Pycharm error while running Pyspark code on windows - windows

WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
20/05/21 12:55:53 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.

It's just a warning you can still run the pyspark code. From pycharm you have to use spark-submit to execute your code. Use pyspark when you want to use REPL.

Related

Please how to solve these errors installing spark on window. Please, anyone help find solution for these errors. I can not start the project

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/11/02 12:09:01 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/11/02 12:09:02 ERROR SparkContext: Error initializing SparkContext.
java.lang.reflect.InvocationTargetException
21/11/02 12:09:01 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/11/02 12:09:02 ERROR SparkContext: Error initializing SparkContext.
java.lang.reflect.InvocationTargetExceptionenter code here

Apache Nutch 2.3: won't inject urls (hangs) & hadoop log shows warning

I'm stuck trying to set up Nutch 2.3 with Elasticsearch 5.4. The problem is in Nutch as I cannot get it to inject my urls. The hadoop log shows the following warning:
Console:
aurora apache-nutch-2.3.1 # runtime/local/bin/nutch inject urls/seed.txt
InjectorJob: starting at 2017-06-14 17:08:28
InjectorJob: Injecting urlDir: urls/seed.txt
** it hangs here**
and the
Hadoop log:
aurora apache-nutch-2.3.1 # cat runtime/local/logs/hadoop.log
2017-06-14 17:08:28,339 INFO crawl.InjectorJob - InjectorJob: starting at 2017-06-14 17:08:28
2017-06-14 17:08:28,340 INFO crawl.InjectorJob - InjectorJob: Injecting urlDir: urls/seed.txt
2017-06-14 17:08:28,992 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
I've tried setting my Hadoop environment variables following this thread (Hadoop "Unable to load native-hadoop library for your platform" warning) but I'm still getting the same error.
Any ideas?
Don't worry about warning. And I believe you are running on a Linux distribution
Nutch2.3 is not compatible with ES 5.x. I had written a custom IndexWriter which invokes Logstash at given port which in turn invokes Elastic Search. You may try this approach or something around it.

Unable to load native-hadoop library for your platform -Rancher

I am using Rancher for manage an environment , I am using Hadoop + Yarn (Experimental) for flink and zookeeper in rancher .
I am trying to configure the hdfs on the flink-conf.yaml.
This is the changes that I made in connection to Hdfs :
fs.hdfs.hadoopconf: /etc/hadoop
recovery.zookeeper.storageDir: hdfs://:8020/flink/recovery
state.backend.fs.checkpointdir: hdfs://:8020/flink/checkpoints
And I get an error that say that :
2016-09-06 14:10:44,965 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
What I did wrong ?
Best regards

Hadoop command `hadoop fs -ls` gives ConnectionRefused error

When I run hadoop command like hadoop fs -ls, I get following error/warnings:
16/08/04 11:24:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: Call From master/172.17.100.54 to master:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Am I doing anything wrong with the hadoop path?
Hadoop Native Libraries Guide say its some thing to do with
installation. please check documentation to resolve this.
Native Hadoop Library
Hadoop has native implementations of certain components for performance reasons and for non-availability of Java implementations. These components are available in a single, dynamically-linked native library called the native hadoop library. On the *nix platforms the library is named libhadoop.so.
Please note the following:
It is mandatory to install both the zlib and gzip development packages on the target platform in order to build the native hadoop library; however, for deployment it is sufficient to install just one package if you wish to use only one codec.
It is necessary to have the correct 32/64 libraries for zlib, depending on the 32/64 bit jvm for the target platform, in order to build and deploy the native hadoop library.
Runtime
The bin/hadoop script ensures that the native hadoop library is on the library path via the system property: -Djava.library.path=<path>
During runtime, check the hadoop log files for your MapReduce tasks.
If everything is all right, then: DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library... INFO util.NativeCodeLoader - Loaded the native-hadoop library
If something goes wrong, then: INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Check
NativeLibraryChecker is a tool to check whether native libraries are loaded correctly. You can launch NativeLibraryChecker as follows
$ hadoop checknative -a
14/12/06 01:30:45 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
14/12/06 01:30:45 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /home/ozawa/hadoop/lib/native/libhadoop.so.1.0.0
zlib: true /lib/x86_64-linux-gnu/libz.so.1
snappy: true /usr/lib/libsnappy.so.1
lz4: true revision:99
bzip2: false
Second thing Connection refused is something related to your setup. please double check setup.
also see the below as pointers..
Hadoop cluster setup - java.net.ConnectException: Connection refused
Hadoop - java.net.ConnectException: Connection refused

Hortonworks Data node install: Exception in secureMain

Am trying to install Hortonworks Hadoop single node cluster. I am able to start namenode and secondary namenode, but datanode failed with the following error. How do I solve this issue?
2014-04-04 18:22:49,975 FATAL datanode.DataNode (DataNode.java:secureMain(1841)) - Exception in secureMain
java.lang.RuntimeException: Although a UNIX domain socket path is configured as /var/lib/hadoop-hdfs/dn_socket, we cannot start a localDataXceiverServer because libhadoop cannot be loaded."
See Native Libraries Guide. Make sure libhadoop.so is available in $HADOOP_HOME\bin. Look into the logs for this message:
INFO util.NativeCodeLoader - Loaded the native-hadoop library
If instead you find
INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
then it means the libhadoop.so is not available, and you'll have to investigate why. Alternatively you can turn off HDFS shortcircuit if you wish, or enable the legacy short-circuit instead using dfs.client.use.legacy.blockreader.local, to remove the libhadoop dependency. But I reckon would be better to find out what's the problem with your library.
Make sure you read and understand the articles linked before asking further questions.

Resources