mapr windows client not working - hadoop

I am trying to install mapr windows client. I have followed all the steps outlined in the mapr windows client installation. I have copied the ssl_truststore file from our cluster into the C:\opt\mapr\conf folder and ran the configure.bat file. It ran without any errors and I even verified the C:\opt\mapr\conf\mapr-clusters.conf with updated cluster name and CLDB nodes.
But however when i run the following command by changing to folder c:\opt\mapr\hadoop\hadoop-2.7.0\bin
hadoop fs -ls /
I get the following error
18/01/19 14:05:07 ERROR cldbutils.CLDBRpcCommonUtils: Exception during init
java.lang.UnsatisfiedLinkError: com.mapr.security.JNISecurity.SetClusterOption(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)I
at com.mapr.security.JNISecurity.SetClusterOption(Native Method)
at com.mapr.baseutils.cldbutils.CLDBRpcCommonUtils.init(CLDBRpcCommonUtils.java:163)
at com.mapr.baseutils.cldbutils.CLDBRpcCommonUtils.<init>(CLDBRpcCommonUtils.java:73)
at com.mapr.baseutils.cldbutils.CLDBRpcCommonUtils.<clinit>(CLDBRpcCommonUtils.java:63)
at org.apache.hadoop.conf.CoreDefaultProperties.<clinit>(CoreDefaultProperties.java:69)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2362)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2579)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2531)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2444)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1156)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1128)
at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1464)
at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:321)
at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:487)
at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:170)
at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
Exception in thread "main" java.lang.UnsatisfiedLinkError: com.mapr.security.JNISecurity.SetParsingDone()V
at com.mapr.security.JNISecurity.SetParsingDone(Native Method)
at com.mapr.baseutils.cldbutils.CLDBRpcCommonUtils.init(CLDBRpcCommonUtils.java:231)
at com.mapr.baseutils.cldbutils.CLDBRpcCommonUtils.<init>(CLDBRpcCommonUtils.java:73)
at com.mapr.baseutils.cldbutils.CLDBRpcCommonUtils.<clinit>(CLDBRpcCommonUtils.java:63)
at org.apache.hadoop.conf.CoreDefaultProperties.<clinit>(CoreDefaultProperties.java:69)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2362)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2579)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2531)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2444)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1156)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1128)
at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1464)
at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:321)
at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:487)
at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:170)
at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
We use java 8 and windows 7.
I am stuck with this issue for a while. I tried all the possible options but was not successful. Any help is greatly appreciated.

These are steps :
1. Install Mapr applications
Configure MapR client
/opt/mapr/server/configure.sh -N poc2.cibdatahub.com -c -C (Ur cluster name)
Upload the missing jar from MapR cluster node to Edge node under:
/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/
Create a local mapr account if you don't have one:
Example: useradd mapr
Passwd mapr
Set environment for user mapr
su mapr
vi ~/.bashrc
#append line below
export HADOOP_HOME=/opt/mapr/hadoop/hadoop-2.7.0
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export LD_LIBRARY_PATH=$HADOOP_COMMON_LIB_NATIVE_DIR
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
#HIVE home directory configuration
export HIVE_HOME=/opt/mapr/hive/hive-2.1
export PATH="$PATH:$HIVE_HOME/bin"
export HADOOP_USER_NAME="USERname"
# load environment
source ~/.bashrc
Edit /opt/mapr/spark/spark-2.1.0/conf/spark-defaults.conf by adding
spark.yarn.archive maprfs:///apps/spark/jars/spark-jars.zip
Verify the cluster setting
/opt/mapr/conf/mapr-clusters.conf has option secure=false
Test
hadoop fs -Dfs.mapr.trace -ls
Please try these options.

Related

Hadoop single-node starting issue

I'm trying to bring up the hadoop standalone server (in aws) by executing
start-dfs.sh file but got the below error
Starting namenodes on [ip-xxx-xx-xxx-xx]
ip-xxx-xx-xxx-xx: Permission denied (publickey).
Starting datanodes
localhost: Permission denied (publickey).
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/hadoop/hdfs/tools/GetConf : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:808)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:442)
at java.net.URLClassLoader.access$100(URLClassLoader.java:64)
at java.net.URLClassLoader$1.run(URLClassLoader.java:354)
at java.net.URLClassLoader$1.run(URLClassLoader.java:348)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:347)
at java.lang.ClassLoader.loadClass(ClassLoader.java:430)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:323)
at java.lang.ClassLoader.loadClass(ClassLoader.java:363)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482)
Installed Java version is javac 1.7.0_181
Hadoop is 3.0.3.
Below is the path contents in profile file
export JAVA_HOME=/usr
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
#export PATH=$PATH:$HADOOP_CONF_DIR
export SCALA_HOME=/usr/local/scala
export PATH=$PATH:$SCALA_HOME/bin
What is the issue ? is there anything i'm missing?
thanks
ssh-keygen
2.It will ask for folder location where it will copy the keys, I entered /home/hadoop/.ssh/id_rsa
3.it will ask for pass phrase, keep it empty for simplicity.
cat /home/hadoop/.ssh/id_rsa.pub .>> ssh/authorized_keys (To copy the newly generated public key to auth file in your users home/.ssh directory)
ssh localhost should not ask for a password
start-dfs.sh (Now it should work!)

Exception :Server IPC version 9 cannot communicate with client version 4: using Hcatalog with Hive-0.14.0

I have following tools :
Hadoop-2.6.0,
Hive-0.14.0,
hbase-0.94.8,
sqoop-1.4.5,
pig-0.14.0 installed in a psuedo distributed environment on Ubuntu 14.0.4.
My goal is to use Hcatalog as an interface to work with Hive, Pig, MapReduce applications.
Steps I did:
1. I have Mysql configured as remote metastore, with mysql-connector-java-5.1.37 jar copied in HIVE_HOME/lib. I have created hive-site.xml in HIVE_HOME/conf for remote metastore but running on same machine.
2. I have hive-env.sh file with HADOOP_HOME pointing to Hadoop-2.6.0 home.
3. I have remote metastore running at port 9083
4. in bashrc file I have following env variables set:
#Hadoop variables start
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export HADOOP_HOME=/home/user/hadoop-2.6.0
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
#Hadoop variables end
export HADOOP_USER_CLASSPATH_FIRST=true
export PIG_USER_CLASSPATH_FIRST=true
#PIG ENV VARIABLE
export PIG_HOME=/home/user/pig-0.14.0
export PATH=$PATH:$PIG_HOME/bin
#Hive Env Variable
export HIVE_HOME=/home/user/hive-0.14.0/apache-hive-0.14.0-bin
export PATH=$PATH:$HIVE_HOME/bin
#HCatalog env
export HCAT_HOME=$HIVE_HOME/hcatalog
export HCAT_HOME
export PATH=$PATH:$HCAT_HOME/bin
HCATJAR=$HCAT_HOME/share/hacatalog/hive-hcatalog-core-0.14.0.jar
export HCATJAR
HCATPIGJAR=$HCAT_HOME/share/hcatalog/hive-hcatalog-pig-adapter-0.14.0.jar
export HCATPIGJAR
export HADOOP_CLASSPATH=$HCATJAR:$HCATPIGJAR:$HIVE_HOME/lib/hive-exec-0.14.0.jar\
:$HIVE_HOME/lib/hive-metastore-0.14.0.jar:$HIVE_HOME/lib/jdo-api-*.jar:$HIVE_HOME/lib/libfb303-*.jar\
:$HIVE_HOME/lib/libthrift-*.jar:$HIVE_HOME/conf:$HADOOP_HOME/etc/hadoop
#Pig hcatalog integration
export PIG_OPTS=-Dhive.metastore.uris=thrift://localhost:9083
export PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/*:$HIVE_HOME/lib/*:$HCATPIGJAR:$HIVE_HOME/conf:$HADOOP_HOME/etc/hadoop
I am trying to invoke "hcat" command on HIVE_HOME/hcatalog/bin path. Below are the errors which is being generated:
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:444)
at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:149)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
**Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4**
at org.apache.hadoop.ipc.Client.call(Client.java:1070)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:427)
... 6 more
Observation: After Googling a lot, If I understand right "Server IPC version 9 cannot communicate with client version 4" is hadoop version mismatch.
So I added HADOOP_HOME in hive-env.sh to refer to hadoop-2.6.0.
Error still persists. I am not sure what I am missing. Any help on this will be really appreciated.

Sqoop Export Oozie Workflow Fails with File Not Found, Works when ran from the console

I have a hadoop cluster with 6 nodes. I'm pulling data out of MSSQL and back into MSSQL via Sqoop. Sqoop import commands work fine, and I can run a sqoop export command from the console (on one of the hadoop nodes). Here's the shell script I run:
SQLHOST=sqlservermaster.local
SQLDBNAME=db1
HIVEDBNAME=db1
BATCHID=
USERNAME="sqlusername"
PASSWORD="password"
sqoop export --connect 'jdbc:sqlserver://'$SQLHOST';username='$USERNAME';password='$PASSWORD';database='$SQLDBNAME'' --table ExportFromHive --columns col1,col2,col3 --export-dir /apps/hive/warehouse/$HIVEDBNAME.db/hivetablename
When I run this command from an oozie workflow, and it's passed the same parameters, I receive the error (when digging into the actual job run logs from the yarn scheduler screen):
**2015-10-01 20:55:31,084 WARN [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job init failed
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1568)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1432)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1390)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1312)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1080)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1519)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1563)
... 17 more**
Has anyone ever seen this and been able to troubleshoot it? It only happens from the oozie workflow. There are similar topics but no one seems to have solved this specific problem.
Thanks!
I was able to solve this problem by setting the user.name property on the job.properties file for the oozie workflow to the user yarn.
user.name=yarn
I think the problem was it did not have permission to create the staging files under /user/root. Once I modified the running user to yarn, the staging files were created under /user/yarn which did have the proper permission.

Unable to write file on HDFS

I have installed hadoop-2.6.0 and also I checked that all the hadoop daemons are running. I am able to create or copy directory in hdfs but not able to copy file to hdfs.
Command:
bin/hadoop fs -copyFromLocal /home/130853/Hadoop_Data/abc /trial/abc
It's giving following exception:
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/String;JZ)V
at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method)
at org.apache.hadoop.util.NativeCrc32.calculateChunkedSumsByteArray(NativeCrc32.java:86)
at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(DataChecksum.java:430)
at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:202)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2217)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
at org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:466)
at org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:391)
at org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:328)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:263)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:248)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
at org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:267)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
Can someone please help on this?
what is your Linux System ? 64bit or 32 bit ?
if it's 64bit , suggest you recomplie hadoop source code in order to it support native lib .
if it's 32bit ,suggest you switch 64bit system.
try execute command ,
hadoop checknative -a

error while running nutch on hadoop multi cluster environment

I am running nutch on hadoop multi cluster environment.
Hadoop is throwing an error when nutch is being executed using the following command
$ bin/hadoop jar /home/nutch/nutch/runtime/deploy/nutch-1.5.1.job org.apache.nutch.crawl.Crawl urls -dir urls -depth 1 -topN 5
Error:
Exception in thread "main" java.io.IOException: Not a file:
hdfs://master:54310/user/nutch/urls/crawldb
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:170)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:515)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
at com.bdc.dod.dashboard.BDCQueryStatsViewer.run(BDCQueryStatsViewer.java:829)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.bdc.dod.dashboard.BDCQueryStatsViewer.main(BDCQueryStatsViewer.java:796)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
I tried with possible ways of solving this and fixed all the issues like setting http.agent.name in /local/conf path etc. And I installed earlier and it was smooth.
Can anybody suggest a solution?
By the way, I followed link for installing and running.
I could solve this issue. when copying files from local file system to HDFS destination filesystem, it used to be like this: bin/hadoop dfs -put ~/nutch/urls urls.
However it should be "bin/hadoop dfs -put ~/nutch/urls/* urls", here urls/* will allow sub directories.

Resources