giraph/ hadoop reading manifest file - hadoop

I am trying to run RandomWalkWith Restart example https://github.com/apache/giraph/blob/release-1.0/giraph-examples/src/main/java/org/apache/giraph/examples/RandomWalkWithRestartVertex.java
My Input is data is
12 34 56
34 78
56 34 78
78 34
and I am running
hadoop jar giraph-examples-1.1.0-for-hadoop-2.2.0-jar-with-dependencies.jar GiraphRunner -Dgiraph.zkList=<host>:port -libjars giraph-examples-1.1.0-for-hadoop-2.2.0-jar-with-dependencies.jar
org.apache.giraph.examples.RandomWalkWithRestartComputation
-mc org.apache.giraph.examples.RandomWalkVertexMasterCompute
-wc org.apache.giraph.examples.RandomWalkWorkerContext
-vof org.apache.giraph.examples.VertexWithDoubleValueDoubleEdgeTextOutputFormat
-vif org.apache.giraph.examples.LongDoubleDoubleTextInputFormat
-vip giraph_algorithms/personalized_pr/input/graph.txt
-op giraph_algorithms/personalized_pr/out1 -w 1
But I am getting this error.. :-/
Error: java.lang.IllegalStateException: run: Caught an unrecoverable exception
For input string: "PK�uE META-INF/��PKPK�uEMETA-INF/MANIFEST.MF�M��LK-.�" at
org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at
org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by:
java.lang.NumberFormatException: For input string: "PK�uE META-INF/��PKPK�uEMETA-
INF/MANIFEST.MF�M��LK-.�" at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at
java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at
org.apache.giraph.examples.RandomWalkWorkerContext.initializeSources(
RandomWalkWorkerContext.java:131) at org.apache.giraph.examples.RandomWalkWorkerContext.
setStaticVars(RandomWalkWorkerContext.java:160) at
org.apache.giraph.examples.RandomWalkWorkerContext
.preApplication(RandomWalkWorkerContext.java:146) at
org.apache.giraph.graph.GraphTaskManager.workerContextPreApp(
GraphTaskManager.java:815) at
org.apache.giraph.graph.GraphTaskManager.
prepareGraphStateAndWorkerContext(GraphTaskManager.java:451) at
org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:266) at
org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) ... 7 more
Why is it reading manifest file.. When I specifically saying it to read a file and not even a directory?

Because you passed the libjar argument as the vertex class file.
Like the other arguments, you need to say: -D libjars=your_jar.jar.

Related

How to run .exe file in specific location using gradle

I need to run a .exe file which is in specific location.
This is the code I'm used
tasks.register( 'buildComponents', Exec )
{
dependsOn createSetupIni
doLast
{
exec
{
workingDir = file('tools/configBuild')
executable = 'ConfigBuilder.exe'
args = [ "${configLocation}/setup.ini", "${logFilePath}/configBuild"]
}
}
}
But I'm getting below error during the execution.
Caused by: org.gradle.process.internal.ExecException: A problem occurred starting process 'command 'ConfigBuilder.exe''
at org.gradle.process.internal.DefaultExecHandle.execExceptionFor(DefaultExecHandle.java:232)
at org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecHandle.java:209)
at org.gradle.process.internal.DefaultExecHandle.failed(DefaultExecHandle.java:356)
at org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:86)
at org.gradle.internal.operations.CurrentBuildOperationPreservingRunnable.run(CurrentBuildOperationPreservingRunnable.java:42)
at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
Caused by: net.rubygrapefruit.platform.NativeException: Could not start 'ConfigBuilder.exe'
at net.rubygrapefruit.platform.internal.DefaultProcessLauncher.start(DefaultProcessLauncher.java:27)
at net.rubygrapefruit.platform.internal.WindowsProcessLauncher.start(WindowsProcessLauncher.java:22)
at net.rubygrapefruit.platform.internal.WrapperProcessLauncher.start(WrapperProcessLauncher.java:36)
at org.gradle.process.internal.ExecHandleRunner.startProcess(ExecHandleRunner.java:97)
at org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:70)
... 4 more
Caused by: java.io.IOException: Cannot run program "ConfigBuilder.exe" (in directory "<full path>\tools\configBuild"): CreateProcess error=2, The system cannot find the file specified
at net.rubygrapefruit.platform.internal.DefaultProcessLauncher.start(DefaultProcessLauncher.java:25)
... 8 more
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
... 9 more
Any Idea about this?
Sloved it by removing exec {} block.

Nifi 1.10 STATELESS -Caused by: java.nio.file.NoSuchFileException

RUN THIS COMMAND
/bin/nifi.sh stateless RunFromRegistry Once --file ./test/stateless_test1.json
LOG
Note: Use of this command is considered experimental. The commands and approach used may change from time to time.
Java home (JAVA_HOME): /home/deltaman/software/jdk1.8.0_211
Java options (STATELESS_JAVA_OPTS): -Xms1024m -Xmx1024m
13:48:39.835 [main] INFO org.apache.nifi.StatelessNiFi - Unpacking 100 NARs
13:50:51.513 [main] INFO org.apache.nifi.StatelessNiFi - Finished unpacking 100 NARs in 131671 millis
Exception in thread "main" java.lang.reflect.InvocationTargetException
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.StatelessNiFi.main(StatelessNiFi.java:103)
... 5 more
Caused by: java.nio.file.NoSuchFileException: ./test/stateless_test1.json
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.newByteChannel(Files.java:407)
at java.nio.file.Files.readAllBytes(Files.java:3152)
at org.apache.nifi.stateless.runtimes.Program.runLocal(Program.java:119)
at org.apache.nifi.stateless.runtimes.Program.launch(Program.java:67)
... 10 more
it seems no exist file,but i can find the file as follows:
$ cat ./test/stateless_test1.json
{
"registryUrl": "http://10.148.123.12:9991",
"bucketId": "ec1b291e-c3f1-437c-a4e4-c069bd2f6ed1",
"flowId": "b1f73fe8-2874-47a5-970c-6b25eea19497",
"parameters": {
"text" : "xixixixi"
}
}
CONFIGURATION
IDK WHAT IS THE PROBLEM?
ANY SUGGESTION IS APPRECIATION!
/bin/nifi.sh stateless RunFromRegistry Once --file ./test/stateless_test1.json
it is relative path,must use full path,such as
/home/NiFi/nifi-1.10.0/bin/nifi.sh stateless RunFromRegistry Once --file /home/NiFi/nifi-1.10.0/test/stateless_test1.json

Unable to start blueprint container - Hello World

I'm new in Opendaylight and I was following a tutorial to build a simple Hello World project (this tutorial) but when I run the project with ./karaf and check if the module is initialized with log:display | grep hello I obtain this error:
2017-08-04 12:43:57,159 | INFO | Event Dispatcher | YangTextSchemaContextResolver | 55 - org.opendaylight.yangtools.yang-parser-impl - 1.0.2.Boron-SR2 | Provided module name /META-INF/yang/hello.yang#0000-00-00.yang does not match actual text hello#2015-01-05.yang, corrected
2017-08-04 12:44:01,928 | INFO | Event Dispatcher | YangTextSchemaContextResolver | 55 - org.opendaylight.yangtools.yang-parser-impl - 1.0.2.Boron-SR2 | Provided module name /META-INF/yang/hello.yang#0000-00-00.yang does not match actual text hello#2015-01-05.yang, corrected
2017-08-04 12:44:08,295 | INFO | Event Dispatcher | BlueprintBundleTracker | 148 - org.opendaylight.controller.blueprint - 0.5.2.Boron-SR2 | Creating blueprint container for bundle org.opendaylight.hello.impl_0.1.0.SNAPSHOT [174] with paths [bundleentry://174.fwk592688102/org/opendaylight/blueprint/impl-blueprint.xml]
2017-08-04 12:44:08,318 | ERROR | Event Dispatcher | BlueprintContainerImpl | 15 - org.apache.aries.blueprint.core - 1.6.1 | Unable to start blueprint container for bundle org.opendaylight.hello.impl/0.1.0.SNAPSHOT
With the diag command I get this output:
opendaylight-user#root>diag
hello-impl (174)
----------------
Status: Failure
Blueprint
4/08/17 14:12
Exception:
Unable to validate xml
org.osgi.service.blueprint.container.ComponentDefinitionException: Unable to validate xml
at org.apache.aries.blueprint.parser.Parser.validate(Parser.java:349)
at org.apache.aries.blueprint.parser.Parser.validate(Parser.java:336)
at org.apache.aries.blueprint.container.BlueprintContainerImpl.doRun(BlueprintContainerImpl.java:343)
at org.apache.aries.blueprint.container.BlueprintContainerImpl.run(BlueprintContainerImpl.java:276)
at org.apache.aries.blueprint.container.BlueprintExtender.createContainer(BlueprintExtender.java:300)
at org.apache.aries.blueprint.container.BlueprintExtender.createContainer(BlueprintExtender.java:269)
at org.apache.aries.blueprint.container.BlueprintExtender.access$900(BlueprintExtender.java:68)
at org.apache.aries.blueprint.container.BlueprintExtender$BlueprintContainerServiceImpl.createContainer(BlueprintExtender.java:602)
at org.opendaylight.controller.blueprint.BlueprintBundleTracker.modifiedBundle(BlueprintBundleTracker.java:178)
at org.opendaylight.controller.blueprint.BlueprintBundleTracker.addingBundle(BlueprintBundleTracker.java:159)
at org.opendaylight.controller.blueprint.BlueprintBundleTracker.addingBundle(BlueprintBundleTracker.java:51)
at org.osgi.util.tracker.BundleTracker$Tracked.customizerAdding(BundleTracker.java:467)
at org.osgi.util.tracker.BundleTracker$Tracked.customizerAdding(BundleTracker.java:414)
at org.osgi.util.tracker.AbstractTracked.trackAdding(AbstractTracked.java:256)
at org.osgi.util.tracker.AbstractTracked.track(AbstractTracked.java:229)
at org.osgi.util.tracker.BundleTracker$Tracked.bundleChanged(BundleTracker.java:443)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.dispatchEvent(BundleContextImpl.java:847)
at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148)
at org.eclipse.osgi.framework.internal.core.Framework.publishBundleEventPrivileged(Framework.java:1568)
at org.eclipse.osgi.framework.internal.core.Framework.publishBundleEvent(Framework.java:1504)
at org.eclipse.osgi.framework.internal.core.Framework.publishBundleEvent(Framework.java:1499)
at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:391)
at org.eclipse.osgi.framework.internal.core.AbstractBundle.resume(AbstractBundle.java:390)
at org.eclipse.osgi.framework.internal.core.Framework.resumeBundle(Framework.java:1176)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:559)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:544)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.incFWSL(StartLevelManager.java:457)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.doSetStartLevel(StartLevelManager.java:243)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:438)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:1)
at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:340)
Caused by: org.xml.sax.SAXParseException: cvc-complex-type.2.3: Element 'blueprint' cannot have character [children], because the type's content type is element-only.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.error(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.reportSchemaError(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.elementLocallyValidComplexType(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.elementLocallyValidType(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.processElementContent(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.handleEndElement(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaValidator.endElement(Unknown Source)
at org.apache.xerces.jaxp.validation.DOMValidatorHelper.finishNode(Unknown Source)
at org.apache.xerces.jaxp.validation.DOMValidatorHelper.validate(Unknown Source)
at org.apache.xerces.jaxp.validation.DOMValidatorHelper.validate(Unknown Source)
at org.apache.xerces.jaxp.validation.ValidatorImpl.validate(Unknown Source)
at javax.xml.validation.Validator.validate(Unknown Source)
at org.apache.aries.blueprint.parser.Parser.validate(Parser.java:346)
... 32 more
As I've said, I was following the tutorial so my files are exactly the same to the opendaylight link (this is the repository I've created GitHub).
I think it's important to say how I've generated. This is de code:
mvn archetype:generate -DarchetypeGroupId=org.opendaylight.controller -DarchetypeArtifactId=opendaylight-startup-archetype -DarchetypeRepository=https://nexus.opendaylight.org/content/repositories/public/ -DarchetypeCatalog=remote -DarchetypeVersion=1.2.2-Boron-SR2
Thank you all,
Daniel Romero Morcillo
In the logs you provided:
Element 'blueprint' cannot have character [children], because the type's content type is element-only.
So I think there are simply some errors (invalid XML) in your blueprint file.
If it is exactly like the one in the link you provided [here] there are some extra characters in line 19

how to insert json data from hdfs to mysql using sqoop?

I have loaded JSON data to my HDFS, I created the table with required columns in MySQL database as follows.
How to create table with row formatter for accepting JSON?
My HDFS data
{
"Employees" : [
{
"userId":"rirani",
"jobTitleName":"Developer",
"firstName":"Romin",
"lastName":"Irani",
"preferredFullName":"Romin Irani",
"employeeCode":"E1",
"region":"CA",
"phoneNumber":"408-1234567",
"emailAddress":"romin.k.irani#gmail.com"
},
{
"userId":"nirani",
"jobTitleName":"Developer",
"firstName":"Neil",
"lastName":"Irani",
"preferredFullName":"Neil Irani",
"employeeCode":"E2",
"region":"CA",
"phoneNumber":"408-1111111",
"emailAddress":"neilrirani#gmail.com"
},
{
"userId":"thanks",
"jobTitleName":"Program Directory",
"firstName":"Tom",
"lastName":"Hanks",
"preferredFullName":"Tom Hanks",
"employeeCode":"E3",
"region":"CA",
"phoneNumber":"408-2222222",
"emailAddress":"tomhanks#gmail.com"
}
]
}
My SQL table structure
mysql> create table employee(userid int,jobTitleName varchar(20),firstName varchar(20),lastName varchar(20),preferrredFullName varchar(20),employeeCode varchar(20),region varchar(20),phoneNumber varchar(20), emailAddress varchar(20),modifiedDate timestamp DEFAULT CURRENT_TIMESTAMP);
mysql> desc employee;
+--------------------+-------------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+-------------+------+-----+-------------------+-------+
| userid | int(11) | YES | | NULL | |
| jobTitleName | varchar(20) | YES | | NULL | |
| firstName | varchar(20) | YES | | NULL | |
| lastName | varchar(20) | YES | | NULL | |
| preferrredFullName | varchar(20) | YES | | NULL | |
| employeeCode | varchar(20) | YES | | NULL | |
| region | varchar(20) | YES | | NULL | |
| phoneNumber | varchar(20) | YES | | NULL | |
| emailAddress | varchar(20) | YES | | NULL | |
| modifiedDate | timestamp | NO | | CURRENT_TIMESTAMP | |
+--------------------+-------------+------+-----+-------------------+-------+
10 rows in set (0.00 sec)
I am trying to load data from my HDFS to MySQL for the above table using sqoop export as follows
sqoop export --connect jdbc:mysql://localhost/emp_scheme --username root --password adithyan --table employee --export-dir /user/adithyan/filesystem/employee.txt
it has end up with exception as follows
17/02/18 19:35:35 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
17/02/18 19:35:35 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/02/18 19:35:35 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/02/18 19:35:35 INFO tool.CodeGenTool: Beginning code generation
17/02/18 19:35:36 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
17/02/18 19:35:36 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
17/02/18 19:35:36 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/adithyan/hadoop_dir/hadoop-1.2.1
Note: /tmp/sqoop-adithyan/compile/35afadf151a1dd1626a3658577cbc2dd/employee.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/02/18 19:35:41 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-adithyan/compile/35afadf151a1dd1626a3658577cbc2dd/employee.jar
17/02/18 19:35:41 INFO mapreduce.ExportJobBase: Beginning export of employee
17/02/18 19:35:45 INFO input.FileInputFormat: Total input paths to process : 1
17/02/18 19:35:45 INFO input.FileInputFormat: Total input paths to process : 1
17/02/18 19:35:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library
17/02/18 19:35:45 WARN snappy.LoadSnappy: Snappy native library not loaded
17/02/18 19:35:46 INFO mapred.JobClient: Running job: job_201702181051_0002
17/02/18 19:35:47 INFO mapred.JobClient: map 0% reduce 0%
17/02/18 19:36:17 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:18 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_0, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:29 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_1, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:29 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_1, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:42 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
... 12 more
17/02/18 19:36:42 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_2, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
at employee.__loadFromFields(employee.java:596)
at employee.parse(employee.java:499)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.valueOf(Integer.java:766)
at employee.__loadFromFields(employee.java:548)
Can somebody help me on this?
you may have to look at multiple options..
JSON_SET/REPLACE/INSERT - these options may not be directly supported by sqoop yet.
Another options is to pre-process data using pig then stage the data in HDFS before sqooping to RDBMS.

How do I get the Hadoop input filename from mapper?

Hadoop streaming makes the filename available to every map task through the environment variable.
Python:
os.environ["map.input.file"]
Java:
System.getenv(“map.input.file”).
How about Ruby?
mapper.rb
#!/usr/bin/env ruby
STDIN.each_line do |line|
line.split.each do |word|
word = word[/([a-zA-Z0-9]+)/]
word = word.gsub(/ /,"")
puts [word, 1].join("\t")
end
end
puts ENV['map.input.file']
How about:
ENV['map.input.file']
Ruby lets you assign to the ENV hash just as easily:
ENV['map.input.file'] = '/path/to/file'
All JobConf variables are put into environment variables by hadoop-streaming. The variable names are made "safe" by converting any character not in 0-9 A-Z a-z to _.
So map.input.file => map_input_file
Try: puts ENV['map_input_file']
Using the input from op, I tried mapper:
#!/usr/bin/python
import os
file_name = os.getenv('map_input_file')
print file_name
and a standard wordcount reducer using command:
hadoop fs -rmr /user/itsjeevs/wc &&
hadoop jar $STRMJAR -files /home/jejoseph/wc_mapper.py,/home/jejoseph/wc_reducer.py \
-mapper wc_mapper.py \
-reducer wc_reducer.py \
-numReduceTasks 10 \
-input "/data/*" \
-output wc
to fail with error:
16/03/10 15:21:32 INFO mapreduce.Job: Task Id : attempt_1455931799889_822384_m_000043_0, Status : FAILED
Error: java.io.IOException: Stream closed
at java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:434)
at java.io.OutputStream.write(OutputStream.java:116)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.hadoop.streaming.io.TextInputWriter.writeUTF8(TextInputWriter.java:72)
at org.apache.hadoop.streaming.io.TextInputWriter.writeValue(TextInputWriter.java:51)
at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:106)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
16/03/10 15:21:32 INFO mapreduce.Job: Task Id : attempt_1455931799889_822384_m_000077_0, Status : FAILED
Error: java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.hadoop.streaming.io.TextInputWriter.writeUTF8(TextInputWriter.java:72)
at org.apache.hadoop.streaming.io.TextInputWriter.writeValue(TextInputWriter.java:51)
at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:106)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Not sure what is happening.

Resources