Spring Integration - outbound transfer renaming issues - spring

I am using the sample program here to build my code. Everything works fine with the local SFTP test server, When I tested today in my client SFTP servers, it gave me an exception as below.
When I debugged I saw the file being written with '.writing' extension on the client's SFTP server. The contents are fine, I don't see any issues with the file that was transferred but the file name is the issue now. After reading through the spring docs, I see that this is the temporary file extension, and the program tries to rename it back to the original name, but since the client SFTP does not provide this option, it throws and exception.
I tried writing the temporary-file-suffix=".writing" as temporary-file-suffix="" thinking that will set the options off, alas the same issue. Is there a work around?
One of the post mentions this, but no solution or the issue vanished for the user.
> 2014-07-28 20:11:21,564 [main] INFO com.jcraft.jsch - SSH_MSG_NEWKEYS
> sent 2014-07-28 20:11:21,608 [main] INFO com.jcraft.jsch -
> SSH_MSG_NEWKEYS received 2014-07-28 20:11:21,678 [main] INFO
> com.jcraft.jsch - SSH_MSG_SERVICE_REQUEST sent 2014-07-28 20:11:21,770
> [main] INFO com.jcraft.jsch - SSH_MSG_SERVICE_ACCEPT received
> 2014-07-28 20:11:21,818 [main] INFO com.jcraft.jsch - Authentications
> that can continue: publickey,keyboard-interactive,password 2014-07-28
> 20:11:21,818 [main] INFO com.jcraft.jsch - Next authentication
> method: publickey 2014-07-28 20:11:21,819 [main] INFO com.jcraft.jsch
> - Authentications that can continue: keyboard-interactive,password 2014-07-28 20:11:21,819 [main] INFO com.jcraft.jsch - Next
> authentication method: keyboard-interactive 2014-07-28 20:11:21,978
> [main] INFO com.jcraft.jsch - Authentication succeeded
> (keyboard-interactive). 2014-07-28 20:11:22,199 [main] DEBUG
> org.springframework.beans.factory.support.DefaultListableBeanFactory -
> Returning cached instance of singleton bean
> 'integrationEvaluationContext' 2014-07-28 20:11:22,878 [main] DEBUG
> org.springframework.integration.sftp.session.SftpSession - Initial
> File rename failed, possibly because file already exists. Will attempt
> to delete file: remote/TESTFILE.ABC and execute rename again.
> 2014-07-28 20:11:22,958 [main] INFO com.jcraft.jsch - Disconnecting
> from fsgatewaytest.aexp.com port 22 2014-07-28 20:11:22,961 [Connect
> thread fsgatewaytest.aexp.com session] INFO com.jcraft.jsch - Caught
> an exception, leaving main loop due to Socket closed Exception in
> thread "main" org.springframework.messaging.MessageDeliveryException:
> Error handling message for file [data/TESTFILE.ABC -> TESTFILE.ABC]
> at
> org.springframework.integration.file.remote.RemoteFileTemplate$1.doInSession(RemoteFileTemplate.java:227)
> at
> org.springframework.integration.file.remote.RemoteFileTemplate$1.doInSession(RemoteFileTemplate.java:190)
> at
> org.springframework.integration.file.remote.RemoteFileTemplate.execute(RemoteFileTemplate.java:302)
> at
> org.springframework.integration.file.remote.RemoteFileTemplate.send(RemoteFileTemplate.java:190)
> at
> org.springframework.integration.file.remote.RemoteFileTemplate.send(RemoteFileTemplate.java:182)
> at
> org.springframework.integration.file.remote.handler.FileTransferringMessageHandler.handleMessageInternal(FileTransferringMessageHandler.java:101)
> at
> org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:78)
> at
> org.springframework.integration.dispatcher.AbstractDispatcher.tryOptimizedDispatch(AbstractDispatcher.java:116)
> at
> org.springframework.integration.dispatcher.UnicastingDispatcher.doDispatch(UnicastingDispatcher.java:101)
> at
> org.springframework.integration.dispatcher.UnicastingDispatcher.dispatch(UnicastingDispatcher.java:97)
> at
> org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:77)
> at
> org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:255)
> at
> org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:223)
> at
> com.reachlocal.payment.integration.SftpOutboundTransferMain.main(SftpOutboundTransferMain.java:43)
> Caused by: org.springframework.messaging.MessagingException: Failed to
> write to 'inbox/REACHLOCALTST.CUFI.writing' while uploading the file
> at
> org.springframework.integration.file.remote.RemoteFileTemplate.sendFileToRemoteDirectory(RemoteFileTemplate.java:397)
> at
> org.springframework.integration.file.remote.RemoteFileTemplate.access$500(RemoteFileTemplate.java:56)
> at
> org.springframework.integration.file.remote.RemoteFileTemplate$1.doInSession(RemoteFileTemplate.java:213)
> ... 13 more Caused by: org.springframework.core.NestedIOException:
> Failed to delete file inbox/TESTFILE.ABC; nested exception is
> org.springframework.core.NestedIOException: Failed to remove file: 2:
> Specified file path is invalid. at
> org.springframework.integration.sftp.session.SftpSession.rename(SftpSession.java:200)
> at
> org.springframework.integration.file.remote.RemoteFileTemplate.sendFileToRemoteDirectory(RemoteFileTemplate.java:393)
> ... 15 more Caused by: org.springframework.core.NestedIOException:
> Failed to remove file: 2: Specified file path is invalid. at
> org.springframework.integration.sftp.session.SftpSession.remove(SftpSession.java:83)
> at
> org.springframework.integration.sftp.session.SftpSession.rename(SftpSession.java:194)
> ... 16 more
Updated config: Using use-temporary-file-name="false" solved this issue. Thanks a ton.
<int:channel id="inputChannel"/>
<int-sftp:outbound-channel-adapter id="sftpOutboundAdapter"
session-factory="sftpSessionFactory"
channel="inputChannel"
remote-filename-generator-expression="payload.getName()"
remote-directory="inbox"
use-temporary-file-name="false"/>

How about this use-temporary-file-name?
In this case you end up with this:
try {
session.write(inputStream, tempFilePath);
// then rename it to its final name if necessary
if (useTemporaryFileName){
session.rename(tempFilePath, remoteFilePath);
}
}
From mentioned doc:
However, there may be situations where you don't want to use this technique (for example, if the server does not permit renaming files). For situations like this, you can disable this feature by setting use-temporary-file-name to false (default is true). When this attribute is false, the file is written with its final name and the consuming application will need some other mechanism to detect that the file is completely uploaded before accessing it.

Related

Apache Atlas quickstart - kafka error

Env: no kerberos, no ranger, no hdfs. EC2 with ssl.
Getting this error after running $ATLAS_HOME/bin/quick_start.py https://$componentPrivateDNSRecord:21443 with correct user/pass
Creating sample types:
Created type [DB]
Created type [Table]
Created type [StorageDesc]
Created type [Column]
Created type [LoadProcess]
Created type [View]
Created type [JdbcAccess]
Created type [ETL]
Created type [Metric]
Created type [PII]
Created type [Fact]
Created type [Dimension]
Created type [Log Data]
Creating sample entities:
Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:105)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:634)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:334)
at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:311)
at org.apache.atlas.AtlasBaseClient.callAPI(AtlasBaseClient.java:199)
at org.apache.atlas.AtlasClientV2.createEntity(AtlasClientV2.java:277)
at org.apache.atlas.examples.QuickStartV2.createInstance(QuickStartV2.java:339)
at org.apache.atlas.examples.QuickStartV2.createDatabase(QuickStartV2.java:362)
at org.apache.atlas.examples.QuickStartV2.createEntities(QuickStartV2.java:268)
at org.apache.atlas.examples.QuickStartV2.runQuickstart(QuickStartV2.java:150)
at org.apache.atlas.examples.QuickStartV2.main(QuickStartV2.java:132)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:347)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 14 more
No sample data added to Apache Atlas Server.
Relevant code:
https://github.com/apache/incubator-atlas/blob/master/webapp/src/main/java/org/apache/atlas/examples/QuickStartV2.java
#This works
quickStartV2.createTypes();
#This errors
quickStartV2.createEntities();
First I thought atlas->kafka connectivity was issue but then I see:
[ec2-user#ip-10-160-187-181 logs]$ cat atlas_kafka_setup.log
2018-07-25 00:06:14,923 INFO - [main:] ~ Looking for atlas-application.properties in classpath (ApplicationProperties:78)
2018-07-25 00:06:14,926 INFO - [main:] ~ Loading atlas-application.properties from file:/home/ec2-user/atlas/distro/target/apache-atlas-1.0.0-SNAPSHOT-bin/apache-atlas-1.0.0-SNAPSHOT/conf/atlas-application.properties (ApplicationProperties:91)
2018-07-25 00:06:16,512 WARN - [main:] ~ Attempting to create topic ATLAS_HOOK (AtlasTopicCreator:72)
2018-07-25 00:06:17,004 WARN - [main:] ~ Created topic ATLAS_HOOK with partitions 1 and replicas 1 (AtlasTopicCreator:119)
2018-07-25 00:06:17,004 WARN - [main:] ~ Attempting to create topic ATLAS_ENTITIES (AtlasTopicCreator:72)
2018-07-25 00:06:17,024 WARN - [main:] ~ Created topic ATLAS_ENTITIES with partitions 1 and replicas 1 (AtlasTopicCreator:119)
2018-07-25 01:49:45,147 DEBUG - [main:] ~ Calling API [ GET : api/atlas/v2/types/typedefs ] (AtlasBaseClient:319)
2018-07-25 01:49:45,147 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,166 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:45,269 DEBUG - [main:] ~ API https://mydns:21443/api/atlas/v2/types/typedefs?name=Dimension returned status 200 (AtlasBaseClient:337)
2018-07-25 01:49:45,270 DEBUG - [main:] ~ Calling API [ GET : api/atlas/v2/types/typedefs ] (AtlasBaseClient:319)
2018-07-25 01:49:45,271 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,291 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:45,450 DEBUG - [main:] ~ API https://mydns:21443/api/atlas/v2/types/typedefs?name=Log+Data returned status 200 (AtlasBaseClient:337)
2018-07-25 01:49:45,455 DEBUG - [main:] ~ Calling API [ POST : api/atlas/v2/entity ] <== AtlasEntityWithExtInfo{entity=AtlasEntity{AtlasStruct{typeName='DB', attributes=[owner:John ETL, createTime:1532483385453, name:Sales, description:sales database, locationuri:hdfs://host:8000/apps/warehouse/sales]}guid='-6466195619848', status=null, createdBy='null', updatedBy='null', createTime=null, updateTime=null, version=0, relationshipAttributes=[], classifications=[], },AtlasEntityExtInfo{referredEntities={}}} (AtlasBaseClient:319)
2018-07-25 01:49:45,455 DEBUG - [main:] ~ Attempting to configure HTTPS connection using client configuration (SecureClientUtils$4:221)
2018-07-25 01:49:45,474 INFO - [main:] ~ Unable to configure HTTPS connection from configuration. Leveraging JDK properties. (SecureClientUtils$4:240)
2018-07-25 01:49:33,256 Audit: myuser/10.160.189.35-10.160.189.35 performed request POST https://mydns:21443/api/atlas/v2/types/typedefs (10.160.187.181) at time 2018-07-25T01:49Z
2018-07-25 01:49:45,445 Audit: myuser/10.160.189.35-10.160.189.35 performed request GET https://mydns:21443/api/atlas/v2/types/typedefs?name=Log+Data (10.160.187.181) at time 2018-07-25T01:49Z
2018-07-25 01:49:45,678 Audit: myuser/10.160.189.35-10.160.189.35 performed request POST https://mydns:21443/api/atlas/v2/entity (10.160.187.181) at time 2018-07-25T01:49Z
The 2 topics are returned by this:
$KAFKA_HOME/bin/kafka-topics.sh --list --zookeeper localhost:2181
atlas' application.log does have this, not sure why:
2018-07-25 02:18:14,991 DEBUG - [NotificationHookConsumer thread-0:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,018 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:134)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:183)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:973)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:937)
at org.apache.atlas.kafka.AtlasKafkaConsumer.receive(AtlasKafkaConsumer.java:63)
at org.apache.atlas.kafka.AtlasKafkaConsumer.receive(AtlasKafkaConsumer.java:55)
at org.apache.atlas.notification.NotificationHookConsumer$HookConsumer.doWork(NotificationHookConsumer.java:305)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,092 DEBUG - [NotificationHookConsumer thread-0:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initialize connection to node -1 for sending metadata request (NetworkClient$DefaultMetadataUpdater:644)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Initiating connection to node -1 at localhost:9027. (NetworkClient:496)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Connection with localhost/127.0.0.1 disconnected (Selector:345)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:309)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Thread.java:748)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Node -1 disconnected. (NetworkClient:463)
2018-07-25 02:18:15,119 DEBUG - [kafka-producer-network-thread | producer-1:] ~ Give up sending metadata request since no node is available (NetworkClient$DefaultMetadataUpdater:625)
This fixed it!
sed -i 's/atlas.kafka.bootstrap.servers=localhost:9027/atlas.kafka.bootstrap.servers=localhost:9092/' $ATLAS_HOME/conf/atlas-application.properties```

Extracting words from a log file

I am trying to extract job id's from a log file, and I'm having trouble extracting them in bash. I've tried using sed.
This is how my log file looks like:
> 2018-06-16 02:39:39,331 INFO org.apache.flink.client.cli.CliFrontend
> - Running 'list' command.
> 2018-06-16 02:39:39,641 INFO org.apache.flink.runtime.rest.RestClient
> - Rest client endpoint started.
> 2018-06-16 02:39:39,741 INFO org.apache.flink.client.cli.CliFrontend
> - Waiting for response...
> Waiting for response...
> 2018-06-16 02:39:39,953 INFO org.apache.flink.client.cli.CliFrontend
> - Successfully retrieved list of jobs
> ------------------ Running/Restarting Jobs -------------------
> 15.06.2018 18:49:44 : 1280dfd7b1de4c74cacf9515f371844b : jETTY HTTP Server -> servlet with content decompress -> pull from
> collections -> CSV to Avro encode -> Kafka publish (RUNNING)
> 16.06.2018 02:37:07 : aa7a691fa6c3f1ad619b6c0c4425ba1e : jETTY HTTP Server -> servlet with content decompress -> pull from
> collections -> CSV to Avro encode -> Kafka publish (RUNNING)
> --------------------------------------------------------------
> 2018-06-16 02:39:39,956 INFO org.apache.flink.runtime.rest.RestClient
> - Shutting down rest endpoint.
> 2018-06-16 02:39:39,957 INFO org.apache.flink.runtime.rest.RestClient
> - Rest endpoint shutdown complete.
I am using the following code to extract the lines containing the jobId:
extractRestResponse=`cat logFile.txt`
echo "extractRestResponse: "$extractRestResponse
w1="------------------ Running/Restarting Jobs -------------------"
w2="--------------------------------------------------------------"
extractRunningJobs="sed -e 's/.*'"$w1"'\(.*\)'"$w2"'.*/\1/' <<< $extractRestResponse"
runningJobs=`eval $extractRunningJobs`
echo "running jobs :"$runningJobs
However this doesn't give me any result. Also I notice that all newlines are lost when I print the extractRestResponse variable.
I also tried using this command but it doesn't give me any result:
extractRestResponse="sed -n '/"$w1"/,/"$w2"/{//!p}' logFile.txt"
With sed:
sed -n '/^-* Running\/Restarting Jobs -*/,/^--*/{//!p;}' logFile.txt
Explanations:
Input lines are echoed by default to the standard output after commands are applied. The -n flag suppresses this behavior
/^-* Running\/Restarting Jobs -*/,/^--*/: matches lines starting from ^-* Running\/Restarting Jobs -* up to ^--*(inclusively)
//!p;: print lines except those matching the addresses
awk to the rescue!
awk '/^-+$/{f=0} f; /^-+ Running\/Restarting Jobs -+$/{f=1}' logfile
You could improve your original substitution:
sed -e 's/.*'"$w1"'\(.*\)'"$w2"'.*/\1/' <<< $extractRestResponse
by using # as the delimiter:
sed -n "s#.*$w1\(.*\)$w2.*#\1#p" <<< $extractRestResponse
The output is the text between $w1 and $w2:
> 15.06.2018 18:49:44 : 1280dfd7b1de4c74cacf9515f371844b : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) > 16.06.2018 02:37:07 : aa7a691fa6c3f1ad619b6c0c4425ba1e : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) >

Tez - DAGAppMaster - java.lang.IllegalArgumentException: Invalid ContainerId

I try to launch a mapreduce job, but I get an error while excuting the jobs in shell or in hive :
hive> select count(*) from employee ; Query ID =
mapr_20171107135114_a574713d-7d69-45e1-aa73-d4de07a3059b Total jobs =
1 Launching Job 1 out of 1 Number of reduce tasks determined at
compile time: 1 In order to change the average load for a reducer (in
bytes): set hive.exec.reducers.bytes.per.reducer= In order to
limit the maximum number of reducers: set
hive.exec.reducers.max= In order to set a constant number of
reducers: set mapreduce.job.reduces= Starting Job =
job_1510052734193_0005, Tracking URL =
http://hdpsrvpre2.intranet.darty.fr:8088/proxy/application_1510052734193_0005/
Kill Command = /opt/mapr/hadoop/hadoop-2.7.0/bin/hadoop job -kill
job_1510052734193_0005 Hadoop job information for Stage-1: number of
mappers: 0; number of reducers: 0 2017-11-07 13:51:25,951 Stage-1 map
= 0%, reduce = 0% Ended Job = job_1510052734193_0005 with errors Error during job, obtaining debugging information... **FAILED: Execution
Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: Stage-Stage-1: MAPRFS Read: 0 MAPRFS Write: 0
FAIL Total MapReduce CPU Time Spent: 0 mse
in Ressourcemanager logs that what I find :
> 2017-11-07 13:51:25,269 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1510052734193_0005_000002 State change from LAUNCHED to
> FINAL_SAVING 2017-11-07 13:51:25,269 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore:
> Updating info for attempt: appattempt_1510052734193_0005_000002 at:
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1510052734193_0005/appattempt_1510052734193_0005_000002
> 2017-11-07 13:51:25,283 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Unregistering app attempt : appattempt_1510052734193_0005_000002
> 2017-11-07 13:51:25,283 INFO
> org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager:
> Application finished, removing password for
> appattempt_1510052734193_0005_000002 2017-11-07 13:51:25,283 **INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1510052734193_0005_000002 State change from FINAL_SAVING to
> FAILED** 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The
> number of failed attempts is 2. The max attempts is 2 2017-11-07
> 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> Updating application application_1510052734193_0005 with final state:
> FAILED 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1510052734193_0005 State change from ACCEPTED to
> FINAL_SAVING 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
> Updating info for app: application_1510052734193_0005 2017-11-07
> 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Application appattempt_1510052734193_0005_000002 is done.
> finalState=FAILED 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore:
> Updating info for app: application_1510052734193_0005 at:
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1510052734193_0005/application_1510052734193_0005
> 2017-11-07 13:51:25,284 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo:
> Application application_1510052734193_0005 requests cleared 2017-11-07
> 13:51:25,296 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> Application application_1510052734193_0005 failed 2 times due to AM
> Container for appattempt_1510052734193_0005_000002 exited with
> exitCode: 1 For more detailed output, check application tracking
> page:http://hdpsrvpre2.intranet.darty.fr:8088/cluster/app/application_1510052734193_0005Then,
> click on links to logs of each attempt. Diagnostics: Exception from
> container-launch. Container id:
> container_e10_1510052734193_0005_02_000001 Exit code: 1 Stack trace:
> ExitCodeException exitCode=1: at
> org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at
> org.apache.hadoop.util.Shell.run(Shell.java:456) at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:304)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:354)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:87)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
> at java.lang.Thread.run(Thread.java:748) Shell output: main : command
> provided 1 main : user is mapr main : requested yarn user is mapr
>
> Container exited with a non-zero exit code 1 Failing this attempt. Failing the application.
Also , in sys log of jobs I find :
2017-11-07 12:09:46,419 FATAL [main] app.DAGAppMaster: Error starting
DAGAppMaster java.lang.IllegalArgumentException: Invalid ContainerId:
container_e10_1510052734193_0001_01_000001 at
org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)
at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:1794)
Caused by: java.lang.NumberFormatException: For input string: "e10"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441) at
java.lang.Long.parseLong(Long.java:483) at
org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)
at
org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)
... 1 more
It seems to be that Tez which causes the issue, is there any solution to solve that?
Thank you !
I think that the execution environment has different versions of hadoop and their respective jar files.
Please verify the environment and make sure you use only the required version and remove the references of other versions from any of your environment variables.

Extract logs between time frame with non specific time stamp using bash

I want to fetch log between two time stamps but i do not have specific time stamps with me. I can use the command sed for fetching if I have specific time stamp in log using the following command
sed -rne '/$StartTime/,/$EndTime/'p <filename>
My query is that since the specific StartTime and EndTime which I'm fetching from my DB might not be present in the log file, I will have to fetch the log between times near to the StartTime and EndTime that I provide using >= and <= signs. I tried the following command but it does not work.
awk '$0>=st && $0<=et' st=$StartTime et=$EndTime <filename>
Sample input and output
Input
Time retrieved from DB
StartTime - 2017-11-02 10:20:00
EndTime - 2017-11-02 11:20:00
The time present in log
T1 - 2017-11-02 10:17:44
T2 - 2017-11-02 11:19:32
Output: Entire Log text between T1 & T2
Sample Log
2017-03-03 10:43:18,736 [main] WARN - ORACLE_HOSTNAME=xxxxxxxxxx[OVERRIDES:
xxxxxxxxxxxxxxxx]
2017-03-03 10:43:18,736 [main] WARN - NLS_DATE_FORMAT=DD-MON-YYYY
HH24:MI:SS [OVERRIDES: DD-MON-YYYY HH24:MI:SS]
2017-03-03 10:43:18,736 [main] WARN - xxxxUsername=MDMPIUSER [OVERRIDES: MDMPIUSER]
2017-03-03 10:43:18,736 [main] WARN - BUNDLE_GEMFILE=uri:classloader://installer/Gemfile [OVERRIDES: uri:classloader://installer/Gemfile]
2017-03-03 10:43:18,736 [main] WARN - TIMEOUT=900 [OVERRIDES: 900]
2017-03-03 10:43:18,736 [main] WARN - SHLVL=4 [OVERRIDES: 4]
2017-03-03 10:43:18,736 [main] WARN - HISTSIZE=1000 [OVERRIDES: 1000]
2017-03-03 10:43:18,736 [main] WARN - JAVA_HOME=/usr/java/jdk1.8.0_60/jre [OVERRIDES: /usr/java/jdk1.8.0_60/jre]
2017-03-03 10:43:20,156 [main] WARN - APP_PROPS=/home/xxx/conf/appProperties [OVERRIDES: /home/xxx/conf/appProperties]
You can try
awk -v start="$StartTime" -v end="$EndTime" '
function fonct(date)
{
gsub(/-|,| |:/,"",date)
return date
}
BEGIN{
start=fonct(start)
end=fonct(end)
}
{
a=fonct($1$2)
if (a>=start && a<=end)print $0
}' infile

Pipe Broken exception every time when I run Mahout samples at EC2 server

I've installed mahout at bitnami AMI ami-02fb006b, (as well as several other ami's, otherwise I won't be asking the question)
according to instructions provided here
and here:
I'm always getting stuck when trying to run ./examples/bin/build-reuters.sh
Here's the output of the command:
> Please select a number to choose the corresponding clustering
> algorithm
> 1. kmeans clustering
> 2. lda clustering Enter your choice : 1 ok. You chose 1 and we'll use
> kmeans Clustering Downloading Reuters-21578 % Total % Received %
> Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent
> Left Speed 100 7959k 100 7959k 0 0 294k 0 0:00:26
> 0:00:26 --:--:-- 305k Extracting... Running on hadoop, using
> HADOOP_HOME=/usr/local/hadoop-0.20.2
> HADOOP_CONF_DIR=/usr/local/hadoop-0.20.2/conf MAHOUT-JOB:
> /usr/local/mahout-0.4/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar
> 11/08/16 20:10:25 WARN driver.MahoutDriver: No
> org.apache.lucene.benchmark.utils.ExtractReuters.props found on
> classpath, will use command-line arguments only
> Deleting all files in mahout-work/reuters-out-tmp
> 11/08/16 20:10:30 INFO driver.MahoutDriver: Program took 4906 ms
> MAHOUT_LOCAL is set, running locally
> CLASSPATH:
> :/usr/local/mahout-0.4/src/conf:/usr/local/hadoop-0.20.2/conf:/usr/lib/jvm/java-6-openjdk//lib/tools.jar:/usr/local/mahout-0.4/mahout-*.jar:/usr/local/mahout-0.4/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar:/usr/local/mahout-0.4/mahout-examples-*-job.jar:/usr/local/mahout-0.4/lib/*.jar:/usr/local/mahout-0.4/examples/target/dependency/antlr-2.7.7.jar:/usr/local/mahout-0.4/examples/target/dependency/antlr-3.2.jar:/usr/local/mahout-0.4/examples/target/dependency/antlr-runtime-3.2.jar:/usr/local/mahout-0.4/examples/target/dependency/avro-1.4.0-cassandra-1.jar:/usr/local/mahout-0.4/examples/target/dependency/bson-2.5.jar:/usr/local/mahout-0.4/examples/target/dependency/cassandra-all-0.8.1.jar:/usr/local/mahout-0.4/examples/target/dependency/cassandra-thrift-0.8.1.jar:/usr/local/mahout-0.4/examples/target/dependency/cglib-nodep-2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-beanutils-1.7.0.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-beanutils-core-1.8.0.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-cli-1.2.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-cli-2.0-mahout.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-codec-1.4.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-collections-3.2.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-compress-1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-configuration-1.6.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-dbcp-1.4.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-digester-1.7.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-httpclient-3.0.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-lang-2.6.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-logging-1.1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-math-2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-pool-1.5.6.jar:/usr/local/mahout-0.4/examples/target/dependency/concurrentlinkedhashmap-lru-1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/easymock-3.0.jar:/usr/local/mahout-0.4/examples/target/dependency/google-collections-1.0-rc2.jar:/usr/local/mahout-0.4/examples/target/dependency/guava-r09.jar:/usr/local/mahout-0.4/examples/target/dependency/hadoop-core-0.20.203.0.jar:/usr/local/mahout-0.4/examples/target/dependency/hector-core-0.8.0-2.jar:/usr/local/mahout-0.4/examples/target/dependency/high-scale-lib-1.1.2.jar:/usr/local/mahout-0.4/examples/target/dependency/httpclient-4.0.1.jar:/usr/local/mahout-0.4/examples/target/dependency/httpcore-4.0.1.jar:/usr/local/mahout-0.4/examples/target/dependency/jackson-core-asl-1.8.2.jar:/usr/local/mahout-0.4/examples/target/dependency/jackson-mapper-asl-1.8.2.jar:/usr/local/mahout-0.4/examples/target/dependency/jakarta-regexp-1.4.jar:/usr/local/mahout-0.4/examples/target/dependency/jamm-0.2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/jcommon-1.0.12.jar:/usr/local/mahout-0.4/examples/target/dependency/jetty-6.1.22.jar:/usr/local/mahout-0.4/examples/target/dependency/jetty-util-6.1.22.jar:/usr/local/mahout-0.4/examples/target/dependency/jfreechart-1.0.13.jar:/usr/local/mahout-0.4/examples/target/dependency/jline-0.9.94.jar:/usr/local/mahout-0.4/examples/target/dependency/json-simple-1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/jul-to-slf4j-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/junit-4.8.2.jar:/usr/local/mahout-0.4/examples/target/dependency/libthrift-0.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/log4j-1.2.16.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-analyzers-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-benchmark-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-core-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-highlighter-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-memory-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-queries-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-xercesImpl-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-collections-1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-core-0.6-SNAPSHOT.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-core-0.6-SNAPSHOT-tests.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-integration-0.6-SNAPSHOT.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-math-0.6-SNAPSHOT.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-math-0.6-SNAPSHOT-tests.jar:/usr/local/mahout-0.4/examples/target/dependency/mongo-java-driver-2.5.jar:/usr/local/mahout-0.4/examples/target/dependency/objenesis-1.2.jar:/usr/local/mahout-0.4/examples/target/dependency/servlet-api-2.5-20081211.jar:/usr/local/mahout-0.4/examples/target/dependency/servlet-api-2.5.jar:/usr/local/mahout-0.4/examples/target/dependency/slf4j-api-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/slf4j-jcl-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/slf4j-log4j12-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/snakeyaml-1.6.jar:/usr/local/mahout-0.4/examples/target/dependency/solr-commons-csv-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/speed4j-0.9.jar:/usr/local/mahout-0.4/examples/target/dependency/stringtemplate-3.2.jar:/usr/local/mahout-0.4/examples/target/dependency/uncommons-maths-1.2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/uuid-3.2.0.jar:/usr/local/mahout-0.4/examples/target/dependency/watchmaker-framework-0.6.2.jar:/usr/local/mahout-0.4/examples/target/dependency/watchmaker-swing-0.6.2.jar:/usr/local/mahout-0.4/examples/target/dependency/xml-apis-1.0.b2.jar:/usr/local/mahout-0.4/examples/target/dependency/xpp3_min-1.1.4c.jar:/usr/local/mahout-0.4/examples/target/dependency/xstream-1.3.1.jar
> SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found
> binding in
> [jar:file:/usr/local/mahout-0.4/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/usr/local/mahout-0.4/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/usr/local/mahout-0.4/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation. WARNING: org.apache.hadoop.metrics.jvm.EventCounter is
> deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in
> all the log4j.properties files. 11/08/16 20:10:32 INFO
> common.AbstractJob: Command line arguments: {--charset=UTF-8,
> --chunkSize=5, --endPhase=2147483647,
> --fileFilterClass=org.apache.mahout.text.PrefixAdditionFilter,
> --input=mahout-work/reuters-out, --keyPrefix=,
> --output=mahout-work/reuters-out-seqdir, --startPhase=0,
> --tempDir=temp} Exception in thread "main" java.io.IOException: Call
> to localhost/127.0.0.1:9000 failed on local exception:
> java.io.IOException: Broken pipe
> at
> org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
> at org.apache.hadoop.ipc.Client.call(Client.java:1033)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> at $Proxy1.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:364)
> at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:208)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:175)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1310)
> at
> org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65)
> at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:109)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:210)
> at
> org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:59)
> at
> org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:110)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at
> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:85)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
> Caused by: java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcher.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:122)
> at sun.nio.ch.IOUtil.write(IOUtil.java:93)
> at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:352)
> at
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:55)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> at java.io.DataOutputStream.flush(DataOutputStream.java:123)
> at
> org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:746)
> at org.apache.hadoop.ipc.Client.call(Client.java:1011)
> ... 25 more rmr: cannot remove mahout-work/reuters-out-seqdir:
> No such file or directory. put: File mahout-work/reuters-out-seqdir
> does not exist.
this is a consistent error and I am getting it in every single installation I attempt.
What do I do to fix this?
This looks like an error from Hadoop and/or EC2. Hadoop workers failed writing data to each other for some reason. Why, I don't know, but I might guess that ports aren't open, even locally.
I have always used Amazon EMR directly instead.
Maybe you can debug by trying another M/R job to test. It is not related to Mahout directly as far as I can tell.

Resources