Get unexpected exception when using dataflow

Get unexpected exception when using dataflow - elasticsearch

when im using dataflow with these steps:
- Read from bigquery
- Convert table row to json string
- Insert to elasticsearch (7.5.2)
It looks work great with ~100k records, but in real (8m records ~ 65gb) dataflow throw exception after 300k insertion.
Error message from worker: java.lang.RuntimeException: unexpected org.apache.beam.runners.dataflow.worker.util.common.worker.CachingShuffleBatchReader.read(CachingShuffleBatchReader.java:104) org.apache.beam.runners.dataflow.worker.util.common.worker.BatchingShuffleEntryReader$ShuffleReadIterator.fillEntries(BatchingShuffleEntryReader.java:125) org.apache.beam.runners.dataflow.worker.util.common.worker.BatchingShuffleEntryReader$ShuffleReadIterator.fillEntriesIfNeeded(BatchingShuffleEntryReader.java:119) org.apache.beam.runners.dataflow.worker.util.common.worker.BatchingShuffleEntryReader$ShuffleReadIterator.hasNext(BatchingShuffleEntryReader.java:84) org.apache.beam.runners.dataflow.worker.util.common.ForwardingReiterator.hasNext(ForwardingReiterator.java:63) org.apache.beam.runners.dataflow.worker.util.common.worker.GroupingShuffleEntryIterator.advance(GroupingShuffleEntryIterator.java:109) org.apache.beam.runners.dataflow.worker.GroupingShuffleReader$GroupingShuffleReaderIterator.advance(GroupingShuffleReader.java:272) org.apache.beam.runners.dataflow.worker.GroupingShuffleReader$GroupingShuffleReaderIterator.start(GroupingShuffleReader.java:266) org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.start(ReadOperation.java:361) org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:194) org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159) org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77) org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:411) org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:380) org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:305) org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:140) org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:120) org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107) java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.util.concurrent.ExecutionException: java.io.IOException: DEADLINE_EXCEEDED: (g)RPC timed out when customerdataworker-khanhn-02170116-o318-harness-8pqr talking to customerdataworker-khanhn-02170116-o318-harness-f8zt:12346. Server unresponsive (ping error: Deadline Exceeded, {"created":"#1581934551.886578453","description":"Deadline Exceeded","file":"third_party/grpc/src/core/ext/filters/deadline/deadline_filter.cc","file_line":69,"grpc_status":4}). Typically one can self manage this issue, please read: https://cloud.google.com/dataflow/docs/guides/common-errors#tsg-rpc-timeout org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:492) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:83) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:196) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2312) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) org.apache.beam.runners.dataflow.worker.util.common.worker.CachingShuffleBatchReader.read(CachingShuffleBatchReader.java:101) ... 21 more Caused by: java.io.IOException: DEADLINE_EXCEEDED: (g)RPC timed out when customerdataworker-khanhn-02170116-o318-harness-8pqr talking to customerdataworker-khanhn-02170116-o318-harness-f8zt:12346. Server unresponsive (ping error: Deadline Exceeded, {"created":"#1581934551.886578453","description":"Deadline Exceeded","file":"third_party/grpc/src/core/ext/filters/deadline/deadline_filter.cc","file_line":69,"grpc_status":4}). Typically one can self manage this issue, please read: https://cloud.google.com/dataflow/docs/guides/common-errors#tsg-rpc-timeout org.apache.beam.runners.dataflow.worker.ApplianceShuffleReader.readIncludingPosition(Native Method) org.apache.beam.runners.dataflow.worker.ChunkingShuffleBatchReader.read(ChunkingShuffleBatchReader.java:58) org.apache.beam.runners.dataflow.worker.util.common.worker.CachingShuffleBatchReader$1.load(CachingShuffleBatchReader.java:70) org.apache.beam.runners.dataflow.worker.util.common.worker.CachingShuffleBatchReader$1.load(CachingShuffleBatchReader.java:66) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) ... 27 more
I also use these configs as recommendations at (increase disk size, num of workers) : https://cloud.google.com/dataflow/docs/guides/common-errors#tsg-rpc-timeout but it seems still not working.
My current configs:
--runner=DataflowRunner \
--numWorkers=10 \
--maxNumWorkers=20 \
--diskSizeGb=150 \
--workerMachineType=n1-standard-1 \
--region=asia-east1" && \
---Update 1:
Log on stackdriverenter image description here

Thanks for the replies, I have solved the problem. The root cause is that VM instances created by dataflow have firewall rules, we should allow connections between those.

Related

worker is getting restarted continuously with closedchannel exception in supervisor

worker which is in one of the supervisor is getting restarted continuously and getting Closedchannel exception . But if run the same topology in another storm cluster which is in another environment , it is running without giving any errors.
Below is the error i can see from Storm UI.
java.lang.RuntimeException: java.nio.channels.ClosedChannelException at org.apache.storm.kafka.ZkCoordinator.refresh(ZkCoordinator.java:103) at org.apache.storm.kafka.ZkCoordinator.getMyManagedPartitions(ZkCoordinator.java:69) at org.apache.storm.kafka.KafkaSpout.nextTuple(KafkaSpout.java:129) at org.apache.storm.daemon.executor$fn__7990$fn__8005$fn__8036.invoke(executor.clj:648) at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:78) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:68) at kafka.consumer.SimpleConsumer.getOffsetsBefore(SimpleConsumer.scala:127) at kafka.javaapi.consumer.SimpleConsumer.getOffsetsBefore(SimpleConsumer.scala:79) at org.apache.storm.kafka.KafkaUtils.getOffset(KafkaUtils.java:75) at org.apache.storm.kafka.KafkaUtils.getOffset(KafkaUtils.java:65) at org.apache.storm.kafka.PartitionManager.(PartitionManager.java:94) at org.apache.storm.kafka.ZkCoordinator.refresh(ZkCoordinator.java:98) ... 6 mo
Can any one please help me to find out the exact issue.Please let me know if need any more information.

I faced this issue and problem was that ZooKeeper host names not being resolved from worker host.

spark timesout maybe due to binaryFiles() with more than 1 million files in HDFS

I am reading millions of xml files via
val xmls = sc.binaryFiles(xmlDir)
The operation runs fine locally but on yarn it fails with:
client token: N/A
diagnostics: Application application_1433491939773_0012 failed 2 times due to ApplicationMaster for attempt appattempt_1433491939773_0012_000002 timed out. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1433750951883
final status: FAILED
tracking URL: http://controller01:8088/cluster/app/application_1433491939773_0012
user: ariskk
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:622)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
On hadoops/userlogs logs I am frequently getting these messages:
15/06/08 09:15:38 WARN util.AkkaUtils: Error sending message [message = Heartbeat(1,[Lscala.Tuple2;#2b4f336b,BlockManagerId(1, controller01.stratified, 58510))] in 2 attempts
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:195)
at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:427)
I run my spark job via spark-submit and it works for an other HDFS directory that contains only 37k files. Any ideas how to resolve this?

Ok after getting some help on sparks mailing list, I found out there were 2 issues:
the src directory, if it is given as /my_dir/ it makes spark fail and creates the heartbeat issues. Instead it should be given as hdfs:///my_dir/*
An out of memory error appears in the logs after fixing #1. This is the spark driver running on yarn running out of memory due to the number of files (apparently it keeps all file info in memory). So I spark-submit'ed the job with --conf spark.driver.memory=8g which fixed the issue.

SEVERE error writing to S3 backup

I'm running OpsCenter 5.1.1 with Datastax Enterprise 4.5.1. It's a 3-node cluster on AWS and I'm backing up to S3 (still...) I've started seeing a new error. I think this is a different error than any I've posted b4.
$ cqlsh
Connected to Test Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.8.39 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
I am seeing this error in the agent.log file
node1_agent.log: SEVERE: error after writing 15736832/16777216 bytes to https://cassandra-dev-bkup.s3.amazonaws.com/snapshots/407bb4b1-5c91-43fe-9d4f-767115668037/sstables/1430904167-reporting_test-transaction_lookup-jb-288-Index.db?partNumber=2&uploadId=.MA3X4RYssg7xL_Hr7Msgze.J4exDq9zZ_0Y7qEj9gZhJ570j73kZNr5_nbxactmPMJeKf0XyZfEC0KAplWOz9lpyRCtNeeDCvCmtEXDchH8F1J2c57aq4MrxfBcyiZr
java.io.IOException: Error writing request body to server
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3192)
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3175)
at com.google.common.io.CountingOutputStream.write(CountingOutputStream.java:53)
at com.google.common.io.ByteStreams.copy(ByteStreams.java:179)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.writePayloadToConnection(JavaUrlHttpCommandExecutorService.java:308)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.convert(JavaUrlHttpCommandExecutorService.java:192)
at org.jclouds.http.internal.JavaUrlHttpCommandExecutorService.convert(JavaUrlHttpCommandExecutorService.java:72)
at org.jclouds.http.internal.BaseHttpCommandExecutorService.invoke(BaseHttpCommandExecutorService.java:95)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.invoke(InvokeSyncToAsyncHttpMethod.java:128)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.apply(InvokeSyncToAsyncHttpMethod.java:94)
at org.jclouds.rest.internal.InvokeSyncToAsyncHttpMethod.apply(InvokeSyncToAsyncHttpMethod.java:55)
at org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
at org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
at com.sun.proxy.$Proxy48.uploadPart(Unknown Source)
at org.jclouds.aws.s3.blobstore.strategy.internal.SequentialMultipartUploadStrategy.prepareUploadPart(SequentialMultipartUploadStrategy.java:111)
at org.jclouds.aws.s3.blobstore.strategy.internal.SequentialMultipartUploadStrategy.execute(SequentialMultipartUploadStrategy.java:93)
at org.jclouds.aws.s3.blobstore.AWSS3BlobStore.putBlob(AWSS3BlobStore.java:89)
at org.jclouds.blobstore2$put_blob.doInvoke(blobstore2.clj:246)
at clojure.lang.RestFn.invoke(RestFn.java:494)
at opsagent.backups.destinations$create_blob$fn__12007.invoke(destinations.clj:69)
at opsagent.backups.destinations$create_blob.invoke(destinations.clj:64)
at opsagent.backups.destinations$fn__12170.invoke(destinations.clj:192)
at opsagent.backups.destinations$fn__11799$G__11792__11810.invoke(destinations.clj:24)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339$fn__12344$fn__12375.invoke(staging.clj:61)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339$fn__12344.invoke(staging.clj:59)
at opsagent.backups.staging$start_staging_BANG_$fn__12338$state_machine__7576__auto____12339.invoke(staging.clj:56)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:940)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:944)
at clojure.core.async.impl.ioc_macros$take_BANG_$fn__7592.invoke(ioc_macros.clj:953)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__4097.invoke(channels.clj:102)
at clojure.lang.AFn.run(AFn.java:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

TL;DR -
Your SSTable which is 38866048 bytes, is both on your filesystem and on S3. This means the file has transferred over and you are in good shape. No need to worry about this error (though I opened an internal ticket to handle this kind of exception rather than throw a dump).
Details - A summary of what I suspect happened
1) There was a file transfer error when you reached 15736832 out of the 16777216 byte slice of the sstable.
2) At this point OpsCenter did not finish transferring the table or leave a partial version in s3
3) Another backup attempt later on moved the sstable with no error and a valid backup exists.

Hadoop Balancer fails with - IOException: Couldn't set up IO streams (LeaseRenewer Warning)

I am stumbling across this error while running the Hadoop Balancer via Namenode. Anytips on cracking this. The process is also blocking the current user and giving an Out of Memory error on issuing any other command.
14/05/09 11:30:05 WARN hdfs.LeaseRenewer: Failed to renew lease for [DFSClient_NONMAPREDUCE_-77290934_1] for 936 seconds. Will retry shortly ...
java.io.IOException: Failed on local exception: java.io.IOException: Couldn't set up IO streams; Host Details : local host is: "hadoop01.xx.xx.xx.xx.com/30.0.1.176"; destination host is: "hadoop01.xx.xx.xx.xx.com":8022;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:763)
at org.apache.hadoop.ipc.Client.call(Client.java:1242)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy10.renewLease(Unknown Source)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy10.renewLease(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:458)
at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:649)
at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Couldn't set up IO streams
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:671)
at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:252)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1291)
at org.apache.hadoop.ipc.Client.call(Client.java:1209)
... 15 more
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:664)
... 18 more

Once the number of threads created by Hadoop RPC reaches the (ulimit -u) on the node's number of processes, java will report it as an out-of-memory error.
Try increasing the maximum number of processes allowed, i.e. your ulimit -u value.

Cassandra Upgrade 0.8.2->0.8.4 get error "failed connecting to all endpoints"

After upgrade of cassandra from 0.8.2 to 0.8.4, got this error
I have restarted cassandra, removed data, etc. nothing helps
I have 6 identical machines in the cloud, before it was working fine.
If I make netstat then it shows port 9160 listening
nodetool ... ring - responces with 6 machines UP.
what could be the problem? : (
Exception in thread "main" java.io.IOException: Could not get input splits
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:157)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at WordCount.run(Unknown Source)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at WordCount.main(Unknown Source)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints slave1/98.188.69.242
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:153)
... 7 more
Caused by: java.io.IOException: failed connecting to all endpoints slave1/98.188.69.242
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:234)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:70)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:190)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:175)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

I don't know whether you have solved the problem. I met with the same problem as yours (Same configuration as yours too) and trrie to solve it.
Problem location:
public List call() throws Exception { ... List
tokens = getSubSplits(keyspace, cfName, range, conf); ...
In the method getSubSplits when calling method
createConnection(host, ConfigHelper.getRpcPort(conf), true)
, the format of host is not right. It is hostname/10.197.34.111 sometimes (ip_address), so createConnection will fail. We need to extract the ip address and then call createConnection.
You can try to change the code and try Hadoop again.
Good luck!

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Get unexpected exception when using dataflow - elasticsearch

Thanks for the replies, I have solved the problem. The root cause is that VM instances created by dataflow have firewall rules, we should allow connections between those.

Related

worker is getting restarted continuously with closedchannel exception in supervisor

spark timesout maybe due to binaryFiles() with more than 1 million files in HDFS

SEVERE error writing to S3 backup

Hadoop Balancer fails with - IOException: Couldn't set up IO streams (LeaseRenewer Warning)

Cassandra Upgrade 0.8.2->0.8.4 get error "failed connecting to all endpoints"

Categories

Resources