spring-xd stream closed - python module - spring-xd

whenever I'm trying to use python module in spring-xd, I'm getting below error:
[ stream create pytest --definition "time | shell --command='python /home/Ubuntu/xd/echo2.py' --encoder=LF | log" --deploy]
[]
2016-03-21T19:23:38+0000 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/ed02510e-f8b3-4f53-9848-e2268fbbade1/pytest.processor.shell.1, type=CHILD_ADDED
2016-03-21T19:23:38+0000 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'shell' for stream 'pytest'
2016-03-21T19:23:38+0000 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor#1c30c1dd moduleName = 'shell', moduleLabel = 'shell', group = 'pytest', sourceChannelName = [null], sinkChannelName = [null], index = 1, type = processor, parameters = map['command' -> 'python /home/Ubuntu/xd/echo2.py', 'encoder' -> 'LF'], children = list[[empty]]]
2016-03-21T19:23:39+0000 1.2.0.RELEASE ERROR SimpleAsyncTaskExecutor-1 process.ShellCommandProcessor - python: can't open file '/home/Ubuntu/xd/echo2.py': [Errno 2] No such file or directory
2016-03-21T19:23:39+0000 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/ed02510e-f8b3-4f53-9848-e2268fbbade1/pytest.source.time.1, type=CHILD_ADDED
2016-03-21T19:23:39+0000 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'time' for stream 'pytest'
2016-03-21T19:23:39+0000 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor#5133da1 moduleName = 'time', moduleLabel = 'time', group = 'pytest', sourceChannelName = [null], sinkChannelName = [null], index = 0, type = source, parameters = map[[empty]], children = list[[empty]]]
2016-03-21T19:23:39+0000 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ZKStreamDeploymentHandler - Deployment status for stream 'pytest': DeploymentStatus{state=deployed}
2016-03-21T19:23:39+0000 1.2.0.RELEASE ERROR task-scheduler-1 process.ShellCommandProcessor - Stream closed
java.io.IOException: Stream closed
at java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:434) ~[na:1.7.0_95]
at java.io.OutputStream.write(OutputStream.java:116) ~[na:1.7.0_95]
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[na:1.7.0_95]
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[na:1.7.0_95]
at org.springframework.integration.ip.tcp.serializer.ByteArraySingleTerminatorSerializer.serialize(ByteArraySingleTerminatorSerializer.java:94) ~[spring-integration-ip-4.1.5.RELEASE.jar!/:na]

In my experience, 9 times out of 10, that has been caused by the python process crashing behind the scenes and the XD processor being unaware of it. Check your logs for the python process to verify that it's still running.

Related

Unable to execute import-hive.sh

I am getting below error while running import-hive.sh
Could you please help me out on this?
hadoop#0.0.0.0:~/apache-atlas-2.1.0/hook/apache-atlas-hive-hook-2.1.0/hook-bin$ ./import-hive.sh
Using Hive configuration directory [/home/hadoop/hive/conf]
Log file for import is /home/hadoop/apache-atlas-2.1.0/hook/apache-atlas-hive-hook-2.1.0/logs/import-hive.log
2021-07-13T15:43:21,449 INFO [main] org.apache.atlas.ApplicationProperties - Looking for atlas-application.properties in classpath
2021-07-13T15:43:21,452 INFO [main] org.apache.atlas.ApplicationProperties - Loading atlas-application.properties from file:/home/hadoop/hive/conf/atlas-application.properties
2021-07-13T15:43:21,505 INFO [main] org.apache.atlas.ApplicationProperties - Using graphdb backend 'janus'
2021-07-13T15:43:21,505 INFO [main] org.apache.atlas.ApplicationProperties - Using storage backend 'hbase2'
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Using index backend 'solr'
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Atlas is running in MODE: PROD.
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Setting solr-wait-searcher property 'true'
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Setting index.search.map-name property 'false'
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Setting atlas.graph.index.search.max-result-set-size = 150
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.db-cache = true
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.db-cache-clean-wait = 20
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.db-cache-size = 0.5
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.tx-cache-size = 15000
2021-07-13T15:43:21,506 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.tx-dirty-size = 120
Enter username for atlas :- admin
Enter password for atlas :-
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/security/authentication/client/ConnectionConfigurator
at org.apache.atlas.AtlasBaseClient.getClient(AtlasBaseClient.java:287)
at org.apache.atlas.AtlasBaseClient.initializeState(AtlasBaseClient.java:454)
at org.apache.atlas.AtlasBaseClient.initializeState(AtlasBaseClient.java:449)
at org.apache.atlas.AtlasBaseClient.<init>(AtlasBaseClient.java:132)
at org.apache.atlas.AtlasClientV2.<init>(AtlasClientV2.java:94)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:134)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.security.authentication.client.ConnectionConfigurator
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 6 more
Failed to import Hive Meta Data!!!

apache nifi Stateless - not able to set parm of controller service (DBCPConnectionPool 1.10.0)

I am following the NiFi 1.10 stateless guildeline to create a simple process group of executing a sql in mysql db. I have put necessary parm of db controller service to parameter context.
it works well in nifi canvas. Then i add it to registry and prepare a json parm file: stateless-simpledb.json
{
"registryUrl": "http://localhost:18080",
"bucketId": "cac8f127-e328-45c1-a4cb-0e03dc837ceb",
"flowId": "cc2753f2-78f3-4449-a2fd-343dfeaafe15",
"flowVersion": "3",
"parameters": {
"lastIngestId" : "20000",
"mysql-jdbc-driver-name" : "com.mysql.jdbc.Driver",
"db-user" : "root",
"db-password" : "password",
"db-con-url" : "jdbc:mysql://localhost:3306/mms",
"jdbc-jar-path" : "/program/jdbc/mysql-connector-java.jar"
}
}
and run the one-off command:
/program/nifi/bin/nifi.sh stateless RunFromRegistry Once --file /app/poc/nifi-stateless/conf/stateless-simpledb.json
It raise error:
=== FlowFileRepository Type ===
org.apache.nifi.controller.repository.RocksDBFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
org.apache.nifi.controller.repository.WriteAheadFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
org.apache.nifi.controller.repository.VolatileFlowFileRepository
org.apache.nifi:nifi-framework-nar:1.10.0 || /program/nifi-1.10.0/work/stateless-nars/nifi-framework-nar-1.10.0.nar-unpacked
=== End FlowFileRepository types ===
23:32:32.626 [main] INFO org.apache.nifi.stateless.bootstrap.ExtensionDiscovery - Successfully discovered extensions in 4411 milliseconds
23:32:32.633 [main] DEBUG org.apache.nifi.stateless.core.ComponentFactory - Setting context class loader to org.apache.nifi.nar.InstanceClassLoader#50fa5938 (parent = org.apache.nifi.nar.NarClassLoader[/program/nifi-1.10.0/work/stateless-nars/nifi-dbcp-service-nar-1.10.0.nar-unpacked]) to create org.apache.nifi.dbcp.DBCPConnectionPool
23:32:32.647 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input #{jdbc-jar-path} found 1 Parameter references: [org.apache.nifi.parameter.StandardParameterReference#2d3eecda]
23:32:32.650 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input /program/jdbc/mysql-connector-java.jar found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 500 millis found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 8 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 0 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 8 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input 30 mins found 0 Parameter references: []
23:32:32.651 [main] DEBUG org.apache.nifi.parameter.ExpressionLanguageAwareParameterParser - For input -1 found 0 Parameter references: []
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.bootstrap.RunStatelessNiFi.main(RunStatelessNiFi.java:69)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.StatelessNiFi.main(StatelessNiFi.java:103)
... 5 more
Caused by: java.lang.RuntimeException: Failed to enable Controller Service {id=691ecc97-ff46-3a5e-8aad-37dc568bc247, name=MYSQL-MMS-stateless-test, type=class org.apache.nifi.dbcp.DBCPConnectionPool} because validation failed: ['Database Connection URL' is invalid because Database Connection URL is required, 'Database Driver Class Name' is invalid because Database Driver Class Name is required]
at org.apache.nifi.stateless.core.StatelessControllerServiceLookup.enableControllerServices(StatelessControllerServiceLookup.java:133)
at org.apache.nifi.stateless.core.StatelessFlow.<init>(StatelessFlow.java:153)
at org.apache.nifi.stateless.core.StatelessFlow.createAndEnqueueFromJSON(StatelessFlow.java:469)
at org.apache.nifi.stateless.runtimes.Program.runLocal(Program.java:133)
at org.apache.nifi.stateless.runtimes.Program.launch(Program.java:67)
... 10 more
Seems the apache nifi stateless function failed to set controller service even it's in "process group" scope.
Would anyone has any advice?
As mentioned in the comments, this appears to be a known problem with the validation of controller services.
This can be avoided by using Nifi 1.12 and above as it got fixed in the following jira: https://issues.apache.org/jira/plugins/servlet/mobile#issue/NIFI-7380
Though I am not entirely sure of this, it may also be possible that this simply indicates that your controller service is not configured correctly. This would be worth double checking.

Flume: kafka channel and hdfs sink get unable to deliver event error

I want to try this new Flafka flow: only use kafka channel transfer data to hdfs sink. I tried it from kafka channel and logger sink which is easier to monitor. My configuration file is:
# Name the components on this agent
a1.sinks = sink1
a1.channels = channel1
a1.channels.channel1.type = org.apache.flume.channel.kafka.KafkaChannel
a1.channels.channel1.brokerList = localhost:9093,localhost:9094
a1.channels.channel1.topic = par4
a1.channels.channel1.zookeeperConnect = localhost:2181
a1.channels.channel1.parseAsFlumeEvent = false
a1.channels.cnannel1.kafka.consumer.timeout.ms = 1000000
a1.sinks.sink1.channel = channel1
a1.sinks.sink1.type = logger
I set up zookeeper and two brokers locally using above port number, and I have a producer client keep push messages to kafka.
I got following messages:
2015-07-02 20:22:37,619 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2015-07-02 20:22:37,623 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:conf/example.conf
2015-07-02 20:22:37,629 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:sink1
2015-07-02 20:22:37,629 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:sink1
2015-07-02 20:22:37,629 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:931)] Added sinks: sink1 Agent: a1
2015-07-02 20:22:37,633 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:508)] Agent configuration for 'a1' has no sources.
2015-07-02 20:22:37,635 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:141)] Post-validation flume configuration contains configuration for agents: [a1]
2015-07-02 20:22:37,635 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:145)] Creating channels
2015-07-02 20:22:37,639 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:42)] Creating instance of channel channel1 type org.apache.flume.channel.kafka.KafkaChannel
2015-07-02 20:22:37,650 (conf-file-poller-0) [INFO - org.apache.flume.channel.kafka.KafkaChannel.configure(KafkaChannel.java:168)] Group ID was not specified. Using flume as the group id.
2015-07-02 20:22:37,658 (conf-file-poller-0) [INFO - org.apache.flume.channel.kafka.KafkaChannel.configure(KafkaChannel.java:188)] {metadata.broker.list=localhost:9093,localhost:9094, request.required.acks=-1, group.id=flume, zookeeper.connect=localhost:2181, consumer.timeout.ms=100, auto.commit.enable=false}
2015-07-02 20:22:37,665 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:200)] Created channel channel1
2015-07-02 20:22:37,666 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: sink1, type: logger
2015-07-02 20:22:37,669 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:114)] Channel channel1 connected to [sink1]
2015-07-02 20:22:37,674 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#3362ba9e counterGroup:{ name:null counters:{} } }} channels:{channel1=org.apache.flume.channel.kafka.KafkaChannel{name: channel1}} }
2015-07-02 20:22:37,675 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel channel1
2015-07-02 20:22:37,677 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.channel.kafka.KafkaChannel.start(KafkaChannel.java:96)] Starting Kafka Channel: channel1
2015-07-02 20:22:37,885 (lifecycleSupervisor-1-0) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Verifying properties
2015-07-02 20:22:37,903 (lifecycleSupervisor-1-0) [WARN - kafka.utils.Logging$class.warn(Logging.scala:83)] Property auto.commit.enable is not valid
2015-07-02 20:22:37,903 (lifecycleSupervisor-1-0) [WARN - kafka.utils.Logging$class.warn(Logging.scala:83)] Property consumer.timeout.ms is not valid
2015-07-02 20:22:37,903 (lifecycleSupervisor-1-0) [WARN - kafka.utils.Logging$class.warn(Logging.scala:83)] Property group.id is not valid
2015-07-02 20:22:37,904 (lifecycleSupervisor-1-0) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property metadata.broker.list is overridden to localhost:9093,localhost:9094
2015-07-02 20:22:37,904 (lifecycleSupervisor-1-0) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property request.required.acks is overridden to -1
2015-07-02 20:22:37,904 (lifecycleSupervisor-1-0) [WARN - kafka.utils.Logging$class.warn(Logging.scala:83)] Property zookeeper.connect is not valid
2015-07-02 20:22:37,929 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.channel.kafka.KafkaChannel.start(KafkaChannel.java:99)] Topic = par4
2015-07-02 20:22:37,929 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: CHANNEL, name: channel1: Successfully registered new MBean.
2015-07-02 20:22:37,930 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: CHANNEL, name: channel1 started
2015-07-02 20:22:37,930 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink sink1
2015-07-02 20:22:37,939 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Verifying properties
2015-07-02 20:22:37,939 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property auto.commit.enable is overridden to false
2015-07-02 20:22:37,939 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property consumer.timeout.ms is overridden to 100
2015-07-02 20:22:37,939 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property group.id is overridden to flume
2015-07-02 20:22:37,939 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - kafka.utils.Logging$class.warn(Logging.scala:83)] Property metadata.broker.list is not valid
2015-07-02 20:22:37,940 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - kafka.utils.Logging$class.warn(Logging.scala:83)] Property request.required.acks is not valid
2015-07-02 20:22:37,942 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property zookeeper.connect is overridden to localhost:2181
2015-07-02 20:22:37,951 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] [flume_MACC02PHH5LG3QC-1435893757951-c4c69fb7], Connecting to zookeeper instance at localhost:2181
2015-07-02 20:22:37,952 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] Unable to deliver event. Exception follows.
java.lang.IllegalStateException: close() called when transaction is OPEN - you must either commit or rollback first
at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at org.apache.flume.channel.BasicTransactionSemantics.close(BasicTransactionSemantics.java:179)
at org.apache.flume.sink.LoggerSink.process(LoggerSink.java:105)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
^C2015-07-02 20:22:39,497 (agent-shutdown-hook) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.stop(LifecycleSupervisor.java:79)] Stopping lifecycle supervisor 12
2015-07-02 20:22:39,499 (agent-shutdown-hook) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Shutting down producer
2015-07-02 20:22:39,499 (agent-shutdown-hook) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Closing all sync producers
2015-07-02 20:22:39,501 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:150)] Component type: CHANNEL, name: channel1 stopped
2015-07-02 20:22:39,501 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:156)] Shutdown Metric for type: CHANNEL, name: channel1. channel.start.time == 1435893757930
2015-07-02 20:22:39,501 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:162)] Shutdown Metric for type: CHANNEL, name: channel1. channel.stop.time == 1435893759501
2015-07-02 20:22:39,501 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.capacity == 0
2015-07-02 20:22:39,502 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.current.size == 0
2015-07-02 20:22:39,502 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.event.put.attempt == 0
2015-07-02 20:22:39,504 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.event.put.success == 0
2015-07-02 20:22:39,504 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.event.take.attempt == 0
2015-07-02 20:22:39,504 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.event.take.success == 0
2015-07-02 20:22:39,504 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.kafka.commit.time == 0
2015-07-02 20:22:39,504 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.kafka.event.get.time == 0
2015-07-02 20:22:39,504 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.kafka.event.send.time == 0
2015-07-02 20:22:39,504 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:178)] Shutdown Metric for type: CHANNEL, name: channel1. channel.rollback.count == 0
2015-07-02 20:22:39,505 (agent-shutdown-hook) [INFO - org.apache.flume.channel.kafka.KafkaChannel.stop(KafkaChannel.java:123)] Kafka channel channel1 stopped. Metrics: CHANNEL:channel1{channel.event.put.attempt=0, channel.event.put.success=0, channel.kafka.event.get.time=0, channel.current.size=0, channel.event.take.attempt=0, channel.event.take.success=0, channel.kafka.event.send.time=0, channel.capacity=0, channel.kafka.commit.time=0, channel.rollback.count=0}
2015-07-02 20:22:39,505 (agent-shutdown-hook) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop(PollingPropertiesFileConfigurationProvider.java:83)] Configuration provider stopping
I don't understand why I have this unable to deliver event error. (I also tried to set up HDFS sink which gives me the same error.)
I also don't understand why I didn't successfully set consumer.timeout.ms
Looking for help, thanks!
Based on the answer from the community, this question can be solved by following two JIRA topic.
https://issues.apache.org/jira/browse/FLUME-2734
https://issues.apache.org/jira/browse/FLUME-2735

Unable to load file to Hadoop using flume

Im using flume to move files to hdfs ... while moving file its showing this error.. please help me to solve this issue.
15/05/20 15:49:26 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: r1 started
15/05/20 15:49:26 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /home/crayondata.com/shanmugapriya/apache-flume-1.5.2-bin/staging/HypeVisitorTest.java to /home/crayondata.com/shanmugapriya/apache-flume-1.5.2-bin/staging/HypeVisitorTest.java.COMPLETED
15/05/20 15:49:26 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.
15/05/20 15:49:26 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false
15/05/20 15:49:26 INFO hdfs.BucketWriter: Creating hdfs://localhost:9000/sha/HypeVisitorTest.java.1432117166377.tmp
15/05/20 15:49:26 ERROR hdfs.HDFSEventSink: process failed
java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:270)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:718)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:183)
at org.apache.flume.sink.hdfs.BucketWriter.access$1700(BucketWriter.java:59)
at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:715)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/05/20 15:49:26 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:471)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:270)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:262)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:718)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:183)
at org.apache.flume.sink.hdfs.BucketWriter.access$1700(BucketWriter.java:59)
at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:715)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
15/05/20 15:49:26 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.
Here is my flumeconf.conf file
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/shanmugapriya/apache-flume-1.5.2-bin/staging
a1.sources.r1.fileHeader = true
a1.sources.r1.maxBackoff = 10000
a1.sources.r1.basenameHeader = true
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://localhost:9000/sha
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.idleTimeout = 100
a1.sinks.k1.hdfs.filePrefix = %{basename}
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 100000
a1.channels.c1.transactionCapacity = 1000
a1.channels.c1.byteCapacity = 0
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
please help to solve this.. TIA...
#Shan Please confirm you have the relevant Hadoop HDFS jars in your classpath for Apache Flume
Also from your sink to HDFS I see that you have port 9000, however the default port is normally 8020, is this correct?

Flume agents are not connecting on different machines

Flume agent 1 does not connect to Flume agent 2. What could be the reason ?
I am using Flume to stream log file to HDFS using 2 Agents. The first agent is located at the source machine where the log file exists, while the second agent is located in the machine (IP Address is 10.10.201.40) where Hadoop is installed.
The configuration file of the first agent (flume-src-agent.conf) is as follows:
source_agent.sources = weblogic_server
source_agent.sources.weblogic_server.type = exec
source_agent.sources.weblogic_server.command = tail -f AdminServer.log
source_agent.sources.weblogic_server.batchSize = 1
source_agent.sources.weblogic_server.channels = memoryChannel
source_agent.sources.weblogic_server.interceptors = itime ihost itype
source_agent.sources.weblogic_server.interceptors.itime.type = timestamp
source_agent.sources.weblogic_server.interceptors.ihost.type = host
source_agent.sources.weblogic_server.interceptors.ihost.useIP = false
source_agent.sources.weblogic_server.interceptors.ihost.hostHeader = host
source_agent.sources.weblogic_server.interceptors.itype.type = static
source_agent.sources.weblogic_server.interceptors.itype.key = log_type
source_agent.sources.weblogic_server.interceptors.itype.value = apache_access_combined
source_agent.channels = memoryChannel
source_agent.channels.memoryChannel.type = memory
source_agent.channels.memoryChannel.capacity = 100
source_agent.sinks = avro_sink
source_agent.sinks.avro_sink.type = avro
source_agent.sinks.avro_sink.channel = memoryChannel
source_agent.sinks.avro_sink.hostname = 10.10.201.40
source_agent.sinks.avro_sink.port = 4545
The configuration file of the second agent (flume-trg-agent.conf) is as follows:
collector.sources = AvroIn
collector.sources.AvroIn.type = avro
collector.sources.AvroIn.bind = 0.0.0.0
collector.sources.AvroIn.port = 4545
collector.sources.AvroIn.channels = mc1 mc2
collector.channels = mc1 mc2
collector.channels.mc1.type = memory
collector.channels.mc1.capacity = 100
collector.channels.mc2.type = memory
collector.channels.mc2.capacity = 100
collector.sinks = HadoopOut
collector.sinks.HadoopOut.type = hdfs
collector.sinks.HadoopOut.channel = mc2
collector.sinks.HadoopOut.hdfs.path = hdfs://localhost:54310/user/root
collector.sinks.HadoopOut.hdfs.callTimeout = 150000
collector.sinks.HadoopOut.hdfs.fileType = DataStream
collector.sinks.HadoopOut.hdfs.writeFormat = Text
collector.sinks.HadoopOut.hdfs.rollSize = 0
collector.sinks.HadoopOut.hdfs.rollCount = 10000
collector.sinks.HadoopOut.hdfs.rollInterval = 600
When the 1st agent is run, I get the following error:
2015-04-08 15:14:10,251 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException:Failed to send events
at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:382)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.flume.FlumeException: NettyAvroRpcClient {host:10.10.201.40, port:4545}: RPC connection error
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:161)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:115)
at org.apache.flume.api.NettyAvroRpcClient.configure(NettyAvroRpcClient.java:590)
at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:88)
at org.apache.flume.sink.AvroSink.initializeRpcClient(AvroSink.java:127)
at org.apache.flume.sink.AbstractRpcSink.createConnection(AbstractRpcSink.java:209)
at org.apache.flume.sink.AbstractRpcSink.verifyConnection(AbstractRpcSink.java:269)
at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:339)
... 3 more
Caused by: java.io.IOException: Error connecting to /10.10.201.40:4545
at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:261)
at.org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:203)
at.org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:152)
at.org.apache.avro.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:147)
When the 2nd Agent is run, I get the following error:
2015-04-08 15:53:31,649 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG-org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Pollingsink runner starting
2015-04-08 15:53:31,844 (lifecycleSupervisor-1-3) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start EventDrivenSourceRunner: { source:Avro source AvroIn: {bindAddress: 0.0.0.0, port: 4545 } } - Exception follows.
org.jboss.netty.channel.ChannelException: Failed to bind to: /0.0.0.0:4545
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:298)
at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:106)
at org.apache.flume.source.AvroSource.start(AvroSource.java:225)
at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.bind(NioServerSocketPipelineSink.java:138)
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleServerSocket(NioServerSocketPipelineSink.java:90)
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:64)
at org.jboss.netty.channel.Channels.bind(Channels.java:569)
at org.jboss.netty.channel.AbstractChannel.bind(AbstractChannel.java:187)
at org.jboss.netty.bootstrap.ServerBootstrap$Binder.channelOpen(ServerBootstrap.java:343)
at org.jboss.netty.channel.Channels.fireChannelOpen(Channels.java:170)
at org.jboss.netty.channel.socket.nio.NioServerSocketChannelFactory.newChannel(NioServerSocketChannelFactory.java:158)
at org.jboss.netty.channel.socket.nio.NioServerSocketChannel.<init>(NioServerSocketChannel.java:80)
at org.jboss.netty.channel.socket.nio.NioServerSocketChannelFactory.newChannel(NioServerSocketChannelFactory.java:86)
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:277)
... 13 more
The answer to your question is in the second log:
Address already in use
The reason for this is that there's another process using port 4545. Just reconfigure both agents to another port, let say 41414, then it should work.
For binding issues type netstat -plten and check for the pid for the process and kill the process. Doing that will solve the binding issues when you run the agent again

Resources