hadoop too many logs on screen - hadoop

I start learning hadoop using hive recently. As a beginner I am not so familiar with all the logs showing on the screen. So it's better to see a clean version of all important logs. I learn hive based on Rutberglen's "Programming Hive" book.
Just started, and I got numerous of logs after the first command. While on the book, it's just "OK, Time taken: 3.543 seconds".
Anyone has solution to reduce these logs?
PS:below are the logs I got from command "create table x (a int);"
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Sep 28, 2014 12:10:28 AM org.apache.hadoop.hive.conf.HiveConf <clinit>
WARNING: hive-site.xml not found on CLASSPATH
Logging initialized using configuration in jar:file:/Users/admin/Documents/Study/software /Programming/Hive/hive-0.9.0-bin/lib/hive-common-0.9.0.jar!/hive-log4j.properties
Sep 28, 2014 12:10:28 AM SessionState printInfo
INFO: Logging initialized using configuration in jar:file:/Users/admin/Documents/Study/software/Programming/Hive/hive-0.9.0-bin/lib/hive-common-0.9.0.jar!/hive-log4j.properties
Hive history file=/tmp/admin/hive_job_log_admin_201409280010_720612579.txt
Sep 28, 2014 12:10:28 AM hive.ql.exec.HiveHistory printInfo
INFO: Hive history file=/tmp/admin/hive_job_log_admin_201409280010_720612579.txt
hive> CREATE TABLE x (a INT);
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.Driver PerfLogBegin
INFO: <PERFLOG method=Driver.run>
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.Driver PerfLogBegin
INFO: <PERFLOG method=compile>
Sep 28, 2014 12:10:31 AM hive.ql.parse.ParseDriver parse
INFO: Parsing command: CREATE TABLE x (a INT)
Sep 28, 2014 12:10:31 AM hive.ql.parse.ParseDriver parse
INFO: Parse Completed
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.parse.SemanticAnalyzer analyzeInternal
INFO: Starting Semantic Analysis
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.parse.SemanticAnalyzer analyzeCreateTable
INFO: Creating table x position=13
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.Driver compile
INFO: Semantic Analysis Completed
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.Driver getSchema
INFO: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.Driver PerfLogEnd
INFO: </PERFLOG method=compile start=1411877431127 end=1411877431388 duration=261>
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.Driver PerfLogBegin
INFO: <PERFLOG method=Driver.execute>
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.ql.Driver execute
INFO: Starting command: CREATE TABLE x (a INT)
Sep 28, 2014 12:10:31 AM hive.ql.exec.DDLTask createTable
INFO: Default to LazySimpleSerDe for table x
Sep 28, 2014 12:10:31 AM hive.log getDDLFromFieldSchema
INFO: DDL: struct x { i32 a}
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.metastore.HiveMetaStore newRawStore
INFO: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
Sep 28, 2014 12:10:31 AM org.apache.hadoop.hive.metastore.ObjectStore initialize
INFO: ObjectStore, initialize called
Sep 28, 2014 12:10:32 AM org.apache.hadoop.hive.metastore.ObjectStore getPMF
INFO: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
Sep 28, 2014 12:10:32 AM org.apache.hadoop.hive.metastore.ObjectStore setConf
INFO: Initialized ObjectStore
Sep 28, 2014 12:10:33 AM org.apache.hadoop.hive.metastore.HiveMetaStore logInfo
INFO: 0: create_table: db=default tbl=x
Sep 28, 2014 12:10:34 AM org.apache.hadoop.hive.ql.Driver PerfLogEnd
INFO: </PERFLOG method=Driver.execute start=1411877431389 end=1411877434527 duration=3138>
OK
Sep 28, 2014 12:10:34 AM org.apache.hadoop.hive.ql.Driver printInfo
INFO: OK
Sep 28, 2014 12:10:34 AM org.apache.hadoop.hive.ql.Driver PerfLogBegin
INFO: <PERFLOG method=releaseLocks>
Sep 28, 2014 12:10:34 AM org.apache.hadoop.hive.ql.Driver PerfLogEnd
INFO: </PERFLOG method=releaseLocks start=1411877434529 end=1411877434529 duration=0>
Sep 28, 2014 12:10:34 AM org.apache.hadoop.hive.ql.Driver PerfLogEnd
INFO: </PERFLOG method=Driver.run start=1411877431126 end=1411877434530 duration=3404>
Time taken: 3.407 seconds
Sep 28, 2014 12:10:34 AM CliDriver printInfo
INFO: Time taken: 3.407 seconds

Try starting hive shell as follows :
hive --hiveconf hive.root.logger=WARN,console
If you wanted to make this change persistent, modify the logger property file HIVE_CONF_DIR/hive-log4j.properties file. If you don't have this file in your HIVE_CONF_DIR, create this file by copying the contents of hive-log4j.default in the HIVE_CONF_DIR.

Related

mvn gluon:runagent failure - Unknown Attribute 'queryAllDeclaredMethods' .. in definition of class ..HelloController

trying to get my first native build working.
(using Windows 10, jdk 17, javafx17, gluon 1.0.9, gluon graalvm (graalvm-svm-windows-gluon-21.2.0-dev.zip))
I am able to run mvn gluonfx:run
(and click on the 1 button I have in my test UI)
However when I run: mvn gluonfx:runagent, I get:
[Wed Nov 17 08:10:41 PST 2021][INFO] [SUB] Error: Error parsing reflection configuration in file:/C:/devel/repos/Gluon-SingleViewProject-jdk17/target/classes/META-INF%5cnative-image%5creflect-config.json:
[Wed Nov 17 08:10:41 PST 2021][INFO] [SUB] Unknown attribute 'queryAllDeclaredMethods' (supported attributes: allDeclaredConstructors, allPublicConstructors, allDeclaredMethods, allPublicMethods, allDeclaredFields, allPublicFields, methods, fields) in defintion of class com.gluonapplication.HelloController
[Wed Nov 17 08:10:41 PST 2021][INFO] [SUB] Verify that the configuration matches the schema described in the -H:PrintFlags=+ output for option ReflectionConfigurationResources.
[Wed Nov 17 08:10:41 PST 2021][INFO] [SUB] com.oracle.svm.core.util.UserError$UserException: Error parsing reflection configuration in file:/C:/devel/repos/Gluon-SingleViewProject-jdk17/target/classes/META-INF%5cnative-image%5creflect-config.json:
[Wed Nov 17 08:10:41 PST 2021][INFO] [SUB] Unknown attribute 'queryAllDeclaredMethods' (supported attributes: allDeclaredConstructors, allPublicConstructors, allDeclaredMethods, allPublicMethods, allDeclaredFields, allPublicFields, methods, fields) in defintion of class com.gluonapplication.HelloController
[Wed Nov 17 08:10:41 PST 2021][INFO] [SUB] Verify that the configuration matches the schema described in the -H:PrintFlags=+ output for option ReflectionConfigurationResources.
The helloController, simply consists of 1 method atm:
public class HelloController {
public void pressButton(ActionEvent ae){
System.out.println("hello, source pressed: " + ae.getSource());
}
}
Any suggestions/tips greatly appreciated...(based on the error above..looks like build process is may be calling an unsupported method for jdk 17?)

StreamSets upgrade and LDAP authentication

Just upgraded StreamSets from 2.1.0.2 to 2.4.0.0 using Cloudera Manager (5.8.2). I can't login anymore into StreamSets - I get "login failed". The new version seem to be using a different LDAP lookup method.
My logs BEFORE Update looks as below:
Mar 15, 10:42:07.799 AM INFO com.streamsets.datacollector.http.LdapLoginModule
Searching for users with filter: '(&(objectClass={0})({1}={2}))' from base dn: DC=myComp,DC=Statistics,DC=ComQ,DC=uk
Mar 15, 10:42:07.826 AM INFO com.streamsets.datacollector.http.LdapLoginModule
Found user?: true
Mar 15, 10:42:07.826 AM INFO com.streamsets.datacollector.http.LdapLoginModule
Attempting authentication: CN=UserDV,OU=London,OU=ComQ,DC=ComQ,DC=Statistics,DC=comQ,DC=uk
My logs AFTER Update looks as below:
Mar 15, 11:10:21.406 AM INFO com.streamsets.datacollector.http.LdapLoginModule
Accessing LDAP Server: ldaps://comQ.statisticsxxx.com:3269 startTLS: false
Mar 15, 11:10:22.086 AM INFO org.ldaptive.auth.SearchDnResolver
search for user=[org.ldaptive.auth.User#1573608120::identifier= userdv, context=null] failed using filter=[org.ldaptive.SearchFilter#1129802876::filter=(&(objectClass=user)(uid={user})), parameters={context=null, user=userdv}]
Mar 15, 11:10:22.087 AM INFO com.streamsets.datacollector.http.LdapLoginModule
Found user?: false
Mar 15, 11:10:22.087 AM ERROR com.streamsets.datacollector.http.LdapLoginModule
Result code: null - DN cannot be null
You should change ldap.userFilter in Cloudera Manager from uid={user} to name={user}

Make Cygnus use WebHDFS to write to local HDFS

I'm trying to make a local Orion+Cygnus persist Orion's data on a local HDFS through WebHDFS.
On Cygnus' instructions on gitub, very little is mentioned about WebHDFS, as the configuration is more about HttpFS.
On the .md OrionHDFSsink it's said that hdfs_port=50070 is for WebHDFS, as indeed my HDFS is. So I would expect by setting the port this way cygnus would automatically use WebHDFS, but on my case it doesn't seem to be working this way.
So, here's my agent_1.conf:
cygnusagent.sources = http-source
cygnusagent.sinks = hdfs-sink
cygnusagent.channels = hdfs-channel
# source configuration
cygnusagent.sources.http-source.channels = hdfs-channel
cygnusagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
cygnusagent.sources.http-source.port = 5050
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.OrionRestHandler
cygnusagent.sources.http-source.handler.notification_target = /notify
cygnusagent.sources.http-source.handler.default_service = def_serv
cygnusagent.sources.http-source.handler.default_service_path = def_servpath
cygnusagent.sources.http-source.handler.events_ttl = 4
cygnusagent.sources.http-source.interceptors = ts gi
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
cygnusagent.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.GroupingInterceptor$Builder
cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf
# OrionHDFSSink configuration
cygnusagent.sinks.hdfs-sink.channel = hdfs-channel
cygnusagent.sinks.hdfs-sink.type = com.telefonica.iot.cygnus.sinks.OrionHDFSSink
cygnusagent.sinks.hdfs-sink.hdfs_host = localHDFS.ip
cygnusagent.sinks.hdfs-sink.hdfs_port = 50070
cygnusagent.sinks.hdfs-sink.hdfs_username = HDFSrootUser
cygnusagent.sinks.hdfs-sink.attr_persistence = column
# hdfs-channel configuration
cygnusagent.channels.hdfs-channel.type = memory
cygnusagent.channels.hdfs-channel.capacity = 1000
cygnusagent.channels.hdfs-channel.transactionCapacity = 100
When I update an Entity on Orion, to whom Cygnus is subbed, Cygnus logs the following:
02 Sep 2015 20:09:12,353 INFO [2055470757#qtp-1523539038-0] (com.telefonica.iot.cygnus.handlers.OrionRestHandler.getEvents:150) - Starting transaction (1441217314-956-0000000000)
02 Sep 2015 20:09:12,362 INFO [2055470757#qtp-1523539038-0] (com.telefonica.iot.cygnus.handlers.OrionRestHandler.getEvents:236) - Received data ({ "subscriptionId" : "55e735c9b89e8535f8ca5ef2", "originator" : "localhost", "contextResponses" : [ { "contextElement" : { "type" : "Reading", "isPattern" : "false", "id" : "Reading1.1", "attributes" : [ { "name" : "Cost", "type" : "double", "value" : "32" }, { "name" : "Reading_ID", "type" : "integer", "value" : "14" }, { "name" : "Threshold", "type" : "double", "value" : "30" }, { "name" : "email", "type" : "string", "value" : "arthurmvieira#hotmail.com" } ] }, "statusCode" : { "code" : "200", "reasonPhrase" : "OK" } } ]})
02 Sep 2015 20:09:12,366 INFO [2055470757#qtp-1523539038-0] (com.telefonica.iot.cygnus.handlers.OrionRestHandler.getEvents:258) - Event put in the channel (id=2020008711, ttl=4)
02 Sep 2015 20:09:12,432 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:128) - Event got from the channel (id=2020008711, headers={fiware-servicepath=def_servpath, destination=reading1.1_reading, content-type=application/json, fiware-service=def_serv, ttl=4, transactionId=1441217314-956-0000000000, timestamp=1441217352368}, bodyLength=812)
02 Sep 2015 20:09:12,549 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionHDFSSink.persist:356) - [hdfs-sink] Persisting data at OrionHDFSSink. HDFS file (def_serv/def_servpath/reading1.1_reading/reading1.1_reading.txt), Data ({"recvTime":"2015-09-02T18:09:12.368Z","Cost":"32", "Cost_md":[],"Reading_ID":"14", "Reading_ID_md":[],"Threshold":"30", "Threshold_md":[],"email":"arthurmvieira#hotmail.com", "email_md":[]})
02 Sep 2015 20:09:12,557 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:143) - Persistence error (The /user/root/def_serv/def_servpath/reading1.1_reading directory could not be created in HDFS. HttpFS response: 503 Service unavailable)
02 Sep 2015 20:09:12,558 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:173) - An event was put again in the channel (id=2020008711, ttl=3)
02 Sep 2015 20:09:12,558 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:193) - Finishing transaction (1441217314-956-0000000000)
02 Sep 2015 20:09:13,560 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:128) - Event got from the channel (id=2020008711, headers={fiware-servicepath=def_servpath, destination=reading1.1_reading, content-type=application/json, fiware-service=def_serv, ttl=3, transactionId=1441217314-956-0000000000, timestamp=1441217352368}, bodyLength=812)
02 Sep 2015 20:09:13,574 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionHDFSSink.persist:356) - [hdfs-sink] Persisting data at OrionHDFSSink. HDFS file (def_serv/def_servpath/reading1.1_reading/reading1.1_reading.txt), Data ({"recvTime":"2015-09-02T18:09:12.368Z","Cost":"32", "Cost_md":[],"Reading_ID":"14", "Reading_ID_md":[],"Threshold":"30", "Threshold_md":[],"email":"arthurmvieira#hotmail.com", "email_md":[]})
02 Sep 2015 20:09:13,574 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:143) - Persistence error (The /user/root/def_serv/def_servpath/reading1.1_reading directory could not be created in HDFS. HttpFS response: 503 Service unavailable)
02 Sep 2015 20:09:13,575 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:173) - An event was put again in the channel (id=2020008711, ttl=2)
02 Sep 2015 20:09:13,575 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:193) - Finishing transaction (1441217314-956-0000000000)
02 Sep 2015 20:09:15,576 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:128) - Event got from the channel (id=2020008711, headers={fiware-servicepath=def_servpath, destination=reading1.1_reading, content-type=application/json, fiware-service=def_serv, ttl=2, transactionId=1441217314-956-0000000000, timestamp=1441217352368}, bodyLength=812)
02 Sep 2015 20:09:15,590 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionHDFSSink.persist:356) - [hdfs-sink] Persisting data at OrionHDFSSink. HDFS file (def_serv/def_servpath/reading1.1_reading/reading1.1_reading.txt), Data ({"recvTime":"2015-09-02T18:09:12.368Z","Cost":"32", "Cost_md":[],"Reading_ID":"14", "Reading_ID_md":[],"Threshold":"30", "Threshold_md":[],"email":"arthurmvieira#hotmail.com", "email_md":[]})
02 Sep 2015 20:09:15,599 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:143) - Persistence error (The /user/root/def_serv/def_servpath/reading1.1_reading directory could not be created in HDFS. HttpFS response: 503 Service unavailable)
02 Sep 2015 20:09:15,600 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:173) - An event was put again in the channel (id=2020008711, ttl=1)
02 Sep 2015 20:09:15,600 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:193) - Finishing transaction (1441217314-956-0000000000)
02 Sep 2015 20:09:18,601 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:128) - Event got from the channel (id=2020008711, headers={fiware-servicepath=def_servpath, destination=reading1.1_reading, content-type=application/json, fiware-service=def_serv, ttl=1, transactionId=1441217314-956-0000000000, timestamp=1441217352368}, bodyLength=812)
02 Sep 2015 20:09:18,615 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionHDFSSink.persist:356) - [hdfs-sink] Persisting data at OrionHDFSSink. HDFS file (def_serv/def_servpath/reading1.1_reading/reading1.1_reading.txt), Data ({"recvTime":"2015-09-02T18:09:12.368Z","Cost":"32", "Cost_md":[],"Reading_ID":"14", "Reading_ID_md":[],"Threshold":"30", "Threshold_md":[],"email":"arthurmvieira#hotmail.com", "email_md":[]})
02 Sep 2015 20:09:18,618 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:143) - Persistence error (The /user/root/def_serv/def_servpath/reading1.1_reading directory could not be created in HDFS. HttpFS response: 503 Service unavailable)
02 Sep 2015 20:09:18,621 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:173) - An event was put again in the channel (id=2020008711, ttl=0)
02 Sep 2015 20:09:18,621 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:193) - Finishing transaction (1441217314-956-0000000000)
02 Sep 2015 20:09:22,622 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:128) - Event got from the channel (id=2020008711, headers={fiware-servicepath=def_servpath, destination=reading1.1_reading, content-type=application/json, fiware-service=def_serv, ttl=0, transactionId=1441217314-956-0000000000, timestamp=1441217352368}, bodyLength=812)
02 Sep 2015 20:09:22,635 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionHDFSSink.persist:356) - [hdfs-sink] Persisting data at OrionHDFSSink. HDFS file (def_serv/def_servpath/reading1.1_reading/reading1.1_reading.txt), Data ({"recvTime":"2015-09-02T18:09:12.368Z","Cost":"32", "Cost_md":[],"Reading_ID":"14", "Reading_ID_md":[],"Threshold":"30", "Threshold_md":[],"email":"arthurmvieira#hotmail.com", "email_md":[]})
02 Sep 2015 20:09:22,635 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:143) - Persistence error (The /user/root/def_serv/def_servpath/reading1.1_reading directory could not be created in HDFS. HttpFS response: 503 Service unavailable)
02 Sep 2015 20:09:22,635 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:163) - The event TTL has expired, it is no more re-injected in the channel (id=2020008711, ttl=0)
02 Sep 2015 20:09:22,635 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:193) - Finishing transaction (1441217314-956-0000000000)
So you can see it's trying to use HttpFS, as it logs the response:
HttpFS response: 503 Service unavailable
...on each writing try.
How should I configure the agent to use WebHDFS?
Thank you
I don't know what was happening, but the configuration mentioned is correct and is working now.
After several tries at rebooting the instance, rewriting the config files and other log errors than the one mentioned, it worked.
At some point Cygnus was trying to write to localhost:50075, instead of {localHDFS.ip}:50070, but that was gone after rebooting cygnus.
All instances are at their latest version (important).
Cygnus configuration for WebHDFS is just about setting the port to 50070, nothing else is required.
Regarding the connections you mention to 50075, they are correct as well, since that's the behaviour of WebHDFS: when you want to upload data to HDFS, first the client (in this case, Cygnus) accesses the Namenode through TCP/50070 port, then the namenode responds with a redirection location pointing to the datanode where the data will be effectively uploaded; such a redirection uses the TCP/50075 port, and thus that datanode:50075 must be accessible by the client (Cygnus). That's why we are using HttpFS in the global instance of Cosmos at FIWARE Lab: HttpFS works as a gateway hiding the details of the datanodes, and a single entry point and port (14000) is required.

LogStash failed action with response of 500, dropping action

I am trying to configure LogStash to watch a file and send events to elasticsearch server.
When I start logstash to output to stdout, it runs fine:
stdout {
codec => rubydebug
}
But when I add elasticsearch output:
elasticsearch {
cluster => 'myclustername'
host => 'myip'
node_name => 'Aragorn'
}
Logstash starts up
Mar 16, 2015 3:44:24 PM org.elasticsearch.node.internal.InternalNode <init>
INFO: [Aragorn] version[1.4.0], pid[7136], build[bc94bd8/2014-11-05T14:26:12Z]
Mar 16, 2015 3:44:24 PM org.elasticsearch.node.internal.InternalNode <init>
INFO: [Aragorn] initializing ...
Mar 16, 2015 3:44:24 PM org.elasticsearch.plugins.PluginsService <init>
INFO: [Aragorn] loaded [], sites []
Mar 16, 2015 3:44:25 PM org.elasticsearch.node.internal.InternalNode <init>
INFO: [Aragorn] initialized
Mar 16, 2015 3:44:25 PM org.elasticsearch.node.internal.InternalNode start
INFO: [Aragorn] starting ...
Mar 16, 2015 3:44:25 PM org.elasticsearch.transport.TransportService doStart
INFO: [Aragorn] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.98.134.83:9300]}
Mar 16, 2015 3:44:25 PM org.elasticsearch.discovery.DiscoveryService doStart
INFO: [Aragorn] myclustername/RjasP2X0ShKXEl0f2WRxBA
Mar 16, 2015 3:44:30 PM org.elasticsearch.cluster.service.InternalClusterService$UpdateTask run
INFO: [Aragorn] detected_master [Aragorn][0YytUoWlQ2qgw2_0i5V4mQ][SOMEMACHINE][inet[/myip:9300]], added {[Aragorn][0YytUoWlQ2qgw2_0i5V4mQ][
SOMEMACHINE][inet[/myip:9300]],}, reason: zen-disco-receive(from master [[Aragorn][0YytUoWlQ2qgw2_0i5V4mQ][SOMEMACHINE][inet[/myip:9300]]])
Mar 16, 2015 3:44:30 PM org.elasticsearch.node.internal.InternalNode start
INFO: [Aragorn] started
But when messages start coming in, nothing is in fact sent to elasticsearch and these start to appear in logstash output:
WARNING: [Aragorn] Message not fully read (response) for [28] handler org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$6#17b
531e, error [true], resetting
Mar 16, 2015 3:44:54 PM org.elasticsearch.transport.netty.MessageChannelHandler messageReceived
WARNING: [Aragorn] Message not fully read (response) for [29] handler org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$6#130
82f0, error [true], resetting
and
{:timestamp=>"2015-03-16T15:44:54.377000+0100", :message=>"failed action with response of 500, dropping action...
(the above message is much longer but does not seem to contain any useful diagnostics)
What might be wrong?

Mapreduce job failing even the required class is present

Hi my job is failing due to runtime exception, saying, mapper class not found.
Below is the exception:
14/06/16 05:52:56 INFO mapred.JobClient: Task Id : attempt_201406071432_1142_m_000028_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.cloudera.sa.omniture.mr.OmnitureRawDataMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:191)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ClassNotFoundException: Class com.cloudera.sa.omniture.mr.OmnitureRawDataMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1680)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1772)
... 8 more
But in my jar i used to run the job had the Mapper class present, still i am facing the issue. pls guide.
Below info. on my jar file confirms the presence of Mapper class in jar file.
[svtdphpd#d-hcr75y1 testJARs]$ jar tvf MalformedAnalysis.jar | grep Omniture
1405 Mon Jun 16 15:59:34 CDT 2014 com/cloudera/sa/omniture/mr/OmnitureRawDataMapper$HITDATA_PROBLEM.class
3728 Mon Jun 16 15:59:34 CDT 2014 com/cloudera/sa/omniture/mr/OmnitureRawDataMapper.class
5280 Mon Jun 16 16:00:46 CDT 2014 com/cloudera/sa/omniture/mr/OmnitureToRCFileJob.class
6436 Mon Jun 16 16:03:34 CDT 2014 com/cloudera/sa/omniture/mr/OmnitureDataFileRecordReader.class
3642 Mon Jun 16 16:00:16 CDT 2014 com/cloudera/sa/omniture/mr/OmnitureRawDataReducer.class
1792 Mon Jun 16 15:59:34 CDT 2014 com/cloudera/sa/omniture/mr/OmnitureDataFileInputFormat.class
Below is the Job configuration:
// Create job
Job job = Job.getInstance(config, "LoadOmnitureData");
job.setJarByClass(OmnitureToRCFileJob.class);
// Add named output for malformed records
MultipleOutputs.addNamedOutput(job, "Malformed",
TextOutputFormat.class, NullWritable.class, Text.class);
FileInputFormat.addInputPath(job, new Path(inputPath));
job.setInputFormatClass(OmnitureDataFileInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(Text.class);
job.setReducerClass(OmnitureRawDataReducer.class);
job.setMapperClass(OmnitureRawDataMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
TextOutputFormat.setOutputPath(job, new Path(outputPath));
job.setNumReduceTasks(numReduceTasks);

Resources