exception error in YCSB load command - cassandra-2.0

I am trying to run YCSB load command but getting below error after 310 seconds. I have set my cassandra.yaml to following configurations:
listen_address : localhost
seeds : 127.0.0.1
rpc_address: 0.0.0.0
broadcast_address : localhost
I also tried changing the rpc address to localhost but still getting the same error. Can someone please help me with this?
Loading workload...
Starting test.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/ubuntu/ycsb-0.3.0/cassandra-binding/lib/logback-classic-1.1.3.jar!/org/enter code hereslf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/ubuntu/ycsb-0.3.0/cassandra-binding/lib/slf4j-log4j12-1.7.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2015-11-07 21:54:01:591 0 sec: 0 operations;
2015-11-07 21:54:11:545 10 sec: 0 operations;
2015-11-07 21:54:21:545 20 sec: 0 operations;
2015-11-07 21:54:31:545 30 sec: 0 operations;
2015-11-07 21:54:41:545 40 sec: 0 operations;
2015-11-07 21:54:51:545 50 sec: 0 operations;
2015-11-07 21:55:01:545 60 sec: 0 operations;
2015-11-07 21:55:11:545 70 sec: 0 operations;
2015-11-07 21:55:21:545 80 sec: 0 operations;
2015-11-07 21:55:31:545 90 sec: 0 operations;
2015-11-07 21:55:41:545 100 sec: 0 operations;
2015-11-07 21:55:51:545 110 sec: 0 operations;
2015-11-07 21:56:01:545 120 sec: 0 operations;
2015-11-07 21:56:11:545 130 sec: 0 operations;
2015-11-07 21:56:21:545 140 sec: 0 operations;
2015-11-07 21:56:31:545 150 sec: 0 operations;
2015-11-07 21:56:41:545 160 sec: 0 operations;
2015-11-07 21:56:51:545 170 sec: 0 operations;
2015-11-07 21:57:01:545 180 sec: 0 operations;
2015-11-07 21:57:11:545 190 sec: 0 operations;
2015-11-07 21:57:21:545 200 sec: 0 operations;
2015-11-07 21:57:31:545 210 sec: 0 operations;
2015-11-07 21:57:41:545 220 sec: 0 operations;
2015-11-07 21:57:51:545 230 sec: 0 operations;
2015-11-07 21:58:01:545 240 sec: 0 operations;
2015-11-07 21:58:11:545 250 sec: 0 operations;
2015-11-07 21:58:21:545 260 sec: 0 operations;
2015-11-07 21:58:31:545 270 sec: 0 operations;
2015-11-07 21:58:41:545 280 sec: 0 operations;
2015-11-07 21:58:51:545 290 sec: 0 operations;
2015-11-07 21:59:01:545 300 sec: 0 operations;
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
Unable to connect to 172.31.18.250 after 300 tries
com.yahoo.ycsb.DBException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:162)
at com.yahoo.ycsb.DBWrapper.init(DBWrapper.java:63)
at com.yahoo.ycsb.ClientThread.run(Client.java:195)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.yahoo.ycsb.db.CassandraClient10.init(CassandraClient10.java:144)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
2015-11-07 21:59:11:545 310 sec: 0 operations;

Related

Team site connection issue from my pc only, others are able to connect

com.interwoven.cssdk.common.CSException: (java.net.SocketException: Connection reset)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.interwoven.cssdk.client.axis.common.AxisExceptionTranslator.translateException(AxisExceptionTranslator.java:72)
at com.interwoven.cssdk.client.axis.access.AccessServiceAdapterImpl.beginSessionUsingPassword(AccessServiceAdapterImpl.java:228)
at com.interwoven.cssdk.client.axis.common.AxisFactory.getClient(AxisFactory.java:273)
Root cause:
AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode:
faultString: java.net.SocketException: Connection reset
faultActor:
faultNode:
faultDetail:
{http://xml.apache.org/axis/}stackTrace:java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)

Unable to create Hive table, flaky metastore connections

I'm following this blog post that partitions S3 Access Logs by date using Hive and EMR. I was able to run this script against a small bucket of access logs okay, but table creation on top of a large bucket (~ 1.5 TB) fails with the following error:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.thrift.transport.TTransportException
I looked through the hive logs, but nothing stand out: /mnt/var/log/hive. Not sure what the problem is as this error is pretty generic. I'm pretty much following the blog post verbatim and the script errors out on the following after 10 or 15 minutes
CREATE EXTERNAL TABLE IF NOT EXISTS Accesslogs(....
Update: I found more log info and also ran Hive in debug mode. EMR gets intermittent connection failures to the metastore then finally fails
.........
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_151]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_151]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_151]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
at org.apache.hadoop.util.RunJar.run(RunJar.java:221) [hadoop-common-2.7.3-amzn-5.jar:?]
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) [hadoop-common-2.7.3-amzn-5.jar:?]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_151]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_151]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_151]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_151]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_151]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_151]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
... 33 more
2017-12-10T15:18:18,718 INFO [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: hive.metastore (HiveMetaStoreClient.java:open(506)) - Waiting 1 seconds before next connection attempt.
2017-12-10T15:18:19,719 INFO [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: hive.metastore (HiveMetaStoreClient.java:open(392)) - Trying to connect to metastore with URI thrift://ip-172-50-31-107.ec2.internal:9083
2017-12-10T15:18:19,719 WARN [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: hive.metastore (HiveMetaStoreClient.java:open(472)) - Failed to connect to the MetaStore Server...
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:465) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.reconnect(HiveMetaStoreClient.java:335) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:163) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at com.sun.proxy.$Proxy37.createTable(Unknown Source) [?:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_151]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_151]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_151]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2303) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at com.sun.proxy.$Proxy37.createTable(Unknown Source) [?:?]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:854) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4356) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) [hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) [hive-cli-2.3.1-amzn-0.jar:2.3.1-amzn-0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_151]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_151]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_151]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
at org.apache.hadoop.util.RunJar.run(RunJar.java:221) [hadoop-common-2.7.3-amzn-5.jar:?]
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) [hadoop-common-2.7.3-amzn-5.jar:?]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_151]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_151]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_151]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_151]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_151]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_151]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[hive-exec-2.3.1-amzn-0.jar:2.3.1-amzn-0]
... 33 more
2017-12-10T15:18:19,720 INFO [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: hive.metastore (HiveMetaStoreClient.java:open(506)) - Waiting 1 seconds before next connection attempt.
2017-12-10T15:18:20,721 INFO [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: hive.metastore (HiveMetaStoreClient.java:open(392)) - Trying to connect to metastore with URI thrift://ip-172-50-31-107.ec2.internal:9083
2017-12-10T15:18:20,721 INFO [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: hive.metastore (HiveMetaStoreClient.java:open(466)) - Opened a connection to metastore, current connections: 1
2017-12-10T15:18:20,795 INFO [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: hive.metastore (HiveMetaStoreClient.java:open(519)) - Connected to metastore.
2017-12-10T15:18:28,013 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:18:28,014 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:19:28,014 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:19:28,014 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:20:28,014 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:20:28,014 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:21:28,015 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:21:28,015 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:22:28,015 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:22:28,015 DEBUG [java-sdk-http-connection-reaper([])]: conn.PoolingHttpClientConnectionManager (PoolingHttpClientConnectionManager.java:closeIdleConnections(401)) - Closing connections idle longer than 60000 MILLISECONDS
2017-12-10T15:22:44,472 ERROR [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: exec.DDLTask (DDLTask.java:failed(639)) - org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTransportException
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:864)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4356)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1199)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1185)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2372)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:93)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:737)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:725)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
at com.sun.proxy.$Proxy37.createTable(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2303)
at com.sun.proxy.$Proxy37.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:854)
... 22 more
2017-12-10T15:22:44,472 ERROR [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: ql.Driver (SessionState.java:printError(1126)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.thrift.transport.TTransportException
2017-12-10T15:22:44,472 DEBUG [e74af478-3227-4bf9-9fde-74d8babcf5f0 main([])]: ql.Driver (DriverContext.java:shutdown(132)) - Shutting down query CREATE EXTERNAL TABLE IF NOT EXISTS Accesslogs(
BucketOwner string,
Bucket string,
RequestDateTime string,
RemoteIP string,
Requester string,
RequestID string,
Operation string,
Key string,
RequestURI_operation string,
RequestURI_key string,
RequestURI_httpProtoversion string,
HTTPstatus string,
ErrorCode string,
BytesSent string,
ObjectSize string,
TotalTime string,
TurnAroundTime string,
Referrer string,
UserAgent string,
VersionId string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
At a guess, Hive is trying to do something which is fast on a filesystem (recursive treewalk, rename) which keels over when it gets to s3 and all these things are faked in the client.
Consider filing a JIRA against Hive on this; include any logs server side you can too, and try some different scaled files to see when things fail.

Hive Too many bytes before newline

I am trying to run simple hive queries like
select * from table_name;
But getting error message as mentioned below.
Error: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:213)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
... 11 more
Caused by: java.io.IOException: Too many bytes before newline: 2147483648
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:249)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:91)
at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:136)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
... 16 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 748 Reduce: 856 Cumulative CPU: 2938.2 sec HDFS Read: 173243267086 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 48 minutes 58 seconds 200 msec
I could not find a way around this problem. I also looked into this question
But could not find a way around Integer.MAX_VALUE. I would really appreciate any suggestion.

Exception when I use Apache spark on a large Dataset

So I get the following exception when I run some mapping and filtering on a relatively large dataset (~1 MB) using PySpark 1.4 in my local windows machine. I have tried using Apache Spark 1.6 and 2.0 and get the exact same exception.
However, if my data-set is small(~100 lines), the code works without any problem.
I have installed Apache Spark on my laptop and have decent amount of storage space and RAM. I tried increasing the memory allocation for spark but it still throws the error. Please help!
Input File: c:/sparklocal/data/in/Emp/est12.txt
16/09/29 15:32:58 ERROR PythonRDD: Python worker exited unexpectedly (crashed)
java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
at java.io.DataOutputStream.flush(Unknown Source)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.app
ly(PythonRDD.scala:251)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scal
a:208)
16/09/29 15:32:58 ERROR PythonRDD: This may have been caused by a prior exceptio
n:
java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
at java.io.DataOutputStream.flush(Unknown Source)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.app
ly(PythonRDD.scala:251)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scal
a:208)
16/09/29 15:32:58 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
at java.io.DataOutputStream.flush(Unknown Source)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.app
ly(PythonRDD.scala:251)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scal
a:208)
16/09/29 15:32:58 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; abor
ting job
Traceback (most recent call last):
File "c:/sparklocal/py/test.py", line 27, in <module>
header = tv_platform_textslines.first()
File "c:\sparklocal\spark\spark-1.4.1-bin-hadoop2.6\python\lib\pyspark.zip\pys
park\rdd.py", line 1295, in first
File "c:\sparklocal\spark\spark-1.4.1-bin-hadoop2.6\python\lib\pyspark.zip\pys
park\rdd.py", line 1277, in take
File "c:\sparklocal\spark\spark-1.4.1-bin-hadoop2.6\python\lib\pyspark.zip\pys
park\context.py", line 897, in runJob
File "c:\sparklocal\spark\spark-1.4.1-bin-hadoop2.6\python\lib\py4j-0.8.2.1-sr
c.zip\py4j\java_gateway.py", line 538, in __call__
File "c:\sparklocal\spark\spark-1.4.1-bin-hadoop2.6\python\lib\py4j-0.8.2.1-sr
c.zip\py4j\protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.
api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in s
tage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0,
localhost): java.net.SocketException: Connection reset by peer: socket write er
ror
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
at java.io.DataOutputStream.flush(Unknown Source)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.app
ly(PythonRDD.scala:251)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scal
a:208)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DA
GScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(D
AGScheduler.scala:1264)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(D
AGScheduler.scala:1263)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.
scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala
:1263)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$
1.apply(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$
1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGSchedu
ler.scala:730)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAG
Scheduler.scala:1457)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAG
Scheduler.scala:1418)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

apache thrift transport TTransportException

Hive Version : 0.13.1
Pig Version : 0.13.0
I was trying to get read the hive tables using pig with the below command.
grunt> DATA = LOAD 'dev.profile' USING org.apache.hcatalog.pig.HCatLoader();
I get the below piece of log
2014-07-16 22:44:58,986 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
2014-07-16 22:44:59,037 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:10000
2014-07-16 22:44:59,057 [main] INFO hive.metastore - Connected to metastore.
2014-07-16 22:45:02,019 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
2014-07-16 22:45:02,166 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
when i do the describe the results comes properly as expected.
grunt> describe DATA
2014-07-16 22:46:42,189 [main] WARN org.apache.hadoop.hive.conf.HiveConf - DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
DATA: {name: chararray,age: int,salary: int}
but when i dump the data i get SocketTimeoutException
2014-07-16 22:47:25,146 [main] ERROR hive.log - Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826)
at org.apache.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:276)
at org.apache.hcatalog.common.HiveClientCache.get(HiveClientCache.java:146)
at org.apache.hcatalog.common.HCatUtil.getHiveClient(HCatUtil.java:548)
at org.apache.hcatalog.pig.PigHCatUtil.getHiveMetaClient(PigHCatUtil.java:158)
at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:200)
at org.apache.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:195)
at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:885)
at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1712)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1420)
at org.apache.pig.PigServer.storeEx(PigServer.java:1004)
at org.apache.pig.PigServer.store(PigServer.java:974)
at org.apache.pig.PigServer.openIterator(PigServer.java:887)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:752)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:542)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 40 more
2014-07-16 22:47:25,148 [main] ERROR hive.log - Converting exception to MetaException
2014-07-16 22:47:25,151 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:10000
2014-07-16 22:47:25,152 [main] INFO hive.metastore - Connected to metastore.
2014-07-16 22:47:45,173 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
Failed to parse: Can not retrieve schema from loader org.apache.hcatalog.pig.HCatLoader#1342464f
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:198)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1712)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1420)
at org.apache.pig.PigServer.storeEx(PigServer.java:1004)
at org.apache.pig.PigServer.store(PigServer.java:974)
at org.apache.pig.PigServer.openIterator(PigServer.java:887)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:752)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:542)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.RuntimeException: Can not retrieve schema from loader org.apache.hcatalog.pig.HCatLoader#1342464f
at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:91)
at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:885)
at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 17 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:179)
at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
... 24 more
Caused by: java.io.IOException: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:205)
at org.apache.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:195)
at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
... 25 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1036)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1022)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
at org.apache.hcatalog.common.HCatUtil.getTable(HCatUtil.java:194)
at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:201)
... 27 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 37 more
2014-07-16 22:47:45,176 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2245: Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
Even though i am able to connect to metastore i am not able to retrieve the data. What could be the reason for read fail ?
and at times the process fails with java.lang.OutOfMemoryError: Java heap space
Any help would be greatly appreciated.
Edit the hive-site.xml.
Replace hive.metastore.ds.retry with /hive.hmshandler.retry.
vim /usr/local/Cellar/hive/0.13.1/libexec/conf/hive-site.xml
:%s/hive.metastore.ds.retry/hive.hmshandler.retry/g
:wq

Resources