Nutch: failed with: java.net.SocketException: Connection reset - hadoop

I run nutch in my hadoop cluster.
When the job gets to the step fetch data I get java.net.SocketException: Connection reset. Here's the full stacktrace:
2013-10-09 00:34:05,922 INFO org.apache.nutch.fetcher.Fetcher: fetch of Url error : xxxxxxx failed with: java.net.SocketException: Connection reset
2013-10-09 00:34:05,923 ERROR org.apache.nutch.protocol.httpclient.Http: Failed to get protocol output
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:77)
at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:105)
at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1115)
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1373)
at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1832)
at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1590)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:995)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:397)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324)
at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:94)
at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:140)
at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:703)

You have to indicate url's protocol on your seedlist! for example:
http://stackoverflow.com/
https://google.com
ftp://foo.bar

Related

ambari on HDP cluster + ambari-metrics-collector service not start

we have some issue with ambari-metrics-collector service , ( we have HDP cluster version - 2.6.4 with 8 nodes )
ambari metrics collector service can’t start or start of few second then failed
the details about metrics collector version
rpm -qa | grep metrics
ambari-metrics-grafana-2.6.1.0-143.x86_64
ambari-metrics-monitor-2.6.1.0-143.x86_64
ambari-metrics-collector-2.5.0.3-7.x86_64
ambari-metrics-hadoop-sink-2.6.1.0-143.x86_64
all machines are rhel 7.2
we performed the following steps in order to resolve the problem
1.restart metrics-collector service
su - ams -c '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf/ stop'
su - ams -c '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf/ start'
or
ambari-metrics-collector stop
ambari-metrics-collector start
2.restart ambari-metrics-monitor on all nodes
ambari-metrics-monitor stop
ambari-metrics-monitor start
3.clean the folder /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/
mv /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/zookeeper_0 /tmp/bck/zookeeper/
Then restart metrics-collector service
4.Tuning the metrics-collector parameters according - https://docs.cloudera.com/HDPDocuments/Ambari-2.2.1.0/bk_ambari_reference_guide/content/_ams_general_guidelines.html
we update the follwing parameters in ambari
metrics_collector_heap_size=1024
hbase_regionserver_heapsize=1024
hbase_master_heapsize=512
hbase_master_xmn_size=128
status for now: - steps 1-4 doesn’t help
From the logs we can see the following:
log file - ambari-metrics-collector.log
2020-06-25 09:06:14,474 WARN org.apache.zookeeper.ClientCnxn: Session 0x172eab71f310002 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
2020-06-25 09:06:14,575 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=master02.sys671.com:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure/meta-region-server
log file - hbase-ams-master-master02.sys671.com.log
2020-06-25 09:38:18,799 WARN [RS:0;master02:51842-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Session 0x172ead5d73a0004 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2020-06-25 09:38:20,437 INFO [main-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Opening socket connection to server master02.sys671.com/23.2.35.171:61181. Will not attempt to authenticate using SASL (unknown error)
2020-06-25 09:38:20,438 WARN [main-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Session 0x172ead5d73a0002 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
we also not see that port is listening ( timeline.metrics.service.webapp.address )
netstat -tulpn | grep 6188
any advice how to continue from this point ?
we'll appreciate to get any help about this problem

com.datastax.driver.core.TransportException: [/xx.xxx.x.xxx:9042] Cannot connect

I have 2 node in Cassandra cluster with IP:Port aa.aaa.a.aaa:9043(node) and xx.xxx.x.xxx:9043. When i trying to connect using following config **PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions.setCoreConnectionsPerHost(HostDistance.LOCAL, 2)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 4)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 2)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 4)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 200)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 200);
cluster = Cluster.builder()
.addContactPointsWithPorts(socketAddressList)
.withPoolingOptions(poolingOptions)
.withRetryPolicy(DefaultRetryPolicy.INSTANCE)
.withLoadBalancingPolicy(new TokenAwarePolicy(new DCAwareRoundRobinPolicy())).build();
Session session = cluster.connect(cassandraDB);**
I am getting following exception 16/01/14 09:52:45 INFO core.NettyUtil: Did not find Netty's native epoll transport in the classpath, defaulting to NIO.
16/01/14 09:52:46 WARN core.Cluster: ***You listed /xx.xxx.x.xxx:9043 in your contact points, but it could not be reached at startup*
16/01/14 09:52:47 INFO policies.DCAwareRoundRobinPolicy: Using data-center name 'name' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)ent.Futures$CombinedFuture setExceptionAndMaybeLog
SEVERE: input future failed.
com.datastax.driver.core.TransportException: [/xx.xxx.x.xxx:9042] Cannot connect
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:156)
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:139)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /xx.xxx.x.xxx:9042
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)
... 6 more
16/01/14 09:52:47 ERROR core.Session: Error creating pool to /xx.xxx.x.xxx:9042
com.datastax.driver.core.TransportException: [/xx.xxx.x.xxx:9042] Cannot connect
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:156)
at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:139)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /xx.xxx.x.xxx:9042
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)**
My Question are:
why it is trying to connect on port 9042, while no where i am using this port in code as well as in config file ?
cassandra version: Cassandra 2.2.1
why it is trying to connect on port 9042, while no where i am using this port in code as well as in config file ?
9042 is the default port for the CQL binary protocol.
Can you show us the content of the variable socketAddressList that you passed to the cluster builder ?
Is there any reason you're using port 9043 instead of the default 9042 port ?

Error connecting Oracle with Hadoop using Sqoop

I have added ojdbc.jar file in /usr/lib/sqoop/lib and I am trying to connect oracle to hadoop using sqoop but facing error.
I am using following command:
sqoop list-tables --connect jdbc:oracle:thin://#192.162.2.8:1521:orcl --username hr --password abc
But the i get following error:
15/05/05 09:21:31 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/05/05 09:21:32 ERROR manager.OracleManager: Failed to rollback transaction
java.lang.NullPointerException
at com.cloudera.sqoop.manager.OracleManager.listTables(OracleManager.java:596)
at com.cloudera.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
15/05/05 09:21:32 ERROR manager.OracleManager: Failed to list tables
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:489)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:553)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:254)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:528)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at com.cloudera.sqoop.manager.OracleManager.makeConnection(OracleManager.java:275)
at com.cloudera.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:51)
at com.cloudera.sqoop.manager.OracleManager.listTables(OracleManager.java:585)
at com.cloudera.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:439)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:454)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:693)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:251)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1140)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:340)
... 16 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:149)
at oracle.net.nt.ConnOption.connect(ConnOption.java:133)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:405)
is there anyhthing wrong with the sqoop command.?
The error "network adaptor could not establish connection" is coming because of incorrect jdbc url. Jdbc url in your sqoop command should be in this format: jdbc:oracle:thin:#192.162.2.8:1521:orcl
The connection refused error may occur by scenarios as far as I know.
The Oracle service might not be running on the specified host on the
given port number.
The firewall in between might restrict the client access to the
oracle server through the given port number.
So I suggest you to first confirm the oracle host, port and the firewall restriction in between.
you can easily check the access by using telnet as below,
telnet 192.162.2.8 1521
See if the listener and the database are initiated. I just started the listener (lsnrctl start) and the database (sqlplus / as sysdba    and     startup) and it worked.

HBase RegionServer: error telling master we are up

I am getting the following errors in the logs of the slave RegionServer.
The problem seems to be at
regionserver.HRegionServer: reportForDuty to
master=localhost,60000,1397430611631 with port=60020
The master is set as localhost but should actually be pointing towards master.
I am unable to figure out how the slave figures out who the master even after going through the docs.
The complete log is:
2014-04-14 04:49:35,939 INFO [regionserver60020] regionserver.HRegionServer: CompactionChecker runs every 10sec
2014-04-14 04:49:35,950 INFO [regionserver60020] regionserver.HRegionServer: reportForDuty to master=localhost,60000,1397430611631 with port=60020, startcode=1397431174733
2014-04-14 04:49:36,083 WARN [regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1671)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:5402)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2013)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:846)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:866)
at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1536)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
... 5 more
check your /etc/hosts file,if there is something like
127.0.0.1 localhost yourhost
change it to
127.0.0.1 localhost
192.168.1.1 yourhost

what is causing recv failed (ActiveMQ 5.6.0 on Windows 7)

I'm having a problem with ActiveMQ 5.6.0 on Windows 7.
WARN | Transport Connection to: tcp://127.0.0.1:55793 failed: java.net.SocketException: Software caused connection abort: recv failed
WARN | Transport Connection to: tcp://127.0.0.1:55791 failed: java.net.SocketException: Software caused connection abort: recv failed
WARN | Transport Connection to: tcp://127.0.0.1:55792 failed: java.net.SocketException: Software caused connection abort: recv failed
Anyone know what might cause this? Here's a screenshot:
Can you run your app/JRE with the following command line arguments?
-Djava.net.preferIPv4Stack=true
See here for more info.

Resources