java.io.IOException: Async IO operation failed (1), reason: RC: 32 Broken pipe - websphere

I am getting given below error in application log, even though request processing completed successfully ie Producer process the request successfully .
Caused by: java.io.IOException: Async IO operation failed (1), reason: RC: 32 Broken pipe
at com.ibm.io.async.AsyncLibrary$IOExceptionCache.<init>(AsyncLibrary.java:924) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.io.async.AsyncLibrary$IOExceptionCache.get(AsyncLibrary.java:937) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.io.async.AsyncLibrary.getIOException(AsyncLibrary.java:951) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.io.async.AbstractAsyncChannel.multiIO(AbstractAsyncChannel.java:482) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.io.async.AsyncSocketChannelHelper.write(AsyncSocketChannelHelper.java:478) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.tcp.channel.impl.AioSocketIOChannel.writeAIOSync(AioSocketIOChannel.java:353) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.tcp.channel.impl.AioTCPWriteRequestContextImpl.processSyncWriteRequest(AioTCPWriteRequestContextImpl.java:126) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.tcp.channel.impl.TCPWriteRequestContextImpl.write(TCPWriteRequestContextImpl.java:122) ~[?:CCX.CF [o1800.01]]
at com.ibm.ws.ssl.channel.impl.SSLUtils.flushCloseDown(SSLUtils.java:214) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.ssl.channel.impl.SSLUtils.shutDownSSLEngine(SSLUtils.java:126) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.ssl.channel.impl.SSLConnectionLink.cleanup(SSLConnectionLink.java:228) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.ssl.channel.impl.SSLConnectionLink.close(SSLConnectionLink.java:172) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.close(HttpInboundLink.java:899) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.wsspi.channel.base.InboundApplicationLink.close(InboundApplicationLink.java:58) ~[?:CCX.CF [o1800.01]]
at com.ibm.ws.webcontainer.channel.WCChannelLink.close(WCChannelLink.java:333) ~[com.ibm.ws.webcontainer.jar:?]
at com.ibm.ws.webcontainer.channel.WCChannelLink.releaseChannelLink(WCChannelLink.java:503) ~[com.ibm.ws.webcontainer.jar:?]
at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:405) ~[com.ibm.ws.webcontainer.jar:?]
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:465) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:532) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:318) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready(HttpInboundLink.java:289) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.ssl.channel.impl.SSLConnectionLink.determineNextChannel(SSLConnectionLink.java:1187) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.ssl.channel.impl.SSLConnectionLink$MyReadCompletedCallback.complete(SSLConnectionLink.java:694) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.ssl.channel.impl.SSLReadServiceContext$SSLReadCompletedCallback.complete(SSLReadServiceContext.java:1833) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:175) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161) ~[com.ibm.ws.runtime.jar:?]
at <unknown class>.<unknown method>(Unknown Source) ~[?:?]
at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:204) ~[com.ibm.ws.runtime.jar:?]
at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:775) ~[com.ibm.ws.runtime.jar:?]
Spring 5.3.13 ( REST controller)
IBM WAS 9.0
JDK 1.8
Consumer received 504 gateway time out error - since java.io.IOException: Async IO operation failed (1), reason: RC: 32 Broken pipe in Producer log.
What could be the reason for RC: 32 Broken pipe ?

For asynchronous web services, the client asynchronous response listener opens a socket with the default value of seven seconds to persist and listen for asynchronous responses. If the server operation takes longer than the default value, the server or client might receive the following exception:
java.io.IOException: Async IO operation failed (1), reason: RC: 32 Broken pipe
This exception occurs because the persistent read timeout limit is exceeded on the client and subsequently, the connection is closed.
Use this property(HttpInboundPersistReadTimeout) when you are reading large data, or at times when the network is slow such that it takes more than the default value of seven seconds on the server side to read the data. If you receive the broken pipe exception on the server side, increase the value of this time out property.

My application (Spring REST) based on multithreading .so one or more thread try to writing single log file - which is cause an issue .
1.I have changed Log level from DEBUG to ERROR.
2.Implemented 3 log file with Asynchronous Loggers.
3.Changed HttpInboundPersistReadTimeout from 7 seconds to 10 seconds in IBM WAS console.
https://logging.apache.org/log4j/2.x/manual/async.html
https://www.ibm.com/docs/en/was-nd/8.5.5?topic=services-http-transport-custom-properties-web-applications
Now no more time out error.

Related

WebSphere App Server timing out while stopping (ADMU3060E: Timed out waiting for server shutdown)

We are having daily issues in our production environment where WebSphere Application Servers timeout while stopping. The error received is
ADMU3060E: Timed out waiting for server shutdown
Below is some content from the error:
[6/14/20 3:30:21:650 CDT] 00000001 AdminTool A ADMU3201I: Server stop request issued. Waiting for stop status.
[6/14/20 3:50:21:719 CDT] 00000001 AdminTool A ADMU3111E: Server stop requested but failed to complete.
[6/14/20 3:50:21:720 CDT] 00000001 WsServerStop E ADMU3002E: Exception attempting to process server c0tcpc0
[6/14/20 3:50:21:721 CDT] 00000001 WsServerStop E ADMU3007E: Exception com.ibm.websphere.management.exception.AdminException: ADMU3060E: Timed out waiting for server shutdown.
[6/14/20 3:50:21:721 CDT] 00000001 WsServerStop A ADMU3007E: Exception com.ibm.websphere.management.exception.AdminException: ADMU3060E: Timed out waiting for server shutdown.
at com.ibm.ws.management.tools.WsServerStop.runTool(WsServerStop.java:434)
at com.ibm.ws.management.tools.AdminTool.executeUtility(AdminTool.java:271)
at com.ibm.ws.management.tools.WsServerStop.main(WsServerStop.java:113)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
at java.lang.reflect.Method.invoke(Method.java:620)
at com.ibm.wsspi.bootstrap.WSLauncher.launchMain(WSLauncher.java:234)
at com.ibm.wsspi.bootstrap.WSLauncher.main(WSLauncher.java:96)
at com.ibm.wsspi.bootstrap.WSLauncher.run(WSLauncher.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
WebSphere Application Server version is 8.5.5.9 and the WebSphere SDK version is 7.0.9.30.
The stopServer command has an optional timeout (in unit seconds) that you can specify. See if using that helps, i.e., stopServer.sh server1 -timeout 20
stopServer command
(Sorry I didn't notice at first that your stop and fail message are indeed already more than 20 seconds apart, but still, specifying a longer than default timeout in stopServer might be useful.)

internal.S3AbortableInputStream on hadoop fs -get s3 to EMR

When I ssh onto an EMR cluster and do the following command:
hadoop fs -get s3://path/to/my/files
I am getting the following error, and the file transfer fails partway through. I have used this command in the past, so I'm not sure what's up. Could it be related to the files' encryption? What would cause the stream to consistently close?
WARN internal.S3AbortableInputStream: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
Exception in thread "main" org.apache.hadoop.fs.FSError: java.io.IOException: Stream Closed
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:253)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:261)
at org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:478)
at org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:395)
at org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:328)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:263)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:248)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:373)
at org.apache.hadoop.fs.shell.CommandWithDestination.recursePath(CommandWithDestination.java:291)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:319)
at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:373)
at org.apache.hadoop.fs.shell.CommandWithDestination.recursePath(CommandWithDestination.java:291)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:319)
at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
Caused by: java.io.IOException: Stream Closed
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:251)
... 30 more
My best guess: Not enough space on the cluster for the files.

spark timesout maybe due to binaryFiles() with more than 1 million files in HDFS

I am reading millions of xml files via
val xmls = sc.binaryFiles(xmlDir)
The operation runs fine locally but on yarn it fails with:
client token: N/A
diagnostics: Application application_1433491939773_0012 failed 2 times due to ApplicationMaster for attempt appattempt_1433491939773_0012_000002 timed out. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1433750951883
final status: FAILED
tracking URL: http://controller01:8088/cluster/app/application_1433491939773_0012
user: ariskk
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:622)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
On hadoops/userlogs logs I am frequently getting these messages:
15/06/08 09:15:38 WARN util.AkkaUtils: Error sending message [message = Heartbeat(1,[Lscala.Tuple2;#2b4f336b,BlockManagerId(1, controller01.stratified, 58510))] in 2 attempts
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:195)
at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:427)
I run my spark job via spark-submit and it works for an other HDFS directory that contains only 37k files. Any ideas how to resolve this?
Ok after getting some help on sparks mailing list, I found out there were 2 issues:
the src directory, if it is given as /my_dir/ it makes spark fail and creates the heartbeat issues. Instead it should be given as hdfs:///my_dir/*
An out of memory error appears in the logs after fixing #1. This is the spark driver running on yarn running out of memory due to the number of files (apparently it keeps all file info in memory). So I spark-submit'ed the job with --conf spark.driver.memory=8g which fixed the issue.

Hbase Heavy write Exception

This Exception was raised in HBase, when there is a heavy writing to
clusters:
WARN org.apache.hadoop.ipc.HBaseServer: IPC Server listener on 60020: readAndProcess threw exception java.io.IOException: Connection reset by peer. Count of bytes read: 0
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
at sun.nio.ch.IOUtil.read(IOUtil.java:171)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:245)
at org.apache.hadoop.hbase.ipc.HBaseServer.channelRead(HBaseServer.java:1676)
at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1120)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
And this warning is raised:
WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":761893,"call":"multi(org.apache.hadoop.hbase.client.MultiAction#5bf92021), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"172.16.0.121:55803","starttimems":1378784998180,"queuetimems":0,"class":"HRegionServer","responsesize":0,"method":"multi"}
WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call multi(org.apache.hadoop.hbase.client.MultiAction#5bf92021), rpc version=1, client version=29, methodsFingerPrint=54742778 from 172.16.0.121:55803: output error
WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 39 on 60020 caught: java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:135)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:326)
at org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1710)
at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1653)
at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:924)
at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:1003)
at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:409)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1346)
This is caused by not heavy write but big write. "processingtimems":761893 means the write operation is not finished in 761 sec. And before the action is finished, client is timeout. Try to reduce the multi operation item count.

JBOSS error Throwable while attempting to get a new connection: null Io exception: Connection reset

After one week that jboss is working, its crash down, sending the next exception:
Software jboss5.0 jdk1.6 ojdbc14.jar
WARN [JBossManagedConnectionPool] Throwable while attempting to get a new connection: null
org.jboss.resource.JBossResourceException: Could not create connection; - nested throwable: (java.sql.SQLException: Io exception: Connection reset)
at org.jboss.resource.adapter.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:225)
at org.jboss.resource.adapter.jdbc.local.LocalManagedConnectionFactory.createManagedConnection(LocalManagedConnectionFactory.java:195)
at org.jboss.resource.connectionmanager.InternalManagedConnectionPool.createConnectionEventListener(InternalManagedConnectionPool.java:633)
at org.jboss.resource.connectionmanager.InternalManagedConnectionPool.getConnection(InternalManagedConnectionPool.java:267)
at org.jboss.resource.connectionmanager.JBossManagedConnectionPool$BasePool.getConnection(JBossManagedConnectionPool.java:622)
at org.jboss.resource.connectionmanager.BaseConnectionManager2.getManagedConnection(BaseConnectionManager2.java:404)
at org.jboss.resource.connectionmanager.TxConnectionManager.getManagedConnection(TxConnectionManager.java:381)
at org.jboss.resource.connectionmanager.BaseConnectionManager2.allocateConnection(BaseConnectionManager2.java:496)
at org.jboss.resource.connectionmanager.BaseConnectionManager2$ConnectionManagerProxy.allocateConnection(BaseConnectionManager2.java:941)
at org.jboss.resource.adapter.jdbc
.WrapperDataSource.getConnection(WrapperDataSource.java:89)
at com.genexus.db.driver.GXConnection.connectJDBCDataSource(Unknown Source)
........
Caused by: java.sql.SQLException: Io exception: Connection reset
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:255)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:387)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:420)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:165)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:35)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:801)
at org.jboss.resource.adapter.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:207)
... 58 more
Currently running SAS 9.3 and Oracle 11.2.0.2, getting the same error as NEFTALY. Prior to this, getting an error due to limit for number of processes/threads in Red Hat Linux 6.3 - out of memory. This was solved by increasing the limit from 40000 to 90000(!)
This exception can be occur by wrong IP-address.
So check the binding IP address and connection properties of DATABASE.
<local-tx-datasource>
<jndi-name>JNDI_NAME</jndi-name>
<connection-url>jdbc:mysql://localhost:3306/DATABASE_NAME</connection-url>
<driver-class>com.mysql.jdbc.Driver</driver-class>
<user-name>root</user-name>
<password>root</password>
<min-pool-size>1</min-pool-size>
<max-pool-size>10</max-pool-size>
<idle-timeout-minutes>1</idle-timeout-minutes>
<metadata>
<type-mapping>mySQL</type-mapping>
</metadata>
</local-tx-datasource>

Resources