.NET Remoting keeps a listening socket if I stop debugging server program - remoting

I have sporadic issue, it is not happening always.
I have windows service which listening on some port (.NET Remoting).
Here the code of registering channel:
void RegisterChannel(int portNumber, string bindTo)
{
ListDictionary channelProperties = new ListDictionary();
channelProperties.Add("name", Guid.NewGuid().ToString().ToUpper());
channelProperties.Add("port", portNumber);
channelProperties.Add("bindTo", bindTo);
channelProperties.Add("secure", true.ToString());
channelProperties.Add("impersonate", true.ToString());
channelProperties.Add("useIpAddress", false.ToString());
BinaryClientFormatterSinkProvider clientFormatter = new BinaryClientFormatterSinkProvider();
BinaryServerFormatterSinkProvider serverFormatter = new BinaryServerFormatterSinkProvider();
serverFormatter.TypeFilterLevel = TypeFilterLevel.Full;
TcpChannel tcpChannel = new TcpChannel(channelProperties, clientFormatter, serverFormatter);
ChannelServices.RegisterChannel(tcpChannel, true);
}
RegisterChannel(9000, "[::]");
RegisterChannel(9000, "0.0.0.0");
RemotingConfiguration.RegisterWellKnownServiceType(typeof(AccessServer), "AccessLayer", WellKnownObjectMode.Singleton);
If I start debugging of this service in Visual Studio and then stop debugging, after that I am unable to start debugging.
I am getting following exception:
Only one usage of each socket address (protocol/network address/port) is normally permitted
Running netstat –anbo shows:
TCP 0.0.0.0:9000 0.0.0.0:0 LISTENING 3432 [System]
TCP [::]:9000 [::]:0 LISTENING 3432 [System]
After that I have to reboot, otherwise port is taken forever.
Any ideas how to fix it?

Firstly, is the service process definitely being terminated when you stop debugging? It is possible to start the debugger and then for the process to continue after you detach again - meaning that the service code would still be running and so still be occupying the port.
You also might want to change your service so that you explicitly unregister the channel. I've written Windows Services in the past that use Remoting and in "OnStart" a TcpChannel is configured, just as with the code in your question, then in "OnStop" that channel is unregistered by calling "UnregisterChannel" -
public static void UnregisterChannel(IChannel chan)
{
// Due to some weird behaviour we can only remove the registered channel
// by spinning through the list of channels
foreach (IChannel ch in ChannelServices.RegisteredChannels)
{
if (ch.Equals(chan))
{
ChannelServices.UnregisterChannel(ch);
return;
}
}
}
In order to do this, you'll need to keep a reference to "tcpChannel" hanging around so that you can unregister it when you're finished.
You would think that you could just call "ChannelServices.UnregisterChannel" directly with the tcpChannel reference that you are now keeping hold of until "tidy up time" but I encountered some problems with this - I can't recall precisely what the symptoms were but I remember that they were avoided by looping through the RegisteredChannels set and looking for a reference that claimed to Equals my channel reference and then passing that to "ChannelServices.UnregisterChannel".

Related

Oracle JDBC intermittent connection reset SQLRecoverableException [duplicate]

I am getting the following error trying to read from a socket. I'm doing a readInt() on that InputStream, and I am getting this error. Perusing the documentation this suggests that the client part of the connection closed the connection. In this scenario, I am the server.
I have access to the client log files and it is not closing the connection, and in fact its log files suggest I am closing the connection. So does anybody have an idea why this is happening? What else to check for? Does this arise when there are local resources that are perhaps reaching thresholds?
I do note that I have the following line:
socket.setSoTimeout(10000);
just prior to the readInt(). There is a reason for this (long story), but just curious, are there circumstances under which this might lead to the indicated error? I have the server running in my IDE, and I happened to leave my IDE stuck on a breakpoint, and I then noticed the exact same errors begin appearing in my own logs in my IDE.
Anyway, just mentioning it, hopefully not a red herring. :-(
There are several possible causes.
The other end has deliberately reset the connection, in a way which I will not document here. It is rare, and generally incorrect, for application software to do this, but it is not unknown for commercial software.
More commonly, it is caused by writing to a connection that the other end has already closed normally. In other words an application protocol error.
It can also be caused by closing a socket when there is unread data in the socket receive buffer.
In Windows, 'software caused connection abort', which is not the same as 'connection reset', is caused by network problems sending from your end. There's a Microsoft knowledge base article about this.
Connection reset simply means that a TCP RST was received. This happens when your peer receives data that it can't process, and there can be various reasons for that.
The simplest is when you close the socket, and then write more data on the output stream. By closing the socket, you told your peer that you are done talking, and it can forget about your connection. When you send more data on that stream anyway, the peer rejects it with an RST to let you know it isn't listening.
In other cases, an intervening firewall or even the remote host itself might "forget" about your TCP connection. This could happen if you don't send any data for a long time (2 hours is a common time-out), or because the peer was rebooted and lost its information about active connections. Sending data on one of these defunct connections will cause a RST too.
Update in response to additional information:
Take a close look at your handling of the SocketTimeoutException. This exception is raised if the configured timeout is exceeded while blocked on a socket operation. The state of the socket itself is not changed when this exception is thrown, but if your exception handler closes the socket, and then tries to write to it, you'll be in a connection reset condition. setSoTimeout() is meant to give you a clean way to break out of a read() operation that might otherwise block forever, without doing dirty things like closing the socket from another thread.
Whenever I have had odd issues like this, I usually sit down with a tool like WireShark and look at the raw data being passed back and forth. You might be surprised where things are being disconnected, and you are only being notified when you try and read.
You should inspect full trace very carefully,
I've a server socket application and fixed a java.net.SocketException: Connection reset case.
In my case it happens while reading from a clientSocket Socket object which is closed its connection because of some reason. (Network lost,firewall or application crash or intended close)
Actually I was re-establishing connection when I got an error while reading from this Socket object.
Socket clientSocket = ServerSocket.accept();
is = new BufferedReader(new InputStreamReader(clientSocket.getInputStream()));
int readed = is.read(); // WHERE ERROR STARTS !!!
The interesting thing is for my JAVA Socket if a client connects to my ServerSocket and close its connection without sending anything is.read() is being called repeatedly.It seems because of being in an infinite while loop for reading from this socket you try to read from a closed connection.
If you use something like below for read operation;
while(true)
{
Receive();
}
Then you get a stackTrace something like below on and on
java.net.SocketException: Socket is closed
at java.net.ServerSocket.accept(ServerSocket.java:494)
What I did is just closing ServerSocket and renewing my connection and waiting for further incoming client connections
String Receive() throws Exception
{
try {
int readed = is.read();
....
}catch(Exception e)
{
tryReConnect();
logit(); //etc
}
//...
}
This reestablises my connection for unknown client socket losts
private void tryReConnect()
{
try
{
ServerSocket.close();
//empty my old lost connection and let it get by garbage col. immediately
clientSocket=null;
System.gc();
//Wait a new client Socket connection and address this to my local variable
clientSocket= ServerSocket.accept(); // Waiting for another Connection
System.out.println("Connection established...");
}catch (Exception e) {
String message="ReConnect not successful "+e.getMessage();
logit();//etc...
}
}
I couldn't find another way because as you see from below image you can't understand whether connection is lost or not without a try and catch ,because everything seems right . I got this snapshot while I was getting Connection reset continuously.
Embarrassing to say it, but when I had this problem, it was simply a mistake that I was closing the connection before I read all the data. In cases with small strings being returned, it worked, but that was probably due to the whole response was buffered, before I closed it.
In cases of longer amounts of text being returned, the exception was thrown, since more then a buffer was coming back.
You might check for this oversight. Remember opening a URL is like a file, be sure to close it (release the connection) once it has been fully read.
I had the same error. I found the solution for problem now. The problem was client program was finishing before server read the streams.
I had this problem with a SOA system written in Java. I was running both the client and the server on different physical machines and they worked fine for a long time, then those nasty connection resets appeared in the client log and there wasn't anything strange in the server log. Restarting both client and server didn't solve the problem. Finally we discovered that the heap on the server side was rather full so we increased the memory available to the JVM: problem solved! Note that there was no OutOfMemoryError in the log: memory was just scarce, not exhausted.
Check your server's Java version. Happened to me because my Weblogic 10.3.6 was on JDK 1.7.0_75 which was on TLSv1. The rest endpoint I was trying to consume was shutting down anything below TLSv1.2.
By default Weblogic was trying to negotiate the strongest shared protocol. See details here: Issues with setting https.protocols System Property for HTTPS connections.
I added verbose SSL logging to identify the supported TLS. This indicated TLSv1 was being used for the handshake.
-Djavax.net.debug=ssl:handshake:verbose:keymanager:trustmanager -Djava.security.debug=access:stack
I resolved this by pushing the feature out to our JDK8-compatible product, JDK8 defaults to TLSv1.2. For those restricted to JDK7, I also successfully tested a workaround for Java 7 by upgrading to TLSv1.2. I used this answer: How to enable TLS 1.2 in Java 7
I also had this problem with a Java program trying to send a command on a server via SSH. The problem was with the machine executing the Java code. It didn't have the permission to connect to the remote server. The write() method was doing alright, but the read() method was throwing a java.net.SocketException: Connection reset. I fixed this problem with adding the client SSH key to the remote server known keys.
In my case was DNS problem .
I put in host file the resolved IP and everything works fine.
Of course it is not a permanent solution put this give me time to fix the DNS problem.
In my experience, I often encounter the following situations;
If you work in a corporate company, contact the network and security team. Because in requests made to external services, it may be necessary to give permission for the relevant endpoint.
Another issue is that the SSL certificate may have expired on the server where your application is running.
I've seen this problem. In my case, there was an error caused by reusing the same ClientRequest object in an specific Java class. That project was using Jboss Resteasy.
Initially only one method was using/invoking the object ClientRequest (placed as global variable in the class) to do a request in an specific URL.
After that, another method was created to get data with another URL, reusing the same ClientRequest object, though.
The solution: in the same class was created another ClientRequest object and exclusively to not be reused.
In my case it was problem with TSL version. I was using Retrofit with OkHttp client and after update ALB on server side I should have to delete my config with connectionSpecs:
OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder();
List<ConnectionSpec> connectionSpecs = new ArrayList<>();
connectionSpecs.add(ConnectionSpec.COMPATIBLE_TLS);
// clientBuilder.connectionSpecs(connectionSpecs);
So try to remove or add this config to use different TSL configurations.
I used to get the 'NotifyUtil::java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:...' message in the Apache Console of my Netbeans7.4 setup.
I tried many solutions to get away from it, what worked for me is enabling the TLS on Tomcat.
Here is how to:
Create a keystore file to store the server's private key and
self-signed certificate by executing the following command:
Windows:
"%JAVA_HOME%\bin\keytool" -genkey -alias tomcat -keyalg RSA
Unix:
$JAVA_HOME/bin/keytool -genkey -alias tomcat -keyalg RSA
and specify a password value of "changeit".
As per https://tomcat.apache.org/tomcat-7.0-doc/ssl-howto.html
(This will create a .keystore file in your localuser dir)
Then edit server.xml (uncomment and edit relevant lines) file (%CATALINA_HOME%apache-tomcat-7.0.41.0_base\conf\server.xml) to enable SSL and TLS protocol:
<Connector port="8443" protocol="HTTP/1.1" SSLEnabled="true"
maxThreads="150" scheme="https" secure="true"
clientAuth="false" sslProtocol="TLS" keystorePass="changeit" />
I hope this helps

systemd socket activation listener end-of-program behaviour

What is the correct behavior on close of a systemd AF_UNIX socket activated daemon.
daemon.socket service file creates the socket, passes it to my daemon, which accept()s new connections. What is supposed to happen when my daemon ends?
The usual is to close() and unlink() the socket. However, that does what it says, and the UNIX socket is no longer available in the FS, even though daemon.socket is still reporting as activated, basically disabling socket re-activation.
How to create a systemd socket restart-able daemon that listen()s on its socket? Is the correct approach to leave the socket open?
After experimentation and reading of relevant manpages and cups code, here is what I know:
There are two modes of functioning :
inetd mode, where systemd accept()s the connection and passes the accepted FD to a newly spawned subprocess. This method is used by default by ssh.socket, whereby every connection spawns a new ssh process.
Non-accepting mode, where the bound socket itself is forwarded to the daemon, which then handles all accept() calls.
per man systemd.socket
A daemon listening on an AF_UNIX socket may,
but does not need to, call close(2) on the received socket before exiting. However, it must not unlink the socket from a file system.
So a received bound socket must not be unlink()ed. On the other hand, even stopping the .socket service doesn't remove the inode. To that effect, the RemoveOnStop= directive exists.
It would appear the "recommended" way is to let systemd create the socket, pass it to the daemon, and in the daemon at most close() it. Once closed, it is of course only closed in the daemon, so the socket is still available in the system. Meaning after a close() the service will be again activated.
Presumably a call to sd_listen_fds() in a still running service would pass the FD anew. I have not tested this.
TLDR: In a systemd socket activated service, don't unlink(), you may close(), RemoveOnStop= also deletes the socket file on stop of the .socket service.

Reconnecting with MQ - issues

I'm looking forward to implement a somewhat intelligent MQ comms module, which should be tolerant for the outages in the network connection. Basically it should try to reconnect each 5 seconds if connection was lost.
The problem is the following. I use the following code for reading:
queueMessage = new MQMessage();
queueMessage.Format = MQC.MQFMT_STRING;
queueGetMessageOptions = new MQGetMessageOptions();
queueGetMessageOptions.Options = MQC.MQGMO_SYNCPOINT + MQC.MQGMO_WAIT + MQC.MQGMO_FAIL_IF_QUIESCING;
queueGetMessageOptions.WaitInterval = 50;
producerQueue.Get(queueMessage, queueGetMessageOptions);
msg = queueMessage.ReadBytes(queueMessage.MessageLength);
(Of course I successfully connect to the queuemanager before etc.)
I got the following issue: when this routine runs, but at the time of .Get there's no connection, the code simply hangs and stays in the .Get.
I use a timer to see if there's a timeout (in theory even that shouldn't be necessary, is that right?) and at the timeout I try to reconnect. BUT when this timeout expires, I still see that the queuemanager reports that it's connected, while its clearly not (no physical connection is present anymore). This issue has popped up since I use SYNCPOINT, and I experience the same when I cut connection during writing, or in this case I try to force a Disconnect on the queuemanager. So please help, what settings shall I use to avoid getting stuck in Get and Put and rather have an MQException thrown or something controllable?
Thanks!
UPDATE: I used the following code to connect to the QueueManager.
Hashtable props = new Hashtable();
props.Add(MQC.HOST_NAME_PROPERTY, Host);
props.Add(MQC.PORT_PROPERTY, Port);
props.Add(MQC.CHANNEL_PROPERTY, ChannelInfo);
if(User!="") props.Add(MQC.USER_ID_PROPERTY, User);
if(Password!="") props.Add(MQC.PASSWORD_PROPERTY, Password);
props.Add(MQC.TRANSPORT_PROPERTY, MQC.TRANSPORT_MQSERIES_MANAGED);
queueManager = new MQQueueManager(QueueManagerName, props);
producerQueue = queueManager.AccessQueue(
ProducerQueueName,
MQC.MQOO_INPUT_AS_Q_DEF // open queue for input
+ MQC.MQOO_FAIL_IF_QUIESCING); // but not if MQM stopping
consumerQueue = queueManager.AccessQueue(
ConsumerQueueName,
MQC.MQOO_OUTPUT + MQC.MQOO_BROWSE + MQC.MQOO_INPUT_AS_Q_DEF // open queue for output
+ MQC.MQOO_FAIL_IF_QUIESCING); // but not if MQM stopping
Needless to say that normally the code works well. Read/Write, connect/disconnect works as it should, I only have to figure out the current issue.
Thanks!
What version of MQ are you using? For automatic reconnection to work the queue manager need to be at least at MQ v701 and MQ .NET client needs to be a MQ v7.1 level.
Assuming you are using MQ v7.1 .NET client, you need to specify reconnect option during connection create. You will need to enable reconnection by adding something like:
props.Add(MQC.CONNECT_OPTIONS_PROPERTY, MQC.MQCNO_RECONNECT);
Reconnection can be enabled/disabled from mqclient.ini file also.
But what is surprising is why the Get/Put are hanging when there is no network connection. Hope you are not connecting a queue manager running on the same machine as your application. There is no need to set any timer or something like that. You can issue MQ calls and if there is anything wrong with connection, an exception will be thrown.
Update:
I think you are referring to IsConnected property of MQQueueManager class. The documentation says the value of this property: "If true, a connection to the queue manager has been made, and is not known to be broken. Any calls to IsConnected do not actively attempt to reach the queue manager, so it is possible that physical connectivity can break, but IsConnected can still return true. The IsConnected state is only updated when activity, for example, putting a message, getting a message, is performed on the queue manager.
If false, a connection to the queue manager has not been made, or has been broken, or has been disconnected."
As you can see a True value does not mean the connection is still ON. My suggestion would be to call a method, Put/Get and handle any exception thrown.
Put/Get/Disconnect calls hanging appears to be a problem. My suggestion would be raise a PMR with IBM.

AMQ9999 occuring in AMQERR01.LOG

I have the following error showing up in AMQERR01.LOG
AMQ9999: Channel 'MGATESrvChannel' to host 'Mgate (127.0.0.1)' ended
abnormally.
EXPLANATION:
The channel program running under process ID 1060(4364) for channel
'MGATESrvChannel' ended abnormally. The host name is 'Mgate (127.0.0.1)'; in
some cases the host name cannot be determined and so is shown as '????'.
This error is preceded with following message:
AMQ9508: Program cannot connect to the queue manager.
EXPLANATION:
The connection attempt to queue manager 'MGATE.QM' failed with reason code
2059.
ACTION:
Ensure that the queue manager is available and operational.
According to what I have been told this can be caused by an application that is using queue manager, however, it seems to me that this has more to do with the way that manager was set up or similar. Can anyone please shed some light on this?
Thanks in advance!
The 2059 says that a connection request was received and refused because the QMgr was not available. We used to see this a lot when the listener was run as a separate process or when inetd was used to start channels. This is because the listener was there to accept the connection but the QMgr processes were not.
Now that the listener is run as a child process of the QMgr, it is quite rare to see this on the WMQ error logs though clients commonly see it. This is because when the listener is run as a child process of the QMgr, there is nothing listening to receive the connection request and it bounces off of the host's IP stack before ever getting to MQ code.
The AMQ9999 message says that a channel program, one of the QMgr's child processes, died or was killed and this caused the channel to terminate. There are many reasons for a channel process to die including being killed by the OS if resources are short, or being killed by a human operator. Other than that the most common way they can die due to running in trusted or fastpath mode and the attached program corrupts them.
It would help to narrow down the field to know the details of the QMgr in question - version and fix pack, how the listeners are started, channel settings, etc.
Start your listener up, you may check the Control property for that channel, so it start up automatically when the Queue Manager restart.

How to close Winsock UDP socket hard?

I need to close UDP socket which has unsent data immediately.
There is SO_LINGER parameter for TCP sockets but I didn't find out anything for UDP.
It's on Windows.
Thanks in advance.
Update 0:
I give background of this question. I have application 1st thread opens/binds/closes socket, 2nd thread sends datagrams to it.
In some cases after closing the socket (errorcode = 0) bind function returns errorcode 10048 "Address already in use". I found out after close() execution port is still used (via netstat command). Maybe I ask incorrect question and the reason of such behavior is something else?
For all application purposes once your send() returns, the packet is "sent". There's no send-buffer like in TCP, and you have no control over the NIC packet queue. Normal close() is all you need.
Edit 0:
#EJP, here's a quote from UNP for you (Section 2.11 "UDP Output"):
This time, we show the socket send buffer as a dashed box bacause it
doesn't really exist. A UDP socket has a send buffer size (which we
can change with the SO_SNDBUF socket option, Section 7.5), but this
is simply an upper limit on the maximum-sized UDP datagram that can
be written to the socket. If an application writes a datagram larget
than the socket send buffer size, EMSGSIZE is returned. Since UDP is
unreliable, it does not need tp keep a copy of the application's data
and does not need an actual send buffer. (The application data is
normally copied into a kernel buffer of some form as it passes down
the protocol stack, but this copy is discarded by the datalink layer
after the data is transmitted.)
This is what I meant in my answer - you have no control over the send buffer - , so "for all application purposes" it does not exist.
I was having this problem with a windows UDP socket as well. After hours of trying everything I finally found my problem was that I was calling socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) on the main thread to create the socket, calling bind(...) and recvfrom() on a worker thread, then after closing the worker thread I called closesocket(...) on the main thread. None of the functions returned an error but tor some reason, doing this leaves the UDP address/port combination in use (so a future call to bind() triggers error 10048 WSAEADDRINUSE and netstat -abot -p UDP also shows the port still in use until the whole application is closed). The solution was to move socket(...) and closesocket(...) calls into the worker thread.
Other than weird issues like the case above, there is normally no way that a UDP server socket can be left open after calling closesocket() on it. Microsoft explains that there is no connection maintained with a UDP socket and no need to call shutdown() or any other function. Usually the reason a TCP socket is left open after calling closesocket() is that it wasn't disconnected gracefully and it's waiting for about 4 minutes in TCP_WAIT state for possible additional data to come in before it actually closes. In the case above, netstat showed the UDP socket never closed until the application was closed even if I waited 30+ minutes.
If you're using a wrapper around winsock like the .NET framework, I've also read some features like setting up async callbacks can leave a UDP socket bound open if you don't clean up the callbacks correctly, but I don't think there are any such features in the win32 winsock API that can cause that.
Just close it. There's nothing in UDP that says that pending data will be sent, unlike TCP.

Resources