Is it possible that windows leaks sockets connection and these sockets are not shown in tcpview and netstat?
After running a few applications that perform many network connections, my windows machine enters a state in whitch it in not able to open any new socket connection. Even to itself (localhosts).
For example, telnet to a local application failed because windows can't create new sockets.
Closing and restarting the network applications does not helps. Only full windows restart solves the problem.
netstat (& tcpview) indicates that there are only some dozens of connections.
Thanks for your help.
No, it is not possible for those apps to miss leaked connections. Something else is going on. Maybe you are not looking at their detailed views, like seeing closed sockets that are in TIME_WAIT state. If you cannot open new socket connections, you mostly likely are encountering port exhaustion. Wait some time for ports to time out and become available again. Or stop wasting ports in the first place.
Related
I am trying to reverse engineer a third-party TCP client / server Windows XP, SP 3 app for which I have no source available. My main line of attack is to use WireShark to capture TCP traffic.
When I issue a certain GUI command on the client side, the client creates a TCP connection to the server, sends some data, and tears down the connection. The server port is 1234, and the client port is assigned by the OS and therefore varies.
WireShark is showing that the message corresponding to the GUI command I issued gets sent twice. The two messages bear a different source port, but they have the same destination port (1234, as mentioned previosuly).
The client side actually consists of several processes, and I would like to determine which processes are sending these messages. These processes are long-lived, so their PIDs are stable and known. However, the TCP connections involved are transient, lasting only a few milliseconds or so. Though I've captured the client-side port numbers in WireShark and though I know all of the PIDs involved, the fact the connections are transient makes it difficult to determine which PID opened the port. (If the connections were long-lived, I could use netstat to map port numbers to PIDs.) Does anybody have any suggestions on how I can determine which processes are creating these transient connections?
I can think of two things:
Try sysinternals' tcpview program. It gives a detailed listing of all tcp connections opened by all the processes in the system. If a process creates connections, you will be able to see them flash (both connect and disconnect are flashed) in tcpview and you will know which processes to start looking into.
Try running the binary under a debugger. Windbg supports multi-process debugging (so does visual studio I think). You may have only export symbols to work with but that should still work for calls made to system dlls. Try breaking on any suspected windows APIs you know will be called by the process to create the connections. MSDN should have the relevant dlls for most system APIs documented.
Start here... post a follow-up if you get stuck again.
I ended up creating a batch file that runs netstat in a tight loop and appends its output to a text file. I ran this batch file while running the system, and by combing through all of the netstat dumps, I was able to find a dump that contained the PIDs associated with the ports.
I have a small TCP server that listens on a port. While debugging it's common for me to CTRL-C the server in order to kill the process.
On Windows I'm able to restart the service quickly and the socket can be rebound. On Linux I have to wait a few minutes before bind() returns with success
When bind() is failing it returns errno=98, address in use.
I'd like to better understand the differences in implementations. Windows sure is more friendly to the developer, but I kind of doubt Linux is doing the 'wrong thing'.
My best guess is Linux is waiting until all possible clients have detected the old socket is broken before allowing new sockets to be created. The only way it could do this is to wait for them to timeout
is there a way to change this behavior during development in Linux? I'm hoping to duplicate the way Windows does this
You want to use the SO_REUSEADDR option on the socket on Linux. The relevant manpage is socket(7). Here's an example of its usage. This question explains what happens.
Here's a duplicate of this answer.
On Linux, SO_REUSEADDR allows you to bind to an address unless an active connection is present. On Windows this is the default behaviour. On Windows, SO_REUSEADDR allows you to additionally bind multiple sockets to the same addresses. See here and here for more.
I am trying to simulate a scenario where connection to the server of one process is down while the connection to another server is up. Just pulling the network cable won't work in my case since I need another process connection to stay up.
Is there any tool for this kind of job? I am on Windows. Thanks!
There's a few layers which you can simulate this at. The easiest would be if your two servers listen on two distinct TCP ports. In that case, you could run two tcp proxies, and stop/pause one when you want to simulate a failure. For Windows I would suggest using tcpTrace to do this.
Another option would be to have the two servers bound to two virtual NICs, which are bridged to the physical NIC. Of course if you have two physical NICs, you could bind each server process to a different physical NIC.
At a lower level, you can ran a WAN simulator. Most simulators allow you to impair specific types of traffic or specific ports. One such simulator is Packetstorm.
One other method which I would suggest is attaching a debugger to one process, and halting all threads on the process with the debugger. Often, a process doesn't die, but gets stuck in garbage collection, or in a loop. As the sockets don't close, many 'high availability' solutions won't automatically failover.
One approach would be to mock the relevant network connection code for the purposes of testing. In this case you would probably want to mock it returning whatever it usually would if the connection was down.
A poor man's approach if you can use sleep/hibernate mode on your machine :
Set an Outbound rule in the Windows Firewall to disallow connection for a particular Program.
Already connected sockets stay connected: put the machine in sleep/hibernate mode for a brief moment to force those sockets to disconnect.
When the system is restored, the program cannot establish new connections.
New connections are made possible as soon as you disable the firewall rule.
Note that it does not simulate network outage because each connection fails immediately with an permission error. But it prevents a process to establish connections.
I have a application which I suspects to get into problems because network nodes closes its sockets to various other servers it communicates to.
I would like to mimic that behaviour by shutting down one or severel connections that can be seen in netstat.
I'm not an expert in networking on OS level, so this question may be stupid, if so do you have any other suggestion on how to mimic the situation?
Before attempting to simulate the problem, you can diagnose the situation with certainty by using Wireshark or, on *nix, tcpdump. You should be able to capture the traffic and observer whether one of the server(s) is sending you a RST or not.
If you are receiving RST then this may be due to the servers themselves (time-outing and closing the connection while waiting for you to send them a response or data, or closing the connection because of server bugs or load limits), to your ISP's network equipment, or to your own network equipment (e.g. your wireless link going up and down.)
Disconnecting your router, network cable or wireless interface can simulate to various degrees connection issues that you can encounter at any time when talking to your servers. Disconnecting your PC's network cable should simulate a forceful closing of the connection with RST, whereas disconnecting a cable between you and the internet, but not the cable that directly connects your PC (e.g. disconnecting the cable between your router and your cable/DSL modem, or disconnecting the cable/DSL modem from the cable company's cable/telco's wall jack) will allow you to simulate timeout conditions.
You can use TCP View. You can view all the connections from your machine to another, or within the same machine. You can close a connection, thus breaking the connection.
I would like to see how a program responds when it's connection is severed. Aside from disabling the network card, is there a way to sever a tcp connection in Windows without killing the process, or the thread that owns the connections?
The closest thing that I've found to generating an OS error is to use something like TcpView to look at what sockets are open and sever them. I'm not sure exactly what it does to sever the connection, but it does close it in a way that an application can see.
TCPView by SysInternals lets you close a connection (and see all open connections).
One thing I've seen done is to have the network code written in such a way that a connection can be severed remotely. A product I once worked on was written that way. We even had a set of torture tests that would randomly break the connections. The product was meant to be transactional, and it was instructive to see how it behaved.
Of course, we then found a customer whose network was actually breaking connections all the time, and were very glad we'd tested so hard.
Why not just unplug the network cable?