Here was a ssh server at My win10, than When I ran the demo named example-ssh2 builded by visual studio 2015 from libssh2 1.90.
I saw the tcp socket was ESTABLISHED.
session = libssh2_session_init()
was successed.But
libssh2_session_handshake(session, sock)
always return -43.
Could you help me?
I was getting this -43 when connecting to servers on high latency networks.
Libssh2.h has this definition...
define LIBSSH2_ERROR_SOCKET_RECV -43
I was able successfully connect consistently by adding a short sleep between the libssh2_session_init() call and the libssh2_session_handshake(session, sock) call like this...
session = libssh2_session_init();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
libssh2_session_handshake(session, sock);
I chanced upon this solution by single stepping over the original code in the debugger, and it would successfully connect every time, so I tried the sleep and it worked. I tried some different sleep times, with 30ms being the lowest value that still worked.
I found your post while trying to find an explanation for this behavior. I hope this method works for you.
Related
We recently observe rare UDP communication issues that show the following symptoms:
A socket sendto() call fails with error WSAENOBUFS (10055)
A subsequent recvfrom() call on this socket does not receive anything, even though Wireshark shows that the network interface actually received the expected datagrams. This situation persists for approximately 8 seconds, afterwards new incoming datagrams can be received again from the socket.
In Windows System Log, there appears a Kernel-General information entry at the time of the sendto() error:
The access history in hive \??\C:\ProgramData\Microsoft\Provisioning\Microsoft-Desktop-Provisioning-Sequence.dat was cleared updating 0 keys and creating 0 modified pages.
The issue happens on a customer system running Microsoft Windows 10 Pro for Workstations, Version 10.0.17763 Build 17763.
On that system we were able to reproduce the issue with a simple test program written in C++ that echoes UDP datagrams. We verified that the thread receiving from the socket was actually responsive all the time, by specifying a timeout of 1 second using SO_RCVTIMEO, printing some “still alive” output and immediately calling recvfrom() again.
On our own test system, we were unable to observe the issue under the same circumstances as the customer. However, we were able to provoke similar effects when playing around with the network adapter settings while the test was running. Enabling Microsoft LLDP Protocol Driver showed the sendto() error and sometimes also resulted in the 8 second “silence” period, but without any Windows System Log entry.
Any hints are greatly appreciated.
The issue seems to be related to Microsoft Provisioning Tool since Windows 10 1809.
Disabling it fixed the issue in our case:
Open Task Scheduler, go to Microsoft/Windows/Manangement/Provisioning and disable Logon task.
Source: Windows TenForums
I had a project which used windivert to work as a router in my network, and it worked fine but now is dead with the same code. Previous versions which worked succesfully now dont work. I always get the same Windivert error which is 997 (Overlapped I/O operation is in progress).
For example when I use WindivertOpen I get the error, when I restart the computer to reset the windivert driver I dont get the error 997 in WindivertOpen but I get it in WindivertSend or WinDivertSendEx and after use them I again get the error in WindivertOpen. These functions worked fine for me months ago and my router worked as I expected, but now I am done with these errors, there is nothing I can do, maybe this is caused by a windows security update.
I need to know how to reset the driver without restart the computer and to know what I can do to face this problem. I used windivert to block windows TCP RST packets to my router fordwards, windows does this when there is not sockets associated with the ports that you are fordwarding, what can I do to block this packets without windivert or with a working way of windivert?
The 997 error is ERROR_IO_PENDING, but the error code is meaningless unless WinDivertOpen returns INVALID_HANDLE_VALUE. Otherwise the call will have completed successfully.
Presumably you have upgraded to WinDivert 1.4 from a previous version. Simply replacing the binary files (dll/sys) won't work -- you must instead recompile your program against the new API.
What does it mean when the terminal throw this error and how to solve it?
packet_write_wait: Connection to xxx.xxx.xxx.xxx: Broken pipe
It was just happen today. After it work normally for year.
My terminal keep disconnect at a certain time. I had already search on google but most of it is about "Write failed: Broken pipe."
Which I already solved that for years. I just found this new annoyed problems today
I experienced this problem as well and spent a few days trying to bisect it.
Like specified, playing with SSH KeepAlive parameters (ClientAliveInterval, ClientAliveCountMax, ServerAliveInterval and ServerAliveCountMax) or kernel TCP parameters (TCPKeepAlive on/off) does not solve the problem.
After playing with USB to Ethernet drivers and tcpdump, I realized the issue was due to the kernel 4.8 I was using. I switched the source (sending side) to 4.4 LTS and the problem disappeared (rsync via ssh and scp were working nicely again). The destination side can remain on 4.8 if you want, in my use case this was working (tested).
On the technical side, we can narrow a little bit the issue thanks to the wireshark dump below I made. We can see the TCP channel of the SSHv2 protocol is being reset (RST flag of TCP set to 1) causing the connection to abort. I don't know the cause of that RST yet. I need to make some bisection from 4.8.1 to 4.8.11 for that.
I'm not saying your problem is specifically due to the kernel 4.8, but wrt. the date you posted your question/message, there are high chances you are currently using a kernel more recent than 4.4.
If that is an ssh connection, then you might want to make sure you send a keepalive message to the server.
ServerAliveInterval seems to be the most common strategy to keep a connection alive. To prevent the broken pipe problem, here is the ssh config I useed in my .ssh/ssh_config file (may be named as /etc/ssh/config or sshd_config):
Host myhostshortcut
HostName myhost.com
User barthelemy
ServerAliveInterval 60
ServerAliveCountMax 10
Connect through another wifi.
I don't know why or how it works, but it does.
The original poster sthapaun already mentioned this solution in a comment, but I want to add that the solution works for me, too.
I am working on a Windows (Microsoft Visual C++ 2005) application that uses several processes
running on different hosts in an intranet.
Processes communicate with each other using TCP/IP. Different processes can be on the
same host or on different hosts (i.e. the communication can be both within the same
host or between different hosts).
We have currently a bug that appears irregularly. The communication seems to work
for a while, then it stops working. Then it works again for some time.
When the communication does not work, we get an error (apparently while a process
was trying to send data). The call looks like this:
send(socket, (char *) data, (int) data_size, 0);
By inspecting the error code we get from
WSAGetLastError()
we see that it is an error 10054. Here is what I found in the Microsoft documentation
(see here):
WSAECONNRESET
10054
Connection reset by peer.
An existing connection was forcibly closed by the remote host. This normally
results if the peer application on the remote host is suddenly stopped, the
host is rebooted, the host or remote network interface is disabled, or the
remote host uses a hard close (see setsockopt for more information on the
SO_LINGER option on the remote socket). This error may also result if a
connection was broken due to keep-alive activity detecting a failure while
one or more operations are in progress. Operations that were in progress
fail with WSAENETRESET. Subsequent operations fail with WSAECONNRESET.
So, as far as I understand, the connection was interrupted by the receiving process.
In some cases this error is (AFAIK) correct: one process has terminated and
is therefore not reachable. In other cases both the sender and receiver are running
and logging activity, but they cannot communicate due to the above error (the error
is reported in the logs).
My questions.
What does the SO_LINGER option mean?
What is a keep-alive activity and how can it break a connection?
How is it possible to avoid this problem or recover from it?
Regarding the last question. The first solution we tried (actually, it is rather a
workaround) was resending the message when the error occurs. Unfortunately, the
same error occurs over and over again for a while (a few minutes). So this is not
a solution.
At the moment we do not understand if we have a software problem or a configuration
issue: maybe we should check something in the windows registry?
One hypothesis was that the OS runs out of ephemeral ports (in case connections are
closed but ports are not released because of TcpTimedWaitDelay), but by analyzing
this issue we think that there should be plenty of them: the problem occurs even
if messages are not sent too frequently between processes. However, we still are not
100% sure that we can exclude this: can ephemeral ports get lost in some way (???)
Another detail that might help is that sending and receiving occurs in each process
concurrently in separate threads: are there any shared data structures in the
TCP/IP libraries that might get corrupted?
What is also very strange is that the problem occurs irregularly: communication works
OK for a few minutes, then it does not work for a few minutes, then it works again.
Thank you for any ideas and suggestions.
EDIT
Thanks for the hints confirming that the only possible explanation was a connection closed error. By further analysis of the problem, we found out that the server-side process of the connection had crashed / had been terminated and had been restarted. So there was a new server process running and listening on the correct port, but the client had not detected this and was still trying to use the old connection. We now have a mechanism to detect such situations and reset the connection on the client side.
That error means that the connection was closed by the
remote site. So you cannot do anything on your programm except to accept that the connection is broken.
I was facing this problem for some days recently and found out that Adobe Acrobat Reader update was the culprit. As soon as you completely uninstall Adobe from the system everything returns back to normal.
I spent a long time debugging a 10054/10053 error in s3 pre-signed uploads
Turns out that the s3 server will reject pre-signed s3 uploads for the first 15 minutes of it's life.
So - If you're debugging s3 check it's not a new bucket.
If you're debugging something else - this is most likely a problem on the server side not client side.
I implement asynchronous download to retrieve remote file and store it in IsolatedStorage in order to use it when out of the network.
Everything works great when network is up. However when out of network, I noticed that async donwload may take up to 2 minutes before to fire my MessageBox (which say that connection to server has failed).
Question:
Is there any way to define a timeout ? Let's say that if my application does not receive any answer for X seconds then stop the Async Download and call a method.
Maybe a timeout is not the best pratices. In this case could you give me suggestion ?
I do not want my user wait for 15 seconds max.
PS: my application is suppose to run on wifi only, so I consider that 'network speed' is optimal.
Thx for your help
What I would recommend doing is check the network type first via NetworkInterface. If NetworkInterfaceType is Wireless80211, you have a wireless connection (Wi-Fi). The returned connection can be None in case there is no available way to connect - so you won't even have to start the download if there is no accessible network.
Answering your question, if you are using WebClient, you can't define a timeout. However, you can call instance.CancelAsync(). For a HttpWebRequest you can call instance.Abort().