Perforce - RpcTransport: partial message read - performance

When using "revert -a" through P4V it waits for a few minutes and throws this error back at me.
RpcTransport: partial message read
TCP receive failed.
read: socket: WSAECONNRESET
The server status returns fine and there are no locked database files.
I suspect this problem is local to this computer as others don't have the same issue. Issueing the same command through the command prompt just has the command prompt sit there indefinitly.
Other commands such as submit and add will have the visual client sit there indefinitely but does not throw and error.
The files are stored on a local drive. This happens with multiply depots/workstations.

The 'WSAECONNRESET' error is issued by Windows, when a network socket is forcibly closed.
Regular occurrences of this error can indicate network problems.
More information is available here:
http://answers.perforce.com/articles/KB/2968/
Hope this helps,
Jen!

I got the same on windows machine. I guess in my case it was caused by corrupted config settings and because of popup error message I had no chance to set it correctly via GUI.
The command line SET command helped to set port and host name again:
p4 set P4PORT=<portnum>
This command reenables the GUI config dialog

A few years late, but for those still facing this:
I faced this error when fetching files from a large repo. I believe what caused this for me was low internet upload speeds due to which - even though I had high a download speed - the TCP acknowledgment from my computer was not getting sent, causing a connection failure.
Perform an upload speed test to determine if it is very low (in my case it had dropped to less than 0.1 Mbps). Fixing upload speeds is a separate topic, but in case it helps try restarting your router as a first step.

Related

Reasons for rare sendto()/recvfrom() issues under Winsock?

We recently observe rare UDP communication issues that show the following symptoms:
A socket sendto() call fails with error WSAENOBUFS (10055)
A subsequent recvfrom() call on this socket does not receive anything, even though Wireshark shows that the network interface actually received the expected datagrams. This situation persists for approximately 8 seconds, afterwards new incoming datagrams can be received again from the socket.
In Windows System Log, there appears a Kernel-General information entry at the time of the sendto() error:
The access history in hive \??\C:\ProgramData\Microsoft\Provisioning\Microsoft-Desktop-Provisioning-Sequence.dat was cleared updating 0 keys and creating 0 modified pages.
The issue happens on a customer system running Microsoft Windows 10 Pro for Workstations, Version 10.0.17763 Build 17763.
On that system we were able to reproduce the issue with a simple test program written in C++ that echoes UDP datagrams. We verified that the thread receiving from the socket was actually responsive all the time, by specifying a timeout of 1 second using SO_RCVTIMEO, printing some “still alive” output and immediately calling recvfrom() again.
On our own test system, we were unable to observe the issue under the same circumstances as the customer. However, we were able to provoke similar effects when playing around with the network adapter settings while the test was running. Enabling Microsoft LLDP Protocol Driver showed the sendto() error and sometimes also resulted in the 8 second “silence” period, but without any Windows System Log entry.
Any hints are greatly appreciated.
The issue seems to be related to Microsoft Provisioning Tool since Windows 10 1809.
Disabling it fixed the issue in our case:
Open Task Scheduler, go to Microsoft/Windows/Manangement/Provisioning and disable Logon task.
Source: Windows TenForums

Irregular socket errors (10054) on Windows application

I am working on a Windows (Microsoft Visual C++ 2005) application that uses several processes
running on different hosts in an intranet.
Processes communicate with each other using TCP/IP. Different processes can be on the
same host or on different hosts (i.e. the communication can be both within the same
host or between different hosts).
We have currently a bug that appears irregularly. The communication seems to work
for a while, then it stops working. Then it works again for some time.
When the communication does not work, we get an error (apparently while a process
was trying to send data). The call looks like this:
send(socket, (char *) data, (int) data_size, 0);
By inspecting the error code we get from
WSAGetLastError()
we see that it is an error 10054. Here is what I found in the Microsoft documentation
(see here):
WSAECONNRESET
10054
Connection reset by peer.
An existing connection was forcibly closed by the remote host. This normally
results if the peer application on the remote host is suddenly stopped, the
host is rebooted, the host or remote network interface is disabled, or the
remote host uses a hard close (see setsockopt for more information on the
SO_LINGER option on the remote socket). This error may also result if a
connection was broken due to keep-alive activity detecting a failure while
one or more operations are in progress. Operations that were in progress
fail with WSAENETRESET. Subsequent operations fail with WSAECONNRESET.
So, as far as I understand, the connection was interrupted by the receiving process.
In some cases this error is (AFAIK) correct: one process has terminated and
is therefore not reachable. In other cases both the sender and receiver are running
and logging activity, but they cannot communicate due to the above error (the error
is reported in the logs).
My questions.
What does the SO_LINGER option mean?
What is a keep-alive activity and how can it break a connection?
How is it possible to avoid this problem or recover from it?
Regarding the last question. The first solution we tried (actually, it is rather a
workaround) was resending the message when the error occurs. Unfortunately, the
same error occurs over and over again for a while (a few minutes). So this is not
a solution.
At the moment we do not understand if we have a software problem or a configuration
issue: maybe we should check something in the windows registry?
One hypothesis was that the OS runs out of ephemeral ports (in case connections are
closed but ports are not released because of TcpTimedWaitDelay), but by analyzing
this issue we think that there should be plenty of them: the problem occurs even
if messages are not sent too frequently between processes. However, we still are not
100% sure that we can exclude this: can ephemeral ports get lost in some way (???)
Another detail that might help is that sending and receiving occurs in each process
concurrently in separate threads: are there any shared data structures in the
TCP/IP libraries that might get corrupted?
What is also very strange is that the problem occurs irregularly: communication works
OK for a few minutes, then it does not work for a few minutes, then it works again.
Thank you for any ideas and suggestions.
EDIT
Thanks for the hints confirming that the only possible explanation was a connection closed error. By further analysis of the problem, we found out that the server-side process of the connection had crashed / had been terminated and had been restarted. So there was a new server process running and listening on the correct port, but the client had not detected this and was still trying to use the old connection. We now have a mechanism to detect such situations and reset the connection on the client side.
That error means that the connection was closed by the
remote site. So you cannot do anything on your programm except to accept that the connection is broken.
I was facing this problem for some days recently and found out that Adobe Acrobat Reader update was the culprit. As soon as you completely uninstall Adobe from the system everything returns back to normal.
I spent a long time debugging a 10054/10053 error in s3 pre-signed uploads
Turns out that the s3 server will reject pre-signed s3 uploads for the first 15 minutes of it's life.
So - If you're debugging s3 check it's not a new bucket.
If you're debugging something else - this is most likely a problem on the server side not client side.

OpenSSH connection trouble

I'm trying to use Putty 0.60 to log in to an OpenSSH 5.3 server. Connections with openssh from another Linux server are possible, but Putty fails. Putty's event log tells me "software caused connection abort" right after the DH key exchange, the server log doesn't report anything (set to INFO). I analyzed the traffic with Wireshark and got a whole bunch of "TCP retransmission" and "TCP DUP ACK" after said key exchange.
Sometimes I was able to log in, but at some point (usually < 2 min.) the connection froze without any logged messages. Sadly, I didn't capture a trace.
The server is my own (Funtoo with genkernel and gentoo-sources 2.6.34), so I may tweak it, but I'd still like to know what causes the error. Any suggestions? Thank you!
Ok, that was weird.
The problems cause was a network BIOS setting: a specified static IP and NIC = shared (Broadcom Extreme II) - system in question is a Dell Blade. By these settings, I somehow ended up with multiple MAC addresses for the IP - which killed my SSH connections. I honestly hope this helps somebody else...

What causes: Error -135: socket write error

I am trying to upload a 12MB .wav file on a Mac running Transmit to a Linux box running Apache and get the following error after uploading just 160KB of the file:
error-135-socket-write-error
Any clues why I may be getting this? I have successfully uploaded much larger files in the past and nothing has changed on the configuration.
The disk was full and this caused the error -135.
Check the File Name are there spaces or symbles in it
Hard to say without knowing which OS is running on the client and the server (i.e. where the error code might come from). The error means that the client couldn't send any more data to the server. This is either because the server stops accepting more bytes (hangup, disk full, quota exceeded) or because the connection times out (but then, I'd expect a "connection reset by peer" error).

BizTalk 2006 - receive file through FTP - timeout issues

When trying to receive a (large, approx. 100MB) file using an FTP adapter in BizTalk 2006, we run into the following problem, which causes the file to be processed over and over again.
Retrieving the file succeeds; it is placed into the MessageBox and processed properly
When the FTP adapter issues the DELE statement, it never reaches the FTP server the file is on (we have verified this by taking a look at the FTP server's logs)
there are no signs of timeouts on the FTP server; the FTP server log does not mention a timeout occurring
After the interval time set on the adapter expires, the FTP server will still find the large file that we have already processed in the previous run, because the DELE statement failed
The event log in BizTalk states that ‘The connection to the FTP server was broken prematurely’. That is why we think there is a timeout issue.
We have seen that retrieval of the file takes around 35 minutes. The FTP server timeout is set to 1 hour. no problems there I guess.
Then we found the following article: http://www.ncftp.com/ncftpd/doc/misc/ftp_and_firewalls.html#FirewallTimeouts. It states that a firewall / routing device might be responsible for the timeouts. The team managing our firewalls and routers told us that there were no timeouts set here.
Which leaves us in the dark on the cause of our problem. Does anyone of you have any suggestions? Or even better, the solution!!
Have you tried the solutions in this article?
I avoid using the FTP adapter. Instead I use a third party utility to retrieve files and move the transferred file to a file adapter receive location. Third party utilities allow you to configure rules, recovery actions etc, freeing BizTalk from having to manage the transfer.

Resources