TFS Build server unable to communicate with controller for load testing - mstest

What steps can I take to test the connectivity of a controller?
How can I ensure that my tfs build server is able to communicate with a controller in order to perform a load test?
My company is trying to automate load testing, and to accomplish this we are using a tfs build. I have a command line task to actually start the load tests. To visualize, it looks like this:
Command Line Task
Tool: MSTest.exe
Arguments: /testcontainer:"some load test.loadtest" /testsettings:"some test settings.testsettings"
When I run the build, it times out after about 3-4.5 minutes and gives me this error:
Failed to queue test run 'some load test': A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
I've tried a few things to test the connectivity of the controller, but I'm not sure I've reached the answer I'm looking for yet.
The first thing I tried was to restart the controller, however that didn't change anything.
I then went into vs2017 and opened up the load test editor. From there I went into Manage test controllers. The controller dialogue box is set to the correct controller, and the agents it's controlling are at the Ready.
Then I looked at the test settings, and verified that the <RemoteController/> was also set correctly. I did this more just to confirm that the error isn't with the files themselves. So, the timeout should be happening due to some connection problem between the server or the controller.
I know that the controller uses a specific port for incoming traffic, port 6901, so I checked the port connection using a .NET TcpClient in PS:
$server = 'mycontroller'
$port = 6901
$client = New-Object Net.Sockets.TcpClient
try{
$client.Connect($server, $port)
$msg = "Connected to $server on Port $port"
} catch {
$msg = "Could not connect to $server on Port $port"
} finally {
$client.Dispose()
}
write-host $msg
Provided the controller is turned on, $client always returns that it is able to connect to $server. Are there any properties besides Connected a TcpClient object has that could give me more information about its connection? Because the build server still times out despite $client seemingly being able to connect. Do I need to give the build server the specific port to connect to? Shouldn't it do that automatically?
I tried tracert through another command line task to see if I could pinpoint where the server was getting timed out:
2019-07-10T20:28:32.9051572Z C:\Windows\system32\cmd.exe /c "TRACERT.exe atsvstctstvm400"
2019-07-10T20:28:32.9271264Z Tracing route to mycontroller.local [10.208.3.56]
2019-07-10T20:28:32.9281250Z over a maximum of 30 hops:
2019-07-10T20:28:37.7253994Z 1 1 ms <1 ms <1 ms 10.200.0.5
2019-07-10T20:28:38.7269952Z 2 <1 ms <1 ms <1 ms 10.50.0.2
2019-07-10T20:28:39.8534160Z 3 38 ms 39 ms 38 ms 172.20.0.2
2019-07-10T20:28:52.3918376Z 4 * * * Request timed out.
2019-07-10T20:28:52.5396304Z 5 48 ms 48 ms 48 ms 10.208.3.56
2019-07-10T20:28:52.5396304Z Trace complete.
Could the 4th hop be where the build server is getting hung up on? It seems like despite the timeout on hop 4, the agent was still able to ping through to the controller. If it is the 4th hop I need to be concerned with, how do I even go about fixing it? It doesn't give me any information about where the 4th hop wanted to go.
So, in sum, I'm trying to run a load test through a command line task in tfs using MSTest.exe. When I run the build, I get a timeout error, which I believe means that the build server is unable to communicate with the controller.
I've tried a few ideas for troubleshooting, but I haven't been able to resolve the issue or shed any light on how to proceed.
How can I pinpoint where the communication error is?
What other troubleshooting solutions are there to resolve this issue?
For future load test runs, how can I test to ensure that the build server is able to communicate with the controller?

Related

Retry connection with JMeter servers (generators) in JMeter Distributed setup

I've a distributed JMeter setup with 1 client (controller) and 2 servers (generators).
Now while a test is executing on the setup if a generator crashes in between, the controller gets hanged even after the test duration ends.
Is there a way to reconnect the controller with the generator after the generator comes up again during the same test execution?
No there is no such configuration option and it is advised to restart servers.
Usually this is due to connectivity issues between server and controller like port not open.
For reference properties have a look at:
https://jmeter.apache.org/usermanual/properties_reference.html#remote
For 1/ you can add this to user.properties:
client.continue_on_fail=true
server.exitaftertest=true

libssh2_session_handshake return -43

Here was a ssh server at My win10, than When I ran the demo named example-ssh2 builded by visual studio 2015 from libssh2 1.90.
I saw the tcp socket was ESTABLISHED.
session = libssh2_session_init()
was successed.But
libssh2_session_handshake(session, sock)
always return -43.
Could you help me?
I was getting this -43 when connecting to servers on high latency networks.
Libssh2.h has this definition...
define LIBSSH2_ERROR_SOCKET_RECV -43
I was able successfully connect consistently by adding a short sleep between the libssh2_session_init() call and the libssh2_session_handshake(session, sock) call like this...
session = libssh2_session_init();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
libssh2_session_handshake(session, sock);
I chanced upon this solution by single stepping over the original code in the debugger, and it would successfully connect every time, so I tried the sleep and it worked. I tried some different sleep times, with 30ms being the lowest value that still worked.
I found your post while trying to find an explanation for this behavior. I hope this method works for you.

java.net.SocketException: Connection reset on reaching 3000 users in JMeteR

All required changes have been done to respective files like:
stalecheck=true,
keepalive is checked from HTTP request defaults,
retrycount=1,
hc.parameters file changes,
Socket timeout is 240000
Still we see "java.net.SocketException: Connection reset" in response data however I see the valid requests been passed to Server.
The issue wasnt till we reach 3000 users, worked smoothly till 3000 users.
Connection Reset has a lot of meaning, possible reasons are:
One of the server components is not able to handle load so it closes connections on its side
On JMeter side, check that you running in NON GUI mode and that neither JMeter JVM nor injector machine are overloaded which could explain this. See:
https://jmeter.apache.org/usermanual/get-started.html#non_gui

Not able to Initialise smartmeter as ite generate an error?

Error Screenshot I have recorded test scrip using smart meter.
But I when I tries to load the script "Rum smartmeter Test"
It gives following. I have no requirement of remote server as i wanted to run basic Testscript from local machine
*> Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused connect*
Just for the background please note that SmartMeter.io is based on Apache JMeter but adds new features such as one-click test reports, advanced scenario recorder, user friendly distributed mode, acceptance criteria and many others.
Looking forward for your suggestion.
Thanks.
Looks like SmartMeter is unable to start a load generator. The most common reason is that the required port 1099 is already occupied by some other program or the load generator can sometimes be blocked by your firewall. I suggest you restart your computer and try again. If it doesn't help, then you need to find out what is occupying that port. See for example How can you find out which process is listening on a port on Windows?
You should be able to get some information from logs/generator.log.
You can also run tests from SmartMeter Editor, the same way you would do in JMeter.

'net use' over SSL fails unless port 443 is specified

We are attempting to connect to a WebDAV server using net use over SSL. On some servers we're seeing an issue in which this connection only succeeds if we specify port 443 in the URL.
Does Map
net use * "https://example.com:443/folder"
net use * "\\example.com#SSL#443\folder"
and, bizarrely, so does this:
net use * "\\example.com#SSLasdf\folder"
Does Not Map
net use * "https://example.com/folder"
net use * "\\example.com#SSL\folder"
In the non-working cases we consistently receive the following error:
System error 67 has occured.
The network name cannot be found.
We have noticed some things that might be useful information:
We have a test server that's configured the same way as the prod server and it works as expected.
In the non-working cases, no incoming requests are ever seen at the prod server from the failing host.
All clients are based on the same image.
The problem does not manifest uniformly on all clients -- some work, some don't.
There is an existing, valid entry for example.com in the client DNS cache.
Flushing the client DNS cache of the affected servers does not resolve the problem.
Once the problem appears, it seems to stick. That is, if I execute one of the working mappings, delete it, and then immediately execute one of the non-working mappings, the problem persists.
We are utterly stumped. Any theories?
You are seeing different behaviors because you are connecting using different names. Once a name has been attempted and failed, the WebClient (this is the service that enables WebDAV) will cache the response for a period. To clear the cache, locate the WebClient service in the Services console and restart it. Or from an administrative command prompt execute the following command:
net.exe stop webclient && net.exe start webclient
We ultimately determined that we were mis-interpreting the System Error 67 that net use was returning. We discovered two interesting things:
In the event that the WebDAV returns a 404 or a 50x on the initial, root folder PROPFIND, net use will (rightly) interpret this as the root folder being unavailable. The fact that it says the network name could not be found let us to believe that the problem was with the name resolution, but it was really just saying, 'hey, I couldn't find anything at this path.'
If 'net use' fails due to a 404/50x, it appears that for a brief period of time it will automatically fail any additional mappings for that same host without issuing a request. For example, if net use http://me.com/foo returns a 404, then net use http://me.com/bar will instantly fail if made in rapid succession to that first call, and no request record will be seen in the WebDAV server logs.
My best guess is that appending the #443 port didn't make any real difference. What it perhaps did do was to trick net use into thinking it was talking to a different host, at least for the purposes of its 'auto-fail' feature. But that's just a guess.

Resources