Troubleshooting a web service's speed - performance

C# .NET 2.0 if that turns out to be applicable.
I'm going to start to get to the bottom of why a web service we have is acting slow - this web service jumps over a proxy, into another domain, queries a stored procedure for some data, and then returns a int/string/dataset, depending on what I asked for. I just wrote a console app that queries it repeatedly in the same fashion so that I can gather some statistics to start out with.
Keep-alive is turned off for each request, for some legacy reason nobody documented, so there's an immediate smell.
When looping through the same request multiple times, I noticed some strange behavior. Here's my output that reflects how long each iteration took to make the query and return data.
Beginning run #1...completed in 4859.3128 ms
Beginning run #2...completed in 3812.4512 ms
Beginning run #3...completed in 3828.076 ms
Beginning run #4...completed in 3828.076 ms
Beginning run #5...completed in 546.868 ms
Beginning run #6...completed in 3828.076 ms
Beginning run #7...completed in 546.868 ms
Beginning run #8...completed in 3828.076 ms
Beginning run #9...completed in 3828.076 ms
Beginning run #10...completed in 578.1176 ms
Beginning run #11...completed in 3796.8264 ms
Beginning run #12...completed in 3828.076 ms
Beginning run #13...completed in 3828.076 ms
Beginning run #14...completed in 3828.076 ms
Beginning run #15...completed in 3828.076 ms
Beginning run #16...completed in 3828.076 ms
Beginning run #17...completed in 546.868 ms
Beginning run #18...completed in 3828.076 ms
Beginning run #19...completed in 3828.076 ms
Beginning run #20...completed in 546.868 ms
Total time: 61165 ms
Average time per request: 3058 ms
I find it odd that there are multiple repeated values, down to a very small level. Is there some bottleneck that would cause it to be returned in the same amount of time repeatedly?
...hopefully my code for figuring out and displaying the millisecond duration isn't off, but the TimeSpan object tracking it is local to each loop, so I don't think it's that.
EDIT: Jon asked for the timing code, so here ya go (variable names changed to protect the proprietary, so might have fat-fingered something that would make this not compile)...
int totalRunTime = 0;
for (int i = 0; i < numberOfIterations; i++)
{
Console.Write("Beginning run #" + (i + 1).ToString() + "...");
DateTime start = DateTime.Now;
SimpleService ws = new SimpleService();
DataSet ds = ws.CallSomeMethod();
DateTime end = DateTime.Now;
TimeSpan runTime = end - start;
totalRunTime += (int)runTime.TotalMilliseconds;
Console.Write("completed in " + runTime.TotalMilliseconds.ToString() + " ms\n");
}
Console.WriteLine("Total time: " + totalRunTime.ToString() + " ms");
Console.WriteLine("Average time per request: " + (totalRunTime / numberOfIterations).ToString() + " ms\n");

The simplest way, without running a profiler etc, is to make the web app log the exact time (as near as you can get it, obviously) it starts the operation, various times within the call, and the time it finishes. Then you can see where it's taking the time. (Using a Stopwatch will give you more accuracy, but it'll be slightly harder to do.)
I agree that it's odd that you've got repeated times. Could you post the code that's measuring it? I wouldn't be hugely surprised to see some sort of captured variable problem which is confusing your timings.
EDIT: Your timing code looks okay. That's very strange. I suggest you record the times at the web service as well, and see whether it looks the same. It's almost as if there's something deliberately throttling it.
When you run it, does it look like it's taking the amount of time it says it is - i.e. when it says it's taken 3 seconds, is that about 3 seconds after the last line was written?

now you need to get some benchmark values for the other steps in the chain. See the server logs to get the time your request hit the webserver, and add some logging into the webservice code to see when the webserver hands off to the actual "working" code.
Once you've done that you can start to narrow down the performance of the slowest part, repeat as much as you like.

Could creating (and timing the creation) of the SimpleService be skewing your numbers?
What happens if you pull that out of the loop?
int totalRunTime = 0;
SimpleService ws = new SimpleService();
for (int i = 0; i < numberOfIterations; i++)
{
Console.Write("Beginning run #" + (i + 1).ToString() + "...");
DateTime start = DateTime.Now;
DataSet ds = ws.CallSomeMethod();
DateTime end = DateTime.Now;
TimeSpan runTime = end - start;
totalRunTime += (int)runTime.TotalMilliseconds;
Console.Write("completed in " + runTime.TotalMilliseconds.ToString() + " ms\n");
}
Console.WriteLine("Total time: " + totalRunTime.ToString() + " ms");
Console.WriteLine("Average time per request: " + (totalRunTime / numberOfIterations).ToString() + " ms\n");

Related

JMeter - JSR223 get current time in milliseconds

I am trying to calculate SSE traffic latency in a simple load test using the following JSR223 Sampler:
EventHandler eventHandler = eventText -> {
count++;
// get the time from the server
def result = eventText.substring(eventText.indexOf("data='") + 6, eventText.indexOf("', event")).trim() as Long;
def currenTime = System.currentTimeMillis();
def diff = currenTime - result;
list.add (diff);
resp = resp + "Time from server:" + result + ", JMeter time:" + currenTime + ", diff:"+ diff +"\n";
};
SSEClient sseClient = SSEClient.builder().url(pURL).eventHandler(eventHandler).build();
sseClient.start();
sleep(SLEEP_TIME);
sseClient.shutdown();
The time from the server (NodeJS -JavaScript) is Date.now() and the time on JMeter is System.currentTimeMillis()
Both Server and JMeter are on the same computer.
It seems that the time methods are not aligned as I can see that in some cases the JMeter time is earlier than the server time:
So I cannot trust the results...
Any other methods I should use on the JavaScript side or the JMeter side?
You cannot trust the results in any case because having the system under test and the load generator on the same machine is not the best idea, you won't get reliable results due to race conditions. Moreover it will be much harder to analyze the bottlenecks even with PerfMon Plugin
Also as per System.currentTimeMillis() function JavaDoc:
Returns the current time in milliseconds. Note that while the unit of time of the return value is a millisecond, the granularity of the value depends on the underlying operating system and may be larger. For example, many operating systems measure time in units of tens of milliseconds.
So if you want to measure the time difference between previous and next SSE you can consider using System.nanoTime()
However it's better to move JMeter to another machine and preferably a Linux as the precision of System.currentTimeMillis() function there is much higher

Need with Help Invalid outside source error

I am trying to process information with a dead mans program. Every time I try to run it I get Compile error Invalid outside procedure. I've never messed with VB6 before. I've been searching for a solution but I only get help saying it needs to be in sub or something but the threads are closed and I haven't been able to get their solutions to work for me. http://pastebin.com/vR7A7iN5
I think the problem is in this specific section of the code but I am unsure how to place it into a sub or get it to work otherwise.
End Type
Dim recout As statrec
Open "\STAT\PP\QM1409\MGA013A\" For Random As #1 Len = 150
num.recs = LOF(1) / 150
Print num.recs
For Count = 1 To num.recs
Get #1, Count, recout
If recout.mga = "013" Then
It is a big reach to assume this is the only incompatibility, but start by wrapping the code you're talking about now inside a method as Plutonix mentioned. That IS definitely an error. However I would not be surprised if there are more problems.
Example:
Public Sub GetStatRecs()
Dim recout As statrec
Open "\STAT\PP\QM1409\MGA013A\" For Random As #1 Len = 150
num.recs = LOF(1) / 150
Print num.recs
For Count = 1 To num.recs
Get #1, Count, recout
....
Next Count
....
End Sub

Rate Exceeding in workflow_execution polling

I am currently trying to modify a plugin for posting metrics to new-relic via AWS. I have successfully managed to make the plugin post metrics from swf to new relic (not originally in plugin), but have encountered a problem if the program runs for too long.
When the program runs for a bout 10 minutes I get the following error:
Error occurred in poll cycle: Rate exceeded
I believe this is coming from my polling swf for the workflow executions
domain.workflow_executions.each do |execution|
starttime = execution.started_at
endtime = execution.closed_at
isOpen = execution.open?
status = execution.status
if endtime != nil
running_workflow_runtime_total += (endtime - starttime)
number_of_completed_executions += 1
end
if status.to_s == "open"
openCount = openCount + 1
elsif status.to_s == "completed"
completedCount = completedCount + 1
elsif status.to_s == "failed"
failedCount = failedCount + 1
elsif status.to_s == "timed_out"
timed_outCount = timed_outCount + 1
end
end
This is called in a polling cycle every 60 seconds
Is there a way to set the polling rate? Or another way to get the workflow executions?
Thanks, here's a link to the ruby sdk for swf => link
The issue is likely that you are creating a large number of workflow executions and each iteration through the loop in workflow_executions is causing a lookup, which eventually is exceeding your rate limit.
This could also be getting a bit expensive, so be careful.
It's not clear what you're really trying to do, so I can't tell you how to fix it unless you post all your code (or the parts around calls to SWF).
You can see here:
https://github.com/aws/aws-sdk-ruby/blob/05d15cd1b6037e98f2db45f8c2597014ee376a59/lib/aws/simple_workflow/workflow_execution_collection.rb
That a call is made to SWF for each workflow in the collection.

Is measuring js execution time a way to tell how quickly the app is responding to requests?

I have something like a microtime() function at the very start of my node.js / express app.
function microtime (get_as_float) {
// Returns either a string or a float containing the current time in seconds and microseconds
//
// version: 1109.2015
// discuss at: http://phpjs.org/functions/microtime
// + original by: Paulo Freitas
// * example 1: timeStamp = microtime(true);
// * results 1: timeStamp > 1000000000 && timeStamp < 2000000000
var now = new Date().getTime() / 1000;
var s = parseInt(now, 10);
return (get_as_float) ? now : (Math.round((now - s) * 1000) / 1000) + ' ' + s;
}
The code of the actual app looks something like this:
application.post('/', function(request, response) {
t1 = microtime(true);
//code
//code
response.send(something);
console.log("Time elapsed: " + (microtime(true) - t1));
}
Time elapsed: 0.00599980354309082
My question is, does this mean that from the time a POST request hits the server to the time a response is sent out is give or take ~0.005s?
I've measured it client-side but my internet is pretty slow so I think there's some lag that has nothing to do with the application itself. What's a quick and easy way to check how quickly the requests are being processed?
Shameless plug here. I've written an agent that tracks the time usage for every Express request.
http://blog.notifymode.com/blog/2012/07/17/profiling-express-web-framwork-with-notifymode/
In fact when I first started writing the agent, I took the same approach. But I soon realized that it is not accurate. My implementation tracks the time difference between request and the response by substituting the Express router. That allowed me to add tracker functions. Feel free to give it a try.

Azure Storage Queue very slow from a worker role in the cloud, but not from my machine

I'm doing a very simple test with queues pointing to the real Azure Storage and, I don't know why, executing the test from my computer is quite faster than deploy the worker role into azure and execute it there. I'm not using Dev Storage when I test locally, my .cscfg is has the connection string to the real storage.
The storage account and the roles are in the same affinity group.
The test is a web role and a worker role. The page tells to the worker what test to do, the the worker do it and returns the time consumed. This specific test meassures how long takes get 1000 messages from an Azure Queue using batches of 32 messages. First, I test running debug with VS, after I deploy the app to Azure and run it from there.
The results are:
From my computer: 34805.6495 ms.
From Azure role: 7956828.2851 ms.
That could mean that is faster to access queues from outside Azure than inside, and that doesn't make sense.
I'm testing like this:
private TestResult InQueueScopeDo(String test, Guid id, Int64 itemCount)
{
CloudStorageAccount account = CloudStorageAccount.Parse(_connectionString);
CloudQueueClient client = account.CreateCloudQueueClient();
CloudQueue queue = client.GetQueueReference(Guid.NewGuid().ToString());
try
{
queue.Create();
PreTestExecute(itemCount, queue);
List<Int64> times = new List<Int64>();
Stopwatch sw = new Stopwatch();
for (Int64 i = 0; i < itemCount; i++)
{
sw.Start();
Boolean valid = ItemTest(i, itemCount, queue);
sw.Stop();
if (valid)
times.Add(sw.ElapsedTicks);
sw.Reset();
}
return new TestResult(id, test + " with " + itemCount.ToString() + " elements", TimeSpan.FromTicks(times.Min()).TotalMilliseconds,
TimeSpan.FromTicks(times.Max()).TotalMilliseconds,
TimeSpan.FromTicks((Int64)Math.Round(times.Average())).TotalMilliseconds);
}
finally
{
queue.Delete();
}
return null;
}
The PreTestExecute puts the 1000 items on the queue with 2048 bytes each.
And this is what happens in the ItemTest method for this test:
Boolean done = false;
public override bool ItemTest(long itemCurrent, long itemCount, CloudQueue queue)
{
if (done)
return false;
CloudQueueMessage[] messages = null;
while ((messages = queue.GetMessages((Int32)itemCount).ToArray()).Any())
{
foreach (var m in messages)
queue.DeleteMessage(m);
}
done = true;
return true;
}
I don't what I'm doing wrong, same code, same connection string and I got these resuts.
Any idea?
UPDATE:
The problem seems to be in the way I calculate it.
I have replaced the times.Add(sw.ElapsedTicks); for times.Add(sw.ElapsedMilliseconds); and this block:
return new TestResult(id, test + " with " + itemCount.ToString() + " elements",
TimeSpan.FromTicks(times.Min()).TotalMilliseconds,
TimeSpan.FromTicks(times.Max()).TotalMilliseconds,
TimeSpan.FromTicks((Int64)Math.Round(times.Average())).TotalMilliseconds);
for this one:
return new TestResult(id, test + " with " + itemCount.ToString() + " elements",
times.Min(),times.Max(),times.Average());
And now the results are similar, so apparently there is a difference in how the precision is handled or something. I will research this later on.
The problem apparently was a issue with different nature of the StopWatch and TimeSpan ticks, as discussed here.
Stopwatch.ElapsedTicks Property
Stopwatch ticks are different from DateTime.Ticks. Each tick in the DateTime.Ticks value represents one 100-nanosecond interval. Each tick in the ElapsedTicks value represents the time interval equal to 1 second divided by the Frequency.
How is your CPU utilization? Is this possible that your code is spiking the CPU and your workstation is much faster than your Azure node?

Resources