Using WebSockets to connect to STT. Working well for the most part.
When streaming a half hour newcast to Watson STT I find respose time to
be 1 to 2 seconds on average. Periodically I experience much longer delays.
8 seconds, 10 seconds. Sometimes things get really backed up and the delay can
be as much as 60 seconds or more! Has anyone else experienced this behavior ?
Does anybody have a suggestion on how to overcome this problem?
Thanks!
the event you are reporting is not something that we IBM Watson are aware of. Our customers are typically very happy about the latency of the Speech-to-Text service. Can you please contact technical support through this link so we can get to the bottom of this? https://support.eu-gb.bluemix.net/gethelp/
btw. if you can provide the transaction-id of one of those transactions in which you experienced long delays that would help very much with troubleshooting.
Has anyone else experienced this behavior?
Yes.
Does anybody have a suggestion on how to overcome this problem? Thanks!
Use self-hosted Kaldi-based transcriber.
Related
Recently my system have a problem like this, suddenly integration latency from API Gateway is higher than usual, especially at night.
Check Logs Insight, I saw these happened.
Somehow, for some request, it took 7s to finish integration. At first I think this problem is from lambda and I also checked it, but seems like it's not.
Lambda took only more than 1s (higher than normal a little) to finish execute and it also started right after integration completed.
Please have anyone solved these kind of problem before ? can you plaease give me some advice ?
We have this architecture with 2 node processes.
One polls a private API and pushes the changes to the second node if any.
The second node process the data and calls a bunch other API's and eventually emits a change event to the client, a HTML5 website, with socket.io
This second node will always process the data and will always emit changes even if no clients are connected. So in my opinion the CPU or mem usage is not that greatly affected by the number of connected clients. Also note that this architecture is still running on a private staging environment.
Everything runs fine and we're ready to go live until we noticed after couple of days, maybe a week, the second node suddenly gets extremely slow while the first node is still fine.
It gets so bad that even the connection between the two nodes gets timed out and they are on the same network over localhost. It also takes more then 10 seconds to browse to the socket.io/socket.io.js file.
I know its very hard to understand the problem without seeing any code but I'm kinda pulling my hair out because we have to go live in couple of days and my logs are not revealing anything and google isn't helping either.
Whats a good practice towards building Have you ever experienced anything like this? What was the problem and how did you fix it?
Whats a good monitor and profiler for node.js? (preferably free)
What are good practices towards building a node.js app with makes a lot of outgoing API calls?
Anything or anyone that could help me in the right direction of solving or even discovering the actual problem will be greatly appreciated!
Thank you!
Never experienced anything like this but may be the second node is blocking the event loop by doing CPU intensive work or waiting for some resource synchronously.
Add some logging in your code to see how much time second node is taking for processing each change pushed by first node. May be some type of change consumes CPU for 10 seconds or so to complete.
You should also start monitoring memory, CPU and network connections. When things slow down your monitoring will provide some clue as to where is the bottle neck.
For monitoring you can try following 3 tools
nodetime
hummingbird
node-monitor
Also read http://nodetime.com/blog/monitoring-nodejs-application-performance
It sounds like you have a memory leak somewhere in the second node, maybe from calling too many anonymous functions etc... do you notice your RAM usage slightly creeping up as it runs?
I am using a modified version of the TaskCloud example to try and read/write my own data.
While testing on a a deployed version, I've noticed that the round-trip response time is slow.
From my Android device, I have a 100ms ping response to appspot.com.
I have changed the AppEngine application to do nothing (The Google Dashboard shows insignificant Average Latency.
The problem is that the time it takes for HttpClient client .execute(post) is about 3 seconds.
(This is the time when an instance is already loaded)
Any suggestions would be greatly appreciated.
EDIT: I've watched the video of Google I/O showing the CloudTasks Android-AppEngine app, and you can see that refreshing the list (a single call to AppEngine) takes about 3 seconds as well. The guy is saying something about performance which I didn't fully get (debuggers are running at both ends?)
The video: http://www.youtube.com/watch?v=M7SxNNC429U&feature=related
Time location: 0:46:45
I'll keep investigating...
Thanks for your help so far.
EDIT 2: Back to this issue...
I've used shark packet sniffer to find out what is happening. Some of the time is spent negotiating a SSL connection for each server call. Using http (and ACSID) is faster than https (and SACSID).
new DefaultHttpClient() and new HttpPost() are used for each server call.
EDIT 3:
Looking at the sniffer logs again, there is an almost 2 seconds delay before the actual POST.
I have also found out that the issue exists with Android 2.2 (all versions) but is resolved with Android 2.3
EDIT 4: It's been resolved. Please see my answer below.
It's difficult to answer your question since no detail about your app is provided. Anyway you can try to use appstats tool provided by Google to analyze the bottleneck.
After using the Shark sniffer, I was able to understand the exact issue and I've found the answer in this question.
I have used Liudvikas Bukys's comment and solved the problem using the suggested line:
post.getParams().setBooleanParameter(CoreProtocolPNames.USE_EXPECT_CONTINUE, false);
Often the first call to your GAE app will take longer than subsequent calls. You should make yourself familiar with loading and warm-up requests and how GAE handles instances of your app: http://code.google.com/intl/de-DE/appengine/docs/adminconsole/instances.html
Some things you could also try:
make your app handle more than one request per instance (make sure your app is threadsafe!) http://code.google.com/intl/de-DE/appengine/docs/java/config/appconfig.html#Using_Concurrent_Requests
enable always on feature in app admin (this will cost you)
In View Currently Executing Requests in IIS7 I can see some requests in the RequestAcquireState with really high Time Elapsed, e.g. 7 seconds
If I understand correctly this means that 7 seconds were spent somewhere in the states below.
BEGIN_REQUEST
AUTHENTICATE_REQUEST
AUTHORIZE_REQUEST
RESOLVE_REQUEST_CACHE
MAP_REQUEST_HANDLER
ACQUIRE_REQUEST_STATE
We are using ASP.NET Sessions Server which is accessed over network by 3 Web Servers.
I heard about some potential locking issues that may arise.
Does anybody have any idea how to diagnose this further?
Thanks,
Piotr
Has anyone done any sort of performance tests against MSMQ?
We have a solution in prod environment where errors are added to a MSMQ for distribution to databases or event monitors.
We need to test the capacity of this system but not sure how to start.
Anyone know any tools or have any tips?
try overloading it with a test program and see where it balks/fails
[analgous to "destructive testing" in materials engineering]
QueueExplorer has "Mass send" option which could be used to send bunch of messages to a queue, with or without delay between them. I know it's not a fully automated stress test, but running it from few instances or even few machines, could generate significant stress load.
Disclaimer: I'm author of QueueExplorer.
yeah I was thinking that was hoping for a more public tool.