MongoDB-Java performance with rebuilt Sync driver vs Async - performance

I have been testing MongoDB 2.6.7 for the last couple of months using YCSB 0.1.4. I have captured good data comparing SSD to HDD and am producing engineering reports.
After my testing was completed, I wanted to explore the allanbank async driver. When I got it up and running (I am not a developer, so it was a challenge for me), I first wanted to try the rebuilt sync driver. I found performance improvements of 30-100%, depending on the workload, and was very happy with it.
Next, I tried the async driver. I was not able to see much difference between it and my results with the native driver.
The command I'm running is:
./bin/ycsb run mongodb -s -P workloads/workloadb -p mongodb.url=mongodb://192.168.0.13:27017/ycsb -p mongodb.writeConcern=strict -threads 96
Over the course of my testing (mostly with the native driver), I have experimented with more and less threads than 96; turned on "noatime"; tried both xfs and ext4; disabled hyperthreading; disabled half my 12 cores; put the journal on a different drive; changed sync from 60 seconds to 1 second; and checked the network bandwidth between the client and server to ensure its not oversubscribed (10GbE).
Any feedback or suggestions welcome.

The Async move exceeded my expectations. My experience is with the Python Sync (pymongo) and Async driver (motor) and the Async driver achieved greater than 10x the throughput. further, motor is still using pymongo under the hoods but adds the async ability. that could easily be the case with your allanbank driver.
Often the dramatic changes come from threading policies and OS configurations.
Async needn't and shouldn't use any more threads than cores on the VM or machine. For example, if you're server code is spawning a new thread per incoming conn -- then all bets are off. start by looking at the way the driver is being utilized. A 4 core machine uses <= 4 incoming threads.
On the OS level, you may have to fine-tune parameters like net.core.somaxconn, net.core.netdev_max_backlog, sys.fs.file_max, /etc/security/limits.conf nofile and the best place to start is looking at nginx related performance guides including this one. nginx is the server that spearheaded or at least caught the attention of many linux sysadmin enthusiasts. Contrary to popular lore one should reduce your keepalive timeout opposed to lengthen it. The default keep-alive timeout is some absurd (4 hours) number of seconds. you might want to cut the cord in 1 minute. basically, think a short sweet relationship with your clients connections.
Bear in mind that Mongo is not Async so you can use a Mongo driver pool. nevertheless, don't let the driver get stalled on slow queries. cut it off in 5 to 10 seconds using the following equivalents in Java. I'm just cutting and pasting here with no recommendations.
# Specifies a time limit for a query operation. If the specified time is exceeded, the operation will be aborted and ExecutionTimeout is raised. If max_time_ms is None no limit is applied.
# Raises TypeError if max_time_ms is not an integer or None. Raises InvalidOperation if this Cursor has already been used.
CONN_MAX_TIME_MS = None
# socketTimeoutMS: (integer) How long (in milliseconds) a send or receive on a socket can take before timing out. Defaults to None (no timeout).
CLIENT_SOCKET_TIMEOUT_MS=None
# connectTimeoutMS: (integer) How long (in milliseconds) a connection can take to be opened before timing out. Defaults to 20000.
CLIENT_CONNECT_TIMEOUT_MS=20000
# waitQueueTimeoutMS: (integer) How long (in milliseconds) a thread will wait for a socket from the pool if the pool has no free sockets. Defaults to None (no timeout).
CLIENT_WAIT_QUEUE_TIMEOUT_MS=None
# waitQueueMultiple: (integer) Multiplied by max_pool_size to give the number of threads allowed to wait for a socket at one time. Defaults to None (no waiters).
CLIENT_WAIT_QUEUE_MULTIPLY=None
Hopefully you will have the same success. I was ready to bail on Python prior to async

Related

Why does ZeroMQ not receive a string when it becomes too large on a PUSH/PULL MT4 - Python setup?

I have an EA set in place that loops history trades and builds one large string with trade information. I then send this string every second from MT4 to the python backend using a plain PUSH/PULL pattern.
For whatever reason, the data isn't received on the pull side when the string transferred becomes too long. The backend PULL-socket slices each string and further processes it.
Any chance that the PULL-side is too slow to grab and process all the data which then causes an overflow (so that a delay arises due to the processing part)?
Talking about file sizes we are well below 5kb per second.
This is the PULL-socket, which manipulates the data after receiving it:
while True:
# check 24/7 for available data in the pull socket
try:
msg = zmq_socket.recv_string()
data = msg.split("|")
print(data)
# if data is available and msg is account info, handle as follows
if data[0] == "account_info":
[...]
except zmq.error.Again:
print("\nResource timeout.. please try again.")
sleep(0.000001)
I am a bit curious now since the pull socket seems to not even be able to process a string containing 40 trades with their according information on a single MT4 client - Python connection. I actually planned to set it up to handle more than 5.000 MT4 clients - python backend connections at once.
Q : Any chance that the pull side is too slow to grab and process all the data which then causes an overflow (so that a delay arises due to the processing part)?
Zero chance.
Sending 640 B each second is definitely no showstopper ( 5kb per second - is nowhere near a performance ceiling... )
The posted problem formulation is otherwise undecidable.
Step 1) POSACK/NACK prove whether a PUSH side accepts the payload for sending error-free.
Step 2) prove the PULL side is not to be blamed - [PUSH.send(640*chr(64+i)) for i in range( 10 )] via a python-2-python tcp://-transport-class solo-channel crossing host-to-host hop, over at least your local physical network ( no VMCI/emulated vLAN, no other localhost colocation )
Step 3) if either steps above got POSACK-ed, your next chances are the ZeroMQ configuration space and/or the MT4-based PUSH-side incompatibility, most probably "hidden" inside a (not mentioned) third party ZeroMQ wrapper used / first-party issues with string handling / processing ( which you must have already read about, as it has been so many times observed and mentioned in the past posts about this trouble with well "hidden" MQL4 internal eco-system changes ).
Anyway, stay tuned. ZeroMQ is a sure bet and a truly horsepower for professional and low-latency designs in distributed-system's domain.

WinAPI - Why does ICMPSendEcho2Ex report false timeouts when Timeout is set below 1000ms?

Edit: I started asking this as a PowerShell / .Net question and couldn't find any reference to it on the internet. With feedback, it appears to be a WINAPI issue so this is an edit/rewrite/retag, but much of the tests and background reference .Net for that reason.
Summary
WINAPI ping function IcmpSendEcho2 appears to have a timing bug if the ping timeout parameter is set below 1000ms (1 second). This causes it to return intermittent false timeout errors. Rather than a proportional "lower timeout = more fails" behaviour, it appears to be a cutoff of >=1000ms+ is expected behaviour, <=999ms triggers false timeouts, often in an alternating success/fail/success/fail pattern.
I call them false timeouts because I have WireShark packet capture showing a reply packet coming back well within the timeout, and partly because the 1ms change shouldn't be a significant amount of time when the replies normally have 500-800ms of headroom, and partly because I can run two concurrent sets of pings with different timeouts and see different behavior between the two.
In the comments of my original .Net question, #wOxxOm has:
located the Open Source .Net code where System.Net.NetworkInformation.Ping() wraps the WinAPI and there appears to be no specific handling of timeouts there, it's passed down to the lower layer directly - possibly line 675 with a call to UnsafeNetInfoNativeMethods.IcmpSendEcho2()
and #Lieven Keersmaekers has investigated and found things beyond my skill level to interpret:
"I can second this being an underlying WINAPI problem. Following a success and timedout call into IPHLPAPI!IcmpSendEcho2Ex: the 000003e7 parameter is on the stack, both set up an event and both return into IPHLPAPI!IcmpEchoRequestComplete with the difference of the success call's eax register containing 00000000 and the timedout call's eax register containing 00000102 (WAIT_TIMEOUT)
"Compiling a 64bit C# version, there's no more calls into IPHLPAPI. A consistent thing that shows up is clr.dll GetLastError() returning WSA_QOS_ADMISSION_FAILURE for timeouts. Also consistent in my sample is the order of thread executions between a success and a timeout call being slightly different."
This StackOverflow question hints that the WSA_QOS_ADMISSION_FAILURE might be a misleading error, and is actually IP_REQ_TIMED_OUT.
Testing steps:
Pick a distant host and set some continuous pings running. (The IP in my examples belongs to Baidu.cn (China) and has ping replies around ~310ms to me). Expected behaviour for all of them: almost entirely ping replies, with occasional drops due to network conditions.
PowerShell / .Net, with 999ms timeout, actual result is bizarre reply/drop/reply/drop patterns, far more drops than expected:
$Pinger = New-Object -TypeName System.Net.NetworkInformation.Ping
while($true) {
$Pinger.Send('111.13.101.208', 999)
start-sleep -Seconds 1
}
command prompt ping.exe with 999ms timeout, actual result is more reliable (edit: but later findings call this into question as well):
ping 111.13.101.208 -t -w 999
PowerShell / .Net, with 1000ms timeout, actual result is as expected:
$Pinger = New-Object -TypeName System.Net.NetworkInformation.Ping
while($true) {
$Pinger.Send('111.13.101.208', 1000)
start-sleep -Seconds 1
}
It's repeatable with C# as well, but I've edited that code out now it seems to be a WinAPI problem.
Example screenshot of these running side by side
On the left, .Net with 999ms timeout and 50% failure.
Center, command prompt, almost all replies.
On the right, .Net with 1000ms timeout, almost all replies.
Earlier investigations / Background
I started with a 500ms timeout, and the quantity of fake replies seems to vary depending on the ping reply time of the remote host:
pinging something 30ms away reports TimedOut around 1 in 10 pings
pinging something 100ms away reports TimedOut around 1 in 4 pings
pinging something 300ms away reports TimedOut around 1 in 2 pings
From the same computer, on the same internet connection, sending the same amount of data (32 byte buffer) with the same 500ms timeout setting, with no other heavy bandwidth use. I run no antivirus networking filter outside Windows 10 defaults, two other people I know have confirmed this frequent TimedOut behaviour on their computers (now two more have in the comments), with more or fewer timeouts, so it ought not to be my network card or drivers or ISP.
WireShark packet capture of ping reply which is falsely reported as a timeout
I ran the ping by hand four times to a ~100ms away host, with a 500ms timeout, with WireShark capturing network traffic. PowerShell screenshot:
WireShark screenshot:
Note that the WireShark log records 4 requests leaving, 4 replies coming back, each with a time difference of around 0.11s (110 ms) - all well inside the timeout, but PowerShell wrongly reports the last one as a timeout.
Related questions
Googling shows me heaps of issues with System.Net.NetworkInformation.Ping but none which look the same, e.g.
System.Net.NetworkInformation.Ping crashing - it crashes if allocated/destroyed in a loop in .Net 3.5 because its internals get wrongly garbage collected. (.Net 4 here and not allocating in a loop)
Blue screen when using Ping - 6+ years of ping being able to BSOD Windows (not debugging an Async ping here)
https://github.com/dotnet/corefx/issues/15989 - it doesn't timeout if you set a timeout to 1ms and a reply comes back in 20ms, it still succeeds. False positive, but not false negative.
The documentation for Ping() warns about the low-timeout-might-still-say-success but I can't see it warns that timeout might falsely report failure if set below 1 second.
Edit: Now that I'm searching for ICMPSendEcho2, I have found exactly this problem documented before in a blog post in May 2015 here: https://www.frameflow.com/ping-utility-flaw-in-windows-api-creating-false-timeouts/ - finding the same behavior, but having no further explanation. They say that ping.exe is affected, when I originally thought it wasn't. They also say:
"In our tests we could not reproduce it on Windows Server 2003 R2 nor on Windows Server 2008 R2. However it was seen consistently in Windows Server 2012 R2 and in the latest builds of Windows 10."
Why? What's wrong with the timeout handling that makes it ignore ping repsonses coming into the network stack completely? Why is 1000ms a significant cutoff?

How to configure Redis connections with Rails 4, Puma and Sidekiq?

I am using Sidekiq (on Heroku with Puma) to send emails asynchronously and would like to use Redis to keep counters and cache models.
RedisCloud's free plan includes 30 connections to Redis. It is not clear to me how to manage:
redis connections used by Sidekiq
redis connections used in models (caching and counters)
Sidekiq Client size is configured like this:
Sidekiq.configure_client do |config|
config.redis = {url: ENV["REDISCLOUD_URL"], size: 3}
end
If I understood this correctly, Puma forks multiple processes, 2 in my case, which will result in:
2 (Puma Workers) * 3 (size) * 1 (Web Dyno) = 6 connections to redis used to push jobs.
Sidekiq Server
With Sidekiq taking 2 connections (or 5 in version 4), setting a concurrency of 10 would default in a server size of 12 or 15.
If I wanted to use all the remaining available connections (30 - 6 = 24), I could set :
Sidekiq.configure_client do |config|
config.redis = { size: 19 }
end
Total redis connections would be 19 + 5 (Sidekiq 4) = 24, and use the default concurrency of 25 would be ok.
As Mike Perham stated generally the concurrency must not be more than (server pool size - 2) * 2.
Now, where it starts to get confusing for me is the use of Redis out of Sidekiq.
# initializers/redis.rb
$redis = Redis.new(:url => uri)
Whenever I use Redis in a model or controller I call like so:
$redis.hincrby("mycounter", "key", 1)
As I understand it, all the puma threads wait on each other on a single Redis connection when $redis.whateverFunction is called.
In this answer What is the best way to use Redis in a Multi-threaded Rails environment? (Puma / Sidekiq), the recommended approach is using the connection_pool gem, related to the Sidekiq Wiki https://github.com/mperham/sidekiq/wiki/Advanced-Options#connection-pooling
require 'connection_pool'
$redis = ConnectionPool.new(size: 10) { Redis.new }
If I understand it right, it that case $redis.whateverFunction would have its own connection pool of 10, and sidekiq its own connection workers pool which would now be set out a new total of 20 redis connections ( 30 (available total) - 10 (redis model connections ), and Sidekiq client and server size would need to be changed.
How do you determine the size of the connection pool (here 10) needed for model/controller redis connections? Since Redis is single-threaded, how does increasing the connection pool actually increases redis operations performance?
Any thoughts on this would be of great help.
Thx!
Redis is single-threaded, but written in pure C, uses an event loop inside and handles connections asynchronously, so connection count does not affect it by much provided the same number of requests. It is capable of handling requests faster than your application can generate them because of network delay, ruby being slower than compiled and optimized C, etc, so you do not need to worry about it being single-threaded.
Increasing number of connections is beneficial for concurrent requests from different threads because there's no need to wait for response to be delivered over network to unlock connection, plus ruby can do parallel IOs.
Also you can tell if pool is too small when connection checkout times become worse than you expect/tolerate and corresponding thread/worker is idling while waiting for it, so benchmark your code and have a good look on your actual usage and behavior patterns.
On the other side i'd advise against using all of the connection count limit, there're times when you might need these extra connections. For example:
for graceful/"zero downtime" dyno restarts ("preboot") you need twice the connections, since old processes are still running for some time
keep at least one free connection for emergency debug as you may want to be able to connect from console/directly and see what data is inside when some unexpected highload comes

Windows Snmp Management Api - Snmp timeout/retry doesn't appear to work

I'm noticing some weird snmp communication behavior when using MS SNMP Mgmt Api in terms of timeout and retries. I was wondering if mgmt api is supported on Win Server 2008 R1 x64. My program is a C++ 64bit snmp extension agent that uses the mgmt api to communicate with other agents as well.
This is my pseudo code:
SnmpMgrOpen(ip address, 150ms timeout, 3 retries)
start = getTickCount()
result = SnmpMgrRequest(get request with 3 or 4 OIDs)
finish = getTickCount()
if (result == some error)
{
log Error including total time (i.e finish - start ticks)
}
SnmpMgrClose()
When the snmpMgrRequest call times out, the total time equals anywhere from 1014ms to 5000ms. If, I set retries to 0, the total time is still 1014ms to 5000ms.
I would expect, with retries to 0 that the SnmpMgrRequest would timeout within 150ms. The documentation seems to imply this. Am I missing something is there a minimum timeout period of at least a second? What could be causing this behavior?
Any help would be greatly appreciated. I'm at a lost here.
ballerstyle_98#hotmail.com
From my experience with SNMP on Windows platforms the minimum timeout value is 1 second. So even if you set it to any value lower than that, it will default to 1 second.
Also the timeout value used is doubled for every retry. So with a 150ms 3 retry configuration in the worst case you will have a failed response to a request in 1+2+2+2 =7 seconds.
I hope this helps.

WinInet: timeout management in FTP put

My program puts a file into a remote host using HTTP. For some unavoidable
reasons, the remote hosts needs some time to acknowledge the final packet of
the data transmission. More time than the default timeout, which according
to my experience is around 30 seconds.
Therefore I wanted to increase the timeout to 5 minutes, using this code:
DWORD dwTimeout= 300000; // 5 minutes
pFtpConnection->SetOption( // KB176420: this has no effect on some
INTERNET_OPTION_SEND_TIMEOUT, dwTimeout); // old versions of IE.
pFtpConnection->SetOption(
INTERNET_OPTION_RECEIVE_TIMEOUT, dwTimeout);
pFtpConnection->SetOption( // NB: Docs say these 2 are not implemented.
INTERNET_OPTION_DATA_SEND_TIMEOUT, dwTimeout);
pFtpConnection->SetOption( // our own tests show that they are!
INTERNET_OPTION_DATA_RECEIVE_TIMEOUT, dwTimeout);
This is MFC code which boils down to calling
InternetOption(hConnection, INTERNET_XXX, &dwTimeout, sizeof(dwTimeout))
The problem is that this code apparently fails to modify the timeout on a
non negligeable proportion of computers where the program is used.
How can I reliably set the data connection timeout?
TIA,
Serge Wautier.
It looks like this WinInet isue can'tbe solved reliably.
I eventually switched from WinInet to Ultimate TCP/IP.

Resources