Redis timing out on long strings (below the 512 MB max) - ruby

This is using the Ruby client.
> long_string = 'x' * 9_000_000; "created"
"created"
> long_string.bytesize / (1024.0 * 1024.0) # size in megabytes
8.58306884765625
> client.set('test', long_string)
Redis::TimeoutError: Connection timed out
I get the timeout error after five seconds (the default timeout). When I raise the timeout even to ten minutes, it still fails. Whether this happens also seems to be dependent on past calls to client.set, even at different keys.
According to the documentation, the max size for strings is 512 MB. Is this over-optimistic?
This answer vaguely suggests that Redis is not meant to handle long strings. Is that what's going on, or is the issue in the Ruby library?

This isn't a Redis limitation, more likely a client setting. Repeating the example with a different client (Python's redis-py) does not reproduce the issue:
In [1]: long_string = "x" * 9000000
In [2]: len(long_string) / (1024.0 ** 2)
Out[2]: 8.58306884765625
In [3]: import redis
In [4]: r = redis.StrictRedis()
In [5]: r.set('test', long_string)
Out[5]: True
In [6]: longer = long_string * 50
In [7]: r.set('test2', longer)
Out[7]: True
In [8]: r.strlen('test2')
Out[8]: 450000000

Related

How to configure windows for large number of concurrent connections

I have seen many tutorial on how to tune Linux to scale Node.js or Erlang server to 600K+ concurrent connections.
But I have not found similar tutorial for windows, can someone help with what are the similar knobs that exist for Windows.
/etc/security/limits.d/custom.conf
In root soft nofile 1000000
root hard nofile 1000000
* soft nofile 1000000
* hard nofile 1000000
* List item
/etc/sysctl.conf
fs.file-max = 1000000
fs.nr_open = 1000000
net.ipv4.netfilter.ip_conntrack_max = 1048576
net.nf_conntrack_max = 1048576
“fs.file-max”
The maximum file handles that can be allocated
“fs.nr_open”
Max amount of file handles that can be opened
“net.ipv4.netfilter.ip_conntrack_max”
how many connections the NAT can keep track of in the “tracking” table Default is: 65536
// Increase total number of connection allowed
[HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
TcpnumConnections = 16,777,214
// increase MaxFreeTcbs to > than number concurrent connection
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \Tcpip
\Parameters] MaxFreeTcbs = 2000 (Default = RAM dependent, but usual
Pro = 1000, Srv=2000)
// Increase hash table size for the hashtable helping TCP lookup
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \services \Tcpip
\Parameters] MaxHashTableSize = 512 (Default = 512, Range = 64-65536)
// reduce tcp delay to limit half-open connection
[HKEY_LOCAL_MACHINE\System \CurrentControlSet \services \Tcpip \Parameters]
TcpTimedWaitDelay = 120 (Default = 240 secs, Range = 30-300)

Golang readers: Why writing int64 numbers using bitwise operator <<

I have come across the following code when dealing with Go readers to limit the number of bytes read from a remote client when sending a file through multipart upload (e.g. in Postman).
r.Body = http.MaxBytesReader(w, r.Body, 32<<20+1024)
If I am not mistaken, the above notation should represent 33555456 bytes, or 33.555456 MB (32 * 2 ^ 20) + 1024. Or is this number not correct?
What I don't understand is:
why did the author use it like this? Why using 20 and not some other number?
why the author used the notation +1024 at all? Why didn't he write 33 MB instead?
would it be OK to write 33555456 directly as int64?
If I am not mistaken, the above notation should represent 33555456 bytes, or 33.555456 MB (32 * 2 ^ 20) + 1024. Or is this number not correct?
Correct. You can trivially check it yourself.
fmt.Println(32<<20+1024)
Why didn't he write 33 MB instead?
Because this number is not 33 MB. 33 * 1024 * 1024 = 34603008
would it be OK to write 33555456 directly as int64?
Naturally. That's what it likely is reduced to during compilation anyway. This notation is likely easier to read, once you figure out the logic behind 32, 20 and 1024.
Ease of reading is why I almost always (when not using ruby) write constants like "50 MB" as 50 * 1024 * 1024 and "30 days" as 30 * 86400, etc.

xarray too slow for performance critical code

I planned to use xarray extensively in some numerically intensive scientific code that I am writing. So far, it makes the code very elegant, but I think I will have to abandon it as the performance cost is far too high.
Here is an example, which creates two arrays and multiplies parts of them together using xarray (with several indexing schemes), and numpy. I used num_comp=2 and num_x=10000:
Line # Hits Time Per Hit % Time Line Contents
4 #profile
5 def xr_timing(num_comp, num_x):
6 1 4112 4112.0 10.1 da1 = xr.DataArray(np.random.random([num_comp, num_x]).astype(np.float32), dims=['component', 'x'], coords={'component': ['a', 'b'], 'x': np.linspace(0, 1, num_x)})
7 1 438 438.0 1.1 da2 = da1.copy()
8 1 1398 1398.0 3.4 da2[:] = np.random.random([num_comp, num_x]).astype(np.float32)
9 1 7148 7148.0 17.6 da3 = da1.isel(component=0).drop('component') * da2.isel(component=0).drop('component')
10 1 6298 6298.0 15.5 da4 = da1[dict(component=0)].drop('component') * da2[dict(component=0)].drop('component')
11 1 7541 7541.0 18.6 da5 = da1.sel(component='a').drop('component') * da2.sel(component='a').drop('component')
12 1 7184 7184.0 17.7 da6 = da1.loc[dict(component='a')].drop('component') * da2.loc[dict(component='a')].drop('component')
13 1 6479 6479.0 16.0 da7 = da1[0, :].drop('component') * da2[0, :].drop('component')
15 #profile
16 def np_timing(num_comp, num_x):
17 1 1027 1027.0 50.2 da1 = np.random.random([num_comp, num_x]).astype(np.float32)
18 1 977 977.0 47.8 da2 = np.random.random([num_comp, num_x]).astype(np.float32)
19 1 41 41.0 2.0 da3 = da1[0, :] * da2[0, :]
The fastest xarray multiplication takes about 150X the time of the numpy version. This is just one of the operations in my code, but I find most of them are many times slower than the numpy equivalent, which is unfortunate as xarray makes the code so much clearer. Am I doing something wrong?
Update: Even da1[0, :].values * da2[0, :].values (which loses many of the benefits of using xarray) takes 2464 time units.
I am using xarray 0.9.6, pandas 0.21.0, numpy 1.13.3, and Python 3.5.2.
Update 2:
As requested by #Maximilian, here is a re-run with num_x=1000000:
Line # Hits Time Per Hit % Time Line Contents
# xarray
9 5 408596 81719.2 11.3 da3 = da1.isel(component=0).drop('component') * da2.isel(component=0).drop('component')
10 5 407003 81400.6 11.3 da4 = da1[dict(component=0)].drop('component') * da2[dict(component=0)].drop('component')
11 5 411248 82249.6 11.4 da5 = da1.sel(component='a').drop('component') * da2.sel(component='a').drop('component')
12 5 411730 82346.0 11.4 da6 = da1.loc[dict(component='a')].drop('component') * da2.loc[dict(component='a')].drop('component')
13 5 406757 81351.4 11.3 da7 = da1[0, :].drop('component') * da2[0, :].drop('component')
14 5 48800 9760.0 1.4 da8 = da1[0, :].values * da2[0, :].values
# numpy
20 5 37476 7495.2 2.9 da3 = da1[0, :] * da2[0, :]
The performance difference has decreased substantially, as expected (only about 10X slower now), but I am still glad that the issue will be mentioned in the next release of the documentation as even this amount of difference may surprise some people.
Yes, this is a known limitation for xarray. Performance sensitive code that uses small arrays is much slower for xarray than NumPy. I wrote a new section about this in our docs for the next version:
http://xarray.pydata.org/en/stable/computation.html#wrapping-custom-computation
You basically have two options:
Write your performance sensitive code on unwrapped arrays, and then wrap them back in xarray data structures. Xarray v0.10 has a new helper function (apply_ufunc) that makes this a little easier. See the link above if you are interested in this.
Use something other than xarray/Python to do your computation. This could also make sense because Python itself adds significant overhead. Julia's AxisArrays.jl looks like interesting, though I haven't tried it myself.
I suppose option 3 would be to rewrite xarray itself in C++ (e.g., on top of xtensor), but that would be much more involved!

Max31865 on Raspberry Pi setup

I'm pretty new to coding. I'm trying to read a PT100 rtd via my Raspberry Pi 3. I read that I needed the Max31865 RTD amplifier to properly read the data because the resistances are so small. I am fairly certain I have it plugged in correctly.
I'm using this code, only slightly editted.
https://github.com/steve71/MAX31865
I'm getting two different outputs so far but it doesn't seem to correlate with anything I'm changing (The byte associated with the readTemp mostly) since I've run the same code twice and gotten both outputs. The outputs are as follows:
config register byte: ff
RTD ADC Code: 32767
PT100 Resistance: 429.986877 ohms
Straight Line Approx. Temp: 767.968750 degC
Callendar-Van Dusen Temp (degC > 0): 988.792111 degC
high fault threshold: 32767
low fault threshold: 32767
and
config register byte: 08
RTD ADC Code: 0
PT100 Resistance: 0.000000 ohms
Straight Line Approx. Temp: -256.000000 degC
Callendar-Van Dusen Temp (degC > 0): -246.861024 degC
high fault threshold: 0
low fault threshold: 0
Any help would be appreciated.
I'am dealing exactly with the same issue right now. Do you use your Pt100 with 3- or 4-wires?
I fixed the problem by setting the correct configuration status register in Line 78 of the original code (https://github.com/steve71/MAX31865) to 0xA2
self.writeRegister(0, 0xA2)
I am using 4-wires, so i had to change bit4 from 1 (3-wires) to 0 (2- or 4-wires)
0xb10100010
After this, i've got this as output
config register byte: 80
RTD ADC Code: 8333
PT100 Resistance: 101.721191 ohms
Straight Line Approx. Temp: 4.406250 degC
Callendar-Van Dusen Temp (degC > 0): 4.406808 degC
high fault threshold: 32767
low fault threshold: 0
Brrr... it's very cold in my room, isn't it? To fix this, i had to change the reference resistance in Line 170 to 430 Ohm
R_REF = 430.0 # Reference Resistor
It's curious, because i red a lot of times, there is a 400 Ohm resistance mounted on this devices as the reference. Indeed, on the SMD resistor is a 3-digit Code "431" which means 430 Ohm. Humm...
But now i have it nice and warm in here
Callendar-Van Dusen Temp (degC > 0): 25.091629 degC
Best regards
Did you get this resolved ? In case you didn't, the below python class method works for me. I remember that I had some trouble with wiring the force terminals, from memory for 2-wire you have to bridge both force terminals.
def _take_Resistance_Reading(self):
msg = '%s: taking resistance reading...' % self.Name
try:
self.Logger.debug(msg + 'entered method take_resistance_Reading()')
with self._RLock:
reg = self.spi.readbytes(9)
del reg[0] # delete 0th dummy data
self.Logger.debug("%s: register values: %s", self.Name, reg)
RTDdata = reg[1] << 8 | reg[2]
self.Logger.debug("%s: RTD data: %s", self.Name, hex(RTDdata))
ADCcode = RTDdata >> 1
self.Logger.debug("%s: ADC code: %s", self.Name, hex(ADCcode))
self.Vout = ADCcode
self._Resistance = round(ADCcode * self.Rref / 8192, 1)
self.Logger.debug(msg + "success, Vout: %s, resistance: %s Ohm" % (self.Vout, self._Resistance))
return True
except Exception as e:

Ruby IO#read max length for single read

How can i determine the max length IO#read can get in a single read on the current platform?
irb(main):301:0> File.size('C:/large.file') / 1024 / 1024
=> 2145
irb(main):302:0> s = IO.read 'C:/large.file'
IOError: file too big for single read
That message comes from io.c, remain_size. It is emitted when the (remaining) size of the file is greater or equal to LONG_MAX. That value depends on the platform your Ruby has been compiled with.
At least in Ruby 1.8.7, the maximum value for Fixnums happens to be just half of that value (-1), so you could get the limit by
2 * 2 ** (1..128).to_a.find { | i | (1 << i).kind_of? Bignum } - 1
You should rather not rely on that.

Resources