Ruby: Too many open files # rb_sysopen - ruby

After opening a file with File.new(big_file) (without closing it) 1016 times (Ubuntu) or 1017 times (CentOS), it seems there is a limit and it raises:
Too many open files # rb_sysopen - big_file (Errno::EMFILE)
Is there any way to raise that limit?
On my systems, ulimit is set to unlimited.

EMFILE is too many files opened in your process.
ENFILE is too many files opened in the entire system.
So Errno::EMFILE is due to the ruby process opening too many files. This limit is probably set to the default 1024 can be seen with:
$ulimit -n
1024
Instead of:
$ulimit
unlimited
You can raise the limit using this method.

Related

Mainframe pkunzip generates PEX013W Record(s) being truncated to lrecl=

I'm sending binary .gz files from Linux to z/OS via ftps. The file transfers seem to be fine, but when the mainframe folks pkunzip the file, they get a warning:
PEX013W Record(s) being truncated to lrecl= 996. Record# 1 is 1000 bytes.
Currently I’m sending the site commands:
SITE TRAIL
200 SITE command was accepted
SITE CYLINDERS PRIMARY=50 SECONDARY=50
200 SITE command was accepted
SITE RECFM=VB LRECL=1000 BLKSIZE=32000
200 SITE command was accepted
SITE CONDDISP=delete
200 SITE command was accepted
TYPE I
200 Representation type is Image
...
250 Transfer completed successfully.
QUIT
221 Quit command received. Goodbye.
They could read the file after the pkunzip, but having a warning is not a good thing.
Output from pkunzip:
SDSF OUTPUT DISPLAY RMD0063A JOB22093 DSID 103 LINE 25 COLUMNS 02- 81
COMMAND INPUT ===> SCROLL ===> CSR
PCM123I Authorized services are unavailable.
PAM030I INPUT Archive opened: TEST.FTP.SOA5021.GZ
PAM560I ARCHIVE FASTSEEK processing is disabled.
PDA000I DDNAME=SYS00001,DISP_STATUS=MOD,DISP_NORMAL=CATALOG,DISP_ABNORMAL=
PDA000I SPACE_TYPE=TRK,SPACE_TYPE=CYL,SPACE_TYPE=BLK
PDA000I SPACE_PRIMARY=4194304,SPACE_DIRBLKS=5767182,INFO_ALCFMT=00
PDA000I VOLUMES=DPPT71,INFO_CNTL=,INFO_STORCLASS=,INFO_MGMTCLASS=
PDA000I INFO_DATACLASS=,INFO_VSAMRECORG=00,INFO_VSAMKEYOFF=0
PDA000I INFO_COPYDD=,INFO_COPYMDL=,INFO_AVGRECU=00,INFO_DSTYPE=00
PEX013W Record(s) being truncated to lrecl= 996. Record# 1 is 1000 bytes.
PEX002I TEST.FTP.SOA5021
PEX003I Extracted to TEST.FTP.SOA5021I.TXT
PAM140I FILES: EXTRACTED EXCLUDED BYPASSED IN ERROR
PAM140I 1 0 0 0
PMT002I PKUNZIP processing complete. RC=00000004 4(Dec) Start: 12:59:48.86 End
Is there a better set of site commands to transfer a .gz file from Linux to z/OS to avoid this error?
**** Update ****
Using SaggingRufus's answer below, it turns out it doesn't much matter how you send the .gz file, as long as it's binary. His suggestion pointed us to the parameters sent to the pkunzip for the output file, which was VB and was truncating 4 bytes off the record.
Because it is a variable block file, there are 4 bytes allocated to the record attributes. Allocate the file with an LRECL of 1004 and it will be fine.
Rather than generating a .zip file, perhaps generate a .tar.gz file and transfer it to z/OS UNIX? Tar is shipped with z/OS by default, and Rocket Software provides a port of gzip that is optimized for z/OS.

Error during go build/run execution

I've created a simple go script: https://gist.github.com/kbl/86ed3b2112eb80522949f0ce574a04e3
It's fetching some xml from the internet and then starts X goroutines. The X depends on file content. In my case it was 1700 goroutines.
My first execution finished with:
$ go run mathandel1.go
2018/01/27 14:19:37 Get https://www.boardgamegeek.com/xmlapi/boardgame/162152?pricehistory=1&stats=1: dial tcp 72.233.16.130:443: socket: too many open files
2018/01/27 14:19:37 Get https://www.boardgamegeek.com/xmlapi/boardgame/148517?pricehistory=1&stats=1: dial tcp 72.233.16.130:443: socket: too many open files
exit status 1
I've tried to increase ulimit to 2048.
Now I'm getting different error, script is the same thou:
$ go build mathandel1.go
# command-line-arguments
/usr/local/go/pkg/tool/linux_amd64/link: flushing $WORK/command-line-arguments/_obj/exe/a.out: write $WORK/command-line-arguments/_obj/exe/a.out: file too large
What is causing that error? How can I fix that?
You ran ulimit 2048 which changed the maximum file size.
From man bash(1), ulimit section:
If no option is given, then -f is assumed.
This means that you now set the maximum file size to 2048 bytes, that's probably not enough for.... anything.
I'm guessing you meant to change the limit for number of open file descriptors. For this, you want to run:
ulimit -n 2048
As for the original error (before changing the maximum file size), you're launching 1700 goroutines, each performing a http get. Each creates a connection, using a tcp socket. These are covered by the open file descriptor limit.
Instead, you should be limiting the number of concurrent downloads. This can be done with a simple worker pool pattern.

Couch has apparent limit of attachment sizes on Mac OS X

I have plain vanilla CouchDB from Apache, which runs as an App running on a Mac OS X 10.9. If I try to attach an attachment to a document that is above 1 Meg in size, it just hangs and does nothing.
I have tried to use couchdbs on Linux, and there the sky is the limit.
I first thought it had to do with low limits on the mac but it doesn't seem so :
➜ ~ ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-v: address space (kbytes) unlimited
-l: locked-in-memory size (kbytes) unlimited
-u: processes 709
-n: file descriptors 256
What is causing this ? Why ? And how to fix this ?
Check the config files given by couchdb -c. You probably have this somewhere in them (for some unknown reason):
[couchdb]
max_attachment_size = 1048576 ; bytes
Remove or comment the line and you should be fine.
Or maybe it was compiled with this hardcoded so you could add this line to one of the config file and increase the value.
Update
max_attachment_size is undocumented so probably not safe to use. I leave the original answer as it seems to have solved the problem of the OP but according to the docs, the attachment size should be unlimited. Also attachment_stream_buffer_size is the config key controlling the chunk size of the attachments which might relevant.

Errno::ENOMEM: Cannot allocate memory - cat

I have a job running on production which process xml files.
xml files counts around 4k and of size 8 to 9 GB all together.
After processing we get CSV files as output. I've a cat command which will merge all CSV files to a single file I'm getting:
Errno::ENOMEM: Cannot allocate memory
on cat (Backtick) command.
Below are few details:
System Memory - 4 GB
Swap - 2 GB
Ruby : 1.9.3p286
Files are processed using nokogiri and saxbuilder-0.0.8.
Here, there is a block of code which will process 4,000 XML files and output is saved in CSV (1 per xml) (sorry, I'm not suppose to share it b'coz of company policy).
Below is the code which will merge the output files to a single file
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each {|file|
`cat #{file} >> #{final_output_file}`
}
I've taken memory consumption snapshots during processing.It consumes almost all part of the memory, but, it won't fail.
It always fails on cat command.
I guess, on backtick it tries to fork a new process which doesn't get enough memory so it fails.
Please let me know your opinion and alternative to this.
So it seems that your system is running pretty low on memory and spawning a shell + calling cat is too much for the few memory left.
If you don't mind loosing some speed, you can merge the files in ruby, with small buffers.
This avoids spawning a shell, and you can control the buffer size.
This is untested but you get the idea :
buffer_size = 4096
output_file = File.open(final_output_file, 'w')
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each do |file|
f = File.open(file)
while buffer = f.read(buffer_size)
output_file.write(buffer)
end
f.close
end
You are probably out of physical memory, so double check that and verify your swap (free -m). In case you don't have a swap space, create one.
Otherwise if your memory is fine, the error is most likely caused by shell resource limits. You may check them by ulimit -a.
They can be changed by ulimit which can modify shell resource limits (see: help ulimit), e.g.
ulimit -Sn unlimited && ulimit -Sl unlimited
To make these limit persistent, you can configure it by creating the ulimit setting file by the following shell command:
cat | sudo tee /etc/security/limits.d/01-${USER}.conf <<EOF
${USER} soft core unlimited
${USER} soft fsize unlimited
${USER} soft nofile 4096
${USER} soft nproc 30654
EOF
Or use /etc/sysctl.conf to change the limit globally (man sysctl.conf), e.g.
kern.maxprocperuid=1000
kern.maxproc=2000
kern.maxfilesperproc=20000
kern.maxfiles=50000
I have the same problem, but instead of cat it was sendmail (gem mail).
I found problem & solution here by installing posix-spawn gem, e.g.
gem install posix-spawn
and here is the example:
a = (1..500_000_000).to_a
require 'posix/spawn'
POSIX::Spawn::spawn('ls')
This time creating child process should succeed.
See also: Minimizing Memory Usage for Creating Application Subprocesses at Oracle.

Why can I only open 2045 files with Tie::File on Windows?

I have the following code that tries to tie arrays to files. Except, when I run this code, it only creates 2045 files. What is the issue here?
#!/usr/bin/perl
use Tie::File;
for (my $i = 0; $i < 10000; $i++) {
#files{$i} = ();
tie #{$files{$i}}, 'Tie::File', "files//tiefile$i";
}
Edit: I am on windows
You are accumulating open file handles (see ulimit -n, setrlimit RLIMIT_NOFILE/RLIMIT_OFILE), and you ultimately hit a 2048 open file descriptors limit (2045 + stdin + stdout + stderr.)
Under Windows you will have to rewrite your application so that it has at most 2048 open file handles at any one time, since the 2048 limit is hard limit (cannot be modified) in MSVC's stdio.
On Linux machines go to /etc/security/limits.conf and add or modify these lines
* soft nofile 10003
* hard nofile 10003
This will increase the number of files each process can have open to 10003 (remember that you always start with three open: stdin, stdout, and stderr).
Based on you comments it sounds like you are using a Win32 machine. I can't find a way to increase the number of open files per process, but you might, and I stress might, be able to handle this through fork'ing (which is really threading on Win32).

Resources