Out of memory with running JavaScript by PhantomJS - shell

My shell script is written in cygwin for windows:
// main.sh
#!/bin/bash
[ "$#" -lt 1 ] && echo "Usage: thisscript.sh <filename.txt>" && exit 0
filename=`basename -s .txt $1`
i=0
while [ $i == 0 ]
do
phantomjs --web-security=no myXHR.js $filename.txt
logLastLine=`tail -n 1 $filename.log`
if [[ "$logLastLine" =~ "Error" ]]; then
echo "Error occurs, now keep looping it..."
elseif [[ "$logLastLine" =~ "503" ]]; then
echo "Error occurs, now keep looping it..."
elseif [[ "$logLastLine" =~ "500" ]]; then
echo "Error occurs, now keep looping it..."
else
echo "Complete! Exiting the execution..."
i=1
fi
done
And here are the codes contained in the myXHR.js
// myXHR.js
phantom.onError = function(msg, trace) {
console.log("PhantomJS Error");
phantom.exit();
};
var fs = require('fs'), system = require('system');
if (system.args.length < 2) {
console.log("Usage: myXHR.js <FILE>");
}
var content = '',
f = null,
lines = null,
eol = "\n";
try {
f = fs.open(system.args[1], "r");
filename=system.args[1].replace(/\.txt/,"");
content = f.read();
} catch (e) {
console.log(e);
}
if (f) {
f.close();
}
var request = new XMLHttpRequest();
if (content) {
lines = content.split(eol);
for (i=0; i<(lines.length-1);i++) {
request.open('GET', "http://stackoverflow.com/", false);
request.send();
if (request.status === 200) {
try {
fs.write($filename.log, line[i] + "Succeed!", 'a');
} catch(e) {
console.log(e);
}
} else {
try {
fs.write($filename.log, line[i] + "Error!", 'a');
} catch(e) {
console.log(e);
}
}
}
phantom.exit();
To illustrate, the javascript, executed by PhantomJS, are reading 1st argument(a filename.txt file), passed into the shell script, line by line. For each line it sends a XMLHttpRequest to check the request status and writes it into filename.log file.
Error status number includes 503 and 500. Luckily these statuses are less likely to occur again if I resend the same XMLHttpRequest. So what I need to do is to set up a error handler which is for resend the same XMLHttpRequest when errors occur.
In this error handler, I use X=${tail -n 1 log} to see if there is a error status number(containing "503" or "500" string). For instance, if [[ "$X" =~ "503" ]]; then restart the execution of the javascript, by not giving i=1 and while loop never exits. Until it has finished reading the last line of the imported file without any error status numbers.
(I know it is awkward to handle error like this, but it was a quick solution that came to my mind.)
But this is theoretical. In practice, this script ended with an error "Memory exhausted". I reckon this error is triggered by the large amount of lines(>100k) in the $1 file, and it occurs in the JavaScript execution part. I used free -m command to get memory usage information, and I noticed that when Javascript is running, the used swap is increasing!
Could anybody teach me how to release the memory when the scripts is being executed.

Related

Perl one liner in Bash script

I have a bash script that runs, and I'm trying to use a Perl one-liner to replace some text in a file variables.php
However, I would like to check if the Perl one-liner runs successfully and that's where I get hung up. I could just output the one-liner and it would work fine, but I would like to know for sure that it ran.
Basically, the function replace_variables() is the function that does the update, and it's the if statement there that I would like to check if my one-liner worked properly.
I've tried using the run_command function in that if statement, but that did not work, and I've tried putting the one-liner directly there, which also didn't work.
If I don't wrap it in an if statement, and just call the one-liner directly, everything works as intended.
here's the full file
#!/bin/bash
export CLI_CWD="$PWD"
site_variables() {
if [ -f "$CLI_CWD/variables.php" ]; then
return true
else
return false
fi
}
replace_variables() {
# perl -pi -e 's/(dbuser)(\s+)=\s.*;$/\1 = Config::get("db")["user"];/; s/(dbpass)(\s+)=\s.*;$/\1 = Config::get("db")["pass"];/; s/(dbname)(\s+)=\s.*;$/\1 = Config::get("db")["database"];/' "$CLI_CWD/variables.php"
if [run_command ]; then
echo "Updated variables.php successfully"
else
echo "Did not update variables.php"
fi
}
run_command() {
perl -pi -e 's/(dbuser)(\s+)=\s.*;$/\1 = Config::get("db")["user"];/; s/(dbpass)(\s+)=\s.*;$/\1 = Config::get("db")["pass"];/; s/(dbname)(\s+)=\s.*;$/\1 = Config::get("db")["database"];/' "$CLI_CWD/variables.php"
}
if [ site_variables ]; then
replace_variables
else
>&2 echo "Current directory ($(pwd)) is not a project root directory"
exit 4
fi
here's the function where the if statement fails
replace_variables() {
# perl -pi -e 's/(dbuser)(\s+)=\s.*;$/\1 = Config::get("db")["user"];/; s/(dbpass)(\s+)=\s.*;$/\1 = Config::get("db")["pass"];/; s/(dbname)(\s+)=\s.*;$/\1 = Config::get("db")["database"];/' "$CLI_CWD/variables.php"
if [run_command ]; then
echo "Updated variables.php successfully"
else
echo "Did not update variables.php"
fi
}
You can see that I commented out the one-liner just before the if statement, it works if I let that run and remove the if/else check.
here is the original file snippet before the update
//Load from Settings DB
$dbuser = 'username';
$dbpass = 'password';
$dbname = 'database_name';
here is the file snippet after the update would run
//Load from Settings DB
$dbuser = Config::get("db")["user"];
$dbpass = Config::get("db")["pass"];
$dbname = Config::get("db")["database"];
tl;dr and Solution
This usage of if with [ ] will not give you the result you expect.
What you're looking for
...
if run_command; then
...
Longer explanation
Basics of if
if is a shell feature
based on the condition, it executes the body contained in between then and fi
the "condition" that if checks is a command
commands usually have a return/exit code. typically
0 for success
1 (common) and everything else for some error
e.g. 127 for command not found
when the return/exit code is 0, the body is executed
otherwise it is skipped; or control is passed to elif or else
the syntax is if <command>; then...
Where does that [ ] come from?
test is a command that can check file types and compare values
refer man test and help test (bash only)
[ ... ] is a synonym for test
NB the brackets should be surrounded by spaces on both sides
if [ -f "/path/to/$filename" ]; then
exception: when terminated by new line or ; space not required
test (or [ ]) evaluates expressions and cannot execute other commands or functions
if [ expr ]; then is alternate syntax for if test expr; then
PS: good practice to "quote" your "$variables" when used with test or [ ]
PPS: [[ ... ]] is a different thing altogether. not POSIX; available only in some shells. take a look at this thread on the UNIX Stack Exchange

Kaldi librispeech data preparation error

I'm trying to do ASR system. Im using kaldi manual and librispeech corpus.
In data preparation step i get this error
utils/data/get_utt2dur.sh: segments file does not exist so getting durations
from wave files
utils/data/get_utt2dur.sh: could not get utterance lengths from sphere-file
headers, using wav-to-duration
utils/data/get_utt2dur.sh: line 99: wav-to-duration: command not found
And here the piece of code where this error occures
if cat $data/wav.scp | perl -e '
while (<>) { s/\|\s*$/ |/; # make sure final | is preceded by space.
#A = split;
if (!($#A == 5 && $A[1] =~ m/sph2pipe$/ &&
$A[2] eq "-f" && $A[3] eq "wav" && $A[5] eq "|")) { exit (1); }
$utt = $A[0]; $sphere_file = $A[4];
if (!open(F, "<$sphere_file")) { die "Error opening sphere file $sphere_file"; }
$sample_rate = -1; $sample_count = -1;
for ($n = 0; $n <= 30; $n++) {
$line = <F>;
if ($line =~ m/sample_rate -i (\d+)/) { $sample_rate = $1; }
if ($line =~ m/sample_count -i (\d+)/) { $sample_count = $1;
}
if ($line =~ m/end_head/) { break; }
}
close(F);
if ($sample_rate == -1 || $sample_count == -1) {
die "could not parse sphere header from $sphere_file";
}
$duration = $sample_count * 1.0 / $sample_rate;
print "$utt $duration\n";
} ' > $data/utt2dur; then
echo "$0: successfully obtained utterance lengths from sphere-file headers"
else
echo "$0: could not get utterance lengths from sphere-file headers,
using wav-to-duration"
if command -v wav-to-duration >/dev/null; then
echo "$0: wav-to-duration is not on your path"
exit 1;
fi
In file wav.scp i got such lines:
6295-64301-0002 flac -c -d -s /home/tinin/kaldi/egs/librispeech/s5/LibriSpeech/dev-clean/6295/64301/6295-64301-0002.flac |
In this dataset i have only flac files(they downloaded via provided script) and i dont understand why we search wav-files? And how run data preparation correctly(i didnt change source code in this manual.
Also, if you explain to me what is happening in this code, then I will be very grateful to you, because i'm not familiar with bash and perl.
Thank you a lot!
The problem I see from this line
utils/data/get_utt2dur.sh: line 99: wav-to-duration: command not found
is that you have not added the kaldi tools in your path.
Check the file path.sh and see if the directories that it adds to your path are correct (because it has ../../.. inside and it might not match your current folder setup)
As for the perl script, it counts the samples of the sound file and then it divides with the sample rate in order to get the duration. Don't worry about the 'wav' word, your files might be on another format, it's just the name of the kaldi functions.

Perl script hangs for no reason

So I have this small script which checks two log files for a specific line and compares the lines.
The script is used on several different Windows Bamboo Agents. But on one it just hangs and doesn't exit. Since the script is used in bamboo the whole job hangs, when this script doesn't exit.
When I check the computer via remote access and kill the script the job continues until it reaches the script again.
This is the script, which is started by another script.
#! /usr/bin/perl
my $naluresult = 2;
my $hevcresult = 2;
my $hevcfailed = 0;
use strict;
use warnings;
#---------------------------------------------
#check for $ARGV[0] and $ARGV[1]
open( my $nalulog, "<", $ARGV[1] )
or die "cannot open File:$!\n\n";
while (<$nalulog>) {
chomp;
$_ =~ s/\s+//g;
if ( $_ =~ m/MD5:OK/ ) {
$naluresult = 1;
} else {
if ( $_ =~ m/MD5:MISSING/ ) {
$naluresult = 0;
}
}
}
close $nalulog;
#---------------------------------------------
open( my $hevclog, "<", $ARGV[0] )
or die "cannot open File:$!\n\n";
while (<$hevclog>) {
chomp;
$_ =~ s/\s+//g;
if ( $_ =~ m/MD5check:OK/ ) {
$hevcresult = 1;
last;
} else {
if ( $_ =~ m/MD5check:FAILED/ ) { $hevcfailed = 1; }
}
if ( $hevcfailed == 1 ) {
#do stuff
}
}
close $hevclog;
#---------------------------------------------
if ( $hevcresult == 2 ) {
print("Missing MD5 status in HEVC Output");
exit(-1);
} elsif ( $naluresult == 2 ) {
print("Missing MD5 status in NALU Output");
exit(-2);
} else {
if ( $naluresult == $hevcresult ) { exit(0); }
else {
#different if-statements to print() to log
exit(1);
}
}
#---------------------EOF---------------------
If your files are just normal disk files that aren't being simultaneously written to by other processes, or locked, or anything like that, then there is nothing in the code you have here that should hang. If the files are both reasonable sizes, the code you have here should read through the files and finish.
However, if one of the files is locked, or is immensely large, or if you have other code that can get stuck in an infinite loop, that would explain why your program is hanging.

How to read the output AND the return value of a external program in shell script?

I'm making a script that reads a tracking code, looks at the results of posting the tracking to a website and prints some messages and has a return value.
Here's part of the python code:
# update return True if there was a change to the .msg file
def update(cod):
msg = extract_msg(cod)
if msg == 'ERROR':
print('ERROR: invalid code\n')
sys.exit(2)
file = open('.msg', "r+")
old_msg = file.read()
if msg == old_msg:
return False
else:
print('Previous message: ' + old_msg)
print('Latest message: ' + msg)
file = overwrite(file, msg)
file.close()
return True
def main(argv):
if len(argv) > 1:
cod_rastr = argv[1]
else:
print("Error: no arg, no code\n")
return -1
# Verify if file exists
if os.path.isfile(".msg") == False:
arq = open('.msg', 'w')
arq.close()
# post() returns the source code of the resulting page of the posted code.
cod = post(cod_rastr)
if update(cod) == False:
return 0
else:
print ('\n Message!\n')
return 1
And here, I want to read not only the prints (for the final user) but the return values (for conditional use). This script should read the output of the .py and send me an email in case there is an update from the last check (I'll put this script in the crontab):
#!/bin/bash
if [ -z "$1" ]; then
echo usage: $0 CODE
exit
fi
CODE=$1
STATUS=$(myscript.py $CODE 2>&1)
VAL=$?
FILE=$(<.msg)
# always prints 0 (zero)
echo $VAL
# I want to check for an existing update case
if [[ $STATUS == 'Message!' ]]
then
echo $STATUS
echo $FILE | mail myuser#mydomain.com -s '$CODE: Tracking status'
fi
The problem is that $? always returns 0, and my string check inside the if, is not working, because I think It reads the update() prints too, which has variables in the print.
How can I make this shell script run, without changing the python script?
Thanks in advance.
I suspect that you can do what you want with the subprocess module. Either use rc = subprocess.call(...) to get a return code while directing stdout to a file, or use p = subprocess.Popen(...) and then perhaps p.communicate to get output and p.returncode to get the returncode.

How can I check if stdin exists in PHP ( php-cgi )?

Setup and Background
I am working on script that needs to run as /usr/bin/php-cgi instead /usr/local/bin/php and I'm having trouble checking for stdin
If I use /usr/local/bin/php as the interpreter I can do something like
if defined('STDIN'){ ... }
This doesn't seem to work with php-cgi - Looks to always be undefined. I checked the man page for php-cgi but didn't find it very helpful. Also, if I understand it correctly, the STDIN constant is a file handle for php://stdin. I read somewhere that constant is not supposed to be available in php-cgi
Requirements
The shebang needs to be #!/usr/bin/php-cgi -q
The script will sometimes be passed arguments
The script will sometimes receive input via STDIN
Current Script
#!/usr/bin/php-cgi -q
<?php
$stdin = '';
$fh = fopen('php://stdin', 'r');
if($fh)
{
while ($line = fgets( $fh )) {
$stdin .= $line;
}
fclose($fh);
}
echo $stdin;
Problematic Behavior
This works OK:
$ echo hello | ./myscript.php
hello
This just hangs:
./myscript.php
These things don't work for me:
Checking defined('STDIN') // always returns false
Looking to see if CONTENT_LENGTH is defined
Checking variables and constants
I have added this to the script and run it both ways:
print_r(get_defined_constants());
print_r($GLOBALS);
print_r($_COOKIE);
print_r($_ENV);
print_r($_FILES);
print_r($_GET);
print_r($_POST);
print_r($_REQUEST);
print_r($_SERVER);
echo shell_exec('printenv');
I then diff'ed the output and it is the same.
I don't know any other way to check for / get stdin via php-cgi without locking up the script if it does not exist.
/usr/bin/php-cgi -v yields: PHP 5.4.17 (cgi-fcgi)
You can use the select function such as:
$stdin = '';
$fh = fopen('php://stdin', 'r');
$read = array($fh);
$write = NULL;
$except = NULL;
if ( stream_select( $read, $write, $except, 0 ) === 1 ) {
while ($line = fgets( $fh )) {
$stdin .= $line;
}
}
fclose($fh);
Regarding your specific problem of hanging when there is no input: php stream reads are blocking operations by default. You can change that behavior with stream_set_blocking(). Like so:
$fh = fopen('php://stdin', 'r');
stream_set_blocking($fh, false);
$stdin = fgets($fh);
echo "stdin: '$stdin'"; // immediately returns "stdin: ''"
Note that this solution does not work with that magic file handle STDIN.
stream_get_meta_data helped me :)
And as mentioned in the previous answer by Seth Battin stream_set_blocking($fh, false); works very well 👍
The next code reads data from the command line if provided and skips when it's not.
For example:
echo "x" | php render.php
and php render.php
In the first case, I provide some data from another stream (I really need to see the changed files from git, something like git status | php render.php.
Here is an example of my solution which works:
$input = [];
$fp = fopen('php://stdin', 'r+');
$info = stream_get_meta_data($fp);
if (!$info['seekable'] && $fp) {
while (false !== ($line = fgets($fp))) {
$input[] = trim($line);
}
fclose($fp);
}
The problem is that you create a endless loop with the while($line = fgets($fh)) part in your code.
$stdin = '';
$fh = fopen('php://stdin','r');
if($fh) {
// read *one* line from stdin upto "\r\n"
$stdin = fgets($fh);
fclose($fh);
}
echo $stdin;
The above would work if you're passing arguments like echo foo=bar | ./myscript.php and will read a single line when you call it like ./myscript.php
If you like to read more lines and keep your original code you can send a quit signal CTRL + D
To get parameters passed like ./myscript.php foo=bar you could check the contents of the $argv variable, in which the first argument always is the name of the executing script:
./myscript.php foo=bar
// File: myscript.php
$stdin = '';
for($i = 1; $i < count($argv); i++) {
$stdin .= $argv[$i];
}
I'm not sure that this solves anything but perhaps it give you some ideas.

Resources