How do I get GNU parallel to work on git bash in Windows 7? - windows-7

I was able to install GNU Parallel globally in git-bash by following this answer.
However, on running an example command as mentioned in the parallel-tutorial,
parallel -k echo ::: A B C > abc-file
I keep getting this error
sh: -c: option requires an argument
sh: -c: option requires an argument
sh: -c: option requires an argument
.
.
.
What am I doing wrong here?

So the root cause is that CygWin (contrary to GNU/Linux) does not respect redirection of STDERR if the command line is too long.
GNU Parallel figures out how long the longest possible command line is by doing a binary search for the length. This is awfully slow on CygWin because forking a 12 MB command line is horribly slow (and 12 MB seems to be the limit in my version of CygWin).
Luckily it only has do be done once. After this GNU Parallel caches the line length in ~/.parallel/tmp/HOSTNAME/linelen, and this is the reason why you experience the problem when ~/.parallel/tmp is removed.
This is also the reason why it seemed that using a different version worked: You simply had a single run that finished, and thus cached the length. It was not the change of version that did this.
Until I manage to get CygWin to ignore the sh: -c: option requires an argument all you need to do is to ignore it and be patient. I should probably also put in a small warning, to let CygWin users know that they have to be patient the first time.
Run:
parallel echo ::: 1
It will spit out the sh: -c: option requires an argument around 25 times, but that is fine. It will take around 30 seconds to complete.
After this everything should be fast(er) and you should not see the error.
It should be fixed in the newest version in GIT: https://savannah.gnu.org/git/?group=parallel

You are not the first to have this problem, and currently we do not know what causes it. I have access to a Windows-10 machine and I do not see that behaviour. A workaround seems to be using an older version of GNU Parallel. You can help by figuring out which versions work. When you have a single version (look here: https://ftpmirror.gnu.org/parallel), that works, run this:
testone() {
v="$1"
wget -c https://ftpmirror.gnu.org/parallel/parallel-$v.tar.bz2
tar xvf parallel-$v.tar.bz2
cd parallel-$v
src/parallel true ::: 1
}
export -f testone
parallel -k --joblog my.log testone {1}{2}22 ::: {2012..2020} ::: {01..12}
grep -E '\s0\s0\stest' my.log
This will give all versions that do work.
Post the output from parallel -Dall echo ::: foo for both the newest working version, the following version, and the newest version (20200322).

Related

Get log entries written while executing command

I have a service that writes to a file in /var/log. For testing purposes, I am looking for a way to extract just the log lines that are written while executing a command against the service. I know I could do it with a C program using fseek/ftell, but that would require extra tooling in the VM. I would prefer a pure bash solution (bash 4.4, Ubuntu 18.04). I thought maybe something about using tail -f might work, but I can't figure out exactly how to work that.
You can use diff command. It takes 2 files as input and prints differing lines. You can copy the log file before execution of the service and compare it to the original file afterwords.
$ cat > logfile
line 1
line 2 asdf
$ cp logfile logfile-old
$ cat >> logfile
Third one.
Oups. Error occured.
$ diff logfile logfile-old
3,4d2
< Third one.
< Oups. Error occured.

xargs command fails on 'LINE_MAX' assertion

I've been using git bash on Windows for a couple of years now and never had any real issues. Now I'm faced with a somewhat weird issue. Whenever the xargs command is used I get this error:
assertion "bc_ctl.arg_max >= LINE_MAX" failed: file "xargs.c", line 500, function: main
Aborted (core dumped)
The dumped core isn't particularly useful to me since it only contains Frame/Function/Args hexadecimal values:
1 Stack trace:^M
2 Frame Function Args^M
3 000FFFFC370 0018005D14C (000FFFFE3F4, 00000000002, 000FFFFC4A0, 000FFFFDE50)^M
...
15 End of stack trace
Even with a command as trivial as xargs --version, I still get the same result. I've searched and read many READMEs, blogs, and manuals, but I can't seem to find what's causing this. I know that the assertion means that my command is somehow bigger than LINE_MAX; which is this on my system:
workspace $>getconf LINE_MAX
2048
But I don't see how xargs --version is bigger/longer than 2048.
Any help or tips is appreciated.
Thanks
Ok well, it turns out that some other scripts were sourced when my git bash loaded. They were company '.sh', since I'm on my work PC. One of them was breaking my xargs by interfering with a couple of configuration (including LINE_MAX).
By 'unsourcing' these scripts I've solved my problem.
I'm still investigating the reason why getconf returned the proper values while xargs failed but at least I'm now able to use xargs properly.
Thanks

How to extract only version from "go version" command using sed or other bash command in bash shell script running on Debian 10

The command go version currently prints go version go1.13.6 linux/amd64. I installed from the go website rather than Debian packages as the version is old. Therefore traditional ways to extract the version number like dpkg -s cannot be used.
I've explored sed commands to extract only the number (1.13.6) like this other question on this site which is similar I grant you, however after reading various sources online about whats possible with sed and my limited knowledge I've been unable to work out how to tell sed to find the starting point, yet alone make it future proof for new versions which may be slight alterations of this number format. I've tried to explore ways to say "find the 3rd to last number" so that I can then work backwards. Or, "find the 2nd word 'go'".
Current efforts have been purely theoretical, as I can't find where to begin, I've not included any attempts.
Can it be done?
$ v=`go version | { read _ _ v _; echo ${v#go}; }`
$ echo $v
1.13.6
Further reading:
Compound commands.
The read comand.
Parameter expansion.
Command substitution.

How rm * works in case of huge number of files in AIX (ksh)

Is there any limit of arguments that can be passed to rm command in AIX? I use ksh. I tried to run
rm *
in one catalog where is >500 000 (now) files but I received strange error. If I remember correctly it was something like "memory core dump". What does it mean? Can I suppose some files were removed? I don't know how many files were there before I executed this command.
I think shell was not able to collect all filenames and pass those to rm command so nothing was removed, but I really don't know. Can someone advise me how it works?
How long an argument list is allowed to be is defined by ARG_MAX which you can query using getconf, e.g.:
$ uname -o
GNU/Linux
$ getconf ARG_MAX
2097152
This is not a limit set by the shell but by the underlying system call(s) involved.

Use of the 'hash' command

I'm working on a small app based on ffmpeg, and I read a tutorial made for ubuntu where they advise to use the command hash on the produced executable.
I'm curious about that command, did you ever use it? For which purpose?
When I run it in my source folder, I get this (once compiled)
$ hash
hits command
1 /usr/bin/strip
1 /usr/local/bin/ffmpeg
1 /usr/bin/svn
4 /usr/local/bin/brew
2 /usr/bin/git
1 /bin/rm
1 /bin/cat
1 /usr/bin/ld
1 /bin/sh
4 /usr/bin/man
5 /usr/bin/make
4 /usr/bin/otool
15 /bin/ls
6 /usr/bin/open
2 /usr/bin/clear
Looks like a summary of my bash_history…
When I run it on an executable file, I do not have lots of lines displayed, and nothing seems to changes in that application ?
$ md5 ffserver
MD5 (ffserver) = 2beac612e5efd6ee4a827ae0893ee338
$ hash ffserver
$ md5 ffserver
MD5 (ffserver) = 2beac612e5efd6ee4a827ae0893ee338
When I look for the man, it just says it's a builtin function. Really useful :)
It does work (let say exist) on Linux and on MacOSX.
hash isn't actually your history; it is a bash(1) shell built-in that maintains a hash table of recently executed programs:
Bash uses a hash table to remember the full pathnames of executable files (see hash under SHELL BUILTIN COMMANDS below). A full search of the directories in PATH is performed only if the command is not found in the hash table.
(From bash(1).)
The guide your found may have suggested running it just to see which ffmpeg command was going to be executed by the next step; perhaps there is an ffmpeg program supplied by the distribution packaging, and they wanted to make sure the new one would be executed instead of the distro-supplied one if you just typed ffmpeg at the shell.
It seems a stretch, because it would also require having the directory containing the new ffmpeg in the PATH before the distro-provided version, and there's no guarantee of that.
If you use commands that might not be installed on the system, check for their availability and tell the user what's missing. From Scripting with style
Example:
NEEDED_COMMANDS="sed awk lsof who"
missing_counter=0
for needed_command in $NEEDED_COMMANDS; do
if ! hash "$needed_command" >/dev/null 2>&1; then
printf "Command not found in PATH: %s\n" "$needed_command" >&2
((missing_counter++))
fi
done
if ((missing_counter > 0)); then
printf "Minimum %d commands are missing in PATH, aborting" "$missing_counter" >&2
exit 1
fi

Resources