tar `--to-command` : how to send output to a custom function? - bash

I have a very many tar archives I need to extract files from and perform post processing on (amongst other things, changing file encodings and applying some sed commands). I'm interesting in using tar's --to-command option to apply a custom function which does all of those things in sequence.
Up until now, I have been doing:
tar -xzi --to-command=<line of shell commands>
Unfortunately, the list of commands I need to do has got larger and means it is no longer neat (nor probably sensible) to attempt to do everything on one line.
To neaten things up, I've written function in another file, test-function.sh, which (tries to) perform those things in sequence:
#!/bin/bash
post_extract() {
<the things I need to do>
}
I realise the above is example is incomplete, but my problem at the moment is that I can't get --to-command to find the post_extract function to even go about testing it.
Where should I put post_extract / what would be the idiomatic way of exposing it to tar's --to-command?

Given the behaviors demonstrated in TAR with --to-command, (in particularly, --to-command='md5sum | ...' resulting in md5sum: |: No such file or directory), it's clear that tar --to-command doesn't invoke a shell, but simply performs shell-like parsing and then attempts an execv()-family invocation of the resulting command. This means the target command needs to be something that can actually be executed -- a script, or similar.
This behavior is (loosely) documented in the Running External Commands section of the GNU tar documentation.
One way around this is to have the thing that's invoked be a shell, and then use an exported function to ship code from your parent shell to that child process; as shown below:
#!/usr/bin/env bash
# ^^^^- MUST be bash, not sh
post_extract() {
: "do stuff here"
}
export -f post_extract
tar --to-command $'bash -c \'post_extract "$#"\' _' ...

Related

How do I use SSFT (Shell Scripts Frontend Tool) on Ubuntu (or any Linux)?

I can't find a man page or any help for ssft. I want to use it in my bash scripts to select either kdialog (if on KDE) or zenity (if on gnome).
See Shell Scripts Frontend Tool
Surely the help pages are somewhere, but I must be overlooking them.
I am running Debian 6.0 Squeeze stable right now, and it has a manpage for ssft.sh. Try man ssft.sh. If that doesn't do what you want, let me know and you and I will figure out what does.
Update: All right. You have tried the manpage, which doesn't tell you what you want to know. There does not appear to exist any more thorough documentation for Ssft (maybe, when this is all over, you will write and contribute that very documentation). However, in Ssft's source appears to be a test script that makes the software do the various things it is designed to do. Sometimes, a good example is even better than a manual. That test script may be just what you need.
To extract the test script, issue a sequence of commands like the following sequence.
$ cd /tmp
$ apt-get source ssft
$ ls
$ cd ssft-0.9.13 # (Your version number may differ from 0.9.13.)
$ ls
$ cd tests
$ ls
When I do the above, the last ls listing reveals a shell script named ssft-test.sh. Inside that script appear to be several examples of how to use ssft.sh correctly.
http://man.devl.cz/man/1/ssft.sh
ssft.sh(1)
SSFT
Name
ssft.sh - library of shell script frontend functions
Synopsis
. ssft.sh
Description
ssft.sh is a library of shell functions that must be sourced from other scripts. If the script is executed without arguments it prints an usage message and also supports the options --doc, --help and --version.
To get a list of available functions call the script with the --doc argument and to get a description of what a given function does call the script with --doc FUNCTION_NAME.
On the typical case the library must be sourced and the SSFT_FRONTEND variable must be set to the desired frontend (zenity, dialog or text); if the variable is not set the default frontend is noninteractive.
To choose the theorically best looking frontend use the function ssft_choose_frontend as follows:
. ssft.sh [ -n "$SSFT_FRONTEND" ] || SSFT_FRONTEND="$( ssft_choose_frontend )"
Written by Sergio Talens-Oliag .
$ /usr/bin/ssft.sh
Shell Script Frontend Tool (version 0.9.13)
Usage: . ssft.sh
When called directly the program supports the following options:
-d,--doc [FUNCTIONS] Prints the list of available functions. If function names are given prints functions' documentation.
-h,--help This message
-v,--version File version
functions:
$ /usr/bin/ssft.sh -d
ssft_set_textdomain
ssft_reset_textdomain
ssft_choose_frontend
ssft_print_text_title
ssft_display_message
ssft_display_error
ssft_display_emsg
ssft_file_selection
ssft_directory_selection
ssft_progress_bar
ssft_read_string
ssft_read_password
ssft_select_multiple
ssft_select_single
ssft_yesno
ssft_show_file

Copy a whole bunch of files with their names being changed a little bit using shell script

I have a very large number of files with very similar names: row1col1.txt, row1col2.txt, row1col3.txt, row1col4.txt......
I'd like to make copies of them all and change the names to row2col1.txt, row2col2.txt,
row2col3,txt, row2col4.txt......
Using the cp command in shell script, how can I do it efficiently?
How are you going to generate the file names? How are you going to specify the substitution?
One possibility is:
ls row1col*.txt |
sed 's/row1\(.*\)/cp & row2\1/' |
sh -x
This uses ls to generate the list of names, and sed to generate a cp command for each named file, and pipes that to sh so that the copy operations occur. Don't run it to sh until you are confident that the rest is right.
If you use the program mcp contained in the packet mmv, you can do that like this:
mcp row1\* row2\#1

What is the `< <()` syntax?

I've been using RVM for a while, and every time I just copied and pasted the following command to get it setup:
bash < <(curl -s https://rvm.beginrescueend.com/install/rvm)
It bugs me that I don't fully understand the syntax, and why we need the double <, and the parentheses. Can some one explain this or point me to the right reference?
The first one is input redirection. It feeds the contents of a file into the program as input. The second construct is <() and it's process redirection: it treats output of a process like a file. In this case, the effect is that you will run the contents of that url as though it was a bash script -- very dangerous! If you don't trust to source completely, don't do that. An attacker could use this method to have you run commands that would compromise your system.
Just my 2 cents. Bashs structure <() as #Daenyth stated "treats output of a process like a file". This structure may be very useful. Just consider following:
diff <(ls dir1) <(ls dir2)
This will use vimdiff to show differences between contents of dir1 and dir2. Using vimdiff instead diff will even cooler.

Diff output from two programs without temporary files

Say I have too programs a and b that I can run with ./a and ./b.
Is it possible to diff their outputs without first writing to temporary files?
Use <(command) to pass one command's output to another program as if it were a file name. Bash pipes the program's output to a pipe and passes a file name like /dev/fd/63 to the outer command.
diff <(./a) <(./b)
Similarly you can use >(command) if you want to pipe something into a command.
This is called "Process Substitution" in Bash's man page.
Adding to both the answers, if you want to see a side by side comparison, use vimdiff:
vimdiff <(./a) <(./b)
Something like this:
One option would be to use named pipes (FIFOs):
mkfifo a_fifo b_fifo
./a > a_fifo &
./b > b_fifo &
diff a_fifo b_fifo
... but John Kugelman's solution is much cleaner.
For anyone curious, this is how you perform process substitution in using the Fish shell:
Bash:
diff <(./a) <(./b)
Fish:
diff (./a | psub) (./b | psub)
Unfortunately the implementation in fish is currently deficient; fish will either hang or use a temporary file on disk. You also cannot use psub for output from your command.
Adding a little more to the already good answers (helped me!):
The command docker outputs its help to STD_ERR (i.e. file descriptor 2)
I wanted to see if docker attach and docker attach --help gave the same output
$ docker attach
$ docker attach --help
Having just typed those two commands, I did the following:
$ diff <(!-2 2>&1) <(!! 2>&1)
!! is the same as !-1 which means run the command 1 before this one - the last command
!-2 means run the command two before this one
2>&1 means send file_descriptor 2 output (STD_ERR) to the same place as file_descriptor 1 output (STD_OUT)
Hope this has been of some use.
For zsh, using =(command) automatically creates a temporary file and replaces =(command) with the path of the file itself. With normal Process Substitution, $(command) is replaced with the output of the command.
This zsh feature is very useful and can be used like so to compare the output of two commands using a diff tool, for example Beyond Compare:
bcomp =(ulimit -Sa | sort) =(ulimit -Ha | sort)
For Beyond Compare, note that you must use bcomp for the above (instead of bcompare) since bcomp launches the comparison and waits for it to complete. If you use bcompare, that launches comparison and immediately exits due to which the temporary files created to store the output of the commands disappear.
Read more here: http://zsh.sourceforge.net/Intro/intro_7.html
Also notice this:
Note that the shell creates a temporary file, and deletes it when the command is finished.
and the following which is the difference between $(...) and =(...) :
If you read zsh's man page, you may notice that <(...) is another form of process substitution which is similar to =(...). There is an important difference between the two. In the <(...) case, the shell creates a named pipe (FIFO) instead of a file. This is better, since it does not fill up the file system; but it does not work in all cases. In fact, if we had replaced =(...) with <(...) in the examples above, all of them would have stopped working except for fgrep -f <(...). You can not edit a pipe, or open it as a mail folder; fgrep, however, has no problem with reading a list of words from a pipe. You may wonder why diff <(foo) bar doesn't work, since foo | diff - bar works; this is because diff creates a temporary file if it notices that one of its arguments is -, and then copies its standard input to the temporary file.

Can a shell script indicate that its lines be loaded into memory initially?

UPDATE: this is a repost of How to make shell scripts robust to source being changed as they run
This is a little thing that bothers me every now and then:
I write a shell script (bash) for a quick and dirty job
I run the script, and it runs for quite a while
While it's running, I edit a few lines in the script, configuring it for a different job
But the first process is still reading the same script file and gets all screwed up.
Apparently, the script is interpreted by loading each line from the file as it is needed. Is there some way that I can have the script indicate to the shell that the entire script file should be read into memory all at once? For example, Perl scripts seem to do this: editing the code file does not affect a process that's currently interpreting it (because it's initially parsed/compiled?).
I understand that there are many ways I could get around this problem. For example, I could try something like:
cat script.sh | sh
or
sh -c "`cat script.sh`"
... although those might not work correctly if the script file is large and there are limits on the size of stream buffers and command-line arguments. I could also write an auxiliary wrapper that copies a script file to a locked temporary file and then executes it, but that doesn't seem very portable.
So I was hoping for the simplest solution that would involve modifications only to the script, not the way in which it is invoked. Can I just add a line or two at the start of the script? I don't know if such a solution exists, but I'm guessing it might make use of the $0 variable...
The best answer I've found is a very slight variation on the solutions offered to How to make shell scripts robust to source being changed as they run. Thanks to camh for noting the repost!
#!/bin/sh
{
# Your stuff goes here
exit
}
This ensures that all of your code is parsed initially; note that the 'exit' is critical to ensuring that the file isn't accessed later to see if there are additional lines to interpret. Also, as noted on the previous post, this isn't a guarantee that other scripts called by your script will be safe.
Thanks everyone for the help!
Use an editor that doesn't modify the existing file, and instead creates a new file then replaces the old file. For example, using :set writebackup backupcopy=no in Vim.
How about a solution to how you edit it.
If the script is running, before editing it, do this:
mv script script-old
cp script-old script
rm script-old
Since the shell keep's the file open as long as you don't change the contents of the open inode everything will work okay.
The above works because mv will preserve the old inode while cp will create a new one. Since a file's contents will not actually be removed if it is opened, you can remove it right away and it will be cleaned up once the shell closes the file.
According to the bash documentation if instead of
#!/bin/bash
body of script
you try
#!/bin/bash
script=$(cat <<'SETVAR'
body of script
SETVAR)
eval "$script"
then I think you will be in business.
Consider creating a new bang path for your quick-and-dirty jobs. If you start your scripts with:
#!/usr/local/fastbash
or something, then you can write a fastbash wrapper that uses one of the methods you mentioned. For portability, one can just create a symlink from fastbash to bash, or have a comment in the script saying one can replace fastbash with bash.
If you use Emacs, try M-x customize-variable break-hardlink-on-save. Setting this variable will tell Emacs to write to a temp file and then rename the temp file over the original instead of editing the original file directly. This should allow the running instance to keep its unmodified version while you save the new version.
Presumably, other semi-intelligent editors would have similar options.
A self contained way to make a script resistant to this problem is to have the script copy and re-execute itself like this:
#!/bin/bash
if [[ $0 != /tmp/copy-* ]] ; then
rm -f /tmp/copy-$$
cp $0 /tmp/copy-$$
exec /tmp/copy-$$ "$#"
echo "error copying and execing script"
exit 1
fi
rm $0
# rest of script...
(This will not work if the original script begins with the characters /tmp/copy-)
(This is inspired by R Samuel Klatchko's answer)

Resources