Joining every group of N lines into one with bash - bash

I would like to join every group of N lines in the output of another command using bash.
Are there any standard linux commands i can use to achieve this?
Example:
./command
46.219464 0.000993
17.951781 0.002545
15.770583 0.002873
87.431820 0.000664
97.380751 0.001921
25.338819 0.007437
Desired output:
46.219464 0.000993 17.951781 0.002545
15.770583 0.002873 87.431820 0.000664
97.380751 0.001921 25.338819 0.007437

If your output has consistent number of fields, you can use xargs -n N to group on X elements per line:
$ ...command... | xargs -n4
46.219464 0.000993 17.951781 0.002545
15.770583 0.002873 87.431820 0.000664
97.380751 0.001921 25.338819 0.007437
From man xargs:
-n max-args, --max-args=max-args
Use at most max-args arguments per command line. Fewer than max-args
arguments will be used if the size (see the -s option) is exceeded,
unless the -x option is given, in which case xargs will exit.

Seems like you're trying to join every two lines with the delimiter \t(tab). If yes then you could try the below paste command,
command | paste -d'\t' - -
If you want space as delimiter then use -d<space>,
command | paste -d' ' - -

Related

Sed through files without using for loop?

I have a small script which basically generates a menu of all the scripts in my ~/scripts folder and next to each of them displays a sentence describing it, that sentence being the third line within the script commented out. I then plan to pipe this into fzf or dmenu to select it and start editing it or whatever.
1 #!/bin/bash
2
3 # a script to do
So it would look something like this
foo.sh a script to do X
bar.sh a script to do Y
Currently I have it run a for loop over all the files in the scripts folder and then run sed -n 3p on all of them.
for i in $(ls -1 ~/scripts); do
echo -n "$i"
sed -n 3p "~/scripts/$i"
echo
done | column -t -s '#' | ...
I was wondering if there is a more efficient way of doing this that did not involve a for loop and only used sed. Any help will be appreciated. Thanks!
Instead of a loop that is parsing ls output + sed, you may try this awk command:
awk 'FNR == 3 {
f = FILENAME; sub(/^.*\//, "", f); print f, $0; nextfile
}' ~/scripts/* | column -t -s '#' | ...
Yes there is a more efficient way, but no, it doesn't only use sed. This is probably a silly optimization for your use case though, but it may be worthwhile nonetheless.
The inefficiency is that you're using ls to read the directory and then parse its output. For large directories, that causes lots of overhead for keeping that list in memory even though you only traverse it once. Also, it's not done correctly, consider filenames with special characters that the shell interprets.
The more efficient way is to use find in combination with its -exec option, which starts a second program with each found file in turn.
BTW: If you didn't rely on line numbers but maybe a tag to mark the description, you could also use grep -r, which avoids an additional process per file altogether.
This might work for you (GNU sed):
sed -sn '1h;3{H;g;s/\n/ /p}' ~/scripts/*
Use the -s option to reset the line number addresses for each file.
Copy line 1 to the hold space.
Append line 3 to the hold space.
Swap the hold space for the pattern space.
Replace the newline with a space and print the result.
All files in the directory ~/scripts will be processed.
N.B. You may wish to replace the space delimiter by a tab or pipe the results to the column command.

Extracting output to use for a variable in bash

After running an fdisk -l, I want to be able to extract the volume group name to input later in a bash script. Example below:
fdisk -l | grep /dev/mapper/vg_palpatine
Disk /dev/mapper/vg_palpatine-lv_root: 105.1 GB
vgextend /dev/vg_$vg_name $partitioned_drive
I want to extract palpatine out of the fdisk -l output and input it as my $vg_name variable. I've been suggested a few different solutions, like sed, cut and awk, but I don't have much experience with those. How would I go about doing this?
We can whole input string with the volume name only. To do this, we ask sed to replace (s///) regex matching with second expression:
vg_name=$(fdisk -l | grep /dev/mapper/vg_palpatine | sed -rn 's/.*vg_(.+)-lv_root.*/\1/p')
Regex explanation:
.* -> Any number of characters.
vg_(.+)-lv_root -> Capture the volume name as group 1.
We use the .* to match the whole input string, then we use group 1 (\1), which captures the volume name only, as substitution.
Note that if you want to use $vg_name on a different script you need to export it first.
You can use this:
vg_name=$(fdisk -l | grep /dev/mapper/vg_palpatine | sed -n "s/.*\/\(.*\)-.*/\1/p")
vgextend /dev/$vg_name $partitioned_drive

Pass Every Line of Input as stdin for Invocation of Utility

I have a file containing valid xmls (one per line) and I want to execute a utility (xpath) on each line one by one.
I tried xargs but that seems doesn't seem to have an option to pass the line as stdin :-
% cat <xmls-file> | xargs -p -t -L1 xpath -p "//Path/to/node"
Cannot open file '//Path/to/node' at /System/Library/Perl/Extras/5.12/XML/XPath.pm line 53.
I also tried parallel --spreadstdin but that doesn't seem to work either :-
% cat <xmls-file> | parallel --spreadstdin xpath -p "//Path/to/node"
junk after document element at line 2, column 0, byte 1607
If you want every line of a file to be split off and made stdin for a utility
you could use a for loop in bash shell:
cat xmls-file | while read line
do ( echo $f > /tmp/input$$;
xpath -p "//Path/to/node" </tmp/input$$
rm -f /tmp/input$$
);
done
The $$ appends the process id number, creating a unique name
I assume xmls-file contains, on each line, what you want iterated into $f and that you want this as stdin for a command line, not as a parameter to the command.
On the other hand, your specification may be incorrect and maybe instead you need each line
to be part of a command. In that case, delete the echo and rm lines, and change the xpath command to include $f wherever the line from the file is needed.
I've not done much XML so the do command may need to be edited.
You are very close with the GNU Parallel version; only -n1 missing:
cat <xmls-file> | parallel -n1 --spreadstdin xpath -p "//Path/to/node"

appending file contents as parameter for unix shell command

I'm looking for a unix shell command to append the contents of a file as the parameters of another shell command. For example:
command << commandArguments.txt
xargs was built specifically for this:
cat commandArguments.txt | xargs mycommand
If you have multiple lines in the file, you can use xargs -L1 -P10 to run ten copies of your command at a time, in parallel.
xargs takes its standard in and formats it as positional parameters for a shell command. It was originally meant to deal with short command line limits, but it is useful for other purposes as well.
For example, within the last minute I've used it to connect to 10 servers in parallel and check their uptimes:
echo server{1..10} | tr ' ' '\n' | xargs -n 1 -P 50 -I ^ ssh ^ uptime
Some interesting aspects of this command pipeline:
The names of the servers to connect to were taken from the incoming pipe
The tr is needed to put each name on its own line. This is because xargs expects line-delimited input
The -n option controls how many incoming lines are used per command invocation. -n 1 says make a new ssh process for each incoming line.
By default, the parameters are appended to the end of the command. With -I, one can specify a token (^) that will be replaced with the argument instead.
The -P controls how many child processes to run concurrently, greatly widening the space of interesting possibilities..
command `cat commandArguments.txt`
Using backticks will use the result of the enclosed command as a literal in the outer command

bash: shortest way to get n-th column of output

Let's say that during your workday you repeatedly encounter the following form of columnized output from some command in bash (in my case from executing svn st in my Rails working directory):
? changes.patch
M app/models/superman.rb
A app/models/superwoman.rb
in order to work with the output of your command - in this case the filenames - some sort of parsing is required so that the second column can be used as input for the next command.
What I've been doing is to use awk to get at the second column, e.g. when I want to remove all files (not that that's a typical usecase :), I would do:
svn st | awk '{print $2}' | xargs rm
Since I type this a lot, a natural question is: is there a shorter (thus cooler) way of accomplishing this in bash?
NOTE:
What I am asking is essentially a shell command question even though my concrete example is on my svn workflow. If you feel that workflow is silly and suggest an alternative approach, I probably won't vote you down, but others might, since the question here is really how to get the n-th column command output in bash, in the shortest manner possible. Thanks :)
You can use cut to access the second field:
cut -f2
Edit:
Sorry, didn't realise that SVN doesn't use tabs in its output, so that's a bit useless. You can tailor cut to the output but it's a bit fragile - something like cut -c 10- would work, but the exact value will depend on your setup.
Another option is something like: sed 's/.\s\+//'
To accomplish the same thing as:
svn st | awk '{print $2}' | xargs rm
using only bash you can use:
svn st | while read a b; do rm "$b"; done
Granted, it's not shorter, but it's a bit more efficient and it handles whitespace in your filenames correctly.
I found myself in the same situation and ended up adding these aliases to my .profile file:
alias c1="awk '{print \$1}'"
alias c2="awk '{print \$2}'"
alias c3="awk '{print \$3}'"
alias c4="awk '{print \$4}'"
alias c5="awk '{print \$5}'"
alias c6="awk '{print \$6}'"
alias c7="awk '{print \$7}'"
alias c8="awk '{print \$8}'"
alias c9="awk '{print \$9}'"
Which allows me to write things like this:
svn st | c2 | xargs rm
Try the zsh. It supports suffix alias, so you can define X in your .zshrc to be
alias -g X="| cut -d' ' -f2"
then you can do:
cat file X
You can take it one step further and define it for the nth column:
alias -g X2="| cut -d' ' -f2"
alias -g X1="| cut -d' ' -f1"
alias -g X3="| cut -d' ' -f3"
which will output the nth column of file "file". You can do this for grep output or less output, too. This is very handy and a killer feature of the zsh.
You can go one step further and define D to be:
alias -g D="|xargs rm"
Now you can type:
cat file X1 D
to delete all files mentioned in the first column of file "file".
If you know the bash, the zsh is not much of a change except for some new features.
HTH Chris
Because you seem to be unfamiliar with scripts, here is an example.
#!/bin/sh
# usage: svn st | x 2 | xargs rm
col=$1
shift
awk -v col="$col" '{print $col}' "${#--}"
If you save this in ~/bin/x and make sure ~/bin is in your PATH (now that is something you can and should put in your .bashrc) you have the shortest possible command for generally extracting column n; x n.
The script should do proper error checking and bail if invoked with a non-numeric argument or the incorrect number of arguments, etc; but expanding on this bare-bones essential version will be in unit 102.
Maybe you will want to extend the script to allow a different column delimiter. Awk by default parses input into fields on whitespace; to use a different delimiter, use -F ':' where : is the new delimiter. Implementing this as an option to the script makes it slightly longer, so I'm leaving that as an exercise for the reader.
Usage
Given a file file:
1 2 3
4 5 6
You can either pass it via stdin (using a useless cat merely as a placeholder for something more useful);
$ cat file | sh script.sh 2
2
5
Or provide it as an argument to the script:
$ sh script.sh 2 file
2
5
Here, sh script.sh is assuming that the script is saved as script.sh in the current directory; if you save it with a more useful name somewhere in your PATH and mark it executable, as in the instructions above, obviously use the useful name instead (and no sh).
It looks like you already have a solution. To make things easier, why not just put your command in a bash script (with a short name) and just run that instead of typing out that 'long' command every time?
If you are ok with manually selecting the column, you could be very fast using pick:
svn st | pick | xargs rm
Just go to any cell of the 2nd column, press c and then hit enter
Note, that file path does not have to be in second column of svn st output. For example if you modify file, and modify it's property, it will be 3rd column.
See possible output examples in:
svn help st
Example output:
M wc/bar.c
A + wc/qax.c
I suggest to cut first 8 characters by:
svn st | cut -c8- | while read FILE; do echo whatever with "$FILE"; done
If you want to be 100% sure, and deal with fancy filenames with white space at the end for example, you need to parse xml output:
svn st --xml | grep -o 'path=".*"' | sed 's/^path="//; s/"$//'
Of course you may want to use some real XML parser instead of grep/sed.

Resources