Bash substring with pipes and stdin - bash

My goal is to cut the output of a command down to an arbitrary number of characters (let's use 6). I would like to be able to append this command to the end of a pipeline, so it should be able to just use stdin.
echo "1234567890" | your command here
# desired output: 123456
I checked out awk, and I also noticed bash has a substr command, but both of the solutions I've come up with seem longer than they need to be and I can't shake the feeling I'm missing something easier.
I'll post the two solutions I've found as answers, I welcome any critique as well as new solutions!
Solution found, thank you to all who answered!
It was close between jcollado and Mithrandir - I will probably end up using both in the future. Mithrandir's answer was an actual substring and is easier to view the result, but jcollado's answer lets me pipe it to the clipboard with no EOL character in the way.

Do you want something like this:
echo "1234567890" | cut -b 1-6

What about using head -c/--bytes?
$ echo t9p8uat4ep | head -c 6
t9p8ua

I had come up with:
echo "1234567890" | ( read h; echo ${h:0:6} )
and
echo "1234567890" | awk '{print substr($0,1,6)}'
But both seemed like I was using a sledgehammer to hit a nail.

This might work for you:
printf "%.6s" 1234567890
123456

If your_command_here is cat:
% OUTPUT=t9p8uat4ep
% cat <<<${OUTPUT:0:6}
t9p8ua

Related

Bash show charcaters if not in string

I am trying out bash, and I am trying to make a simple hangman game now.
Everything is working but I don't understand how to do one thing:
I am showing the user the word with guessed letters (so for example is the world is hello world, and the user guessed the 'l' I show them **ll* ***l* )
I store the letters that the user already tried in var guess
I do that with the following:
echo "${word//[^[:space:]$guess]/*}"
The thing I want to do now is echo the alphabet, but leave out the letters that the user already tried, so in this case show the full alphabet without the L.
I already tried to do it the same way as I shown just yet, but it won't quite work.
If you need any more info please let me know.
Thanks,
Tim
You don't show what you tried, but parameter expansion works fine.
$ alphabet=abcdefghijklmnopqrstuvwxyz
$ word="hello world"
$ guesses=aetl
$ echo "${word//[^[:space:]$guesses]/*}"
*ell* ***l*
$ echo "${alphabet//[$guesses]/*}"
*bcd*fghijk*mnopqrs*uvwxyz
First store both strings in files where they are stored one char per line:
sed 's/./&\n/g' | sort <<< $guess > guessfile
sed 's/./&\n/g' | sort <<< $word > wordfile
Then we can filter the words that are only present in one of the files and paste the lines together as a string:
grep -xvf guessfile wordfile | paste -s -d'\0'
And of course we clean up after ourselves:
rm wordfile
rm guessfile
If the output is not correct, try switching arguments in grep (i.e. wordfile guessfile instead of guessfile wordfile).

Trimming pathnames beyond a keyword (awk, sed, ?)

I want to trim a pathname beyond a certain point after finding a keyword. I'm drawing a blank this morning.
/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java
I want to find the keyword Java, save the pathname beyond that (tsupdater), then cut everything off after the Java portion.
I don't know if this is what you want, but you can split the pathname into two with:
echo "/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java" | sed 'h;s/.*Java//p;g;s/Java.*/Java/'
Which outputs:
/tsupdater/src/tsupdater.java
/home/quikq/1.0/dev/Java
If you would like to save the second part into a file part2.txt and print the first part, you could do:
echo "/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java" | sed 'h;s/.*Java//;wpart2.txt;g;s/Java.*/Java/'
If you're writing a shell script:
myvar="/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java"
part1="${myvar%Java*}Java"
part2="${myvar#*Java/}"
Hope this helps =)
take one you need:
kent$ echo "/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java"|sed -r 's#(.*Java/[^/]*).*#\1#g'
/home/quikq/1.0/dev/Java/tsupdater
kent$ echo "/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java"|sed -r 's#(.*Java).*#\1#g'
/home/quikq/1.0/dev/Java
kent$ echo "/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java"|sed -r 's#.*Java/([^/]*).*#\1#g'
tsupdater
I'm not entirely sure what you want as output (please specify more clearly), but this command:
echo "/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java" | sed 's/.*Java//'
results in:
/tsupdater/src/tsupdater.java
If you want the preceding part then this command:
echo "/home/quikq/1.0/dev/Java/tsupdater/src/tsupdater.java" | sed 's/Java.*//'
results in:
/home/quikq/1.0/dev/
Like I said, I was having a weird morning, but it dawned on me.
echo /home/quikq/1.0/dev/Java/TSUpdater/src/TSUpdater.java | sed s/Java.*//g
Yields
/home/quikq/1.0/dev
Lots of great tips here for chopping it up different ways though. Thanks a bunch!

BashScripting: Reading out a specific variable

my question is actually rather easy, but I suck at bash scripting and google was no help either. So here is the problem:
I have an executable that writes me a few variables to stdout. Something like that:
MrFoo:~$ ./someExec
Variable1=5
Another_Weird_Variable=12
VARIABLENAME=42
What I want to do now is to read in a specific one of these variables (I already know its name), store the value and use it to give it as an argument to another executable.
So, a simple call like
./function2 5 // which comes from ./function2 Variable1 from above
I hope you understand the problem and can help me with it
With awk you can do something like this (this is for passing value of 1st variable)
./someExec | awk -F= 'NR==1{system("./function2 " $2)}'
or
awk -F= 'NR==1{system("./function2 " $2)}' <(./someExec)
Easiest way to go is probably to use a combination of shell and perl or ruby. I'll go with perl since it's what I cut my teeth on. :)
someExec.sh
#!/bin/bash
echo Variable1=5
echo Another_Weird_Variable=12
echo VARIABLENAME=42
my_shell_script.sh
#!/bin/bash
myVariable=`./someExec | perl -wlne 'print $1 if /Variable1=(.*)/'`
echo "Now call ./function2 $myVariable"
[EDIT]
Or awk, as Jaypal pointed out 58 seconds before I posted my answer. :) Basically, there are a lot of good solutions. Most importantly, though, make sure you handle both security and error cases properly. In both of the solutions so far, we're assuming that someExec will provide guaranteed well-formed and innocuous output. But, consider if someExec were compromised and instead provided output like:
./someExec
5 ; rm -rf / # Uh oh...
You can use awk like this:
./function2 $(./someExec | awk -F "=" '/Variable1/{print $2}')
which is equivalent to:
./function2 5
If you can make sure someExec's output is safe you can use eval.
eval $(./someExec)
./function2 $Variable1
You can use this very simple and straight forward way:
./exp1.sh | grep "Variable1" | awk -F "=" '{print $2}'
If you want to use only one variable from the file use the below
eval $(grep 'Variable1' ./someExec )
./function2 $Variable1
And, if you want to use all the variables of a file, use
eval $(./someExec)
./function2 $<FILE_VARIBALE_NAME>

bash: shortest way to get n-th column of output

Let's say that during your workday you repeatedly encounter the following form of columnized output from some command in bash (in my case from executing svn st in my Rails working directory):
? changes.patch
M app/models/superman.rb
A app/models/superwoman.rb
in order to work with the output of your command - in this case the filenames - some sort of parsing is required so that the second column can be used as input for the next command.
What I've been doing is to use awk to get at the second column, e.g. when I want to remove all files (not that that's a typical usecase :), I would do:
svn st | awk '{print $2}' | xargs rm
Since I type this a lot, a natural question is: is there a shorter (thus cooler) way of accomplishing this in bash?
NOTE:
What I am asking is essentially a shell command question even though my concrete example is on my svn workflow. If you feel that workflow is silly and suggest an alternative approach, I probably won't vote you down, but others might, since the question here is really how to get the n-th column command output in bash, in the shortest manner possible. Thanks :)
You can use cut to access the second field:
cut -f2
Edit:
Sorry, didn't realise that SVN doesn't use tabs in its output, so that's a bit useless. You can tailor cut to the output but it's a bit fragile - something like cut -c 10- would work, but the exact value will depend on your setup.
Another option is something like: sed 's/.\s\+//'
To accomplish the same thing as:
svn st | awk '{print $2}' | xargs rm
using only bash you can use:
svn st | while read a b; do rm "$b"; done
Granted, it's not shorter, but it's a bit more efficient and it handles whitespace in your filenames correctly.
I found myself in the same situation and ended up adding these aliases to my .profile file:
alias c1="awk '{print \$1}'"
alias c2="awk '{print \$2}'"
alias c3="awk '{print \$3}'"
alias c4="awk '{print \$4}'"
alias c5="awk '{print \$5}'"
alias c6="awk '{print \$6}'"
alias c7="awk '{print \$7}'"
alias c8="awk '{print \$8}'"
alias c9="awk '{print \$9}'"
Which allows me to write things like this:
svn st | c2 | xargs rm
Try the zsh. It supports suffix alias, so you can define X in your .zshrc to be
alias -g X="| cut -d' ' -f2"
then you can do:
cat file X
You can take it one step further and define it for the nth column:
alias -g X2="| cut -d' ' -f2"
alias -g X1="| cut -d' ' -f1"
alias -g X3="| cut -d' ' -f3"
which will output the nth column of file "file". You can do this for grep output or less output, too. This is very handy and a killer feature of the zsh.
You can go one step further and define D to be:
alias -g D="|xargs rm"
Now you can type:
cat file X1 D
to delete all files mentioned in the first column of file "file".
If you know the bash, the zsh is not much of a change except for some new features.
HTH Chris
Because you seem to be unfamiliar with scripts, here is an example.
#!/bin/sh
# usage: svn st | x 2 | xargs rm
col=$1
shift
awk -v col="$col" '{print $col}' "${#--}"
If you save this in ~/bin/x and make sure ~/bin is in your PATH (now that is something you can and should put in your .bashrc) you have the shortest possible command for generally extracting column n; x n.
The script should do proper error checking and bail if invoked with a non-numeric argument or the incorrect number of arguments, etc; but expanding on this bare-bones essential version will be in unit 102.
Maybe you will want to extend the script to allow a different column delimiter. Awk by default parses input into fields on whitespace; to use a different delimiter, use -F ':' where : is the new delimiter. Implementing this as an option to the script makes it slightly longer, so I'm leaving that as an exercise for the reader.
Usage
Given a file file:
1 2 3
4 5 6
You can either pass it via stdin (using a useless cat merely as a placeholder for something more useful);
$ cat file | sh script.sh 2
2
5
Or provide it as an argument to the script:
$ sh script.sh 2 file
2
5
Here, sh script.sh is assuming that the script is saved as script.sh in the current directory; if you save it with a more useful name somewhere in your PATH and mark it executable, as in the instructions above, obviously use the useful name instead (and no sh).
It looks like you already have a solution. To make things easier, why not just put your command in a bash script (with a short name) and just run that instead of typing out that 'long' command every time?
If you are ok with manually selecting the column, you could be very fast using pick:
svn st | pick | xargs rm
Just go to any cell of the 2nd column, press c and then hit enter
Note, that file path does not have to be in second column of svn st output. For example if you modify file, and modify it's property, it will be 3rd column.
See possible output examples in:
svn help st
Example output:
M wc/bar.c
A + wc/qax.c
I suggest to cut first 8 characters by:
svn st | cut -c8- | while read FILE; do echo whatever with "$FILE"; done
If you want to be 100% sure, and deal with fancy filenames with white space at the end for example, you need to parse xml output:
svn st --xml | grep -o 'path=".*"' | sed 's/^path="//; s/"$//'
Of course you may want to use some real XML parser instead of grep/sed.

Copying part of a large file using command line

I've a text file with 2 million lines. Each line has some transaction information.
e.g.
23848923748, sample text, feild2 , 12/12/2008
etc
What I want to do is create a new file from a certain unique transaction number onwards. So I want to split the file at the line where this number exists.
How can I do this form the command line?
I can find the line by doing this:
cat myfile.txt | grep 23423423423
use sed like this
sed '/23423423423/,$!d' myfile.txt
Just confirm that the unique transaction number cannot appear as a pattern in some other part of the line (especially, before the correctly matching line) in your file.
There is already a 'perl' answer here, so, i'll give one more AWK way :-)
awk '{BEGIN{skip=1} /number/ {skip=0} // {if (skip!=1) print $0}' myfile.txt
On a random file in my tmp directory, this is how I output everything from the line matching popd onwards in a file named tmp.sh:
tail -n+`grep -n popd tmp.sh | cut -f 1 -d:` tmp.sh
tail -n+X matches from that line number onwards; grep -n outputs lineno:filename, and cut extracts just lineno from grep.
So for your case it would be:
tail -n+`grep -n 23423423423 myfile.txt | cut -f 1 -d:` myfile.txt
And it should indeed match from the first occurrence onwards.
It's not a pretty solution, but how about using -A parameter of grep?
Like this:
mc#zolty:/tmp$ cat a
1
2
3
4
5
6
7
mc#zolty:/tmp$ cat a | grep 3 -A1000000
3
4
5
6
7
The only problem I see in this solution is the 1000000 magic number. Probably someone will know the answer without using such a trick.
You can probably get the line number using Grep and then use Tail to print the file from that point into your output file.
Sorry I don't have actual code to show, but hopefully the idea is clear.
I would write a quick Perl script, frankly. It's invaluable for anything like this (relatively simple issues) and as soon as something more complex rears its head (as it will do!) then you'll need the extra power.
Something like:
#!/bin/perl
my $out = 0;
while (<STDIN>) {
if /23423423423/ then $out = 1;
print $_ if $out;
}
and run it using:
$ perl mysplit.pl < input > output
Not tested, I'm afraid.

Resources