Output only specific var from text - bash

Output only specific var from text. In that case:
echo "ote -C -pname ap01 -HLS 134 -db 1 -instance 43 -log ap01"
Want to get only this value from "-pname"
Exptected result:
ap01

-log takes a string. That string could be -pname. The existing solutions so far fail to handle that and treat the value of the -log parameter as the start of another argument.
You'll have to recreate the argument parsing ote performs if you want a robust solution. The following is well on your way to do that.
echo ... | perl -MGetopt::Long -nle'
local #ARGV = split;
GetOptions(\%args, "C", "pname=s", "HLS=i", "db=i", "instant=i", "log=s")
&& defined($args{pname})
&& print($args{pname});
'

This will deal with a doubled -pname.
echo "ote -C -pname ap01 -HLS 134 -db 1 -instance 43 -log ap01" |\
perl -ne 'print ( /\s+-pname\s+([^-]\S*)/ )'
As ikegami notes below, if you should happen to want to use dashes as the first character of this value, the only way I know that you can be sure you're getting a value and not another switch is more complicated. One way is to do a negative lookahead for all known switches:
echo "ote -C -pname -pname -ap01- -HLS 134 -db 1 -instance 43 -log ap01" |\
perl -ne 'print ( /\s+-pname\s+(?!-(?:pname|other|known|switches))(\S+)/ )'

echo "ote -C -pname ap01 -HLS 134 -db 1 -instance 43 -log ap01" | \
awk '{for (i = 1; i <= NF; i++) {if ($i == "-pname") { print $(i+1); break; } } }'

echo "ote -C -pname ap01 -HLS 134 -db 1 -instance 43 -log ap01" | \
perl -pe '($_)= /-pname\s+([^-]\S*)/'

grep with look behind?
$ grep -Po '(?<=-pname )[^ ]*' <<< "ote -C -pname ap01 -HLS 134 -db 1 -instance 43 -log ap01"
ap01
As there might be many -pname in the string (see comments below), you can then "play" with head and tail to get the value you want.
Explanation
This uses -P for Perl Regex and -o for "print only the matched parts of a machine line".
(?<=-pname ) is a look-behind: match strings that are preceeded by -pname (note the space).
[^ ]* match any set of characters until a space is found.

You can simply use (GNU) grep:
$ echo "ote -C -pname ap01 -HLS 134 -pname foo -db 1 -instance 43 -log ap01" |
grep -Po -- '-pname \K[^ ]+'
ap01
Explanation
The -P enables Perl Compatible Regular Expressions (PCREs) which gives us \K (meaning "discard anything matched up to this point). The -o means "print only the matched portion of the line. So, we then look for the string -pname followed by a space and then as many consecutive non-space characters as possible ([^ ]+). Because of the \K, everything before that is discarded and because of the -o, only the matched portion is printed.
This will work for an arbitrary number of -pname flags as long as none of their values contain spaces.

This looks simple
xargs -n1 | sed -n "/-pname/,//p" | tail -n1

Related

Bash - finding substrings in string

I am new to bash. I have experience in java and python but no experience in bash so I'm struggling with the simplest of tasks.
What I want to achieve is I want to look through the string and find certain sub strings, numbers to be exact. But not all numbers just number that are followed by " xyz". For example:
string="Blah blah boom boom 14 xyz foo bar 12 foo boom 55 XyZ hue hue 15 xyzlkj 45hh."
And I want to find numbers:
14 55 and 15
How would I go about that?
You can use grep with lookahead
echo "$string" | grep -i -P -o '[0-9]+(?= xyz)'
Explanation:
-i – ignore case
-P – interpret pattern as a Perl regular expression
-o – print only matching
[0-9]+(?= xyz) – match one or more numbers followed by xyz
For more information see:
https://linux.die.net/man/1/grep
http://www.regular-expressions.info/lookaround.html
https://github.com/tldr-pages/tldr/blob/master/pages/common/grep.md
grep + cut approach (without PCRE):
echo $string | grep -io '[0-9]* xyz' | cut -d ' ' -f1
The output:
14
55
15

How to properly use the grep command to grab and store integers?

I am currently building a bash script for class, and I am trying to use the grep command to grab the values from a simple calculator program and store them in the variables I assign, but I keep receiving a syntax error message when I try to run the script. Any advice on how to fix it? my script looks like this:
#!/bin/bash
addanwser=$(grep -o "num1 + num2" Lab9 -a 5 2)
echo "addanwser"
subanwser=$(grep -o "num1 - num2" Lab9 -s 10 15)
echo "subanwser"
multianwser=$(grep -o "num1 * num2" Lab9 -m 3 10)
echo "multianwser"
divanwser=$(grep -o "num1 / num2" Lab9 -d 100 4)
echo "divanwser"
modanwser=$(grep -o "num1 % num2" Lab9 -r 300 7)
echo "modawser"`
You want to grep the output of a command.
grep searches from either a file or standard input. So you can say either of these equivalent:
grep X file # 1. from a file
... things ... | grep X # 2. from stdin
grep X <<< "content" # 3. using here-strings
For this case, you want to use the last one, so that you execute the program and its output feeds grep directly:
grep <something> <<< "$(Lab9 -s 10 15)"
Which is the same as saying:
Lab9 -s 10 15 | grep <something>
So that grep will act on the output of your program. Since I don't know how Lab9 works, let's use a simple example with seq, that returns numbers from 5 to 15:
$ grep 5 <<< "$(seq 5 15)"
5
15
grep is usually used for finding matching lines of a text file. To actually grab a part of the matched line other tools such as awk are used.
Assuming the output looks like "num1 + num2 = 54" (i.e. fields are separated by space), this should do your job:
addanwser=$(Lab9 -a 5 2 | awk '{print $NF}')
echo "$addanwser"
Make sure you don't miss the '$' sign before addanwser when echo'ing it.
$NF selects the last field. You may select nth field using $n.

Get the first real number from a series of files

I try to take the first number from each file.dat of the form:
5.01 1 56.413481000 -0.00063400 0.00095770
5.01 2 61.193808800 0.00102170 0.00078280
5.01 3 65.974136600 -0.00108170 0.00102620
5.01 4 70.754464300 0.00082490 0.00103630
and then use this number (5.01) as the title of a .png file.
I use a bash script and I know the command line=$(head -n 1 $f) as found in a question here, but this take to me the first line of the file $f.
In this case also the space in the line is saved and the .png file title became:
plot 5.01 1 56.413481000 -0.00063400 0.00095770.png
There is some way to take only 5.01 and have a trim title for the plot?
Thanks to all.
I'd probably just do it with perl:
VAL=$( echo "$line" | perl -pe 's/^[^\d]+//g;s/[^\d\.].*$//' )
Something like that anyway.
Should remove:
anything that isn't a digit from the start of line.
Anything not-digit or not . to the end of line.
Or with grep:
grep -o "[0-9]*\.[0-9]*" file.dat | head -1
Edit:
Testing without the head -1 for a oneline input:
echo " 5.01 2 61.193808800 0.00102170 0.00078280" | grep -o "[0-9]*\.[0-9]*"
5.01
61.193808800
0.00102170
0.00078280
Using head -1 will return the first match on the first line.
When you know the match will be on the first line, so can we ignore files with an incorrect first line (and don't grep through complete files):
Make a two-headed monster:
head -1 | grep -o "[0-9]*\.[0-9]*" file.dat | head -1
To extract the first field, assuming they are tab separated:
val=$(head -n 1 $f | cut -f 1)
or, if they are space separated instead:
val=$(head -n 1 $f | cut -f 1 -d ' ')
OR you can avoid calling any extra processes and keep all data manipulation in the bash shell with
while read realNum restOfLine ;
break
done < $f
echo $realNum
This grabs the first "word" and puts the remaining into "restOfLine".
The break ensures that you only read the first line of the file.
IHTH

Replace range of numbers with certain number

I need to replace a range of numbers with a certain number. I really tried it hard to code it myself with sed (like sed "s/[33-64]/64/") or awk, but always get wrong results. It tends to replace single digits instead of numbers... What I need would be: Replacing 0-32 -> 32, 33-64 -> 64, 65-128 -> 128, 129-255 -> 255. In between these numbers are IPs, which should stay untouched. I think this command is selecting all, but IPs:
sed '/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/! ... '
So I have a file like this:
65.74.16.161
232
10.128.8.72
63
10.128.14.13
100
10.128.8.58
32
10.128.4.129
60
10.128.240.18
59
and it should look like this:
65.74.16.161
255
10.128.8.72
64
10.128.14.13
128
10.128.8.58
32
10.128.4.129
64
10.128.240.18
64
The [33-64] defines a character class and is a funny way of writing [3-6] and does indeed only match a single character — any single digit from 3, 4, 5 or 6. If you really want to do it with sed, and you're concerned with values from 33 to 64, then you have to write it out differently — and much more verbosely.
In part it depends on which version of sed you have. A solution that will work with classic sed is:
sed -e 's/^[0-9]$/32/' \
-e 's/^[12][0-9]$/32/' \
-e 's/^3[012]$/32/' \
-e 's/^3[3-9]$/64/' \
-e 's/^[45][0-9]$/64/' \
-e 's/^6[0-4]$/64/' \
-e 's/^6[5-9]$/128/' \
-e 's/^[7-9][0-9]$/128/' \
-e 's/^1[01][0-9]$/128/' \
-e 's/^12[0-8]$/128/' \
-e 's/^129$/255/' \
-e 's/^1[3-9][0-9]$/255/' \
-e 's/^2[0-4][0-9]$/255/' \
-e 's/^25[0-5]$/255/'
But, as you can see, it is quite painful. If you have GNU sed, you can use the -r option to enable extended regular expressions; if you have Mac OS X or BSD sed, you can use the -E option to enable extended regular expressions. Then you can reduce the code above to:
sed -E \
-e 's/^([0-9]|[12][0-9]|3[012])$/32/' \
-e 's/^(3[3-9]|[45][0-9]|6[0-4])$/64/' \
-e 's/^(6[5-9]|[7-9][0-9]|1[01][0-9]|12[0-8])$/128/' \
-e 's/^(129|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$/255/'
However, you might do better using awk:
awk '/^[0-9][0-9]*$/ { if ($1 <= 32) print 32
else if ($1 <= 64) print 64
else if ($1 <= 128) print 128
else if ($1 <= 255) print 255
else print $1
next
}
{ print }'
The final else clause accurately prints any unexpected values, such as 256 or 999 or, indeed, 123456789. There are those who would write 1 in place of { print } — the part of the awk script that matches and prints the IP addresses.
You can use this awk with some arithmetic:
awk '$1 == ($1+0) && $1<=255{$1 = ($1>128)?255:($1>64?128:32 * int(($1+31)/32))} 1' file
65.74.16.161
255
10.128.8.72
64
10.128.14.13
128
10.128.8.58
32
10.128.4.129
64
10.128.240.18
64
$1 == ($1+0) is a check to determine $1 is an integer.
Using awk:
awk -vFS=. ' NF == 1 { v=2^int((log($1)/log(2))+0.5); $1 = v>255?255:v; }1' input
Gives:
65.74.16.161
255
10.128.8.72
64
10.128.14.13
128
10.128.8.58
32
10.128.4.129
64
10.128.240.18
64

How do I pick random unique lines from a text file in shell?

I have a text file with an unknown number of lines. I need to grab some of those lines at random, but I don't want there to be any risk of repeats.
I tried this:
jot -r 3 1 `wc -l<input.txt` | while read n; do
awk -v n=$n 'NR==n' input.txt
done
But this is ugly, and doesn't protect against repeats.
I also tried this:
awk -vmax=3 'rand() > 0.5 {print;count++} count>max {exit}' input.txt
But that obviously isn't the right approach either, as I'm not guaranteed even to get max lines.
I'm stuck. How do I do this?
This might work for you:
shuf -n3 file
shuf is one of GNU coreutils.
If you have Python accessible (change the 10 to what you'd like):
python -c 'import random, sys; print("".join(random.sample(sys.stdin.readlines(), 10)).rstrip("\n"))' < input.txt
(This will work in Python 2.x and 3.x.)
Also, (again change the 10 to the appropriate value):
sort -R input.txt | head -10
If jot is on your system, then I guess you're running FreeBSD or OSX rather than Linux, so you probably don't have tools like rl or sort -R available.
No worries. I had to do this a while ago. Try this instead:
$ printf 'one\ntwo\nthree\nfour\nfive\n' > input.txt
$ cat rndlines
#!/bin/sh
# default to 3 lines of output
lines="${1:-3}"
# default to "input.txt" as input file
input="${2:-input.txt}"
# First, put a random number at the beginning of each line.
while read line; do
printf '%8d%s\n' $(jot -r 1 1 99999999) "$line"
done < "$input" |
sort -n | # Next, sort by the random number.
sed 's/^.\{8\}//' | # Last, remove the number from the start of each line.
head -n "$lines" # Show our output
$ ./rndlines input.txt
two
one
five
$ ./rndlines input.txt
four
two
three
$
Here's a 1-line example that also inserts the random number a little more cleanly using awk:
$ printf 'one\ntwo\nthree\nfour\nfive\n' | awk 'BEGIN{srand()} {printf("%8d%s\n", rand()*10000000, $0)}' | sort -n | head -n 3 | cut -c9-
Note that different versions of sed (in FreeBSD and OSX) may require the -E option instead of -r to handle ERE instead or BRE dialect in the regular expression if you want to use that explictely, though everything I've tested works with escapted bounds in BRE. (Ancient versions of sed (HP/UX, etc) might not support this notation, but you'd only be using those if you already knew how to do this.)
This should do the trick, at least with bash and assuming your environment has the other commands available:
cat chk.c | while read x; do
echo $RANDOM:$x
done | sort -t: -k1 -n | tail -10 | sed 's/^[0-9]*://'
It basically outputs your file, placing a random number at the start of each line.
Then it sorts on that number, grabs the last 10 lines, and removes that number from them.
Hence, it gives you ten random lines from the file, with no repeats.
For example, here's a transcript of it running three times with that chk.c file:
====
pax$ testprog chk.c
} else {
}
newNode->next = NULL;
colm++;
====
pax$ testprog chk.c
}
arg++;
printf (" [%s] n", currNode->value);
free (tempNode->value);
====
pax$ testprog chk.c
char tagBuff[101];
}
return ERR_OTHER;
#define ERR_MEM 1
===
pax$ _
sort -Ru filename | head -5
will ensure no duplicates. Not all implementations of sort have the -R option.
To get N random lines from FILE with Perl:
perl -MList::Util=shuffle -e 'print shuffle <>' FILE | head -N
Here's an answer using ruby if you don't want to install anything else:
cat filename | ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
for example, given a file (dups.txt) that looks like:
1 2
1 3
2
1 2
3
4
1 3
5
6
6
7
You might get the following output (or some permutation):
cat dups.txt| ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
4
6
5
1 2
2
3
7
1 3
Further example from the comments:
printf 'test\ntest1\ntest2\n' | ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
test1
test
test2
Of course if you have a file with repeated lines of test you'll get just one line:
printf 'test\ntest\ntest\n' | ruby -e 'puts ARGF.read.split("\n").uniq.shuffle.join("\n")'
test

Resources