How do I split "abcd efgh" into "abcd" and "efgh"? - bash

This is really self explanatory. I'm working in a bash shell and I'm really new to shell scripting. I've found a lot of information about using tr and sed but all the examples I have found so far are removing delimiters and new lines. I really want to do the opposite of that. I want to be able to separate based on a blank space. I have a string like "abcd efgh" and I need it to be "abcd" "efgh" (all without quotes, just to show groupings).
I'm sure this is much simpler than I'm making it, but I'm very confused.
Updated Question:
I have a column of PIDs that I have put into an array, but each element of the array has both the pids in the column.
Column:
1234
5678
when I print out the entire array, all the different columns have been added so I have all the values, but when I print out a single element of my array I get something like:
1234 5678
which is not what I want.
I need to have an element for 1234 and a separate one for 5678.
This is my code so far:
!/bin/bash
echo "Enter the File Name"
read ips
index=0
IFS=' '
while read myaddr myname; do
myips[$index]="$myaddr"
names[$index]="$myname"
index=$(($index+1))
done < $ips
echo "my IPs are: ${myips[*]}"
echo "the corresponding names are: ${names[*]}"
echo "Total IPs in the file: ${index}"
ind=0
for i in "$myips[#]}"
do
echo $i
pids=( $(jps | awk '{print $1}') )
for pid in "${pids[#]}"; do
echo $pid
done
echo "my PIDs are: ${pids}"
for j in "${pids[#]}"
do
mypids[$ind]="$j"
ind=$(($ind+1))
done
done
echo "${mypids[*]}"
echo "The 3rd PID is: ${mypids[2]}"
SAMPLE OUTPUT:
Total IPs in the file: 6
xxx.xxx.xxx.xxx
5504
1268
1
xxx.xxx.xxx.xxx
5504
4352
1
xxx.xxx.xxx.xxx
5504
4340
1
5504
1268 5504
4352 5504
4340
The 3rd pid is: 5504
4340
I need each pid to be separate, so that each element of the array, is a single pid. So for instance, the line "The 3rd pid is: " needs to look something like
The 3rd pid is: 5504
and the 4th element would be 4340

Try cut:
$ echo "abcd efgh" | cut -d" " -f1
abcd
$ echo "abcd efgh" | cut -d" " -f2
efgh
Alternatively, if at some point you want to do something more complex, do look into awk as well:
$ echo "abcd efgh" | awk '{print $1}'
abcd
$ echo "abcd efgh" | awk '{print $2}'
efgh
To address your updated question:
I have a column of PIDs that I have put into an array, but each element of the array has both the pids in the column.
If you want to load a column of data into an array, you could do something like this:
$ pgrep sshd # example command. Get pid of all sshd processes
795
32046
32225
$ A=(`pgrep sshd`) # store output of command in array A
$ echo ${A[0]} # print first value
795
$ echo ${A[1]} # print second value
32046
To address the example code you posted, the reason for your problem is that you've change $IFS to a space (IFS=' ') which means that your columns which are separated by newlines are no longer being split.
Consider this example:
$ A=(`pgrep sshd`)
$ echo ${A[0]} # works as expected
795
$ IFS=' ' # change IFS to space only
$ A=(`pgrep sshd`)
$ echo ${A[0]} # newlines no longer used as separator
795
32046
32225
To avoid this problem, a common approach is to always backup the original IFS and replace it once you've done using the updated value. E.g.
# backup original IFS
OLDIFS=$IFS
IFS=' '
# .. do stuff ...
# restore after use
IFS=$OLDIFS

Sample file:
abcd efgh
bla blue
Using awk you can do the following
cat file.txt | awk '{print $1}'
This will output the following
abcd
bla
or
cat file.txt | awk '{print $2}'
This will output the following
efgh
blue
Awk is a really powerfull command I suggest you try to learn it as soon as you can. It will save you lots of headaches in bash scripting.

The other solutions are pretty good. I use cut often. However, I just wanted to add that if you always want to split on whitespace then xargs will do that for you. Then the command line version of printf can format the arguments (if reordering of strings is desired use awk as in the other solution). Here is an example for reference:
MYSTR="hello big world"
$ echo $MYSTR |xargs printf "%s : %s > %s\n"
hello : big > world

The read command handles input as entire lines (unless a delimiter is set with -e):
$ echo "abcd efgh" | while read item
do
echo $item
# Do something with item
done
abcd efgh
If you want to pipe each item to a command, you can do this:
echo "abcd efgh" | tr ' ' '\n' | while read item
do
echo $item
# Do something with item
done
abcd
efgh

No need to use external commands to split strings into words. The set built-in does just that:
string="abcd efgh"
set $string
# Now $1 is "abcd" and $2 is "efgh"
echo $1
echo $2

There is no difference between the string "abcd efgh" and the string "abcd" "efgh" other than, if passed as argument to a program, the first will be read as one argument where the second will be two arguments.
The double quotes " merely activate and deactivate shell expansion, just as the single quotes do (more aggressively, though).
Now, you could have a string '"abcd efgh"' which you would like to transform into '"abcd" "efgh"', which you could do with sed 's/ /" "/' but that's probably not what you want.

Related

Correctly count number of lines a bash variable

I need to count the number of lines of a given variable. For example I need to find how many lines VAR has, where VAR=$(git log -n 10 --format="%s").
I tried with echo "$VAR" | wc -l), which indeed works, but if VAR is empty, is prints 1, which is wrong. Is there a workaround for this? Something better than using an if clause to check whether the variable is empty...(maybe add a line and subtract 1 from the returned value?).
The wc counts the number of newline chars. You can use grep -c '^' for counting lines.
You can see the difference with:
#!/bin/bash
count_it() {
echo "Variablie contains $2: ==>$1<=="
echo -n 'grep:'; echo -n "$1" | grep -c '^'
echo -n 'wc :'; echo -n "$1" | wc -l
echo
}
VAR=''
count_it "$VAR" "empty variable"
VAR='one line'
count_it "$VAR" "one line without \n at the end"
VAR='line1
'
count_it "$VAR" "one line with \n at the end"
VAR='line1
line2'
count_it "$VAR" "two lines without \n at the end"
VAR='line1
line2
'
count_it "$VAR" "two lines with \n at the end"
what produces:
Variablie contains empty variable: ==><==
grep:0
wc : 0
Variablie contains one line without \n at the end: ==>one line<==
grep:1
wc : 0
Variablie contains one line with \n at the end: ==>line1
<==
grep:1
wc : 1
Variablie contains two lines without \n at the end: ==>line1
line2<==
grep:2
wc : 1
Variablie contains two lines with \n at the end: ==>line1
line2
<==
grep:2
wc : 2
You can always write it conditionally:
[ -n "$VAR" ] && echo "$VAR" | wc -l || echo 0
This will check whether $VAR has contents and act accordingly.
For a pure bash solution: instead of putting the output of the git command into a variable (which, arguably, is ugly), put it in an array, one line per field:
mapfile -t ary < <(git log -n 10 --format="%s")
Then you only need to count the number of fields in the array ary:
echo "${#ary[#]}"
This design will also make your life simpler if, e.g., you need to retrieve the 5th commit message:
echo "${ary[4]}"
try:
echo "$VAR" | grep ^ | wc -l

How to add multiple line of output one by one to a variable in Bash?

This might be a very basic question but I was not able to find solution. I have a script:
If I run w | awk '{print $1}' in command line in my server I get:
f931
smk591
sc271
bx972
gaw844
mbihk988
laid640
smk59
ycc951
Now I need to use this list in my bash script one by one and manipulate some operation on them. I need to check their group and print those are in specific group. The command to check their group is id username. How can I save them or iterate through them one by one in a loop.
what I have so far is
tmp=$(w | awk '{print $1})
But it only return first record! Appreciate any help.
Populate an array with the output of the command:
$ tmp=( $(printf "a\nb\nc\n") )
$ echo "${tmp[0]}"
a
$ echo "${tmp[1]}"
b
$ echo "${tmp[2]}"
c
Replace the printf with your command (i.e. tmp=( $(w | awk '{print $1}') )) and man bash for how to work with bash arrays.
For a lengthier, more robust and complete example:
$ cat ./tstarrays.sh
# saving multi-line awk output in a bash array, one element per line
# See http://www.thegeekstuff.com/2010/06/bash-array-tutorial/ for
# more operations you can perform on an array and its elements.
oSET="$-"; set -f # save original set flags and turn off globbing
oIFS="$IFS"; IFS=$'\n' # save original IFS and make IFS a newline
array=( $(
awk 'BEGIN{
print "the quick brown"
print " fox jumped\tover\tthe"
print "lazy dogs back "
}'
) )
IFS="$oIFS" # restore original IFS value
set +f -$oSET # restore original set flags
for (( i=0; i < ${#array[#]}; i++ ));
do
printf "array[%d] of length=%d: \"%s\"\n" "$i" "${#array[$i]}" "${array[$i]}"
done
printf -- "----------\n"
printf -- "array[#]=\n\"%s\"\n" "${array[#]}"
printf -- "----------\n"
printf -- "array[*]=\n\"%s\"\n" "${array[*]}"
.
$ ./tstarrays.sh
array[0] of length=22: "the quick brown"
array[1] of length=23: " fox jumped over the"
array[2] of length=21: "lazy dogs back "
----------
array[#]=
"the quick brown"
array[#]=
" fox jumped over the"
array[#]=
"lazy dogs back "
----------
array[*]=
"the quick brown fox jumped over the lazy dogs back "
A couple of non-obvious key points to make sure your array gets populated with exactly what your command outputs:
If your command output can contain globbing characters than you should disable globbing before the command (oSET="$-"; set -f) and re-enable it afterwards (set +f -$oSET).
If your command output can contain spaces then set IFS to a newline before the command (oIFS="$IFS"; IFS=$'\n') and set it back to it's old value after the command (IFS="$oIFS").
tmp=$(w | awk '{print $1}')
while read i
do
echo "$i"
done <<< "$tmp"
You can use a for loop, i.e.
for user in $(w | awk '{print $1}'); do echo $user; done
which in a script would look nicer as:
for user in $(w | awk '{print $1}')
do
echo $user
done
You can use the xargs command to do this:
w | awk '{print $1}' | xargs -I '{}' id '{}'
With the -I switch, xargs will take each line of its standard input separately, then construct and execute a command line by replacing the specified string '{}' in the command line template with the input line
I guess you should use who instead of w. Try this out,
who | awk '{print $1}' | xargs -n 1 id

Count mutiple occurences of a word on the same line using grep

Here I made a small script that take input from user searching some pattern from a file and displays required no of lines from that file where the pattern is found. Although this code is searching the pattern line wise due to standard grep practice. I mean if the pattern occurs twice on the same line, i want the output to print twice. Hope I make some sense.
#!/bin/sh
cat /dev/null>copy.txt
echo "Please enter the sentence you want to search:"
read "inputVar"
echo "Please enter the name of the file in which you want to search:"
read "inputFileName"
echo "Please enter the number of lines you want to copy:"
read "inputLineNumber"
[[-z "$inputLineNumber"]] || inputLineNumber=20
cat /dev/null > copy.txt
for N in `grep -n $inputVar $inputFileName | cut -d ":" -f1`
do
LIMIT=`expr $N + $inputLineNumber`
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
As I understood, the task is to count number of pattern occurrences in line. It can be done like so:
count=$((`echo "$line" | sed -e "s|$pattern|\n|g" | wc -l` - 1))
Suppose you have one file to read. Then, code will be following:
#!/bin/bash
file=$1
pattern="an."
#reading file line by line
cat -n $file | while read input
do
#storing line to $tmp
tmp=`echo $input | grep "$pattern"`
#counting occurrences count
count=$((`echo "$tmp" | sed -e "s|$pattern|\n|g" | wc -l` - 1))
#printing $tmp line $count times
for i in `seq 1 $count`
do
echo $tmp
done
done
I checked this for pattern "an." and input:
I pass here an example of many 'an' letters
an
ananas
an-an-as
Output is:
$ ./test.sh input
1 I pass here an example of many 'an' letters
1 I pass here an example of many 'an' letters
1 I pass here an example of many 'an' letters
3 ananas
4 an-an-as
4 an-an-as
Adapt this to your needs.
How about using awk?
Assume the pattern you are searching for is in variable $pattern and the file you are checking is $file
The
count=`awk 'BEGIN{n=0}{n+=split($0,a,"'$pattern'")-1}END {print n}' $file`
or for a line
count=`echo $line | awk '{n=split($0,a,"'$pattern'")-1;print n}`

Shell script: how to read only a portion of text from a variable

I'm developing a little script using ash shell (not bash).
Now i have a variable with the following composition:
VARIABLE = "number string status"
where number could be any number (actually between 1 and 18 but in the future that number could be higher) the string is a name and status is or on or off
The name usually is only lowercase letter.
Now my problem is to read only the string content in the variable, removing the number and the status.
How i can obtain that?
Two ways; one is to leverage $IFS and use a while loop - this will work for a single line quite happily - as:
echo "Part1 Part2 Part3" | while read a b c
do
echo $a
done
alternatively, use cut as follows:
a=`echo $var | cut -d' ' -f2`
echo $a
How about using cut?
name=$(echo "$variable" | cut -d " " -f 2)
UPDATE
Apparently, Ash doesn't understand $(...). Hopefully you can do this instead:
name=`echo "$variable" | cut -d " " -f 2`
How about :
name=$(echo "$variable" | awk '{print $2}')
#!/bin/sh
myvar="word1 word2 word3 wordX"
set -- $myvar
echo ${15} # outputs word 15

results of wc as variables

I would like to use the lines coming from 'wc' as variables. For example:
echo 'foo bar' > file.txt
echo 'blah blah blah' >> file.txt
wc file.txt
2 5 23 file.txt
I would like to have something like $lines, $words and $characters associated to the values 2, 5, and 23. How can I do that in bash?
In pure bash: (no awk)
a=($(wc file.txt))
lines=${a[0]}
words=${a[1]}
chars=${a[2]}
This works by using bash's arrays. a=(1 2 3) creates an array with elements 1, 2 and 3. We can then access separate elements with the ${a[indice]} syntax.
Alternative: (based on gonvaled solution)
read lines words chars <<< $(wc x)
Or in sh:
a=$(wc file.txt)
lines=$(echo $a|cut -d' ' -f1)
words=$(echo $a|cut -d' ' -f2)
chars=$(echo $a|cut -d' ' -f3)
There are other solutions but a simple one which I usually use is to put the output of wc in a temporary file, and then read from there:
wc file.txt > xxx
read lines words characters filename < xxx
echo "lines=$lines words=$words characters=$characters filename=$filename"
lines=2 words=5 characters=23 filename=file.txt
The advantage of this method is that you do not need to create several awk processes, one for each variable. The disadvantage is that you need a temporary file, which you should delete afterwards.
Be careful: this does not work:
wc file.txt | read lines words characters filename
The problem is that piping to read creates another process, and the variables are updated there, so they are not accessible in the calling shell.
Edit: adding solution by arnaud576875:
read lines words chars filename <<< $(wc x)
Works without writing to a file (and do not have pipe problem). It is bash specific.
From the bash manual:
Here Strings
A variant of here documents, the format is:
<<<word
The word is expanded and supplied to the command on its standard input.
The key is the "word is expanded" bit.
lines=`wc file.txt | awk '{print $1}'`
words=`wc file.txt | awk '{print $2}'`
...
you can also store the wc result somewhere first.. and then parse it.. if you're picky about performance :)
Just to add another variant --
set -- `wc file.txt`
chars=$1
words=$2
lines=$3
This obviously clobbers $* and related variables. Unlike some of the other solutions here, it is portable to other Bourne shells.
I wanted to store the number of csv file in a variable. The following worked for me:
CSV_COUNT=$(ls ./pathToSubdirectory | grep ".csv" | wc -l | xargs)
xargs removes the whitespace from the wc command
I ran this bash script not in the same folder as the csv files. Thus, the pathToSubdirectory
You can assign output to a variable by opening a sub shell:
$ x=$(wc some-file)
$ echo $x
1 6 60 some-file
Now, in order to get the separate variables, the simplest option is to use awk:
$ x=$(wc some-file | awk '{print $1}')
$ echo $x
1
declare -a result
result=( $(wc < file.txt) )
lines=${result[0]}
words=${result[1]}
characters=${result[2]}
echo "Lines: $lines, Words: $words, Characters: $characters"

Resources