How to read a file for special lines in bash script - bash

I just want to read even line number from a file in bash shell, how to do it?
Also I just want to read the fifth line of a file, then how do it?

awk 'NR % 2 == 1' <filename>
For the second one:
awk 'NR == 5' <filename>
You can also use sed to get numbers in a specified range:
sed -ne '5,5p' <filename>

You could use the tail command. Put it in a for loop for the first case and the second is totally trivial if you get the first.
Or maybe you could even use awk:
awk NR==5 file_name

To read even number files using gnu-sed:
sed -n "2~2 p" file
To print specific line # from a file using sed:
sed '5q;d' file

Awk is often the answer (or, nowadays, Perl, Python etc. too)
If for some reason you must do it with only bash and the basic shell utilities:
cat file | \
while read line; do
i=$(( (i + 1) % 2 ))
if [[ $i -eq 0 ]]; then
echo $line // or whatever else you wanted to do with it
fi
done
And to get a specific line:
cat file | head -5 | tail -1

try this:
for example lines between 3 and 6
awk 'NR>=3 && NR<=6'`
These is a help to improve it(but not completed)
#!/bin/bash
test=`cat input.txt | awk 'NR>=3 && NR<=6'`
while read line; do
#do stuff
done <input.txt

Related

Bash while read line loop does not print every line in condition

I have the following situation:
I have a text file I'm trying to loop so I can know if each line has a match with ".mp3" in this case which is this one:
12 Stones.mp3
randomfile.txt
Aclarion.mp3
ransomwebpage.html
Agents Of The Sun.mp3
randomvideo.mp4
So, I've written the following script to process it:
while read line || [ -n "$line" ]
do
varline=$(awk '/.mp3/{print "yes";next}{print "no"}')
echo $varline
if [ "$varline" == "yes" ]; then
some-command
else
some-command
fi
done < file.txt
The expected output would be:
yes
no
yes
no
yes
no
Instead, it seems misses the first line and I get the following:
no
yes
no
yes
no
You really don't need Awk for a simple pattern match if that's all you used it for.
while IFS= read -r line; do
case $line in
*.mp3) some-command;,
*) some-other-command;;
esac
done <file.txt
If you are using Awk anyway for other reasons, looping the lines in a shell loop is inefficient and very often an antipattern. This doesn't really fix that, but at least avoids executing a new Awk instance on every iteration:
awk '{ print ($0 ~ /\.mp3$/) ? "yes" : no" }' file.txt |
while IFS= read -r whether; do
case $whether in
'yes') some-command ;;
'no') some-other-command;;
esac
done
If you need the contents of "$line" too, printing that from Awk as well and reading two distinct variables is a trivial change.
I simplified the read expression on the assumption that you can make sure your input file is well-formed separately. If you can't do that, you need to put back the more-complex guard against a missing newline on the last line in the file.
Use awk
$ awk '{if ($0 ~ /mp3/) {print "yes"} else {print "no"}}' file.txt
yes
no
yes
no
yes
no
Or more concise:
$ awk '/mp3/{print "yes";next}{print "no"}' file.txt
$ awk '{print (/mp3/ ? "yes" : "no")}' file.txt
Have you forgot something? Your awk has no explicit input, change to this instead:
while IFS= read -r read line || [ -n "$line" ]
do
varline=$(echo "$line" | awk '/.mp3/{print "yes";next}{print "no"}')
echo $varline
if [ "$varline" == "yes" ]; then
some-command
else
some-other-command
fi
done < file.txt
In this case, you might need to change to /\.mp3$/ or /\.mp3[[:space:]]*$/ for precise matching.
Because . will match any character, so for example /.mp3/ will match Exmp3but.mp4 too.
Update: changed while read line to while IFS= read -r read line, to keep each line's content intact when assigning to the variable.
And the awk part can be improved to:
awk '{print $0~/\.mp3$/ ? "yes":"no"}'
So with awk only, you can do it like this:
awk '{print $0~/\.mp3$/ ? "yes":"no"}' file.txt
Or if your purpose is just the commands in the if structure, you can just do this:
awk '/\.mp3$/{system("some-command");next}{system("some-other-command");}' file.txt
or this:
awk '{system($0~/\.mp3$/ ? "some-command" : "some-other-command")}' file.txt

How to remove a filename from the list of path in Shell

I would like to remove a file name only from the following configuration file.
Configuration File -- test.conf
knowledgebase/arun/test.rf
knowledgebase/arunraj/tester/test.drl
knowledgebase/arunraj2/arun/test/tester.drl
The above file should be read. And removed contents should went to another file called output.txt
Following are my try. It is not working to me at all. I am getting empty files only.
#!/bin/bash
file=test.conf
while IFS= read -r line
do
# grep --exclude=*.drl line
# awk 'BEGIN {getline line ; gsub("*.drl","", line) ; print line}'
# awk '{ gsub("/",".drl",$NF); print line }' arun.conf
# awk 'NF{NF--};1' line arun.conf
echo $line | rev | cut -d'/' -f 1 | rev >> output.txt
done < "$file"
Expected Output :
knowledgebase/arun
knowledgebase/arunraj/tester
knowledgebase/arunraj2/arun/test
There's the dirname command to make it easy and reliable:
#!/bin/bash
file=test.conf
while IFS= read -r line
do
dirname "$line"
done < "$file" > output.txt
There are Bash shell parameter expansions that will work OK with the list of names given but won't work reliably for some names:
file=test.conf
while IFS= read -r line
do
echo "${line%/*}"
done < "$file" > output.txt
There's sed to do the job — easily with the given set of names:
sed 's%/[^/]*$%%' test.conf > output.txt
It's harder if you have to deal with names like /plain.file (or plain.file — the same sorts of edge cases that trip up the shell expansion).
You could add Perl, Python, Awk variants to the list of ways of doing the job.
You can get the path like this:
path=${fullpath%/*}
It cuts away the string after the last /
Using awk one liner you can do this:
awk 'BEGIN{FS=OFS="/"} {NF--} 1' test.conf
Output:
knowledgebase/arun
knowledgebase/arunraj/tester
knowledgebase/arunraj2/arun/test

awk parse filename and add result to the end of each line

I have number of files which have similar names like
DWH_Export_AUSTA_20120701_20120731_v1_1.csv.397.dat.2012-10-02 04-01-46.out
DWH_Export_AUSTA_20120701_20120731_v1_2.csv.397.dat.2012-10-02 04-03-12.out
DWH_Export_AUSTA_20120801_20120831_v1_1.csv.397.dat.2012-10-02 04-04-16.out
etc.
I need to get number before .csv(1 or 2) from the file name and put it into end of every line in file with TAB separator.
I have written this code, it finds number that I need, but i do not know how to put this number into file. There is space in the filename, my script breaks because of it.
Also I am not sure, how to send to script list of files. Now I am working only with one file.
My code:
#!/bin/sh
string="DWH_Export_AUSTA_20120701_20120731_v1_1.csv.397.dat.2012-10-02 04-01-46.out"
out=$(echo $string | awk 'BEGIN {FS="_"};{print substr ($7,0,1)}')
awk ' { print $0"\t$out" } ' $string
for file in *
do
sfx=$(echo "$file" | sed 's/.*_\(.*\).csv.*/\1/')
sed -i "s/$/\t$sfx/" "$file"
done
Using sed:
$ sed 's/.*_\(.*\).csv.*/&\t\1/' file
DWH_Export_AUSTA_20120701_20120731_v1_1.csv.397.dat.2012-10-02 04-01-46.out 1
DWH_Export_AUSTA_20120701_20120731_v1_2.csv.397.dat.2012-10-02 04-03-12.out 2
DWH_Export_AUSTA_20120801_20120831_v1_1.csv.397.dat.2012-10-02 04-04-16.out 1
To make this for many files:
sed 's/.*_\(.*\).csv.*/&\t\1/' file1 file2 file3
OR
sed 's/.*_\(.*\).csv.*/&\t\1/' file*
To make this changed get saved in the same file(If you have GNU sed):
sed -i 's/.*\(.\).csv.*/&\t\1/' file
Untested, but this should do what you want (extract the number before .csv and append that number to the end of every line in the .out file)
awk 'FNR==1 { split(FILENAME, field, /[_.]/) }
{ print $0"\t"field[7] > FILENAME"_aaaa" }' *.out
for file in *_aaaa; do mv "$file" "${file/_aaaa}"; done
If I understood correctly, you want to append the number from the filename to every line in that file - this should do it:
#!/bin/bash
while [[ 0 < $# ]]; do
num=$(echo "$1" | sed -r 's/.*_([0-9]+).csv.*/\t\1/' )
#awk -e "{ print \$0\"\t${num}\"; }" < "$1" > "$1.new"
#sed -r "s/$/\t$num/" < "$1" > "$1.mew"
#sed -ri "s/$/\t$num/" "$1"
shift
done
Run the script and give it names of the files you want to process. $# is the number of command line arguments for the script which is decremented at the end of the loop by shift, which drops the first argument, and shifts the other ones. Extract the number from the filename and pick one of the three commented lines to do the appending: awk gives you more flexibility, first sed creates new files, second sed processes them in-place (in case you are running GNU sed, that is).
Instead of awk, you may want to go with sed or coreutils.
Grab number from filename, with grep for variety:
num=$(<<<filename grep -Eo '[^_]+\.csv' | cut -d. -f1)
<<<filename is equivalent to echo filename.
With sed
Append num to each line with GNU sed:
sed "s/\$/\t$num" filename
Use the -i switch to modify filename in-place.
With paste
You also need to know the length of the file for this method:
len=$(<filename wc -l)
Combine filename and num with paste:
paste filename <(seq $len | while read; do echo $num; done)
Complete example
for filename in DWH_Export*; do
num=$(echo $filename | grep -Eo '[^_]+\.csv' | cut -d. -f1)
sed -i "s/\$/\t$num" $filename
done

sh shell script of working with for loop

I am using sh shell script to read the files of a folder and display on the screen:
for d in `ls -1 $IMAGE_DIR | egrep "jpg$"`
do
pgm_file=$IMAGE_DIR/`echo $d | sed 's/jpg$/pgm/'`
echo "file $pgm_file";
done
the output result is reading line by line:
file file1.jpg
file file2.jpg
file file3.jpg
file file4.jpg
Because I am not familiar with this language, I would like to have the result that print first 2 results in the same row like this:
file file1.jpg; file file2.jpg;
file file3.jpg; file file4.jpg;
In other languages, I just put d++ but it does not work with this case.
Would this be doable? I will be happy if you would provide me sample code.
thanks in advance.
Let the shell do more work for you:
end_of_line=""
for d in "$IMAGE_DIR"/*.jpg
do
file=$( basename "$d" )
printf "file %s; %s" "$file" "$end_of_line"
if [[ -z "$end_of_line" ]]; then
end_of_line=$'\n'
else
end_of_line=""
fi
pgm_file=${d%.jpg}.pgm
# do something with "$pgm_file"
done
for d in "$IMAGE_DIR"/*jpg; do
pgm_file=${d%jpg}pgm
printf '%s;\n' "$d"
done |
awk 'END {
if (ORS != RS)
print RS
}
ORS = NR % n ? FS : RS
' n=2
Set n to whatever value you need.
If you're on Solaris, use nawk or /usr/xpg4/bin/awk
(do not use /usr/bin/awk).
Note also that I'm trying to use a standard shell syntax,
given your question is sh related (i.e. you didn't mention bash or ksh,
for example).
Something like this inside the loop:
echo -n "something; "
[[ -n "$oddeven" ]] && oddeven= || { echo;oddeven=x;}
should do.
Three per line would be something like
[[ "$((n++%3))" = 0 ]] && echo
(with n=1) before entering the loop.
Why use a loop at all? How about:
ls $IMAGE_DIR | egrep 'jpg$' |
sed -e 's/$/;/' -e 's/^/file /' -e 's/jpg$/pgm/' |
perl -pe '$. % 2 && chomp'
(The perl just deletes every other newline. You may want to insert a space and add a trailing newline if the last line is an odd number.)

results of wc as variables

I would like to use the lines coming from 'wc' as variables. For example:
echo 'foo bar' > file.txt
echo 'blah blah blah' >> file.txt
wc file.txt
2 5 23 file.txt
I would like to have something like $lines, $words and $characters associated to the values 2, 5, and 23. How can I do that in bash?
In pure bash: (no awk)
a=($(wc file.txt))
lines=${a[0]}
words=${a[1]}
chars=${a[2]}
This works by using bash's arrays. a=(1 2 3) creates an array with elements 1, 2 and 3. We can then access separate elements with the ${a[indice]} syntax.
Alternative: (based on gonvaled solution)
read lines words chars <<< $(wc x)
Or in sh:
a=$(wc file.txt)
lines=$(echo $a|cut -d' ' -f1)
words=$(echo $a|cut -d' ' -f2)
chars=$(echo $a|cut -d' ' -f3)
There are other solutions but a simple one which I usually use is to put the output of wc in a temporary file, and then read from there:
wc file.txt > xxx
read lines words characters filename < xxx
echo "lines=$lines words=$words characters=$characters filename=$filename"
lines=2 words=5 characters=23 filename=file.txt
The advantage of this method is that you do not need to create several awk processes, one for each variable. The disadvantage is that you need a temporary file, which you should delete afterwards.
Be careful: this does not work:
wc file.txt | read lines words characters filename
The problem is that piping to read creates another process, and the variables are updated there, so they are not accessible in the calling shell.
Edit: adding solution by arnaud576875:
read lines words chars filename <<< $(wc x)
Works without writing to a file (and do not have pipe problem). It is bash specific.
From the bash manual:
Here Strings
A variant of here documents, the format is:
<<<word
The word is expanded and supplied to the command on its standard input.
The key is the "word is expanded" bit.
lines=`wc file.txt | awk '{print $1}'`
words=`wc file.txt | awk '{print $2}'`
...
you can also store the wc result somewhere first.. and then parse it.. if you're picky about performance :)
Just to add another variant --
set -- `wc file.txt`
chars=$1
words=$2
lines=$3
This obviously clobbers $* and related variables. Unlike some of the other solutions here, it is portable to other Bourne shells.
I wanted to store the number of csv file in a variable. The following worked for me:
CSV_COUNT=$(ls ./pathToSubdirectory | grep ".csv" | wc -l | xargs)
xargs removes the whitespace from the wc command
I ran this bash script not in the same folder as the csv files. Thus, the pathToSubdirectory
You can assign output to a variable by opening a sub shell:
$ x=$(wc some-file)
$ echo $x
1 6 60 some-file
Now, in order to get the separate variables, the simplest option is to use awk:
$ x=$(wc some-file | awk '{print $1}')
$ echo $x
1
declare -a result
result=( $(wc < file.txt) )
lines=${result[0]}
words=${result[1]}
characters=${result[2]}
echo "Lines: $lines, Words: $words, Characters: $characters"

Resources