I'm trying to create a script to fix a csv file like this:
field_one,field_two,field_three
,field_two,field_three
So I need to check inside my loop if the current line is missing field_one and replace it with sed with a new value for field_one (overwrite the line missing field_one).
For this i have a loop but i need some help with identifying if the line is missing field one or not. I should probably use grep? but how to use it in a loop and get its response?
while read -r line; do
# this is pseudocode:
# if $line matches regex then
# sed 's/,/newfieldone/'
# overwrite the corrected line in the file
# end if
done < my_file
Thanks a lot in advance for your help!!!!
Inside your loop you can run following sed command:
sed 's/^\s*,/newfieldone,/'
To see if a line begins with a , and is hence missing field one, you can use if [[ "$line" =~ ^, ]].
For example:
while read -r line; do
if [[ "$line" =~ ^, ]]
then
echo "newfieldone$line"
else
echo "$line"
fi
done < my_file
Just for the heck of it, here's a solution in awk:
awk '{FS=","} {if ($1 == "") print "field_one" $0;else print $0} ' < /tmp/test.txt
$ sed -e "/^,/s/^,\([^,]*\),\([^,]\)/new_field_one,\1,\2/" < my_file
Edit: This probably is too complicated. Take one of the other fine answers :)
with sed try something like that:
sed -i 's|\(^,.*\)|new_field_one\1|g' <your file>
This might work for you:
a=Field_one,Field_two,Field_three
sed '/^,/c\'$a'' file
field_one,field_two,field_three
Field_one,Field_two,Field_three
Or if just inserting field_one:
a=Field_one
sed '/^,/s/^/'$a'/' file
field_one,field_two,field_three
Field_one,field_two,field_three
Simple bash solution using case statemetn:
while read -r line; do
case "$line" in
,*) printf "%s%s\n" newfieldone "$line" ;;
*) printf "%s\n" "$line" ;;
esac
done < my_file
case uses "glob" matching, not regular expressions, so ,* matches a string beginning with a comma.
sed -i 's/^,/fieldone,/' YOURFILE
Will replace every line starting , with fieldone, (inplace, so the original file gets overwritten, if you need a backup, try -i.backup).
If you want a dynamic fieldone value, well it depends, how dynamic want it to be :-), e.g.:
MYDYNAMICFIELDONE="DYNAF1"
sed -i "s/^,/${MYDYNAMICFIELDONE},/" YOURFILE
Or with your while loop:
while read -r line; do
MYDYNAMICFIELDONE="SET IT"
sed -i "s/^,/${MYDYNAMICFIELDONE},/"
done < my_file > tmpfile
mv tmpfile my_file
Or with awk:
awk '{
/^,/ {
DYNAF1="SET IT HERE"
print gensub("^,",DYNAF1 ",","g",$0)
}
} INPUT > OUTPUT
This is a pretty short 1-liner with awk
awk '{$1="field_one"}1' FS=',' OFS=',' file.csv
. . . and another awk one-liner:
awk '$1==""{$1="field_one"}1' FS=',' OFS=',' file
What about the use of bash only
while IFS=\, read field_one field_two rest_of_line
echo "${field_one:-default_field_one_value},$field_two,$rest_of_line"
doen < my_file > my_corecct_file
where the 'default_field_one_value' is used if the 'field_one' is empty
Related
I have this input file
gb|KY798440.1|
gb|KY842329.1|
MG082893.1
MG173246.1
and I want to get all the characters that are between the "|" or the full line if there is no "|". That is a desired output that looks like
KY798440.1
KY842329.1
MG082893.1
MG173246.1
I wrote:
while IFS= read -r line; do
if [[ $line == *\|* ]] ; then
sed 's/.*\|\(.*\)\|.*/\1/' <<< $line >> output_file
else echo $line >> output_file
fi
done < input_file
Which gives me
empty line
empty line
MG082893.1
MG173246.1
(note: empty line means an actual empty line - it doesn't actually writes "empty line")
The sed command works on a single example (i.e. sed 's/.*\|\(.*\)\|.*/\1/' <<< "gb|KY842329.1|" outputs KY842329.1) but within the loop it just does a line return. The else echo $line >> output_file seems to work.
Bare sed:
$ sed 's/^[^|]*|\||[^|]*$//g' file
Output:
KY798440.1
KY842329.1
MG082893.1
MG173246.1
You could do
sed '/|/s/[^|]*|\([^|]*\)|.*/\1/' input
or
awk 'NF>1 {print $2} NF < 2 { print $1}' FS=\| input
or
sed -e 's/[^|]*|//' -e 's/|.*//' input
I have the following situation:
I have a text file I'm trying to loop so I can know if each line has a match with ".mp3" in this case which is this one:
12 Stones.mp3
randomfile.txt
Aclarion.mp3
ransomwebpage.html
Agents Of The Sun.mp3
randomvideo.mp4
So, I've written the following script to process it:
while read line || [ -n "$line" ]
do
varline=$(awk '/.mp3/{print "yes";next}{print "no"}')
echo $varline
if [ "$varline" == "yes" ]; then
some-command
else
some-command
fi
done < file.txt
The expected output would be:
yes
no
yes
no
yes
no
Instead, it seems misses the first line and I get the following:
no
yes
no
yes
no
You really don't need Awk for a simple pattern match if that's all you used it for.
while IFS= read -r line; do
case $line in
*.mp3) some-command;,
*) some-other-command;;
esac
done <file.txt
If you are using Awk anyway for other reasons, looping the lines in a shell loop is inefficient and very often an antipattern. This doesn't really fix that, but at least avoids executing a new Awk instance on every iteration:
awk '{ print ($0 ~ /\.mp3$/) ? "yes" : no" }' file.txt |
while IFS= read -r whether; do
case $whether in
'yes') some-command ;;
'no') some-other-command;;
esac
done
If you need the contents of "$line" too, printing that from Awk as well and reading two distinct variables is a trivial change.
I simplified the read expression on the assumption that you can make sure your input file is well-formed separately. If you can't do that, you need to put back the more-complex guard against a missing newline on the last line in the file.
Use awk
$ awk '{if ($0 ~ /mp3/) {print "yes"} else {print "no"}}' file.txt
yes
no
yes
no
yes
no
Or more concise:
$ awk '/mp3/{print "yes";next}{print "no"}' file.txt
$ awk '{print (/mp3/ ? "yes" : "no")}' file.txt
Have you forgot something? Your awk has no explicit input, change to this instead:
while IFS= read -r read line || [ -n "$line" ]
do
varline=$(echo "$line" | awk '/.mp3/{print "yes";next}{print "no"}')
echo $varline
if [ "$varline" == "yes" ]; then
some-command
else
some-other-command
fi
done < file.txt
In this case, you might need to change to /\.mp3$/ or /\.mp3[[:space:]]*$/ for precise matching.
Because . will match any character, so for example /.mp3/ will match Exmp3but.mp4 too.
Update: changed while read line to while IFS= read -r read line, to keep each line's content intact when assigning to the variable.
And the awk part can be improved to:
awk '{print $0~/\.mp3$/ ? "yes":"no"}'
So with awk only, you can do it like this:
awk '{print $0~/\.mp3$/ ? "yes":"no"}' file.txt
Or if your purpose is just the commands in the if structure, you can just do this:
awk '/\.mp3$/{system("some-command");next}{system("some-other-command");}' file.txt
or this:
awk '{system($0~/\.mp3$/ ? "some-command" : "some-other-command")}' file.txt
I would like to remove a file name only from the following configuration file.
Configuration File -- test.conf
knowledgebase/arun/test.rf
knowledgebase/arunraj/tester/test.drl
knowledgebase/arunraj2/arun/test/tester.drl
The above file should be read. And removed contents should went to another file called output.txt
Following are my try. It is not working to me at all. I am getting empty files only.
#!/bin/bash
file=test.conf
while IFS= read -r line
do
# grep --exclude=*.drl line
# awk 'BEGIN {getline line ; gsub("*.drl","", line) ; print line}'
# awk '{ gsub("/",".drl",$NF); print line }' arun.conf
# awk 'NF{NF--};1' line arun.conf
echo $line | rev | cut -d'/' -f 1 | rev >> output.txt
done < "$file"
Expected Output :
knowledgebase/arun
knowledgebase/arunraj/tester
knowledgebase/arunraj2/arun/test
There's the dirname command to make it easy and reliable:
#!/bin/bash
file=test.conf
while IFS= read -r line
do
dirname "$line"
done < "$file" > output.txt
There are Bash shell parameter expansions that will work OK with the list of names given but won't work reliably for some names:
file=test.conf
while IFS= read -r line
do
echo "${line%/*}"
done < "$file" > output.txt
There's sed to do the job — easily with the given set of names:
sed 's%/[^/]*$%%' test.conf > output.txt
It's harder if you have to deal with names like /plain.file (or plain.file — the same sorts of edge cases that trip up the shell expansion).
You could add Perl, Python, Awk variants to the list of ways of doing the job.
You can get the path like this:
path=${fullpath%/*}
It cuts away the string after the last /
Using awk one liner you can do this:
awk 'BEGIN{FS=OFS="/"} {NF--} 1' test.conf
Output:
knowledgebase/arun
knowledgebase/arunraj/tester
knowledgebase/arunraj2/arun/test
I'm trying to read file line by line in bash.
Every line has format as follows text|number.
I want to produce file with format as follows text,text,text etc. so new file would have just text from previous file separated by comma.
Here is what I've tried and couldn't get it to work :
FILENAME=$1
OLD_IFS=$IFSddd
IFS=$'\n'
i=0
for line in $(cat "$FILENAME"); do
array=(`echo $line | sed -e 's/|/,/g'`)
echo ${array[0]}
i=i+1;
done
IFS=$OLD_IFS
But this prints both text and number but in different format text number
here is sample input :
dsadadq-2321dsad-dasdas|4212
dsadadq-2321dsad-d22as|4322
here is sample output:
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
What did I do wrong?
Not pure bash, but you could do this in awk:
awk -F'|' 'NR>1{printf(",")} {printf("%s",$1)}'
Alternately, in pure bash and without having to strip the final comma:
#/bin/bash
# You can get your input from somewhere else if you like. Even stdin to the script.
input=$'dsadadq-2321dsad-dasdas|4212\ndsadadq-2321dsad-d22as|4322\n'
# Output should be reset to empty, for safety.
output=""
# Step through our input. (I don't know your column names.)
while IFS='|' read left right; do
# Only add a field if it exists. Salt to taste.
if [[ -n "$left" ]]; then
# Append data to output string
output="${output:+$output,}$left"
fi
done <<< "$input"
echo "$output"
No need for arrays and sed:
while IFS='' read line ; do
echo -n "${line%|*}",
done < "$FILENAME"
You just have to remove the last comma :-)
Using sed:
$ sed ':a;N;$!ba;s/|[0-9]*\n*/,/g;s/,$//' file
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Alternatively, here is a bit more readable sed with tr:
$ sed 's/|.*$/,/g' file | tr -d '\n' | sed 's/,$//'
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Choroba has the best answer (imho) except that it does not handle blank lines and it adds a trailing comma. Also, mucking with IFS is unnecessary.
This is a modification of his answer that solves those problems:
while read line ; do
if [ -n "$line" ]; then
if [ -n "$afterfirst" ]; then echo -n ,; fi
afterfirst=1
echo -n "${line%|*}"
fi
done < "$FILENAME"
The first if is just to filter out blank lines. The second if and the $afterfirst stuff is just to prevent the extra comma. It echos a comma before every entry except the first one. ${line%|\*} is a bash parameter notation that deletes the end of a paramerter if it matches some expression. line is the paramter, % is the symbol that indicates a trailing pattern should be deleted, and |* is the pattern to delete.
This question already has answers here:
How to concatenate multiple lines of output to one line?
(12 answers)
Closed 4 years ago.
I have a file csv :
data1,data2,data2
data3,data4,data5
data6,data7,data8
I want to convert it to (Contained in a variable):
variable=data1,data2,data2%0D%0Adata3,data4,data5%0D%0Adata6,data7,data8
My attempt :
data=''
cat csv | while read line
do
data="${data}%0D%0A${line}"
done
echo $data # Fails, since data remains empty (loop emulates a sub-shell and looses data)
Please help..
Simpler to just strip newlines from the file:
tr '\n' '' < yourfile.txt > concatfile.txt
In bash,
data=$(
while read line
do
echo -n "%0D%0A${line}"
done < csv)
In non-bash shells, you can use `...` instead of $(...). Also, echo -n, which suppresses the newline, is unfortunately not completely portable, but again this will work in bash.
Some of these answers are incredibly complicated. How about this.
data="$(xargs printf ',%s' < csv | cut -b 2-)"
or
data="$(tr '\n' ',' < csv | cut -b 2-)"
Too "external utility" for you?
IFS=$'\n', read -d'\0' -a data < csv
Now you have an array! Output it however you like, perhaps with
data="$(tr ' ' , <<<"${data[#]}")"
Still too "external utility?" Well fine,
data="$(printf "${data[0]}" ; printf ',%s' "${data[#]:1:${#data}}")"
Yes, printf can be a builtin. If it isn't but your echo is and it supports -n, use echo -n instead:
data="$(echo -n "${data[0]}" ; for d in "${data[#]:1:${#data[#]}}" ; do echo -n ,"$d" ; done)"
Okay, now I admit that I am getting a bit silly. Andrew's answer is perfectly correct.
I would much prefer a loop:
for line in $(cat file.txt); do echo -n $line; done
Note: This solution requires the input file to have a new line at the end of the file or it will drop the last line.
Another short bash solution
variable=$(
RS=""
while read line; do
printf "%s%s" "$RS" "$line"
RS='%0D%0A'
done < filename
)
awk 'END { print r }
{ r = r ? r OFS $0 : $0 }
' OFS='%0D%0A' infile
With shell:
data=
while IFS= read -r; do
[ -n "$data" ] &&
data=$data%0D%0A$REPLY ||
data=$REPLY
done < infile
printf '%s\n' "$data"
Recent bash versions:
data=
while IFS= read -r; do
[[ -n $data ]] &&
data+=%0D%0A$REPLY ||
data=$REPLY
done < infile
printf '%s\n' "$data"
A very simple single-line solution which requires no extra files as its quite easy to understand (I think, just cat the file together and perform sed-replace):
output=$(echo $(cat ./myFile.txt) | sed 's/ /%0D%0A/g')
Useless use of cat, punished! You want to feed the CSV into the loop
while read line; do
# ...
done < csv