Replace data on specific column and row - bash

I currently new to shell scripting and i am having an issue in replacing data. I need to replace the data of a specific column and row.
Below is a random database:
test:test1:test2:test3:test4
example:example1:example2:example3:example4
sample:sample1:sample2:sample3:sample4
for example, I would like to replace the word "test3" into "changed".. how do i achieve this? i tried several command like
awk -F : 'NR==n{$4=a}1' n="$row" a="$replace" test.txt
sed -i "$row"'s/\S\+/'"$replace"'/4' test.txt
although there is no error when i run those command, it did not replace my data either.
anyone can give me some help on this problem..?

Your awk version works fine for me with a minor modification:
$ row=1
$ replace=changed
$ awk 'BEGIN{FS=OFS=":"}NR==n{$4=a}1' n="$row" a="$replace" file
test:test1:test2:changed:test4
example:example1:example2:example3:example4
sample:sample1:sample2:sample3:sample4
I have defined the Output Field Separator OFS so that lines which are modified still have : between each field. To overwrite the original file, you can just do awk '...' file > tmp && mv tmp file.

Here is an sed one-liner (using in-place editing):
#!/bin/bash
cat > /tmp/file <<EOF
test:test1:test2:test3:test4
example:example1:example2:example3:example4
sample:sample1:sample2:sample3:sample4
EOF
row=1
column=4
replace=changed
sed -i "$row"'s/^\(\([^:]*:\)\{'"$(($column - 1))"'\}\)[^:]*/\1'"$replace"'/' /tmp/file
cat /tmp/file

Related

AWK remove blank lines and append empty columns to all csv files in the directory

Hi I am looking for a way to combine all the below commands together.
Remove blank lines in the csv file (comma delimited)
Add multiple empty columns to each line up to 100th column
Perform action 1 & 2 on all the files in the folder
I am still learning and this is the best I could get:
awk '!/^[[:space:]]*$/' x.csv > tmp && mv tmp x.csv
awk -F"," '($100="")1' OFS="," x.csv > tmp && mv tmp x.csv
They work out individually but I don't know how how to put them together and I am looking for ways to have it run through all the files under the directory.
Looking for concrete AWK code or shell script calling AWK.
Thank you!
An example input would be:
a,b,c
x,y,z
Expected output would be:
a,b,c,,,,,,,,,,
x,y,z,,,,,,,,,,
you can combine in one script without any loops
$ awk 'BEGIN{FS=OFS=","} FNR==1{close(f); f=FILENAME".updated"} NF{$100=""; print > f}' files...
it won't overwrite the original files.
You can pipe the output of the first to the other:
awk '!/^[[:space:]]*$/' x.csv | awk -F"," '($100="")1' OFS="," > new_x.csv
If you wanted to run the above on all the files in your directory, you would do:
shopt -s nullglob
for f in yourdirectory/*.csv; do
awk '!/^[[:space:]]*$/' "${f}" | awk -F"," '($100="")1' OFS="," > new_"${f}"
done
The shopt -s nullglob is so that an empty directory won't give you a literal *. Quoted from a good source for about looping through files
With recent enough GNU awk you could:
$ gawk -i inplace 'BEGIN{FS=OFS=","}/\S/{NF=100;$1=$1;print}' *
Explained:
$ gawk -i inplace ' # using GNU awk and in-place file editing
BEGIN {
FS=OFS="," # set delimiters to a comma
}
/\S/ { # gawk specific regex operator that matches any character that is not a space
NF=100 # set the field count to 100 which truncates fields above it
$1=$1 # edit the first field to rebuild the record to actually get the extra commas
print # output records
}' *
Some test data (the first empty record is empty, the second empty record has a space and a tab, trust me bro):
$ cat file
1,2,3
1,2,3,4,5,6,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101
Output of cat file after the execution of the GNU awk program:
1,2,3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2,3,4,5,6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100

How to remove consecutive repeating characters from every line?

I have the below lines in a file
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;;;;
Acanthocephala;;;;;;;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Polymorphus;;
and I want to remove the repeating semi-colon characters from all lines to look like below (note- there are repeating semi-colons in the middle of some of the above lines too)
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;
Acanthocephala;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Polymorphus;
I would appreciate if someone could kindly share a bash one-liner to accomplish this.
You can use tr with "squeeze":
tr -s ';' < infile
perl -p -e 's/;+/;/g' myfile # writes output to stdout
or
perl -p -i -e 's/;+/;/g' myfile # does an in-place edit
If you want to edit the file itself:
printf "%s\n" 'g/;;/s/;\{2,\}/;/g' w | ed -s foo.txt
If you want to pipe a modified copy of the file to something else and leave the original unchanged:
sed 's/;\{2,\}/;/g' foo.txt | whatever
These replace runs of 2 or more semicolons with single ones.
could be solved easily by substitutions.
I add an awk solution by playing with the FS/OFS variable:
awk -F';+' -v OFS=';' '$1=$1' file
or
awk -F';+' -v OFS=';' '($1=$1)||1' file
Here's a sed version of alaniwi's answer:
sed 's/;\+/;/g' myfile # Write output to stdout
or
sed -i 's/;\+/;/g' myfile # Edit the file in-place

Removing n columns from all the files from directory in Unix

I have around 2000+ files which has random number of columns in each file.
I wanted to remove last 4 columns from each of the file.
I tried to use below command, but it is not an inline command. Delimiter of the file is #
awk -F"#" '{NF-=4;OFS="#";print}' test > testing.csv
I wanted to save the file with the same name.(e.g. filename test with test only)
How to remove last 4 columns and save the file with same name?
Can someone please help?
In case you have GNU awk's latest version could you please try following.
gawk -i inplace -v INPLACE_SUFFIX=.bak 'BEGIN{FS=OFS=","} {NF-=4} 1' *.csv
This will take backup also for each csv Input_file.
Above is safe option which has your Input_file's backup too, in case you are happy with above command and DO NOT want backup files then you could simply run following.
gawk -i inplace 'BEGIN{FS=OFS=","} {NF-=4} 1' *.csv
NOTE: In case anyone using GNU awk version 5+ then we could use inplace::suffix='.bak' s per #Sundeep sir's comment here.
You really, really, really do not want to edit the files "in-place". It is (almost) always the wrong thing to do. For something like this, you want to do something like:
$ rm -rf new-dir/
$ mkdir new-dir
$ for file in old-dir/*; do
f=${file#old-dir/};
awk '{NF-=4; $1=$1; print}' FS=# OFS=# "$file" > new-dir/"$f"; done
Then, after you know things have worked, you can replace your original directory with the new one.
Using any POSIX awk:
tmp=$(mktemp) || exit 1
for file in *; do
awk '{sub(/(#[^#]*){4}$/,"")}1' "$file" > "$tmp" &&
mv -- "$tmp" "$file"
done

search and replace multiple occurrences

So I have a file containing millions of lines.
and now within the file I have occurrences such as
=Continent
=Country
=State
=City
=Street
Now I have an excel file in which I have the text that should replace these occurrences - as an example :
=Continent should be replaced with =Asia
Similarly for other text
Now I was thinking of writing a java program to read my input file , read the mapping file and for each occurrence search and replace.
I am being lazy here - was wondering if I could do the same using editors like VIM ?
would that be possible ?
NOTE - I dont want to do a single text replace - I have multiple text that need to be found and replaced and I dont want to do the search and replace manually for each.
EDIT1:
Contents of my file that I want to replace: "1.txt"
continent=cont_text
country=country_text
The file that contains the values I want to replace with : "to_replace.txt"
=cont_text~Asia
=country_text~India
and finally using 'sed' here is my .sh file - but I am doing something wrong - it does not replace the contents of "1.txt"
while IFS="~" read foo bar;
do
echo $foo
echo $bar
for filename in 1.txt; do
sed -i.backup 's/$foo/$bar/g;' $filename
done
done < to_replace.txt
You can't put $foo and $bar in single quotes because the shell won't expand them. You don't need the for $filename in 1.txt loop because sed will loop through the lines of 1.txt. And you can't use -i.backup inside the loop because it will change the backup file each time and not preserve the original. So your script should be:
#!/bin/bash
cp 1.txt 1.txt.backup
while IFS="~" read foo bar;
do
echo $foo
echo $bar
sed -i "s/$foo/=$bar/g;" 1.txt
done < to_replace.txt
Output:
$ cat 1.txt
continent=Asia
country=India
sed is for simple substitutions on individual lines and shell is an environment from which to call tools not a tool to manipulate text so any time you write a shell loop to manipulate text you are doing it wrong.
Just use the tool that the same guys who invented sed and shell also invented to do general text processing jobs like this, awk:
$ awk -F'[=~]' -v OFS="=" 'NR==FNR{map[$2]=$3;next} {$2=map[$2]} 1' to_replace.txt 1.txt
continent=Asia
country=India
This sed command will do it without any loop:
sed -n 's#\(^=[^~]*\)~\(.*\)#s/\1/=\2/g#p' to_replace.txt |sed -i -f- 1.txt
Or sed with extended regex:
sed -nr 's#(^=[^~]*)~(.*)#s/\1/=\2/g#p' to_replace.txt | sed -i -f- 1.txt
Explanation:
The sed command:
sed -n 's#\(^=[^~]*\)~\(.*\)#s/\1/=\2/g#p' to_replace.txt
generates an output:
s/=cont_text/=Asia/g
s/=country_text/=India/g
which is then used as a sed script for the next sed after the pipe.
$ cat 1.txt
continent=Asia
country=India

bash script to add current date in each record as first column

how to add the current date in a each record as first column.
Input file:
12345|Test1
67890|Test2
expected Output file:
2014-04-26|12345|Test1
2014-04-26|67890|Test2
Thanks,
sed -e "s,^,$(date +'%Y-%M-%d')|," file
If you use Linux (more specifically, GNU sed) then you may use in-place editing with -i flag:
sed -i -e "s,^,$(date +'%Y-%M-%d')|," file
Otherwise you have to store results into a temporary file and then rename.
You could use awk
awk -vOFS='|' -vcdate=$(date '+%Y-%m-%d') ' {print cdate, $0}' file
You can use sed for example:
sed -i "/^$/ !s/^/`date +"%Y-%m-%d"`|/" data_file
If you want to edit the file, why not use ed, the standard editor? the common and nice versions of ed will support the following:
printf '%s\n' "$(date '+%%s/^/%Y-%m-%d|/')" wq | ed -s file
(this will edit the file in place, so make sure you have appropriate backups if you want to revert the changes).

Resources