Remove lines partially matching other lines in a file - bash

I have the following lines in input.txt:
client_citic_plat_fix44;CITICHK;interbridge_ulnet_se_eqx
client_citic_plat_fix44;CITICHK;interbridge_ulnet_se_eqx;CITICHK;interbridge_hk_eqx
client_dkp_crd;DELIVERTOCOMPID;DESTINATION
client_dkp_crd;NORD;interbridge_fr
client_dkp_crd;NORD;interbridge_fr;broker_nordea_2
client_dkp_crd;AVIA;interbridge_fr
client_dkp_crd;AVIA;interbridge_fr;interbridge_ld
client_dkp_crd;SEBAP;interbridge_fr
client_dkp_crd;SEBAP;interbridge_fr;broker_seb_ss_thl
client_epf_crd;DELIVERTOCOMPID;DESTINATION
I need some bash (awk/sed) script to remove the lines that are partially similar to others. Desired output should be:
client_citic_plat_fix44;CITICHK;interbridge_ulnet_se_eqx;CITICHK;interbridge_hk_eqx
client_dkp_crd;DELIVERTOCOMPID;DESTINATION
client_dkp_crd;NORD;interbridge_fr;broker_nordea_2
client_dkp_crd;AVIA;interbridge_fr;interbridge_ld
client_dkp_crd;SEBAP;interbridge_fr;broker_seb_ss_thl
client_epf_crd;DELIVERTOCOMPID;DESTINATION
Columns 1, 2 and 3 are always similar and I always want to remove the shortest line between the two compared.
Thanks!

Here's a solution using grep and sed:
#!/bin/bash
file="filepath"
while IFS= read -r line;do
(($(grep $line "$file" -c)>1)) && sed -i "/^$line$/d" "$file"
done <"$file"
Note: This will replace your file.
To not replace your file and to put the output to another file, you can do this:
#!/bin/bash
infile="infilepath"
outfile="outfilepath"
cp "$infile" "$outfile"
while IFS= read -r line;do
(($(grep $line "$infile" -c)>1)) && sed -i "/^$line$/d" "$outfile"
done <"$infile"

Related

Append out from reading lines in a txt file

I have a test.txt file with the following contents
100001
100003
100007
100008
100009
I am trying to loop through the text file and append each one with .xml.
Ex:
100001.xml
100003.xml
100007.xml
100008.xml
100009.xml
I have tried different variations of
while read p; do
echo "$p.zip"
done < test.txt
But it prints out weird like this
.xml01
.xml03
.xml07
.xml08
.xml09
Appending a .xml at the end of each line while removing CRLF, if present.
With sed and bash:
#!/bin/bash
sed -E $'s/\r?$/.xml/' test.txt
With awk:
awk -v suffix='.xml' '{sub(/\r?$/,suffix)}1' test.txt
Using it in a bash loop:
#!/bin/bash
while IFS='' read -r filename
do
printf '%q\n' "$filename"
done < <(
awk -v suffix='.xml' '{sub(/\r?$/,suffix)}1' test.txt
)
Or doing the whole thing in pure shell:
while IFS='' read -r filename
do
fullname="${filename%\r}.xml"
printf '%s\n' "$fullname"
done < test.txt

How to add lines at the beginning of either empty or not file?

I want to add lines at beginning of file, it works with:
sed -i '1s/^/#INFO\tFORMAT\tunknown\n/' file
sed -i '1s/^/##phasing=none\n/' file
However it doesn't work when my file is empty. I found these commands:
echo > file && sed '1s/^/#INFO\tFORMAT\tunknown\n/' -i file
echo > file && sed '1s/^/##phasing=none\n/' -i file
but the last one erase the first one (and also if file isn't empty)
I would like to know how to add lines at the beginning of file either if the file is empty or not
I tried a loop with if [ -s file ] but without success
Thanks!
You can use the insert command (i).
if [ -s file ]; then
sed -i '1i\
#INFO\tFORMAT\tunknown\
##phasing=none' file
else
printf '#INFO\tFORMAT\tunknown\n##phasing=none' > file
fi
Note that \t for tab is not POSIX, and does not work on all sed implementations (eg BSD/Apple, -i works differently there too). You can use a raw tab instead, or a variable: tab=$(printf '\t').
You should use i command in sed:
file='inputFile'
# insert a line break if file is empty
[[ ! -s $file ]] && echo > "$file"
sed -i.bak $'1i\
#INFO\tFORMAT\tunknown
' "$file"
Or you can ditch sed and do it in the shell using printf:
{ printf '#INFO\tFORMAT\tunknown\n'; cat file; } > file.new &&
mv file.new file
With plain bash and shell utilities:
#!/bin/bash
header=(
$'#INFO\tFORMAT\tunknown'
$'##phasing=none'
)
mv file file.bak &&
{ printf '%s\n' "${header[#]}"; cat file.bak; } > file &&
rm file.bak
Explicitely creating a new file, then moving it:
#!/bin/bash
echo -e '#INFO\tFORMAT\tunknown' | cat - file > file.new
mv file.new file
or slurping the whole content of the file into memory:
#!/bin/bash
printf '#INFO\tFORMAT\tunknown\n%s' "$(<file)" > file
It is trivial with ed if available/acceptable.
printf '%s\n' '0a' $'#INFO\tFORMAT\tunknown' $'##phasing=none' . ,p w | ed -s file
It even creates the file if it does not exists.

Editing line with sed in for loop from other file

im new in bash,
Im trying to edit line with sed with for loop from other file.
Please tell me what i doing wrong in my small code?
Do i missing another loop?
#!/bin/bash
# read and taking the line needed:
for j in `cat /tmp/check.txt`; do
# replacing the old value with and value:
sed -i "s+/tmp/old_name/+/${j}/+gi" file_destantion.txt$$
#giving numbers to the logs for checking
Num=j +1
# moving the changed file to .log number ( as for see that it is changed):
mv file_destantion.txt$$ file_destantion.txt$$.log$Num
#create ne source file to do the next value from /tmp/check:
cp -rp file_destantion.txt file_destantion.txt$$
done
On /tmp/check i have the info that i want to enter on each loop turn.
in /tmp/check:
/tmp/check70
/tmp/check70_1
/tmp/_check7007
In the end this is what i want it to be like:
.log1 > will contain /tmp/check70
.log2 > will contain /tmp/check70_1
.log3 will contain /tmp/check7007
I have found this solution worked for me.
#!/bin/bash
count=0
grep -v '^ *#' < /tmp/check | while IFS= read -r line ;do
cp -rp file_destantion.txt file_destantion.txt$$
sed -i "s+/tmp/old_name/+${line}/+gi" file_destantion.txt$$
(( count++ ))
mv file_destantion.txt$$ "file_destantion.txt$$.log${count}"
cp -rp file_destantion.txt file_destantion.txt$$
done
thank you very much #Cyrus for your guiding.

applying sed to certains line from file using bash

I need you help on this;
I am currently trying to apply a sed command to lines from a file.
2014-08-05T09:29:13+01:00 (INFO:3824.87075728): [27219] [ <email#domain.com>] A message from <user1#domain.com> source <asdfg> this is a test.
I need to apply this sed cmd to this line but keep this others that does not have 'this is a test'
pattern="this\ is\ a test"
while IFS='' read -r line; do
if [[ $line = *"${pattern}"* ]]; then
sed 's/\[ .*\(source\)/\1/g' ${line}
else
echo "${line}"
fi
done < ${INPUT} > ${OUPUT}
I have set the input and output; however ideally keeping the same file would be ideal.
Thank you for your input.
You don't need a loop for this. Use this sed:
sed -i.bak '/this is a test/s/\[ .*\(source\)/\1/g' "${INPUT}"

bash script to remove newline

I am trying to remove newlines from a file. My file is like this (it contains backward slashes):
line1\|
line2\|
I am using the following script to remove newlines:
#!/bin/bash
INPUT="file1"
while read line
do
: echo -n $line
done < $INPUT
I get the following output:
line1|line2|
It removes the backslashes. How can I retain those backslashes?
The -r option to read prevents backslash processing of the input.
while read -r line
do
echo -n "$line"
done < $INPUT
But if you just want to remove all newlines from the input, the tr command would be better:
tr -d '\n' < $INPUT
Try sed 's/\n//' /path/to/file

Resources