insert a blank line between every two lines in a file using shell, sed or awk - shell

I have a file with many lines. I want to insert a blank line between each two lines
for example
original file
xfdljflsad
fjdiaopqqq
dioapfdja;
I want to make it as:
xfdljflsad
fjdiaopqqq
dioapfdja;
how to achieve this?
I want to use shell script, awk or sed for this?
thanks!

With sed, use
sed G input-file
If pilcrow is correct and you do not want an additional newline at the end of the file,
then do:
sed '$!G' input-file
Another alternative is to use pr:
pr -dt input-file

awk '{print nl $0; nl="\n"}' file

My approach if I want to quickly regex a file.
vim file.txt
%s/\n/\n\n/g

Idiomatic awk:
awk 1 ORS='\n\n' file
Similar thing with perl:
perl -nE 'say' file
Append | head -n -1 if final newline is unwanted.

Related

How to convert multiple parameters URLs into single parameter URLs in bash

$ cat urls.txt
http://example.com/test/test/test?apple=&bat=&cat=&dog=
https://test.com/test/test/test?aa=&bb=&cc=
http://target.com/test/test?hmm=
I want output like below 👇🏻 , how can i do that in bash ( single line command )
$ cat urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=
With GNU awk:
$ awk -F'?|=&|=' '{for(i=2;i<NF;i++) print $1 "?" $i "="}' urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=
I try use sed but it is complex. if use perl like this:
perl -pe 'if(/(.*\?)/){$url=$1;s#&#\n$url#g;}' url.txt
it works well.
With GNU awk using gensub():
awk '{print gensub(/^(https?:)(.*)(\?[[:alpha:]]+=)(.*)/,"\\1\\2\\3","g")}' file
http://example.com/test/test/test?apple=
https://test.com/test/test/test?aa=
http://target.com/test/test?hmm=
gensub() for specifying components of the regexp in the replacement text, using parentheses in the regexp to mark the components (four here). We print only 3 of them: "\\1\\2\\3" .
This might work for you (GNU sed):
sed -E 's/(([^?]+\?)[^=]+=)&/\1\n\2/;P;D' file
Replace each & by a newline and the substring before the first parameter, print/delete the first line and repeat.

How to remove consecutive repeating characters from every line?

I have the below lines in a file
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;;;;
Acanthocephala;;;;;;;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Polymorphus;;
and I want to remove the repeating semi-colon characters from all lines to look like below (note- there are repeating semi-colons in the middle of some of the above lines too)
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;
Acanthocephala;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Polymorphus;
I would appreciate if someone could kindly share a bash one-liner to accomplish this.
You can use tr with "squeeze":
tr -s ';' < infile
perl -p -e 's/;+/;/g' myfile # writes output to stdout
or
perl -p -i -e 's/;+/;/g' myfile # does an in-place edit
If you want to edit the file itself:
printf "%s\n" 'g/;;/s/;\{2,\}/;/g' w | ed -s foo.txt
If you want to pipe a modified copy of the file to something else and leave the original unchanged:
sed 's/;\{2,\}/;/g' foo.txt | whatever
These replace runs of 2 or more semicolons with single ones.
could be solved easily by substitutions.
I add an awk solution by playing with the FS/OFS variable:
awk -F';+' -v OFS=';' '$1=$1' file
or
awk -F';+' -v OFS=';' '($1=$1)||1' file
Here's a sed version of alaniwi's answer:
sed 's/;\+/;/g' myfile # Write output to stdout
or
sed -i 's/;\+/;/g' myfile # Edit the file in-place

Delete first and last character from from each line of a txt file

I need to delete the first and the last character from the each line of a text file.
for example:
Input
$cat file1.txt
|head1|head2|head3|
|1|2|3|
|2|3|4|
Output:
head1|head2|head3
1|2|3
2|3|4
Using sed:
sed 's/.$//; s/^.//' inputfile
A simple way to do it in one sed command:
sed -E 's/^.|.$//g' file
Match a character at the start or the end of the line and replace with nothing.
In basic mode, remember that the | needs to be escaped:
sed 's/^.\|.$//g' file
If awk is helpful:
awk '{print substr($0,2,length($0)-2)}' file
head1|head2|head3
1|2|3
2|3|4
#smisra- Could you please try following, it may help you in same too.
awk '{gsub(/^\||\|$/,X,$0);print}' Input_file

remove first text with shell script

Please someone help me with this bash script,
lets say I have lots of files with url like below:
https://example.com/x/c-ark4TxjU8/mybook.zip
https://example.com/x/y9kZvVp1k_Q/myfilename.zip
My question is, how to remove all other text and leave only the file name?
I've tried to use the command described in this url How to delete first two lines and last four lines from a text file with bash?
But since the text is random which means it doesn't have exact numbers the code is not working.
You can use the sed utility to parse out just the filenames
sed 's_.*\/__'
You can use awk:
The easiest way that I find:
awk -F/ '{print $NF}' file.txt
or
awk -F/ '{print $6}' file.txt
You can also use sed:
sed 's;.*/;;' file.txt
You can use cut:
cut -d'/' -f6 file.txt

Delete every other row in CSV file using AWK or grep

I have a file like this:
1000_Tv178.tif,34.88552709
1000_Tv178.tif,
1000_Tv178.tif,34.66987165
1000_Tv178.tif,
1001_Tv180.tif,65.51335742
1001_Tv180.tif,
1002_Tv184.tif,33.83784863
1002_Tv184.tif,
1002_Tv184.tif,22.82542442
1002_Tv184.tif,
How can I make it like this using a simple Bash command? :
1000_Tv178.tif,34.88552709
1000_Tv178.tif,34.66987165
1001_Tv180.tif,65.51335742
1002_Tv184.tif,33.83784863
1002_Tv184.tif,22.82542442
Im other words, I need to delete every other row, starting with the second.
Thanks!
hek2mgl's (deleted) answer was on the right track, given the output you actually desire.
awk -F, '$2'
This says, print every row where the second field has a value.
If the second field has a value, but is nothing but whitespace you want to exclude, try this:
awk -F, '$2~/.*[^[:space:]].*/'`
You could also do this with sed:
sed '/,$/d'
Which says, delete every line that ends with a comma. I'm sure there's a better way, I avoid sed.
If you really want to explicitly delete every other row:
awk 'NR%2'
This says, print every row where the row number modulo 2 is not zero. If you really want to delete every even row it doesn't actually matter that it's a comma-delimited file.
awk provides a simple way
awk 'NR % 2' file.txt
This might work for you (GNU sed):
sed '2~2d' file
or:
sed 'n;d' file
Here's the gnu sed equivalent of the awk answers provided. Now you can safely use sed's -i flag, by specifying a backup extension:
sed -n -i.bak 'N;P' file.txt
Note that gawk4 can do this too:
gawk -i inplace -v INPLACE_SUFFIX=".bak" 'NR%2==1' file.txt
Results:
1000_Tv178.tif,34.88552709
1000_Tv178.tif,34.66987165
1001_Tv180.tif,65.51335742
1002_Tv184.tif,33.83784863
1002_Tv184.tif,22.82542442
If OPs input does not contain space after last number or , this awk can be used.
awk '!/,$/'
1000_Tv178.tif,34.88552709
1000_Tv178.tif,34.66987165
1001_Tv180.tif,65.51335742
1002_Tv184.tif,33.83784863
1002_Tv184.tif,22.82542442
But its not robust at all, any space after , brakes it.
This should fix the last space:
awk '!/,[ ]*$/'
Thank for your help guys, but I also had to make a workaround:
Read it into R and then wrote it out again. Then I installed GNU versions of awk and used gawk '{if ((FNR % 2) != 0) {print $0}}'. So if anyone else have the same problem, try it!

Resources