SED: First and last empty lines not removed - bash

I'm running the following but it's returning with empty lines at the top and bottom of the new file.
How do I output to a new file without these empty lines?
input | sed -E '/^$/d' > file.txt
The following has no effect either.
sed '1d'
sed '$d'
I'm unsure of where the expression has problems.

If you are comfortable using awk then this would work -
awk 'NF' INPUT_FILE > OUTPUT_FILE

grep . file_name > outfile would do the job for you.

This might work for you:
echo -e " \t\r\nsomething\n \t \r\n" | sed '/^\s*$/d' | cat -n
1 something
N.B. This removes all blank lines, to preserve blank lines in the body of a file use:
echo -e " \t\r\n something\n \nsomething else \n \t \r\n" |
sed ':a;$!{N;ba};s/^\(\s*\n\)*\|\(\s*\n\)*$//g'
something
something else

Related

Sed find and replace expression works with literal but not with variable interpolation

For the following MVCE:
echo "test_num: 0" > test.txt
test_num=$(grep 'test_num:' test.txt | cut -d ':' -f 2)
new_test_num=$((test_num + 1))
echo $test_num
echo $new_test_num
sed -i "s/test_num: $test_num/test_num: $new_test_num/g" test.txt
cat test.txt
echo "sed -i "s/test_num: $test_num/test_num: $new_test_num/g" test.txt"
sed -i "s/test_num: 0/test_num: 1/g" test.txt
cat test.txt
Which outputs
0 # parsed original number correctly
1 # increment the number
test_num: 0 # sed with interpolated variable, does not work
sed -i s/test_num: 0/test_num: 1/g test.txt # interpolated parameter looks right
test_num: 1 # ???
Why does sed -i "s/test_num: $test_num/test_num: $new_test_num/g" test.txt not produce the expected result when sed -i "s/test_num: 0/test_num: 1/g" test.txt works just fine in the above example?
As mentioned in the comment, there is a white space in ${test_num}. Therefore in your sed there should not be an empty space between the colon and your variable.
Also I guess you should surround your variable with curly bracket {} to increase readability.
sed "s/test_num:${test_num}/test_num: ${new_test_num}/g" test.txt
test_num: 1
If you just want the number in ${test_num}, you can try something like:
grep 'test_num:' test.txt | awk -F ': ' '{print $2}'
awk allows to specify delimiter with more than 1 character.
Instead of grep|cut you can also use sed.
#! /bin/bash
exec <<EOF
test_num: 0
EOF
grep 'test_num:' | cut -d ':' -f 2
exec <<EOF
test_num: 0
EOF
sed -n 's/^test_num: //p'
When using regexp replace in sed there is special meaning to $ .
Suggesting to rebuild your sed command segments as follow:
sed -i 's/test_num: '$test_num'/test_num: '$new_test_num'/g' test.txt
Other option, use echo command to expand variables in sed command.
sed_command=$(echo "s/test_num:${test_num}/test_num: ${new_test_num}/g")
sed -i "$sed_command" test.txt

output of sed gives strange result when using capture groups

I'm doing the following command in a bash:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -rn 's#^URL: \^/tags/([^/]+)/#\1#p'
I think this should output only the matching lines and the content of the capture group. So I'm expecting 0.0.0 as the result. But I'm getting 0.0.0abcd
Why contains the capture group parts from the left and the right side of the /? What I am doing wrong?
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' |
sed -rn 's#^URL: \^/tags/([^/]+)/#\1#p'
echo outputs two lines:
UNUSED
URL: ^/tags/0.0.0/abcd
The regular expression given to sed does not match the first line, so this line is not printed. The regular expression matches the second line, so URL: ^/tags/0.0.0/ is replaced with 0.0.0; only the matched part of the line is replaced, so abcd is passed unchanged.
To obtain the desired output you must also match abcd, for example with
sed -rn 's#^URL: \^/tags/([^/]+)/.*#\1#p'
where the .* eats all characters to the end of the line.
You can use awk:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | awk -F/ 'index($0, "^/tags/"){print $3}'
0.0.0
This awk command uses / as field delimiter and prints 3rd column when there ^/tags/ text in input.
Alternatively, you can use gnu grep:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | grep -oP '^URL: \^/tags/\K([^/]+)'
0.0.0
Or this sed:
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -nE 's~^URL: \^/tags/([^/]+).*~\1~p'
0.0.0
This sed catch your desired output.
echo -e 'UNUSED\nURL: ^/tags/0.0.0/abcd' | sed -E '/URL/!d;s#.*/(.*)/[^/]*#\1#'

Bash: Replacing "" with newline character, using sed or tr

I'm trying to format output in a way that inserts newline characters after each 'line', with lines denoted by double quotes (""). The quotes themselves are temporary and to be stripped in a later step.
Input:
"a",1,"aa""b",2,"bb"
Output:
a,1,aa
b,2,bb
I've tried:
sed 's/""/\n/'
sed 's/""/\/g'
tr '""' '\n'
But tr seems to replace every quote character and sed seems to insert \n as text instead of a newline. What can I do to make this work?
echo '"a",1,"aa""b",2,"bb"' |awk -v RS='""' '{$1=$1} {gsub(/"/,"")}1'
a,1,aa
b,2,bb
or using sed:
echo '"a",1,"aa""b",2,"bb"' |sed -e 's/""/\n/' -e 's/"//g' # OR sed -e 's/""/\n/;s/"//g'
a,1,aa
b,2,bb
awk solution: Here the default record separator is changed from new line to "". So awk will consider the EOL when it hits "".
sed solution: Here first "" are converted into new line and second replacement is to remove " from each line.
neech#nicolaw.uk:~ $ cat file.txt
"a",1,"aa""b",2,"bb"
neech#nicolaw.uk:~ $ sed 's/""/\n/' file.txt | tr -d '"'
a,1,aa
b,2,bb
You seem to be dealing with POSIX sed, which does not have support for the \n notation. Insert an actual new-line into the pattern, either:
sed 's/""/\
/'
Or:
sed 's/""/\'$'\n''/'
E.g.:
sed 's/""/\
/' | tr -d \"
Output:
a,1,aa
b,2,bb
As suggested by George Vasiliou if you have perl you could use:
> echo '"a",1,"aa""b",2,"bb"' | perl -pe 's/""/"\n"/g;s/"//g'
This avoids the non portable sed problem.
Or for a crappy hack version.
Replace the "" with another character and then use tr (since tr should work with \n) to replace it with \n instead then remove the single " after.
So you can get the "" replaced with newline like this:
sed 's/""/#/g' | tr '#' '\n'
Then the rest follows:
> echo '"a",1,"aa""b",2,"bb"'| sed 's/""/#/g' | tr '#' '\n' | sed 's/\"//g'

Concatenate grep output string (bash script)

I'm processing some data from a text file using a bash script (Ubuntu 12.10).
The basic idea is that I select a certain line from a file using grep. Next, I process the line to get the number with sed. Both the grep and sed command are working. I can echo the number.
But the concatenation of the result with a string goes wrong.
I get different results when combining string when I do a grep command from a variable or a file. The concatenation goes wrong when I grep a file. It works as expected when I grep a variable with the same text as in the file.
What am I doing wrong with the grep from a file?
Contents of test.pdb
REMARK overall = 324.88
REMARK bon = 24.1918
REMARK coup = 0
My script
#!/bin/bash
#Correct function
echo "Working code"
TEXT="REMARK overall = 324.88\nREMARK bon = 24.1918\nREMARK coup = 0\n"
DATA=$(echo -e $TEXT | grep 'overall' | sed -n -e "s/^.*= //p" )
echo "Data: $DATA"
DATA="$DATA;0"
echo $DATA
#Not working
echo ""
echo "Not working code"
DATA=$(grep 'overall' test.pdb | sed -n -e "s/^.*= //p")
echo "Data: $DATA"
DATA="$DATA;0"
echo $DATA
Output
Working code
Data: 324.88
324.88;0
Not working code
Data: 324.88
;04.88
I went crazy with the same issue.
The real problem is that your "test.pdb" has probably a wrong EOL (end of line) character.
Linux EOL: LF (aka \n)
Windows EOL: CR LF (aka \r \n)
This mean that echo and grep will have problem with this extra character (\r), luckily tr, sed and awk manage it correctly.
So you can try also with:
DATA=$(grep 'overall' test.pdb | sed -n -e "s/^.*= //p" | sed -e 2s/\r$//")
or
DATA=$(grep 'overall' test.pdb | sed -n -e "s/^.*= //p" | tr -d '\r')
With awk, it will be more reliable and cleaner I guess :
$ awk '$2=="overall"{print "Working code\nData: " $4 "\n" $4 ";0"}' file.txt
Working code
Data: 324.88
324.88;0
Try this:
SUFFIX=";0"
DATA="${DATA}${SUFFIX}"

Delete first line of file if it's empty

How can I delete the first (!) line of a text file if it's empty, using e.g. sed or other standard UNIX tools. I tried this command:
sed '/^$/d' < somefile
But this will delete the first empty line, not the first line of the file, if it's empty. Can I give sed some condition, concerning the line number?
With Levon's answer I built this small script based on awk:
#!/bin/bash
for FILE in $(find some_directory -name "*.csv")
do
echo Processing ${FILE}
awk '{if (NR==1 && NF==0) next};1' < ${FILE} > ${FILE}.killfirstline
mv ${FILE}.killfirstline ${FILE}
done
The simplest thing in sed is:
sed '1{/^$/d}'
Note that this does not delete a line that contains all blanks, but only a line that contains nothing but a single newline. To get rid of blanks:
sed '1{/^ *$/d}'
and to eliminate all whitespace:
sed '1{/^[[:space:]]*$/d}'
Note that some versions of sed require a terminator inside the block, so you might need to add a semi-colon. eg sed '1{/^$/d;}'
Using sed, try this:
sed -e '2,$b' -e '/^$/d' < somefile
or to make the change in place:
sed -i~ -e '2,$b' -e '/^$/d' somefile
If you don't have to do this in-place, you can use awk and redirect the output into a different file.
awk '{if (NR==1 && NF==0) next};1' somefile
This will print the contents of the file except if it's the first line (NR == 1) and it doesn't contain any data (NF == 0).
NR the current line number,NF the number of fields on a given line separated by blanks/tabs
E.g.,
$ cat -n data.txt
1
2 this is some text
3 and here
4 too
5
6 blank above
7 the end
$ awk '{if (NR==1 && NF==0) next};1' data.txt | cat -n
1 this is some text
2 and here
3 too
4
5 blank above
6 the end
and
cat -n data2.txt
1 this is some text
2 and here
3 too
4
5 blank above
6 the end
$ awk '{if (NR==1 && NF==0) next};1' data2.txt | cat -n
1 this is some text
2 and here
3 too
4
5 blank above
6 the end
Update:
This sed solution should also work for in-place replacement:
sed -i.bak '1{/^$/d}' somefile
The original file will be saved with a .bak extension
Delete the first line of all files under the actual directory if the first line is empty :
find -type f | xargs sed -i -e '2,$b' -e '/^$/d'
This might work for you:
sed '1!b;/^$/d' file

Resources