Appending to line with sed, adding separator if necessary - bash

I have a properties file, which, when unmodified has the following line:
worker.list=
I would like to use sed to append to that line a value so that after sed has run, the line in the file reads:
worker.list=test
But, when I run the script a second time, I want sed to pick up that a value has already been added, and thus adds a separator:
worker.list=test,test
That's the bit that stumps me (frankly sed scares me with its power, but that's my problem!)
Rich

Thats easy! If you're running GNU sed, you can write it rather short
sed -e '/worker.list=/{s/$/,myValue/;s/=,/=/}'
That'll add ',myValue' to the line, and then remove the comma (if any) after the equal sign.
If you're stuck on some other platform you need to break it apart like so
sed -e '/worker.list=/{' -e 's/$/,myValue/' -e 's/=,/=/' -e '}'
It's a pretty stupid script in that it doesn't know about existance of values etc (I suppose you CAN do a more elaborate parsing, but why should you?), but I guess that's the beauty of it. Oh and it'll destroy a line like this
worker.list=,myval
which will turn into
worker.list=myval,test
If that's a problem let me know, and I'll fix that for you.
HTH.

you can also use awk. Set field delimiter to "=". then what you want to append is always field number 2. example
$ more file
worker.list=
$ awk -F"=" '/worker\.list/{$2=($2=="")? $2="test" : $2",test"}1' OFS="=" file
worker.list=test
$ awk -F"=" '/worker\.list/{$2=($2=="")? $2="test" : $2",test"}1' OFS="=" file >temp
$ mv temp file
$ awk -F"=" '/worker\.list/{$2=($2=="")? $2="test1" : $2",test1"}1' OFS="=" file
worker.list=test,test1
or the equivalent of the sed answer
$ awk -F"=" '/worker\.list/{$2=",test1";sub("=,","=")}1' OFS="=" file

Related

Remove filler spaces from blank lines in linux script

I am trying to work on a bash script that will take files from one github repo and copy them over to another one.
I have this mostly working however 1 file I am trying to move over has spaces on all of its blank lines like so:
FROM metrics_flags ORDER BY DeliveryDate ASC
)
SELECT * FROM selected;
""";
Notice how its not just a blank line, there are actually 10-20 spaces in between the 2 blocks of code on that blank line.
Is there some unix command that can parse the file and remove the spaces (but keep the blank line)?
I tried
awk 'NF { $1=$1; print }' file.txt
and
sed -e 's/^[ \t]*//' file.txt
with no success.
awk used without changing delimeters splits records (lines) into white-space-separated fields. By default any print commands obey the same separators for the output but any empty fields can be removed resulting in their white-space-separators not being used.
The 'trick' is to get awk to re-evaluate the line by setting any field (even empty ones) to itself:
awk '{$1=$1; print}' test.txt
will remove all white space that is not surrounding other printable characters and return the file contents to stdout where it can be redirected to file if required.
I don't know why you used NF as a pattern in your awk attempt, nor why it caused it to fail, but the similar approach without it, as above, works fine.
edit after a quick experiment, I think what was happening with your awk attempt was that setting the pattern to NF caused awk to skip lines with no printable fields completely. Removing that pattern allows the now empty lines to be printed.
This should do what you describe, replacing leading whitespace only from empty lines:
sed -E 's|^\s+$||' file
The -E (extended regex) is required for \s+ (\t also), meaning one or more whitespace characters. I think you might have accidentally used a lower e.
If you like the output, you can add -i to apply the edit to your file.
This is an example of using awk to achieve the same:
awk '{gsub(/^\s+$/, "")}; { print }' file
To apply it, use -i inplace:
awk -i inplace '{gsub(/^\s+$/, "")}; { print }' file
I tested this on Ubuntu 22.04 with GNU sed 4.8 and GNU awk 5.1.0
Odd ...
sed -i 's/^[[:space:]]*$//g' file.txt
definitely works for me; I don't see why your sed version wouldn't, though.
On MacOS, this works (TESTED):
sed -E -i "" 's/^[[:space:]]*$//g' file.txt

bash scripting: Can I get sed to output the original line, then a space, then the modified line?

I'm new to Unix in all its forms, so please go easy on me!
I have a bash script that will pipe an ls command with arbitrary filenames into sed, which will use an arbitrary replacement pattern on the files, and then this will be piped into awk for some processing. The catch is, awk needs to know both the original file name and the new one.
I've managed everything except getting the original file names into awk. For instance, let's say my files are test.* and my replacement pattern is 's:es:ar;', which would change every occurrence of "test" to "tart". For testing purposes I'm just using awk to print what it's receiving:
ls "$#" | sed "$pattern" | awk '{printf "0: %s\n1: %s\n2: %s\n", $0,$1,$2}'
where test.* is in $# and the pattern is stored in $pattern.
Clearly, this doesn't get me to where I want to be. The output is obviously
0: tart.c
1: tart.c
2:
If I could get sed to output "test.c tart.c", then I'd have two parameters for awk. I've played around with the pattern to no avail, even hardcoding "test.c" into the replacement. But of course that just gave me amateur results like "ttest.c art.c". Is it possible for sed to remember the input, then work it into the beginning of the output? Do I even have the right ideas? Thanks in advance!
Two ways to change the first t in a b in the duplicated field.
Duplicate (& replays the matched part), change first word and swap (remember 2 strings with a space in between):
echo test.c | sed -r 's/.*/& &/;s/t/b/;s/([^ ]*) (.*)/\2 \1/'
or with more magic (copy original value to buffer, make the change, insert value from buffer as the first line and replace eond of line with a space)
echo test.c | sed 'h;s/t/b/;x;G;s/\n/ /'
Use Perl instead of sed:
echo test.c | perl -lne 'print "$_ ", s/es/ar/r'
-l removes the newline from input and adds it after each print. The /r modifier to the substitution returns the modified string instead of changing the variable (Perl 5.14+ needed).
Old answer, not working for s/t/b/2 or s/.*/replaced/2:
You can duplicate the contents of the line with s/.*/& &/, then just tell sed that it should only apply the second substitution (this works at least in GNU sed):
echo test.c | sed 's/.*/& &/; s/es/ar/2'
$ echo 'foo' | awk '{old=$0; gsub(/o/,"e"); print old, $0}'
foo fee

Change three or more empty lines into two using bash, sed or awk

Lets say that we have string containing words and multiple empty lines. For instance
"1\n2\n\n3\n\n\n4\n\n\n\n2\n\n3\n\n\n1\n"
I would like to "shrink" three or more empty lines into two using bash, sed or awk to obtain string
"1\n2\n\n3\n\n4\n\n2\n\n3\n\n1\n"
Has anybody an idea?
with awk
$ awk -v RS= -v ORS='\n\n' 1 file
If perl is acceptable,
perl -00 -lpe1
ought to do it. It reads and outputs whole paragraphs, which has the side effect of normalizing 2+ newlines to just \n\n.
If the data isn't too voluminous and you have GNU sed, use sed -z to make it work on a single null-terminated record rather than one \n-terminated record per line :
sed -z 's/\n\n\n\n*/\n\n/g'
Or with extended regexs :
sed -zr 's/\n{3,}/\n\n/g'

How to extract lines after founding specific string

My example text is,
AA BB CC
DDD
process.get('name1')
process.get('name2')
process.get('name3')
process.get('name4')
process.get('name5')
process.get('name6')
EEE
FFF
...
I want to search the string "process.get('name1')" first, if found then extract the lines from "process.get('name1')" to "process.get('name6')".
How do I extract the lines using sed?
This should work and... it uses sed as per OP request:
$ sed -n "/^process\.get('name1')$/,/^process\.get('name6')$/p" file
sed is for simple substitutions on individual lines, for anything more interesting you should be using awk:
$ awk -v beg="process.get('name1')" -v end="process.get('name6')" \
'index($0,beg){f=1} f; index($0,end){f=0}' file
process.get('name1')
process.get('name2')
process.get('name3')
process.get('name4')
process.get('name5')
process.get('name6')
Note that you could use a range in awk, just like you are forced to in sed:
awk -v beg="process.get('name1')" -v end="process.get('name6')" \
'index($0,beg),index($0,end)' file
and you could use regexps after escaping metachars in awk, just like you are forced to in sed:
awk "/process\.get\('name1'\)/,/process\.get\('name6'\)/" file
but the first awk version above using strings instead of regexps and a flag variable is simpler (in as much as you don't have to figure out which chars are/aren't RE metacharacters), more robust and more easily extensible in future.
It's important to note that sed CANNOT operate on strings, just regexps, so when you say "I want to search for a string" you should stop trying to force sed to behave as if it can do that.
Imagine your search strings are passed in to a script as positional parameters $1 and $2. With awk you'd just init the awk variables from them in the expected way:
awk -v beg="$1" -v end="$2" 'index($0,beg){f=1} f; index($0,end){f=0}' file
whereas with sed you'd have to do something like:
beg=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$1")
end=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$2")
sed -n "/^${beg}$/,/^${end}$/p" file
to deactivate any metacharacters present. See Is it possible to escape regex metacharacters reliably with sed for details on escaping RE metachars for sed.
Finally - as mentioned above you COULD use a range expression with strings in awk:
awk -v beg="$1" -v end="$2" 'index($0,beg),index($0,end)' file
but I personally have never found that useful, there's always some slight requirements change comes along to make me wish I'd started out using a flag. See Is a /start/,/end/ range expression ever useful in awk? for details on that

Delete every other row in CSV file using AWK or grep

I have a file like this:
1000_Tv178.tif,34.88552709
1000_Tv178.tif,
1000_Tv178.tif,34.66987165
1000_Tv178.tif,
1001_Tv180.tif,65.51335742
1001_Tv180.tif,
1002_Tv184.tif,33.83784863
1002_Tv184.tif,
1002_Tv184.tif,22.82542442
1002_Tv184.tif,
How can I make it like this using a simple Bash command? :
1000_Tv178.tif,34.88552709
1000_Tv178.tif,34.66987165
1001_Tv180.tif,65.51335742
1002_Tv184.tif,33.83784863
1002_Tv184.tif,22.82542442
Im other words, I need to delete every other row, starting with the second.
Thanks!
hek2mgl's (deleted) answer was on the right track, given the output you actually desire.
awk -F, '$2'
This says, print every row where the second field has a value.
If the second field has a value, but is nothing but whitespace you want to exclude, try this:
awk -F, '$2~/.*[^[:space:]].*/'`
You could also do this with sed:
sed '/,$/d'
Which says, delete every line that ends with a comma. I'm sure there's a better way, I avoid sed.
If you really want to explicitly delete every other row:
awk 'NR%2'
This says, print every row where the row number modulo 2 is not zero. If you really want to delete every even row it doesn't actually matter that it's a comma-delimited file.
awk provides a simple way
awk 'NR % 2' file.txt
This might work for you (GNU sed):
sed '2~2d' file
or:
sed 'n;d' file
Here's the gnu sed equivalent of the awk answers provided. Now you can safely use sed's -i flag, by specifying a backup extension:
sed -n -i.bak 'N;P' file.txt
Note that gawk4 can do this too:
gawk -i inplace -v INPLACE_SUFFIX=".bak" 'NR%2==1' file.txt
Results:
1000_Tv178.tif,34.88552709
1000_Tv178.tif,34.66987165
1001_Tv180.tif,65.51335742
1002_Tv184.tif,33.83784863
1002_Tv184.tif,22.82542442
If OPs input does not contain space after last number or , this awk can be used.
awk '!/,$/'
1000_Tv178.tif,34.88552709
1000_Tv178.tif,34.66987165
1001_Tv180.tif,65.51335742
1002_Tv184.tif,33.83784863
1002_Tv184.tif,22.82542442
But its not robust at all, any space after , brakes it.
This should fix the last space:
awk '!/,[ ]*$/'
Thank for your help guys, but I also had to make a workaround:
Read it into R and then wrote it out again. Then I installed GNU versions of awk and used gawk '{if ((FNR % 2) != 0) {print $0}}'. So if anyone else have the same problem, try it!

Resources