How to add to the end of lines containing a pattern with sed or awk? - bash

Here is example file:
somestuff...
all: thing otherthing
some other stuff
What I want to do is to add to the line that starts with all: like this:
somestuff...
all: thing otherthing anotherthing
some other stuff

This works for me
sed '/^all:/ s/$/ anotherthing/' file
The first part is a pattern to find and the second part is an ordinary sed's substitution using $ for the end of a line.
If you want to change the file during the process, use -i option
sed -i '/^all:/ s/$/ anotherthing/' file
Or you can redirect it to another file
sed '/^all:/ s/$/ anotherthing/' file > output

You can append the text to $0 in awk if it matches the condition:
awk '/^all:/ {$0=$0" anotherthing"} 1' file
Explanation
/patt/ {...} if the line matches the pattern given by patt, then perform the actions described within {}.
In this case: /^all:/ {$0=$0" anotherthing"} if the line starts (represented by ^) with all:, then append anotherthing to the line.
1 as a true condition, triggers the default action of awk: print the current line (print $0). This will happen always, so it will either print the original line or the modified one.
Test
For your given input it returns:
somestuff...
all: thing otherthing anotherthing
some other stuff
Note you could also provide the text to append in a variable:
$ awk -v mytext=" EXTRA TEXT" '/^all:/ {$0=$0mytext} 1' file
somestuff...
all: thing otherthing EXTRA TEXT
some other stuff

This should work for you
sed -e 's_^all: .*_& anotherthing_'
Using s command (substitute) you can search for a line which satisfies a regular expression. In the command above, & stands for the matched string.

Here is another simple solution using sed.
$ sed -i 's/all.*/& anotherthing/g' filename.txt
Explanation:
all.* means all lines started with 'all'.
& represent the match (ie the complete line that starts with 'all')
then sed replace the former with the later and appends the ' anotherthing' word

In bash:
while read -r line ; do
[[ $line == all:* ]] && line+=" anotherthing"
echo "$line"
done < filename

Solution with awk:
awk '{if ($1 ~ /^all/) print $0, "anotherthing"; else print $0}' file
Simply: if the row starts with all print the row plus "anotherthing", else print just the row.

Related

Grep a line from a file and replace a substring and append the line to the original file in bash?

This is what I want to do.
for example my file contains many lines say :
ABC,2,4
DEF,5,6
GHI,8,9
I want to copy the second line and replace a substring EF(all occurrences) and make it XY and add this line back to the file so the file looks like this:
ABC,2,4
DEF,5,6
GHI,8,9
DXY,5,6
how can I achieve this in bash?
EDIT : I want to do this in general and not necessarily for the second line. I want to grep EF, and do the substition in whatever line is returned.
Here's a simple Awk script.
awk -F, -v pat="EF" -v rep="XY" 'BEGIN { OFS=FS }
$1 ~ pat { x = $1; sub(pat, rep, x); y = $0; sub($1, x, y); a[++n] = y }
1
END { for(i=1; i<=n; i++) print a[i] }' file
The -F , says to use comma as the input field separator (internal variable FS) and in the BEGIN block, we also set that as the output field separator (OFS).
If the first field matches the pattern, we copy the first field into x, substitute pat with rep, and then substitute the first field of the whole line $0 with the new result, and append it to the array a.
1 is a shorthand to say "print the current input line".
Finally, in the END block, we output the values we have collected into a.
This could be somewhat simplified by hardcoding the pattern and the replacement, but I figured it's more useful to make it modular so that you can plug in whatever values you need.
While this all could be done in native Bash, it tends to get a bit tortured; spending the 30 minutes or so that it takes to get a basic understanding of Awk will be well worth your time. Perhaps tangentially see also while read loop extremely slow compared to cat, why? which explains part of the rationale for preferring to use an external tool like Awk over a pure Bash solution.
You can use the sed command:
sed '
/EF/H # copy all matching lines
${ # on the last line
p # print it
g # paste the copied lines
s/EF/XY/g # replace all occurences
s/^\n// # get rid of the extra newline
}'
As a one-liner:
sed '/EF/H;${p;g;s/EF/XY/g;s/^\n//}' file.csv
If ed is available/acceptable, something like:
#!/bin/sh
ed -s file.txt <<-'EOF'
$kx
g/^.*EF.*,.*/t'x
'x+;$s/EF/XY/
,p
Q
EOF
Or in one-line.
printf '%s\n' '$kx' "g/^.*EF.*,.*/t'x" "'x+;\$s/EF/XY/" ,p Q | ed -s file.txt
Change Q to w if in-place editing is needed.
Remove the ,p to silence the output.
Using BASH:
#!/bin/bash
src="${1:-f.dat}"
rep="${2:-XY}"
declare -a new_lines
while read -r line ; do
if [[ "$line" == *EF* ]] ; then
new_lines+=("${line/EF/${rep}}")
fi
done <"$src"
printf "%s\n" "${new_lines[#]}" >> "$src"
Contents of f.dat before:
ABC,2,4
DEF,5,6
GHI,8,9
Contents of f.dat after:
ABC,2,4
DEF,5,6
GHI,8,9
DXY,5,6
Following on from the great answer by #tripleee, you can create a variation that uses a single call to sub() by outputting all records before the substitution is made, then add the updated record to the array to be output with the END rule, e.g.
awk -F, '1; /EF/ {sub(/EF/,"XY"); a[++n]=$0} END {for(i=1;i<=n;i++) print a[i]}' file
Example Use/Output
An expanded input based on your answer to my comment below the question that all occurrences of EF will be replaced with XY in all records, e.g.
$ cat file
ABC,2,4
DEF,5,6
GHI,8,9
EFZ,3,7
Use and output:
$ awk -F, '1; /EF/ {sub(/EF/,"XY"); a[++n]=$0} END {for(i=1;i<=n;i++) print a[i]}' file
ABC,2,4
DEF,5,6
GHI,8,9
EFZ,3,7
DXY,5,6
XYZ,3,7
Let me know if you have questions.

How to ignore case when using awk or sed [duplicate]

sed -i '/first/i This line to be added'
In this case,how to ignore case while searching for pattern =first
You can use the following:
sed 's/[Ff][Ii][Rr][Ss][Tt]/last/g' file
Otherwise, you have the /I and n/i flags:
sed 's/first/last/Ig' file
From man sed:
I
i
The I modifier to regular-expression matching is a GNU extension which
makes sed match regexp in a case-insensitive manner.
Test
$ cat file
first
FiRst
FIRST
fir3st
$ sed 's/[Ff][Ii][Rr][Ss][Tt]/last/g' file
last
last
last
fir3st
$ sed 's/first/last/Ig' file
last
last
last
fir3st
GNU sed
sed '/first/Ii This line to be added' file
You can try
sed 's/first/somethingelse/gI'
if you want to save some typing, try awk. I don't think sed has that option
awk -v IGNORECASE="1" '/first/{your logic}' file
For versions of awk that don't understand the IGNORECASE special variable, you can use something like this:
awk 'toupper($0) ~ /PATTERN/ { print "string to insert" } 1' file
Convert each line to uppercase before testing whether it matches the pattern and if it does, print the string. 1 is the shortest true condition, so awk does the default thing: { print }.
To use a variable, you could go with this:
awk -v var="$foo" 'BEGIN { pattern = toupper(foo) } toupper($0) ~ pattern { print "string to insert" } 1' file
This passes the shell variable $foo and transforms it to uppercase before the file is processed.
Slightly shorter with bash would be to use -v pattern="${foo^^}" and skip the BEGIN block.
Use the following, \b for word boundary
sed 's/\bfirst\b/This line to be added/Ig' file

output csv with lines that contains only one column

with input csv file
sid,storeNo,latitude,longitude
2,1,-28.03720000,153.42921670
9
I wish to output only the lines with one column, in this example it's line 3.
how can this be done in bash shell script?
Using awk
The following awk would be usfull
$ awk -F, 'NF==1' inputFile
9
What it does?
-F, sets the field separator as ,
NF==1 matches lines with NF, number of fields as 1. No action is provided hence default action, printing the entire record is taken. it is similar to NF==1{print $0}
inputFile input csv file to the awk script
Using grep
The same function can also be done using grep
$ grep -v ',' inputFile
9
-v option prints lines that do not match the pattern
, along with -v greps matches lines that do not contain , field separator
Using sed
$ sed -n '/^[^,]*$/p' inputFile
9
what it does?
-n suppresses normal printing of pattern space
'/^[^,]*$/ selects lines that match the pattern, lines without any ,
^ anchors the regex at the start of the string
[^,]* matches anything other than ,
$ anchors string at the end of string
p action p makes sed to print the current pattern space, that is pattern space matching the input
try this bash script
#!/bin/bash
while read -r line
do
IFS=","
set -- $line
case ${#} in
1) echo $line;;
*) continue;;
esac
done < file

How to output only text after a match with sed

I am using sed to find a certain match in a text file and then put this value in to a variable, my problem is that I only want the text after the match, and not the entire line.
Ans=$(sed -n '/^'$1':/,/~/{/:/{p;n};/~/q;p}' $file.txt)
Text File
q1:answer1
~
q2:answer2
~
q3:answer3
~
Actual Output
q1:answer1
Expected Output
answer1
With grep :
Ans=$(grep -oP "^$1:\K.*" file)
or with perl if your grep version doesn't support -P switch :
Ans=$(var=$1 perl -lne '/^$ENV{var}:\K.*/ and print $&' file)
In case a sed solution is needed - e.g., if answers could span multiple lines:
Ans=$(sed -r -n '/^'$1':(.*)/,/^(~)$/ { s//\1/; /^~$/q; p; }' file.txt)
(OSX users: use -E instead of -r).
Uses a backreference (\1) to replace the first matching line with its portion of interest; any other lines between the first matching one and the terminating ~ line are unaffected by the replacement (assuming they don't also start with $1:) and also printed.
Replace q with d if you don't want to quit after the first matching range.
By contrast, if the string of interest is limited to the line starting with $1:, there's no need to also match the ~ line, and the command can be simplified to:
Ans=$(sed -r -n '/^'$1':(.*)/ { s//\1/p; q; }' file.txt)
Remove q; if you don't want to quit after the first match.
However, the single-line case is more easily handled with a grep or awk solution - see #sputnick's and #anubhava's answers. If you wanted those to quit after the first match -- as in the snippets above and the code in the OP -- you'd need to add option -m 1 to the grep solution and ; exit to the awk solution (before the }).
Better use awk for this:
ans=$(awk -F':' -v s='q1' '$1 == s {print $2}' file)

How to append to lines in a file that do not contain a specific pattern using shell script

I have a flat file as follows:
11|aaa
11|bbb|NO|xxx
11|ccc
11|ddd|NO|yyy
For lines that do not contain |NO|, I would like to add the string |YES| at the end. So my file should look like:
11|aaa|YES|
11|bbb|NO|xxx
11|ccc|YES|
11|ddd|NO|yyy
I am using AIX and sed -i option for inline replacements is not available. Hence, currently I'm using the following code to do this:
#Get the lines that do not contain |NO|
LINES=`grep -v "|NO|" file`
for i in LINES
do
sed "/$i/{s/$/|YES|/;}" file > temp
mv temp file
done
The above works, however, as my file contains over 40000 lines, it takes about 3 hours to run. I believe it is taking so much time because it has to search for each line and write to a temp file. Is there a faster way to achieve this ?
This will be quick:
sed '/NO/!s/$/|YES|/' filename
If temp.txt is your file, try:
awk '$0 !~ /NO/ {print $0 "|YES|"} $0 ~ /NO/ {print}' temp.txt
Simple with awk. Put the code below into a script and run it with awk -f script file > temp
/\|NO\|/ { print; next; } # just print anything which contains |NO| and read next line
{ print $0 "|YES|"; } # For any other line (no pattern), print the line + |YES|
I'm not sure about awk regexps; if it doesn't work, try to remove the two \ in the first pattern.

Resources