Comment multiple lines between two markers in a file - bash

Let's say I have a file like this
blah
blah
MARKER 1
blah
blah
blah
MARKER 2
blah
I want to find a single line command (awk? sed?) in bash to change it in
blah
blah
# MARKER 1
# blah
# blah
# blah
# MARKER 2
blah

same in sed
$ sed '/^MARKER 1/,/^MARKER 2/s/^/#/' file
blah
blah
#MARKER 1
#blah
#blah
#blah
#MARKER 2
blah

Updated as per anubhava suggested :
awk '/^MARKER 1/,/^MARKER 2/{$0 = "#" $0} 1' testt
blah
blah
#MARKER 1
#blah
#blah
#blah
#MARKER 2
blah

Related

Awk multiple file manipulation

Ok, let's try this again.
How can I open multiple files within AWK, and then just print them all to standard output? The following prints only the first line of each file.
BEGIN {
}
{
$file = $1;
(getline < $file)
print $0;
}
awk -f program.awk myindex
myindex is a list of files
file1
file2
file3
file4
an example of file1
rigrg
gdfgbt
rfghrth
thfg
bhtd
ht
hthrtjhrth
rtg
rthhrthrt
It sounds like you need something like this:
awk '
NR == FNR { ARGV[ARGC++]=$0; next }
FNR == 1 { found=0 }
$2 == "motd" { found=1 }
found
$1 == "customer" { nextfile }
' myindex
Untested of course since you didn't provide testable sample input/output. The above uses GNU awk for nextfile, with other awks replace nextfile with found=0; next.
I'll propose a different approach since getline use needs to be very precise...
$ awk '/motd/{p=1} /Customer/{p=0} p' $(awk '{print $0".info"}' index)
motd
good stuff 1
good stuff 1
motd
good stuff 2
good stuff 2
motd
good stuff 3
good stuff 3
prepare the file names as arguments to the main script. I added 1/2/3 suffix to show that the data is coming from the corresponding file.
where
==> index <==
one
two
three
==> one.info <==
blah
blah
blah
motd
good stuff 1
good stuff 1
Customer
blah
blah
end
==> three.info <==
blah
blah
blah
motd
good stuff 3
good stuff 3
Customer
blah
blah
end
==> two.info <==
blah
blah
blah
motd
good stuff 2
good stuff 2
Customer
blah
blah
to print lines between motd and Customer from all files listed in
index file
cat + sed pipeline:
cat index | xargs -I {} sed -n '/^motd$/,/^Customer$/{/^motd$/d; /^Customer$/d;p}' {}".information"
The above will output the needed lines excluding pattern lines

merge lines based on pattern

I have been struggling to figure out how to 'unparse' lines in an log file (with 2 new line delimiters - '#' and '|') so all lines related to one time stamp are on one line.
Example:
2016-03-22 blah blah blah
|blah blah
|blah blah blah
#blah
|blah blah blah
2016-03-22 blah blah blah
|blah blah blah
#blah blah
#blah blah blah
|blah
Required Output
2016-03-22 blah blah blah |blah blah |blah blah blah #blah |blah blah blah
2016-03-22 blah blah blah |blah blah blah #blah blah #blah blah blah |blah
I thought I had this sussed simply by using xarg to put everything on one line then using sed to add new lines at 2016 but i discovered there is a limit on characters on one line and the log file is so big xargs was creating multiple lines.
Removing the carriage returns from lines starting with | and # would solve this but can't fathom how to do this either.
I've searched on here and found a few people posting similar questions but I can't interpret some of the solutions to fit in with my issue as I'm not familiar enough with sed/awk/xargs.
Would appreciate if anyone can offer some suggestions.
Thanks
You can use this awk command:
awk '/^[0-9]{4}(-[0-9]{2}){2}/ {
if (p!="")
print p
p=$0
next
}
{
p = p OFS $0
}
END {
print p
}' file
2016-03-22 blah blah blah |blah blah |blah blah blah #blah |blah blah blah
2016-03-22 blah blah blah |blah blah blah #blah blah #blah blah blah |blah
anubhava's answer works but it buffers the entirety of each line before printing it.
This prints as it reads each input line.
awk '{printf "%s%s", /^[|#]/?OFS:(NR>1)?"\n":"", $0} END{print ""}'
/^[|#]/ match lines starting with # or |
?OFS if matched lead with OFS (output field separator, space by default)
: otherwise
(NR>1) if we aren't on the first line
?"\n" output a newline
:"" otherwise output a blank (to avoid a blank line at the top of the output)
END{print ""} make sure we end the last line with a newline
This might work for you (GNU sed):
sed ':a;N;/\n....-..-.. /!s/\n/ /;ta;P;D' file
Read two lines into the pattern space and if the newline is not the start of a new record, replace it by a space and repeat i.e. append another line to the existing one etc.
If the line appended is the start of a new record, print the first line, delete it and repeat.
Remove the newlines, add a newline at the end of the line and insert newlines before each 2016:
echo '2016-03-22 blah blah blah
|blah blah
|blah blah blah
#blah
|blah blah blah
2016-03-22 blah blah blah
|blah blah blah
#blah blah
#blah blah blah
|blah ' | tr -d '\n' | sed -e 's/$/\n/' -e 's/2016-/\n2016-/g'
But how to merge lines (only words from lines), when this word exists in both files?
All words are changing automaticaly and files 1.txt and 2.txt are changing automatically too as part of package manager's script in Gnome 2 environment. And "link" means http://link
example INPUT:
1.txt contains detected http and version of packages:
link1/autotools-dev_20100122.1
link4/debhelper_8.0.0
link5/dreamchess_0.2.0
link5/dreamchess_0.2.0-2
link7/quilt_0.48
link7/quilt_0.48-7
link34/quilt-el_0.46.2
link34/quilt-el_0.46.2-1
2.txt contains needed extensions of packages:
autotools-dev_*.diff.gz
debhelper_*.diff.gz
debhelper_*.orig.tar.gz
libmxml-dev_*.diff.gz
libmxml-dev_*.dsc
libmxml-dev_*.orig.tar.gz
libsdl1.2-dev_*.diff.gz
libsdl1.2-dev_*.dsc
libsdl1.2-dev_*.orig.tar.gz
libsdl-image1.2-dev_*.diff.gz
libsdl-image1.2-dev_*.dsc
libsdl-image1.2-dev_*.orig.tar.gz
quilt_*.diff.gz
DESIRED OUTPUT to file 3.txt:
link1/autotools-dev_20100122.1.diff.gz
link4/debhelper_8.0.0.diff.gz
link4/debhelper_8.0.0.orig.tar.gz
libmxml-dev_*.diff.gz
libmxml-dev_*.dsc
libmxml-dev_*.orig.tar.gz
libsdl1.2-dev_*.diff.gz
libsdl1.2-dev_*.dsc
libsdl1.2-dev_*.orig.tar.gz
libsdl-image1.2-dev_*.diff.gz
libsdl-image1.2-dev_*.dsc
libsdl-image1.2-dev_*.orig.tar.gz
link7/quilt_0.48.diff.gz
link7/quilt_0.48-7.diff.gz
So needed script, which automaticaly detects common package name in files 1.txt and 2.txt and to file 3.txt suitable inserts to the same line where package name exist:
http and version from file 1.txt
extension from file 2.txt
lines from file 2.txt which not contain package name in file 1.txt

Sed match on multiple file, displaying the match together with filename and line number

This is a continuance to Multiple line, repeated occurence matching
I have many test*.txt files with contents as per previous thread.
test1.txt
blah blah..
blah blah..
blah abc blah1
blah blah..
blah blah..
blah abc blah2
blah blah..
blah efg1 blah blah
blah efg2 blah blah
blah blah..
blah blah..
blah abc blah3
blah blah..
blah blah..
blah abc blah4
blah blah..
blah blah blah
blah abc blah5
blah blah..
blah blah..
blah abc blah6
blah blah..
blah efg3 blah blah
blah efg4 blah blah
blah abc blah7
blah blah..
blah blah..
blah abc blah8
blah blah..
Now I wanted to modify the output to run the sed command on all files, but also displaying the filename together with line number (if possible) with the output of the sed command...
I run below command
ls test*.txt | xargs sed -n -f findMatch.txt
findMatch.txt content
/abc/h;/efg/!b;x;/abc/p;z;x
output is
blah abc blah2
blah abc blah6
blah abc blah2
blah abc blah6
blah abc blah2
blah abc blah6
I need a bit more detailed output as per below
test1.txt ln6 blah abc blah2
test1.txt ln23 blah abc blah6
test2.txt ln6 blah abc blah2
test2.txt ln23 blah abc blah6
test3.txt ln6 blah abc blah2
test3.txt ln23 blah abc blah6
grep command used to search the particular pattern to all files.
grep -f patten_file -Rn *.txt
-R Recursive
-n Line number
Patten_file
hai
hello
this
Output:
1.txt:1:hai
1.txt:2:hello
2.txt:1:hai
2.txt:2:this

Bash & Printf: How can I both right pad and truncate?

In Bash ...
I know how to right pad with printf
printf "%-10s" "potato"
I know how to truncate with printf
printf "%.10s" "potatos are my best friends"
How can I do both at the same time?
LIST="aaa bbbbb ccc ddddd"
for ITEM in $LIST; do
printf "%-.4s blah" $ITEM
done
This prints
aaa blah
bbbbb blah
ccc blah
ddddd blah
I want it to print
aaa blah
bbbb blah
ccc blah
dddd blah
I'd rather not do something like this (unless there's no other option):
LIST="aaa bbbbb ccc ddddd"
for ITEM in $LIST; do
printf "%-4s blah" $(printf "%.4s" "$ITEM")
done
though, obviously, that works (it feels ugly and hackish).
You can use printf "%-4.4s for getting both formatting in output:
for ITEM in $LIST; do printf "%-4.4s blah\n" "$ITEM"; done
aaa blah
bbbb blah
ccc blah
dddd blah

Filtering for blocks (with headers) containing content in bash

I have an output that looks like this:
foo-2
===========
foo-3
===========
bar bar bar
yadda yadda
blah blah
foo-54
===========
foo-26
===========
How do I print only what has text under it? Meaning, for this example its foo-3. But I also want it to work if the data appears in foo-54 as well...
foo-3
===========
bar bar bar
yadda yadda
blah blah
I've trying "playing" with sed: sed -n '/foo/,/^foo/!p', but it doesn't print the foo-3 itself + unnecessary ===== prints.
Thanks!
EDIT:
Re-reading the answers, I've apparently asked the wrong question. I don't know where my data is (whether its in foo-3 or foo-XXXXX).
It doesn't have to be with sed. sed and awk are my "comfort zone"... Any solution will be appreciated.
Based on your edit I think this gnu-awk solution should be simplest/smallest script:
awk -v ORS= -v RS='foo-[0-9]+\n=+\n' '!NF{p=RT} NF{print p $0}' file
foo-3
===========
bar bar bar
yadda yadda
blah blah
Earlier Solution:
sed -n '/^foo-3/,/^foo/{/foo-3/p; /foo-/!p;}' f
foo-3
===========
bar bar bar
yadda yadda
blah blah
$ sed -n '/^foo-3/, ${/^foo-[^3]/q; p}' input
foo-3
===========
bar bar bar
yadda yadda
blah blah
A better solution using awk would be
$ awk 'flag && /^foo/{flag=0} /^foo-3$/{flag++} flag' input
foo-3
===========
bar bar bar
yadda yadda
blah blah
Here is an awk version:
awk '!(/^foo/ || $0~sep) {data=data RS $0;next} data {print header RS sep data RS;data=""} /^foo/ {header=$0}' sep="===========" file
foo-3
===========
bar bar bar
yadda yadda
blah blah
Some more readable:
awk '
!(/^foo/ || $0~sep) {
data=data RS $0
next}
data {
print header "\n"sep data
data=""}
/^foo/ {
header=$0}
' sep="===========" file
cat file
foo-2
===========
foo-3
===========
bar bar bar
yadda yadda
blah blah
foo-34
===========
more data
test this
foo-26
===========
gives
foo-3
===========
bar bar bar
yadda yadda
blah blah
foo-34
===========
more data
test this
In pure native bash, printing only blocks with content (and their headers):
#!/bin/bash
sep="==========="
last_line=''
header=''
content=( )
while read -r line; do
if [[ $line = $sep ]]; then
if (( ${#content[#]} )); then
printf '%s\n' "$header" "$sep" "${content[#]}"
fi
header=$last_line
content=( )
else
[[ $last_line && $last_line != $sep ]] && content+=( "$last_line" )
fi
last_line=$line
done

Resources