Bash & Printf: How can I both right pad and truncate? - bash

In Bash ...
I know how to right pad with printf
printf "%-10s" "potato"
I know how to truncate with printf
printf "%.10s" "potatos are my best friends"
How can I do both at the same time?
LIST="aaa bbbbb ccc ddddd"
for ITEM in $LIST; do
printf "%-.4s blah" $ITEM
done
This prints
aaa blah
bbbbb blah
ccc blah
ddddd blah
I want it to print
aaa blah
bbbb blah
ccc blah
dddd blah
I'd rather not do something like this (unless there's no other option):
LIST="aaa bbbbb ccc ddddd"
for ITEM in $LIST; do
printf "%-4s blah" $(printf "%.4s" "$ITEM")
done
though, obviously, that works (it feels ugly and hackish).

You can use printf "%-4.4s for getting both formatting in output:
for ITEM in $LIST; do printf "%-4.4s blah\n" "$ITEM"; done
aaa blah
bbbb blah
ccc blah
dddd blah

Related

Bash include string between chars in awk print

I'm trying to parse a log file that will have lines like this:
aaa bbb ccc: [DDD] efg oi
aaa bbb ccc: lll [DDD] efg oo
aaa bbb ccc: [DDD]
where [DDD] can be at any place in line.
Only one thing will be between [ and ] in any line
Using awk and space as a delimiter, how can I print 1st, 3rd and all data (whole string) between [ and ]?
Expected output: aaa ccc: DDD
gawk(GNU awk) approach:
Let's say we a file with the following line:
aaa bbb ccc: ddd [fff] ggg hhh
The command:
awk '{match($0,/\[([^]]+)\]/, a); print $1,$3,a[1]}' file
The output:
aaa ccc: fff
match(string, regexp [, array]) Search string for the longest, leftmost substring matched by the regular expression regexp and return the character position (index) at which that substring begins (one, if it starts at the beginning of string). If no match is found, return zero..
Given:
$ cat file
aaa bbb ccc: [DDD] efg oi
aaa bbb [ccc:] lll DDD efg oo
aaa [bbb] ccc: DDD
(note -- changed from the OP's example)
In POSIX awk:
awk 'BEGIN{fields[1]; fields[3]}
{s=""
for (i=1;i<=NF;i++)
if ($i~/^\[/ || i in fields)
s=i>1 ? s OFS $i : $i
gsub(/\[|\]/,"",s)
print s
}' file
Prints:
aaa ccc: DDD
aaa ccc:
aaa bbb ccc:
This does not print the field twice if it is both enclosed in [] and in the selected fields array. (i.e., [aaa] bbb ccc: does not print aaa twice) It will also print in correct field order if you have aaa [bbb] ccc ...
awk '$5=="[DDD]"{gsub("[\\[\\]]","");print $1,$3,$5}' file
or
awk '$5=="[DDD]"{print $1,$3, substr($5,2,3)}' file
aaa ccc: DDD

Comment multiple lines between two markers in a file

Let's say I have a file like this
blah
blah
MARKER 1
blah
blah
blah
MARKER 2
blah
I want to find a single line command (awk? sed?) in bash to change it in
blah
blah
# MARKER 1
# blah
# blah
# blah
# MARKER 2
blah
same in sed
$ sed '/^MARKER 1/,/^MARKER 2/s/^/#/' file
blah
blah
#MARKER 1
#blah
#blah
#blah
#MARKER 2
blah
Updated as per anubhava suggested :
awk '/^MARKER 1/,/^MARKER 2/{$0 = "#" $0} 1' testt
blah
blah
#MARKER 1
#blah
#blah
#blah
#MARKER 2
blah

merge lines based on pattern

I have been struggling to figure out how to 'unparse' lines in an log file (with 2 new line delimiters - '#' and '|') so all lines related to one time stamp are on one line.
Example:
2016-03-22 blah blah blah
|blah blah
|blah blah blah
#blah
|blah blah blah
2016-03-22 blah blah blah
|blah blah blah
#blah blah
#blah blah blah
|blah
Required Output
2016-03-22 blah blah blah |blah blah |blah blah blah #blah |blah blah blah
2016-03-22 blah blah blah |blah blah blah #blah blah #blah blah blah |blah
I thought I had this sussed simply by using xarg to put everything on one line then using sed to add new lines at 2016 but i discovered there is a limit on characters on one line and the log file is so big xargs was creating multiple lines.
Removing the carriage returns from lines starting with | and # would solve this but can't fathom how to do this either.
I've searched on here and found a few people posting similar questions but I can't interpret some of the solutions to fit in with my issue as I'm not familiar enough with sed/awk/xargs.
Would appreciate if anyone can offer some suggestions.
Thanks
You can use this awk command:
awk '/^[0-9]{4}(-[0-9]{2}){2}/ {
if (p!="")
print p
p=$0
next
}
{
p = p OFS $0
}
END {
print p
}' file
2016-03-22 blah blah blah |blah blah |blah blah blah #blah |blah blah blah
2016-03-22 blah blah blah |blah blah blah #blah blah #blah blah blah |blah
anubhava's answer works but it buffers the entirety of each line before printing it.
This prints as it reads each input line.
awk '{printf "%s%s", /^[|#]/?OFS:(NR>1)?"\n":"", $0} END{print ""}'
/^[|#]/ match lines starting with # or |
?OFS if matched lead with OFS (output field separator, space by default)
: otherwise
(NR>1) if we aren't on the first line
?"\n" output a newline
:"" otherwise output a blank (to avoid a blank line at the top of the output)
END{print ""} make sure we end the last line with a newline
This might work for you (GNU sed):
sed ':a;N;/\n....-..-.. /!s/\n/ /;ta;P;D' file
Read two lines into the pattern space and if the newline is not the start of a new record, replace it by a space and repeat i.e. append another line to the existing one etc.
If the line appended is the start of a new record, print the first line, delete it and repeat.
Remove the newlines, add a newline at the end of the line and insert newlines before each 2016:
echo '2016-03-22 blah blah blah
|blah blah
|blah blah blah
#blah
|blah blah blah
2016-03-22 blah blah blah
|blah blah blah
#blah blah
#blah blah blah
|blah ' | tr -d '\n' | sed -e 's/$/\n/' -e 's/2016-/\n2016-/g'
But how to merge lines (only words from lines), when this word exists in both files?
All words are changing automaticaly and files 1.txt and 2.txt are changing automatically too as part of package manager's script in Gnome 2 environment. And "link" means http://link
example INPUT:
1.txt contains detected http and version of packages:
link1/autotools-dev_20100122.1
link4/debhelper_8.0.0
link5/dreamchess_0.2.0
link5/dreamchess_0.2.0-2
link7/quilt_0.48
link7/quilt_0.48-7
link34/quilt-el_0.46.2
link34/quilt-el_0.46.2-1
2.txt contains needed extensions of packages:
autotools-dev_*.diff.gz
debhelper_*.diff.gz
debhelper_*.orig.tar.gz
libmxml-dev_*.diff.gz
libmxml-dev_*.dsc
libmxml-dev_*.orig.tar.gz
libsdl1.2-dev_*.diff.gz
libsdl1.2-dev_*.dsc
libsdl1.2-dev_*.orig.tar.gz
libsdl-image1.2-dev_*.diff.gz
libsdl-image1.2-dev_*.dsc
libsdl-image1.2-dev_*.orig.tar.gz
quilt_*.diff.gz
DESIRED OUTPUT to file 3.txt:
link1/autotools-dev_20100122.1.diff.gz
link4/debhelper_8.0.0.diff.gz
link4/debhelper_8.0.0.orig.tar.gz
libmxml-dev_*.diff.gz
libmxml-dev_*.dsc
libmxml-dev_*.orig.tar.gz
libsdl1.2-dev_*.diff.gz
libsdl1.2-dev_*.dsc
libsdl1.2-dev_*.orig.tar.gz
libsdl-image1.2-dev_*.diff.gz
libsdl-image1.2-dev_*.dsc
libsdl-image1.2-dev_*.orig.tar.gz
link7/quilt_0.48.diff.gz
link7/quilt_0.48-7.diff.gz
So needed script, which automaticaly detects common package name in files 1.txt and 2.txt and to file 3.txt suitable inserts to the same line where package name exist:
http and version from file 1.txt
extension from file 2.txt
lines from file 2.txt which not contain package name in file 1.txt

Sed match on multiple file, displaying the match together with filename and line number

This is a continuance to Multiple line, repeated occurence matching
I have many test*.txt files with contents as per previous thread.
test1.txt
blah blah..
blah blah..
blah abc blah1
blah blah..
blah blah..
blah abc blah2
blah blah..
blah efg1 blah blah
blah efg2 blah blah
blah blah..
blah blah..
blah abc blah3
blah blah..
blah blah..
blah abc blah4
blah blah..
blah blah blah
blah abc blah5
blah blah..
blah blah..
blah abc blah6
blah blah..
blah efg3 blah blah
blah efg4 blah blah
blah abc blah7
blah blah..
blah blah..
blah abc blah8
blah blah..
Now I wanted to modify the output to run the sed command on all files, but also displaying the filename together with line number (if possible) with the output of the sed command...
I run below command
ls test*.txt | xargs sed -n -f findMatch.txt
findMatch.txt content
/abc/h;/efg/!b;x;/abc/p;z;x
output is
blah abc blah2
blah abc blah6
blah abc blah2
blah abc blah6
blah abc blah2
blah abc blah6
I need a bit more detailed output as per below
test1.txt ln6 blah abc blah2
test1.txt ln23 blah abc blah6
test2.txt ln6 blah abc blah2
test2.txt ln23 blah abc blah6
test3.txt ln6 blah abc blah2
test3.txt ln23 blah abc blah6
grep command used to search the particular pattern to all files.
grep -f patten_file -Rn *.txt
-R Recursive
-n Line number
Patten_file
hai
hello
this
Output:
1.txt:1:hai
1.txt:2:hello
2.txt:1:hai
2.txt:2:this

Filtering for blocks (with headers) containing content in bash

I have an output that looks like this:
foo-2
===========
foo-3
===========
bar bar bar
yadda yadda
blah blah
foo-54
===========
foo-26
===========
How do I print only what has text under it? Meaning, for this example its foo-3. But I also want it to work if the data appears in foo-54 as well...
foo-3
===========
bar bar bar
yadda yadda
blah blah
I've trying "playing" with sed: sed -n '/foo/,/^foo/!p', but it doesn't print the foo-3 itself + unnecessary ===== prints.
Thanks!
EDIT:
Re-reading the answers, I've apparently asked the wrong question. I don't know where my data is (whether its in foo-3 or foo-XXXXX).
It doesn't have to be with sed. sed and awk are my "comfort zone"... Any solution will be appreciated.
Based on your edit I think this gnu-awk solution should be simplest/smallest script:
awk -v ORS= -v RS='foo-[0-9]+\n=+\n' '!NF{p=RT} NF{print p $0}' file
foo-3
===========
bar bar bar
yadda yadda
blah blah
Earlier Solution:
sed -n '/^foo-3/,/^foo/{/foo-3/p; /foo-/!p;}' f
foo-3
===========
bar bar bar
yadda yadda
blah blah
$ sed -n '/^foo-3/, ${/^foo-[^3]/q; p}' input
foo-3
===========
bar bar bar
yadda yadda
blah blah
A better solution using awk would be
$ awk 'flag && /^foo/{flag=0} /^foo-3$/{flag++} flag' input
foo-3
===========
bar bar bar
yadda yadda
blah blah
Here is an awk version:
awk '!(/^foo/ || $0~sep) {data=data RS $0;next} data {print header RS sep data RS;data=""} /^foo/ {header=$0}' sep="===========" file
foo-3
===========
bar bar bar
yadda yadda
blah blah
Some more readable:
awk '
!(/^foo/ || $0~sep) {
data=data RS $0
next}
data {
print header "\n"sep data
data=""}
/^foo/ {
header=$0}
' sep="===========" file
cat file
foo-2
===========
foo-3
===========
bar bar bar
yadda yadda
blah blah
foo-34
===========
more data
test this
foo-26
===========
gives
foo-3
===========
bar bar bar
yadda yadda
blah blah
foo-34
===========
more data
test this
In pure native bash, printing only blocks with content (and their headers):
#!/bin/bash
sep="==========="
last_line=''
header=''
content=( )
while read -r line; do
if [[ $line = $sep ]]; then
if (( ${#content[#]} )); then
printf '%s\n' "$header" "$sep" "${content[#]}"
fi
header=$last_line
content=( )
else
[[ $last_line && $last_line != $sep ]] && content+=( "$last_line" )
fi
last_line=$line
done

Resources