How to Indent a String in Bash using printf?

How to Indent a String in Bash using printf? - bash

Is there an example of indenting strings in Bash (for output)?
I found examples using printf but they don't seem to work as expected.
I want to simply indent a given string with a given number of spaces.
echo "Header"
indent "Item 1" 2
indent "Sub Item 1a" 4
indent "Sub Item 1b" 4
would produce the output
Header
Item 1
Sub Item 1a
Sub Item 1b

In printf, something like %3s means "a string, but with as many initial spaces as are necessary to ensure that the string is at least 3 columns wide".
This works even if the string is the empty string '', in which case %3s means essentially "three spaces".
So, for example, indent "Sub Item 1a" 4 can be expressed as printf '%4s%s\n' '' "Sub Item 1a", which prints four spaces followed by "Sub Item 1a" and a newline.
If you want, you can implement indent as a function:
function indent () {
local string="$1"
local num_spaces="$2"
printf "%${num_spaces}s%s\n" '' "$string"
}

Related

Remove characters between markers in a bash variable

I'm trying to remove unknown characters between 2 known markers from a variable using bash.
eg
string="This text d #! more text jsdlj end and mo{re ;re end text.text"
I want to remove all the characters between the last word "text " (before the end word) and the first occurance thereafter called "end" . ie between the last occurance of the word "text " after that the first occurance of the word "end", but keeping both these markers)
result="This text d #! more text end and mo{re ;re end text.text"
I'll be using it as part of a find -print0 | xargs -0 bash -c 'command; command...etc.' script.
I've tried
echo $string | sed 's/[de][ex][ft][^\-]*//' ;
but that does it from the first "ext" and "-" (not the last "ext" before the end marker) and also does not retain the markers.
Any suggestions?
EDIT: So far the outcomes are as follows:
string="text text text lk;sdf;-end end 233-end.txt"
start="text "
end="-end"
Method 1
[[ $string =~ (.*'"${start}"').*('"${end}"'.*) ]] || :
nstring="${BASH_REMATCH[1]}${BASH_REMATCH[2]}" ;
echo "$nstring" ;
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 2
temp=${cname%'"$end"'*}
nend=${cname#"$temp"}
nstart=${temp%'"$start"'*}
echo "$nstart$nend"
>"text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 3
nstring=$(sed -E "s/(.*'"$start"').*('"$end"')/\1\2/" <<< "$string")
echo "$nstring";
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 4
nstring=$(sed -En "s/(^.*'"$start"').*('"$end"'.*$)/\1\2/p" <<< "$string")
echo "$nstring" ;
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"

Using Bash's Regex match:
#!/usr/bin/env bash
string='This text and more text jsdlj-end.text'
[[ $string =~ (.*text\ ).*(-end.*) ]] || :
printf %s\\n "${BASH_REMATCH[1]}${BASH_REMATCH[2]}"

UPDATE: question has been updated with more details for dealing with a string that contains multiple start and end markers.
The new input string:
This text d #! more text jsdlj end and mo{re ;re end text.text
Test case:
start marker = 'text'
end marker = 'end'
objective = remove all text between last start marker and before the first end marker (actually replace all said text with a single space)
Input with all markers in bold:
This text d #! more text jsdlj end and mo{re ;re end text.text
Input with the two markers of interest in bold:
This text d #! more text jsdlj end and mo{re ;re end text.text
Desired result:
This text d #! more text end and mo{re ;re end text.text
While we can use sed to remove the desired text (replace <space>jsdlj<space> with <space>), we have to deal with the fact that sed does greedy matching (fine for finding the 'last' start marker) but does not do non-greedy matching (needed to find the 'first' end marker). We can get around this limitation by switching out our end marker with a single-character replacement, simulate a non-greedy match, then switch back to the original end marker.
m1='text' # start marker
m2='end' # end marker
string="This text d #! more text jsdlj end and mo{re ;re end text.text"
sed -E "s/${m2}/#/g;s/(^.*${m1})[^#]*(#.*$)/\1 \2/;s/#/${m2}/g" <<< "${string}"
Where:
-E - enable Extended regex support (includes capture groups)
s/${m2}/#/g - replace our end marker with the single character # (OP needs to determine what character cannot show up in expected input strings)
(^.*${m1}) - 1st capture group; greedy match from start of string up to last start marker before ...
[^#]* - match everything that's not the # character
(#.*$) - 2nd capture group; everything from # character until end of string
\1 \2 - replace entire string with 1st capture group + <space> + 2nd capture group
s/#/${m2}/g - replace single character # with our end marker
This generates:
This text d #! more text end and mo{re ;re end text.text
Personally, I'd probably opt for a more straight forward parameter expansion approach (similiar to Jetchisel's answer) but that could be a bit problematic for inline xargs processing ... ???
Original answer
One sed idea using capture groups:
$ string="This text and more text jsdlj-end.text"
$ sed -En 's/(^.*text ).*(-end.*$)/\1\2/p' <<< "${string}"
This text and more text -end.text
Where:
-En - enable Extended regex support (and capture groups) and (-n) disable default printing of pattern space
(^.*text ) - first capture group = start of line up to last text
.* - everything between the 2 capture groups
(-end.*$) - second capture group = from -end to end of string
\1\2/p - print the contents of the 2 capture groups.
Though this runs into issues if there are multiple -end strings on the 'end' of the string, eg:
$ string="This text and more text jsdlj-end -end.text"
$ sed -En 's/(^.*text ).*(-end.*$)/\1\2/p' <<< "${string}"
This text and more text -end.text
Whether this is correct or not depends on the desired output (and assuming this type of 'double' ending string is possible).

With Parameter Expansion.
string="This text and more text jsdlj-end.text"
temp=${string%-*}
end=${string#"$temp"}
start=${temp% *}
echo "$start$end"

This is a bit tricky using only a posix extended regex (ERE), but easy with a perl compatible regex (PCRE). Therefore, we switch from sed to perl:
To get the last text (that still has a end afterwards), put a .* in front. The closest end to that text can then be matched using a non-greedy .*?.
Here we also put \b around text and end to avoid matching parts of other words (for example, the word send should not be matched even though it contains end too).
perl -pe 's/(.*\btext\b).*?(\bend\b)/\1 \2/' <<< "$string"

Need to read file and grab each block of text between two blank lines

Need while loop that can get each two lines and store in variable.
while read data; do
echo $data
done
so I need to do something for each block of text which is two lines each.

For this input -
some text here
some text here a
some text here 2
some text here 2a
This will merge two lines use while read line.. It's NOT how I'd do it but it does what you said you wanted ...
last=""
while read line; do
if [ "$last" != "" ]; then
echo "$last$line"
last=""
else
last=$line
fi
done
if [ "$last" != "" ]; then
echo "$last"
fi
This great article (How to merge every two lines into one from the command line?) shows lots of different ways of merging 2 lines ..

You can read two lines in the while condition:
while read -r first && read -r second
do
echo "${first} ${second}"
done

It would help to know what you want to do to the two lines, but you can collect each block of 2 surrounded by empty lines easy enough with awk, e.g.
awk '
NF==0 { n=0; next }
n<2 { arr[++n]=$0 }
n==2 { printf "do to: %s & %s\n",arr[1],arr[2]; n=0 }
' file
or as a 1-liner:
awk 'NF==0{n=0;next} n<2{arr[++n]=$0} n==2{printf "do to: %s & %s\n",arr[1],arr[2]; n=0}' file
Where you have 3-rules, the first checks if the line is empty with NF==0, and if so, sets the index n to zero and skips to the next record (line). The second check is n<2 and adds the current line to the array arr. The final rule where n==2 just does whatever you need to the lines contained in arr[1] ane arr[2] and then resets the index n to zero.
Example Input File
Shamelessly borrowed from the other answer and modified (thank you), you could have:
$ cat file
some text here
some text here a
some text here 2
some text here 2a
Example Use/Output
Where each 2-lines separated by whitespace are collected and then output with "do to: " prefixed and the lines joined by " & ", for example purposes only:
$ awk 'NF==0{n=0;next} n<2{arr[++n]=$0} n==2{printf "do to: %s & %s\n",arr[1],arr[2]; n=0}' file
do to: some text here & some text here a
do to: some text here 2 & some text here 2a
Depending on what you need to do to the lines, awk may provide a very efficient solution. (as may sed)

Read content of file line by line in unix using 'line'

I have a file - abc, which has the below content -
Bob 23
Jack 44
Rahul 36
I also have a shell script that do the addition of all the numbers here.
The specific line that picks up these numbers is -
while read line
do
num=echo ${line#* }
sum=`expr $sum + $num`
count=`expr $count + 1`
done< "$readfile"
I assumed that the code is just picking up the last field from file, but it's not. If i modify the file like
Bob 23 12
Jack 44 23
Rahul 36 34
The same script fails with syntax error.
NOTE: I know there are other ways to pick up the field value, but i would like to know how this works.

The syntax ${line#* } will skip the shortest string from the beginning till it finds a space and returns the rest. It worked fine when you had just 2 columns. But the same will not work when 3 columns are present as it will return you the last 2 column values which when you use it in the sum operator will throw you an error. To explain that, just imagine
str='foo bar'
printf '%s\n' "${str#* }"
bar
but imagine the same for 3 fields
str='foo bar foobar'
printf '%s\n' "${str#* }"
bar foobar
To fix that use the parameter expansion syntax of "${str##* }" to skip the longest sub-string from beginning. To fix your script for the example with 3 columns, I would use a script as below.
This does a simple input redirection on the file and uses the read command with the default IFS value which is a single white space. So I'm getting only the 3rd field on each line (even if it has multiple fields), the _ mark the fields I'm skipping. You could also have some variables as place-holders and use their value in the scripts also.
declare -i sum
while read -r _ _ value _ ; do
((sum+=value)
done < file
printf '%d\n' "$sum"
See Bash - Parameter Expansion (Substring removal) to understand more.
You could also use the PE syntax ${line##* } as below,
while read -r line ; do
((sum+=${line##* }))
done < file
[Not relevant to the current question]
If you just want the sum to be computed and not specifically worried about using bash script for this. You can use a simple Awk command to sum up values in 3rd column as
awk '{sum+=$3}END{print sum}' inputfile

Print a string with its special characters printed as literal escape sequences

I have a string in a shell/bash script. I want to print the string with all its "special characters" (eg. newlines, tabs, etc.) printed as literal escape sequences (eg. a newline is printed as \n, a tab is printed as \t, and so on).
(Not sure if I'm using the correct terminology; the example should hopefully clarify things.)
Example
The desired output of...
a="foo\t\tbar"
b="foo bar"
print_escape_seq "$a"
print_escape_seq "$b"
...is:
foo\t\tbar
foo\t\tbar
$a and $b are strings that were read in from a text file.
There are two tab characters between foo and bar in the $b variable.
An attempt
This is what I've tried:
#!/bin/sh
print_escape_seq() {
str=$(printf "%q\n" $1)
str=${str/\/\//\/}
echo $str
}
a="foo\t\tbar"
b="foo bar"
print_escape_seq "$a"
print_escape_seq "$b"
The output is:
foo\t\tbar
foo bar
So, it doesn't work for $b.
Is there an entirely straightforward way to accomplish this that I've missed completely?

Bash has a string quoting operation ${var#Q}
Here is some example code
bash_encode () {
esc=${1#Q}
echo "${esc:2:-1}"
}
testval=$(printf "hello\t\tworld")
set | grep "^testval="
echo "The encoded value of testval is" $(bash_encode "$testval")
Here is the output
testval=$'hello\t\tworld'
The encoded value of testval is hello\t\tworld

You will need to create a search and replace pattern for each binary value you wish to replace. Something like this:
#!/bin/bash
esc() {
# space char after //
v=${1// /\\s}
# tab character after //
v=${v// /\\t}
echo $v
}
esc "hello world"
esc "hello world"
This outputs
hello\sworld
hello\tworld

I required something similar for file paths, and I realized that ls -1b does the work, but in the research I found this solution in stackoverflow which is closer to what you were requiring.
Command to escape a string in bash
just compile it with gcc -o "escapify" escapify.c

Bash array + sed + html

I need change price the HTML file, which search and store them in array but I have to change and save /nuevo-focus.html
price=( `cat /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html | grep -oiE '([$][0-9.]{1,7})'|tr '\n' ' '` )
price2=( $90.880 $0 $920 $925 $930 $910 $800 $712 $27.220 $962 )
sub (){
for item in "${price[#]}"; do
for x in ${price2[#]}; do
sed s/$item/$x/g > /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html
done
done
}
sub
Output the "cat /home/.../nuevo-focus.html|grep -oiE '([$][0-9.]{1,7})'|tr '\n' ' '` )" is...
$86.880 $0 $912 $908 $902 $897 $882 $812 $25.725 $715

In bash the variables $0 through $9 refer to the respective command line arguments of the script being run. In the line:
price2=( $90.880 $0 $920 $925 $930 $910 $800 $712 $27.220 $962 )
They will be expanded to either empty strings or the command line arguments that you gave the script.
Try doing this instead:
price2=( '$90.880' '$0' '$920' '$925' '$930' '$910' '$800' '$712' '$27.220' '$962' )
EDIT for part two of question
If what you are trying to do with the sed line is replace the prices in the file, overwriting the old ones, then you should do this:
sed -i s/$item/$x/g /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html
This will perform the substitution in place (-i), modifying the input file.
EDIT for part three of the question
I just realized that your nested loop does not really make sense. I am assuming that what you want to do is replace each price from price with the corresponding price in price2
If that is the case, then you should use a single loop, looping over the indices of the array:
for i in ${!price[*]}
do
sed -i "s/${price[$i]}/${price2[$i]}/g" /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html
done
I'm not able to test that right now, but I think it should accomplish what you want.
To explain it a bit:
${!price[*]} gives you all of the indices of your array (e.g. 0 1 2 3 4 ...)
For each index we then replace the corresponding old price with the new one. There is no need for a nested loop as you have done. When you execute that, what you are Basically doing is this:
replace every occurence of "foo" with "bar"
# at this point, there are now no more occurences of "foo" in your file
# so all of the other replacements do nothing
replace every occurence of "foo" with "baz"
replace every occurence of "foo" with "spam"
replace every occurence of "foo" with "eggs"
replace every occurence of "foo" with "qux"
replace every occurence of "foo" with "whatever"
etc...

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to Indent a String in Bash using printf? - bash

Related

Remove characters between markers in a bash variable

Need to read file and grab each block of text between two blank lines

Read content of file line by line in unix using 'line'

Print a string with its special characters printed as literal escape sequences

Bash array + sed + html

Categories

Resources