Bash %s format specifier eliminates spaces in string on print out - bash

I am starting to use the printf instead of echo. My first forray into printf %s is this:
#!/bin/bash
danny=$(tail -1 /come/and/play/with/US.log| ~/walt/convert_gm_est)
printf "%s" $danny
08:02.020ZINFO<-casper/casperbox001wowSYSSTATS[sz=21,tag=0,aux=0]process_log2221397800
It eliminates all the spaces in the string. So I added a space after the format specifier. the string looks nicer with the spaces plus I could rip out the time with awk very easily. I don't see anything about this on the interwebs. I tried a "%s\s" and that did not work. Is this standard procedure for using format specifiers - use a space after %s? Is this the way to do it or am I missing something?
#!/bin/bash
danny=$(tail -1 /come/and/play/with/US.log| ~/walt/convert_gm_est)
printf "%s " $danny
08:02.020Z INFO <- casper/casperbox001wow SYSSTATS[sz=21, tag=0, aux=0] process_log 2221397 80 0 casper#casperbox001wow:~$

When the shell evaluates:
printf "%s" $danny
the shell will expand the value of the variable danny and then split it into words. It will also expand globs in those words. Once that is done, the expression will look something like this (quotes added for clarification):
printf '%s' '08:02.020Z' 'INFO' '<-' 'casper/casperbox001' 'wow0' 'SYSSTATS'...
printf repeats its format string until all of the arguments are consumed. So using the format string %s causes the arguments to be concatenated without intervening spaces.
You probably meant to quote $danny so that it would be presented as a single argument to printf:
printf "%s" "$danny"

Related

How do you convert characters to ASCII without use of the printf in bash

ascii() {printf '%d' "'$1"}
I am currently using this function to convert characters to ASCII, however I just want to store the result of the function as a variable without printing the ascii. How would I go about this? (please bear in mind I have only been using bash for a few hours total, so sorry if this is a dumb question.)
In bash, after
printf -v numval "%d" "'$1"
the variable numval (you can use any other valid variable name) will hold the numerical value of the first character of the string contained in the positional parameter $1.
Alternatively, you can use the command substitution:
numval=$(printf "%d" "'$1")
Note that these still use printf but won't print anything to stdout.
As stated in the comment by #Charles Duffy, the printf -v version is more efficient, but less portable (standard POSIX shell does not support the -v option).
Thx for your script! I didn't know how get ascii values so "'M" rescued me.
I pass parameters to function to get return.
Function returns I use to... Well, return err/status codes.
#!/bin/sh
# posix
ascii () {
# $1 decimal ascii code return
# $2 character
eval $1=$(printf '%d' "'$2")
}
ascii cod 'M'
echo "'M' = $cod"

In bash how can I get the last part of a string after the last hyphen [duplicate]

I have this variable:
A="Some variable has value abc.123"
I need to extract this value i.e abc.123. Is this possible in bash?
Simplest is
echo "$A" | awk '{print $NF}'
Edit: explanation of how this works...
awk breaks the input into different fields, using whitespace as the separator by default. Hardcoding 5 in place of NF prints out the 5th field in the input:
echo "$A" | awk '{print $5}'
NF is a built-in awk variable that gives the total number of fields in the current record. The following returns the number 5 because there are 5 fields in the string "Some variable has value abc.123":
echo "$A" | awk '{print NF}'
Combining $ with NF outputs the last field in the string, no matter how many fields your string contains.
Yes; this:
A="Some variable has value abc.123"
echo "${A##* }"
will print this:
abc.123
(The ${parameter##word} notation is explained in §3.5.3 "Shell Parameter Expansion" of the Bash Reference Manual.)
Some examples using parameter expansion
A="Some variable has value abc.123"
echo "${A##* }"
abc.123
Longest match on " " space
echo "${A% *}"
Some variable has value
Longest match on . dot
echo "${A%.*}"
Some variable has value abc
Shortest match on " " space
echo "${A%% *}"
some
Read more Shell-Parameter-Expansion
The documentation is a bit painful to read, so I've summarised it in a simpler way.
Note that the '*' needs to swap places with the ' ' depending on whether you use # or %. (The * is just a wildcard, so you may need to take off your "regex hat" while reading.)
${A% *} - remove shortest trailing * (strip the last word)
${A%% *} - remove longest trailing * (strip the last words)
${A#* } - remove shortest leading * (strip the first word)
${A##* } - remove longest leading * (strip the first words)
Of course a "word" here may contain any character that isn't a literal space.
You might commonly use this syntax to trim filenames:
${A##*/} removes all containing folders, if any, from the start of the path, e.g.
/usr/bin/git -> git
/usr/bin/ -> (empty string)
${A%/*} removes the last file/folder/trailing slash, if any, from the end:
/usr/bin/git -> /usr/bin
/usr/bin/ -> /usr/bin
${A%.*} removes the last extension, if any (just be wary of things like my.path/noext):
archive.tar.gz -> archive.tar
How do you know where the value begins? If it's always the 5th and 6th words, you could use e.g.:
B=$(echo "$A" | cut -d ' ' -f 5-)
This uses the cut command to slice out part of the line, using a simple space as the word delimiter.
As pointed out by Zedfoxus here. A very clean method that works on all Unix-based systems. Besides, you don't need to know the exact position of the substring.
A="Some variable has value abc.123"
echo "$A" | rev | cut -d ' ' -f 1 | rev
# abc.123
More ways to do this:
(Run each of these commands in your terminal to test this live.)
For all answers below, start by typing this in your terminal:
A="Some variable has value abc.123"
The array example (#3 below) is a really useful pattern, and depending on what you are trying to do, sometimes the best.
1. with awk, as the main answer shows
echo "$A" | awk '{print $NF}'
2. with grep:
echo "$A" | grep -o '[^ ]*$'
the -o says to only retain the matching portion of the string
the [^ ] part says "don't match spaces"; ie: "not the space char"
the * means: "match 0 or more instances of the preceding match pattern (which is [^ ]), and the $ means "match the end of the line." So, this matches the last word after the last space through to the end of the line; ie: abc.123 in this case.
3. via regular bash "indexed" arrays and array indexing
Convert A to an array, with elements being separated by the default IFS (Internal Field Separator) char, which is space:
Option 1 (will "break in mysterious ways", as #tripleee put it in a comment here, if the string stored in the A variable contains certain special shell characters, so Option 2 below is recommended instead!):
# Capture space-separated words as separate elements in array A_array
A_array=($A)
Option 2 [RECOMMENDED!]. Use the read command, as I explain in my answer here, and as is recommended by the bash shellcheck static code analyzer tool for shell scripts, in ShellCheck rule SC2206, here.
# Capture space-separated words as separate elements in array A_array, using
# a "herestring".
# See my answer here: https://stackoverflow.com/a/71575442/4561887
IFS=" " read -r -d '' -a A_array <<< "$A"
Then, print only the last elment in the array:
# Print only the last element via bash array right-hand-side indexing syntax
echo "${A_array[-1]}" # last element only
Output:
abc.123
Going further:
What makes this pattern so useful too is that it allows you to easily do the opposite too!: obtain all words except the last one, like this:
array_len="${#A_array[#]}"
array_len_minus_one=$((array_len - 1))
echo "${A_array[#]:0:$array_len_minus_one}"
Output:
Some variable has value
For more on the ${array[#]:start:length} array slicing syntax above, see my answer here: Unix & Linux: Bash: slice of positional parameters, and for more info. on the bash "Arithmetic Expansion" syntax, see here:
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Arithmetic-Expansion
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Shell-Arithmetic
You can use a Bash regex:
A="Some variable has value abc.123"
[[ $A =~ [[:blank:]]([^[:blank:]]+)$ ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
Prints:
abc.123
That works with any [:blank:] delimiter in the current local (Usually [ \t]). If you want to be more specific:
A="Some variable has value abc.123"
pat='[ ]([^ ]+)$'
[[ $A =~ $pat ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
echo "Some variable has value abc.123"| perl -nE'say $1 if /(\S+)$/'

Comparing special characters in Bash

I have a function I have written that outputs coloured output kind of like this:
foo() {
local arrow="\033[1;32m↑\033[0m"
echo "1$arrow"
}
I'd like to test this function so I'm using https://github.com/kward/shunit2 to do unit tests for it.
test_foo() {
local green_arrow="\033[1;32m↑\033[0m"
assertEquals "1$green_arrow" "$(foo)"
}
But shunit complains that: ASSERT:expected:< 1↑> but was:< 1↑>. I'm guessing there's a problem with hidden characters being output by the function.
Is there any way to strip special characters, or escape them from the variable?
Edit:
Running the following commands shows that the \\033 escape character is being converted, probably by echo, into a literal \E escape:
printf '%q' "1$green_arrow"
' 1\\033[1;32m?\206\221\\033[0m'
printf '%q' "$(foo)"
' 1\E[1;32m?\206\221\E[0m'
Using printf -v varname_quoted %q "$varname" to get escaped forms of your content will provide an easier-to-read and easy-to-reason-about form, which can also be included in your code as literals with no additional quoting or escaping needed. (This can be simplified to printf %q "$varname" if your goal is to emit quoted content to stdout, rather than to another variable).
That is to say, to get a format string to a literal (with support in the format string for arguments, ie. %s, %02f, etc):
printf -v green_arrow '\033[1;32m↑\033[0m'
...or, a backslash-escaped string (which supports only a subset of full format-string syntax) to a literal:
printf -v green_arrow '%b' '\033[1;32m↑\033[0m'
...and then, to get that literal emitted in escaped, human-readable form for easy comparison and/or copy-and-paste into shell scripts as a literal:
printf '%q\n' "$green_arrow"
Tying this all together, the following is one way to get a more readable error message (please pardon StackOverflow's syntax highlighting, which as of this writing doesn't fully grok the way nested quoting contexts work in bash):
test_foo() {
local green_arrow=$'\E[1;32m↑\E[0m'
assertEquals "$(printf '%q' " 1$green_arrow")" "$(printf '%q' "$(foo)")"
}
...or, more efficiently (avoiding the unnecessary subshells):
test_foo() {
local green_arrow=$'\E[1;32m↑\E[0m'
local desired_answer_quoted actual_answer_quoted
printf -v desired_answer_quoted '%q' " 1$green_arrow"
printf -v actual_answer_quoted '%q' "$(foo)"
assertEquals "$desired_answer_quoted" "$actual_answer_quoted"
}

Bash Columns SED and BASH Commands without AWK?

I wrote 2 difference scripts but I am stuck at the same problem.
The problem is am making a table from a file ($2) that I get in args and $1 is the numbers of columns. A little bit hard to explain but I am gonna show you input and output.
The problem is now that I don't know how I can save every column now in a difference var so i can build it in my HTML code later
#printf #TR##TD#$...#/TD##TD#$...#/TD##TD#$..#/TD##/TR##TD#$...
so input look like that :
Name\tSize\tType\tprobe
bla\t4711\tfile\t888888888
abcde\t4096\tdirectory\t5555
eeeee\t333333\tblock\t6666
aaaaaa\t111111\tpackage\t7777
sssss\t44444\tfile\t8888
bbbbb\t22222\tfolder\t9999
Code :
c=1
column=$1
file=$2
echo "$( < $file)"| while read Line ; do
Name=$(sed "s/\\\t/ /g" $file | cut -d' ' -f$c,-$column)
printf "$Name \n"
#let c=c+1
#printf "<TR><TD>$Name</TD><TD>$Size</TD><TD>$Type</TD></TR>\n"
exit 0
done
Output:
Name Size Type probe
bla 4711 file 888888888
abcde 4096 directory 5555
eeeee 333333 block 6666
aaaaaa 111111 package 7777
sssss 44444 file 8888
bbbbb 22222 folder 9999
This is tailor-made job for awk. See this script:
awk -F'\t' '{printf "<tr>";for(i=1;i<=NF;i++) printf "<td>%s</td>", $i;print "</tr>"}' input
<tr><td>bla</td><td>4711</td><td>file</td><td>888888888</td></tr>
<tr><td>abcde</td><td>4096</td><td>directory</td><td>5555</td></tr>
<tr><td>eeeee</td><td>333333</td><td>block</td><td>6666</td></tr>
<tr><td>aaaaaa</td><td>111111</td><td>package</td><td>7777</td></tr>
<tr><td>sssss</td><td>44444</td><td>file</td><td>8888</td></tr>
<tr><td>bbbbb</td><td>22222</td><td>folder</td><td>9999</td></tr>
In bash:
celltype=th
while IFS=$'\t' read -a columns; do
rowcontents=$( printf '<%s>%s</%s>' "$celltype" "${columns[#]}" "$celltype" )
printf '<tr>%s</tr>\n' "$rowcontents"
celltype=td
done < <( sed $'s/\\\\t/\t/g' "$2")
Some explanations:
IFS=$'\t' read -a columns reads a line from standard input, using only the tab character to separate fields, and putting each field into a separate element of the array columns. We change IFS so that other whitespace, which could occur in a field, is not treated as a field delimiter.
On the first line read from standard input, <th> elements will be output by the printf line. After resetting the value of celltype at the end of the loop body, all subsequent rows will consist of <td> elements.
When setting the value of rowcontents, take advantage of the fact that the first argument is repeated as many times as necessary to consume all the arguments.
Input is via process substitution from the sed command, which requires a crazy amount of quoting. First, the entire argument is quoted with $'...', which tells bash to replace escaped characters. bash converts this to the literal string s/\\t/^T/g, where I am using ^T to represent a literal ASCII 09 tab character. When sed sees this argument, it performs its own escape replacement, so the search text is a literal backslash followed by a literal t, to be replaced by a literal tab character.
The first argument, the column count, is unnecessary and is ignored.
Normally, you avoid making the while loop part of a pipeline because you set parameters in the loop that you want to use later. Here, all the variables are truly local to the while loop, so you could avoid the process substitution and use a pipeline if you wish:
sed $'s/\\\\t/\t/g' "$2" | while IFS=$'\t' read -a columns; do
...
done

SPRINTF in shell scripting?

I have an auto-generated file each day that gets called by a shell script.
But, the problem I'm facing is that the auto-generated file has a form of:
FILE_MM_DD.dat
... where MM and DD are 2-digit month and day-of-the-month strings.
I did some research and banged it at on my own, but I don't know how to create these custom strings using only shell scripting.
To be clear, I am aware of the DATE function in Bash, but what I'm looking for is the equivalent of the SPRINTF function in C.
In Bash:
var=$(printf 'FILE=_%s_%s.dat' "$val1" "$val2")
or, the equivalent, and closer to sprintf:
printf -v var 'FILE=_%s_%s.dat' "$val1" "$val2"
If your variables contain decimal values with leading zeros, you can remove the leading zeros:
val1=008; val2=02
var=$(printf 'FILE=_%d_%d.dat' $((10#$val1)) $((10#$val2)))
or
printf -v var 'FILE=_%d_%d.dat' $((10#$val1)) $((10#$val2))
The $((10#$val1)) coerces the value into base 10 so the %d in the format specification doesn't think that "08" is an invalid octal value.
If you're using date (at least for GNU date), you can omit the leading zeros like this:
date '+FILE_%-m_%-d.dat'
For completeness, if you want to add leading zeros, padded to a certain width:
val1=8; val2=2
printf -v var 'FILE=_%04d_%06d.dat' "$val1" "$val2"
or with dynamic widths:
val1=8; val2=2
width1=4; width2=6
printf -v var 'FILE=_%0*d_%0*d.dat' "$width1" "$val1" "$width2" "$val2"
Adding leading zeros is useful for creating values that sort easily and align neatly in columns.
Why not using the printf program from coreutils?
$ printf "FILE_%02d_%02d.dat" 1 2
FILE_01_02.dat
Try:
sprintf() { local stdin; read -d '' -u 0 stdin; printf "$#" "$stdin"; }
Example:
$ echo bar | sprintf "foo %s"
foo bar

Resources