Why does herestring add a newline [duplicate] - bash

It seems that here string is adding line break. Is there a convenient way of removing it?
$ string='test'
$ echo -n $string | md5sum
098f6bcd4621d373cade4e832627b4f6 -
$ echo $string | md5sum
d8e8fca2dc0f896fd7cb4cb0031ba249 -
$ md5sum <<<"$string"
d8e8fca2dc0f896fd7cb4cb0031ba249 -

Yes, you are right: <<< adds a trailing new line.
You can see it with:
$ cat - <<< "hello" | od -c
0000000 h e l l o \n
0000006
Let's compare this with the other approaches:
$ echo "hello" | od -c
0000000 h e l l o \n
0000006
$ echo -n "hello" | od -c
0000000 h e l l o
0000005
$ printf "hello" | od -c
0000000 h e l l o
0000005
So we have the table:
| adds new line |
-------------------------|
printf | No |
echo -n | No |
echo | Yes |
<<< | Yes |
From Why does a bash here-string add a trailing newline char?:
Most commands expect text input. In the unix world, a text file
consists of a sequence of lines, each ending in a
newline.
So in most cases a final newline is required. An especially common
case is to grab the output of a command with a command susbtitution,
process it in some way, then pass it to another command. The command
substitution strips final newlines; <<< puts one back.

fedorqui's helpful answer shows that and why here-strings (and also here-documents) invariably append a newline.
As for:
Is there a convenient way of removing it?
In Bash, use printf inside a process substitution as an "\n-less" alternative to a here-string:
... < <(printf %s ...)
Applied to your example:
$ md5sum < <(printf %s 'test')
098f6bcd4621d373cade4e832627b4f6
Alternatively, as user202729 suggests, simply use printf %s in the pipeline, which has the added advantage of not only using a more familiar feature but also making the command work in (more strictly) POSIX-compliant shells (in scripts targeting /bin/sh):
$ printf %s 'test' | md5sum
098f6bcd4621d373cade4e832627b4f6

As a "here doc" add a newline:
$ string="hello test"
$ cat <<_test_ | xxd
> $string
> _test_
0000000: 6865 6c6c 6f20 7465 7374 0a hello test.
Also a "here string" does:
$ cat <<<"$string" | xxd
0000000: 6865 6c6c 6f20 7465 7374 0a hello test.
Probably the easiest solution to get an string non-ending on newline would be printf:
$ printf '%s' "$string" | xxd
0000000: 6865 6c6c 6f20 7465 7374 hello test

Related

How does grep handle DOS end of line?

I have a Windows text file which contains a line (with ending CRLF)
aline
The following is several commands' output:
[root#panel ~]# grep aline file.txt
aline
[root#panel ~]# grep aline$'\r' file.txt
[root#panel ~]# grep aline$'\r'$'\n' file.txt
[root#panel ~]# grep aline$'\n' file.txt
aline
The first command's output is normal. I'm curious about the second and the third output. Why is it an empty line? And the last output, I think it can not find the string but it actually finds it, why? The commands are run on CentOS/bash.
In this case grep really matches the string "aline\r" but you just don't see it because it was overwritten by the ANSI sequence that prints color. Pass the output to od -c and you'll see
$ grep aline file.txt
aline
$ grep aline$'\r' file.txt
$ grep aline$'\r' --color=never file.txt
aline
$ grep aline$'\r' --color=never file.txt | od -c
0000000 a l i n e \r \n
0000007
$ grep aline$'\r' --color=always file.txt | od -c
0000000 033 [ 0 1 ; 3 1 m 033 [ K a l i n e
0000020 \r 033 [ m 033 [ K \n
0000030
With --color=never you can see the output string because grep doesn't print out the color. \r simply resets the cursor to the start of the line and then a new line is printed out, nothing is overwritten. But by default grep will check whether it's running on the terminal or its output is being piped and prints out the matched string in color if supported, and it seems resetting the color then print \n clears the rest of the line
To match \n you can use the -z option to make null bytes the line separator
$ grep -z aline$'\r'$'\n' --color=never file.txt
aline
$ grep -z aline$'\r'$'\n' --color=never file.txt | od -c
0000000 a l i n e \r \n \0
0000010
$ grep -z aline$'\r'$'\n' --color=always file.txt | od -c
0000000 033 [ 0 1 ; 3 1 m 033 [ K a l i n e
0000020 \r 033 [ m 033 [ K \n \0
0000031
Your last command grep aline$'\n' file.txt works because \n is simply a word separator in bash, so the command is just the same as grep aline file.txt. Exactly the same thing happened in the 3rd line: grep aline$'\r'$'\n' file.txt To pass a newline you must quote the argument to prevent word splitting
$ echo "aline" | grep -z "aline$(echo $'\n')"
aline
To demonstrate the effect of the quote with the 3rd line I added another line to the file
$ cat file.txt
aline
another line
$ grep -z "aline$(echo $'\n')" file.txt | od -c
0000000 a l i n e \r \n a n o t h e r l
0000020 i n e \n \0
0000025
$ grep -z "aline$(echo $'\n')" file.txt
aline
another line
$
If the input is not well-formed, the behavior is undefined.
In practice, some versions of GNU grep use CR for internal purposes, so attempting to match it does not work at all, or produces really bizarre results.
For not entirely different reasons, passing in a literal newline as part of the regular expression could have some odd interpretations, including, but not limited to, interpreting the argument as two separate patterns. (Look at how grep -F reads from a file, and imagine that at least some implementations use the same logic to parse the command line.)
In the grand scheme of things, the sane solution is to fix the input so it's a valid text file before attempting to run Unix line-oriented tools on it.
For quick and dirty solutions, some tools have well-defined semantics for random binary input. Perl is a model citizen in this respect.
bash$ perl -ne 'print if /aline\r$/' <<<$'aline\r'
aline
Awk also tends to work amicably, though there are several implementations, so the risk that somebody somewhere has a version which doesn't behave identically to AT&T Awk is higher.
Maybe notice also how \r is the last character before the end of the line (the DOS line ending is the sequence CR LF, where LF is the standard Unix line terminator for text files).
At least for me phuclv's answer doesn't completely cover the last case, i.e. grep aline$'\n' file.txt.
Your mileage my vary depending on which shell and which version and implementation of grep you are using, but for me grep -z "aline$(echo $'\n')" and grep -z aline$'\n' both just match the same pattern as grep -z aline.
This becomes more apparent if the -o switch is used, so that grep outputs only the matched string and not the entire line (which is the entire file for most text files when the -z option is used).
If you use the same file.txt as in phuclv's second example:
$ cat file.txt
aline
another line
$ grep -z "aline$(echo $'\n')" file.txt | od -c
0000000 a l i n e \r \n a n o t h e r l
0000020 i n e \n \0
0000025
$ grep -z -o "aline$(echo $'\n')" file.txt | od -c
0000000 a l i n e \0
0000006
$ grep -z -o aline$'\n' file.txt | od -c
0000000 a l i n e \0
0000006
$ grep -z -o aline file.txt | od -c
0000000 a l i n e \0
0000006
To actually match a \n as part of the pattern I had to use the -P switch to turn on "Perl-compatible regular expression"
$ grep -z -o -P 'aline\r\n' file.txt | od -c
0000000 a l i n e \r \n \0
0000010
$ grep -z -o -P 'aline\r\nanother' file.txt | od -c
0000000 a l i n e \r \n a n o t h e r \0
0000017
For reference:
grep --version|head -n1
grep (GNU grep) 3.1
bash --version|head -n1
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)

How can I stop a here string (<<<) from adding a line break or new lines?

It seems that here string is adding line break. Is there a convenient way of removing it?
$ string='test'
$ echo -n $string | md5sum
098f6bcd4621d373cade4e832627b4f6 -
$ echo $string | md5sum
d8e8fca2dc0f896fd7cb4cb0031ba249 -
$ md5sum <<<"$string"
d8e8fca2dc0f896fd7cb4cb0031ba249 -
Yes, you are right: <<< adds a trailing new line.
You can see it with:
$ cat - <<< "hello" | od -c
0000000 h e l l o \n
0000006
Let's compare this with the other approaches:
$ echo "hello" | od -c
0000000 h e l l o \n
0000006
$ echo -n "hello" | od -c
0000000 h e l l o
0000005
$ printf "hello" | od -c
0000000 h e l l o
0000005
So we have the table:
| adds new line |
-------------------------|
printf | No |
echo -n | No |
echo | Yes |
<<< | Yes |
From Why does a bash here-string add a trailing newline char?:
Most commands expect text input. In the unix world, a text file
consists of a sequence of lines, each ending in a
newline.
So in most cases a final newline is required. An especially common
case is to grab the output of a command with a command susbtitution,
process it in some way, then pass it to another command. The command
substitution strips final newlines; <<< puts one back.
fedorqui's helpful answer shows that and why here-strings (and also here-documents) invariably append a newline.
As for:
Is there a convenient way of removing it?
In Bash, use printf inside a process substitution as an "\n-less" alternative to a here-string:
... < <(printf %s ...)
Applied to your example:
$ md5sum < <(printf %s 'test')
098f6bcd4621d373cade4e832627b4f6
Alternatively, as user202729 suggests, simply use printf %s in the pipeline, which has the added advantage of not only using a more familiar feature but also making the command work in (more strictly) POSIX-compliant shells (in scripts targeting /bin/sh):
$ printf %s 'test' | md5sum
098f6bcd4621d373cade4e832627b4f6
As a "here doc" add a newline:
$ string="hello test"
$ cat <<_test_ | xxd
> $string
> _test_
0000000: 6865 6c6c 6f20 7465 7374 0a hello test.
Also a "here string" does:
$ cat <<<"$string" | xxd
0000000: 6865 6c6c 6f20 7465 7374 0a hello test.
Probably the easiest solution to get an string non-ending on newline would be printf:
$ printf '%s' "$string" | xxd
0000000: 6865 6c6c 6f20 7465 7374 hello test

Convert 32/64 bit number to 4/8 character string

Given that:
$ printf "love" | od -td4 -A n
1702260588
$ printf "lovehate" | od -td8 -A n
7310575196135911276
Is there a concise (ideally without loops, awk, sed, perl or python) way in Bash to convert the numbers 1702260588 and 7310575196135911276 to love and lovehate respectively?
Here's what I came up with:
alpha() {
(($1)) && printf "\x"$(printf "%02x" $(($1%256)))$(alpha $(($1/256)))"\n"
}
alpha 1702260588
alpha 7310575196135911276
Output:
love
lovehate
Edit: Here's an answer using the xxd utility:
# The echo is only necessary to get a newline at the end.
echo $(printf "%x" 1702260588 | xxd -r -p | rev)
echo $(printf "%x" 7310575196135911276 | xxd -r -p | rev)
Output:
love
lovehate

Ascii/Hex convert in bash

I'm now doing it this way:
[root#~]# echo Aa|hexdump -v
0000000 6141 000a
0000003
[root#~]# echo -e "\x41\x41\x41\x41"
AAAA
But it's not exactly behaving as I wanted,
the hex form of Aa should be 4161,but the output is 6141 000a,which seems not making sense.
and when performing hex to ascii,is there another utility so that I don't need the prefix \x ?
The reason is because hexdump by default prints out 16-bit integers, not bytes. If your system has them, hd (or hexdump -C) or xxd will provide less surprising outputs - if not, od -t x1 is a POSIX-standard way to get byte-by-byte hex output. You can use od -t x1c to show both the byte hex values and the corresponding letters.
If you have xxd (which ships with vim), you can use xxd -r to convert back from hex (from the same format xxd produces). If you just have plain hex (just the '4161', which is produced by xxd -p) you can use xxd -r -p to convert back.
For the first part, try
echo Aa | od -t x1
It prints byte-by-byte
$ echo Aa | od -t x1
0000000 41 61 0a
0000003
The 0a is the implicit newline that echo produces.
Use echo -n or printf instead.
$ printf Aa | od -t x1
0000000 41 61
0000002
For single line solution:
echo "Hello World" | xxd -ps -c 200 | tr -d '\n'
It will print:
48656c6c6f20576f726c640a
or for files:
cat /path/to/file | xxd -ps -c 200 | tr -d '\n'
For reverse operation:
echo '48656c6c6f20576f726c640a' | xxd -ps -r
It will print:
Hello World
$> printf "%x%x\n" "'A" "'a"
4161
With bash :
a=abcdefghij
for ((i=0;i<${#a};i++));do printf %02X \'${a:$i:1};done
6162636465666768696A
I use:
> echo Aa | tr -d '\n' | xxd -p
4161
> echo 414161 | tr -d '\n' | xxd -r -p
AAa
The tr -d '\n' will trim any possible newlines in your input
I don't know how it crazy it looks but it does the job really well
ascii2hex(){ a="$#";s=0000000;printf "$a" | hexdump | grep "^$s"| sed s/' '//g| sed s/^$s//;}
Created this when I was trying to see my name in HEX ;)
use how can you use it :)
Text2Conv="Aa"
for letter in $(echo "$Text2Conv" | sed "s/\(.\)/'\1 /g");do printf '%x' "$letter";done
4161
The trick is using sed to parse the Text2Conv to format we can then seperate anf loop using for.
Finally got the correct thing
echo "Hello, world!" | tr -d '\n' | xxd -ps -c 200
here a little script I wrote to convert ascii to hex. hope it helps:
echo '0x'"`echo 'ASCII INPUT GOES HERE' | hexdump -vC | awk 'BEGIN {IFS="\t"} {$1=""; print }' | awk '{sub(/\|.*/,"")}1' | tr -d '\n' | tr -d ' '`" | rev | cut -c 3- | rev
SteinAir's answer above was helpful to me -- thank you! And below is a way it inspired, to convert hex strings to ascii:
for h in $(echo "4161" | sed "s/\(..\)/\1 /g"); do printf `echo "\x$h"`;done
Aa
echo -n Aa | hexdump -e '/1 "%02x"'; echo
according to http://mylinuxbook.com/hexdump/ you might use the hexdump format parameter
echo Aa | hexdump -C -e '/1 "%02X"'
will return 4161
to add an extra linefeed at the end, append another formatter.
BUT: the format given above will give multiplier outputs for repetitive characters
$ printf "Hello" | hexdump -e '/1 "%02X"'
48656C*
6F
instead of
48656c6c6f
jcomeau#aspire:~$ echo -n The quick brown fox jumps over the lazy dog | python -c "print raw_input().encode('hex'),"
54686520717569636b2062726f776e20666f78206a756d7073206f76657220746865206c617a7920646f67
jcomeau#aspire:~$ echo -n The quick brown fox jumps over the lazy dog | python -c "print raw_input().encode('hex')," | python -c "print raw_input().decode('hex'),"
The quick brown fox jumps over the lazy dog
it could be done with Python3 as well, but differently, and I'm a lazy dog.
echo append a carriage return at the end.
Use
echo -e
to remove the extra 0x0A
Also, hexdump does not work byte-per-byte as default. This is why it shows you bytes in a weird endianess and why it shows you an extra 0x00.

reverse the order of characters in a string

In string "12345", out string "54321". Preferably without third party tools and regex.
I know you said "without third-party tools", but sometimes a tool is just too obviously the right one, plus it's installed on most Linux systems by default:
[madhatta#risby tmp]$ echo 12345 | rev
54321
See rev's man page for more.
Simple:
var="12345"
copy=${var}
len=${#copy}
for((i=$len-1;i>=0;i--)); do rev="$rev${copy:$i:1}"; done
echo "var: $var, rev: $rev"
Output:
$ bash rev
var: 12345, rev: 54321
Presume that a variable 'var' has the value '123'
var="123"
Reverse the string and store in a new variable 'rav':
rav=$(echo $var | rev)
You'll see the 'rav' has the value of '321' using echo.
echo $rav
rev | tail -r (BSD) or rev | tac (GNU) also reverse lines:
$ rev <<< $'12\n34' | tail -r
43
21
$ rev <<< $'12\n34' | gtac
43
21
If LC_CTYPE is C, rev reverses the bytes of multibyte characters:
$ LC_CTYPE=C rev <<< あの
��め�
$ export LC_ALL=C; LC_ALL=en_US.UTF-8 rev <<< あの
のあ
A bash solution improving over #osdyng answer (my edit was not accepted):
var="12345" rev=""
for(( i=0 ; i<${#var} ; i++ )); do rev="${var:i:1}$rev"; done
echo "var: $var, rev: $rev"
Or an even simpler (bash) loop:
var=$1 len="${#var}" i=0 rev=""
while (( i<len )); do rev="${var:i++:1}$rev"; done
echo "var: $var, rev: $rev"
A POSIX solution:
var="12345" rev="" i=1
while [ "$i" -le "${#var}" ]
do rev="$(echo "$var" | awk -v i="$i" '{print(substr($0,i,1))}')$rev"
: $(( i+=1 ))
done
echo "var: $var, rev: $rev"
Note: This works on multi byte strings. Cut solutions will work only in ASCII (1 byte) strings.
Some simple methods of reversing a string
echo '!!!esreveR si sihT' | grep -o . | tac | tr -d '\n' ; echo
echo '!!!esreveR si sihT' | fold -w 1 | tac | tr -d '\n' ; echo
Convert to hex values then reverse
echo '!!!esreveR si sihT' | xxd -p | grep -o .. | tac | xxd -r -p ; echo
echo '!!!esreveR si sihT' | xxd -p | fold -w 2 | tac | xxd -r -p ; echo
This reverses the string "in place":
a=12345
len=${#a}
for ((i=1;i<len;i++)); do a=$a${a: -i*2:1}; done; a=${a:len-1}
echo $a
or the third line could be:
for ((i=0;i<len;i++)); do a=${a:i*2:1}$a; done; a=${a:0:len}
or
for ((i=1;i<len;i++)); do a=${a:0:len-i-1}${a: -i:i+1}${a:len-i-1:1}; done
For those without rev (recommended), there is the following simple awk solution that splits fields on the null string (every character is a separate field) and prints in reverse:
awk -F '' '{ for(i=NF; i; i--) printf("%c", $i); print "" }'
The above awk code is POSIX compliant. As a compliant awk implementation is guaranteed to be on every POSIX compliant OS, the solution should thus not be thought of as "3rd-party." This code will likely be more concise and understandable than a pure POSIX sh (or bash) solution.
(; I do not know if you consider the null string to -F a regex... ;)
If var=12345:
bash for((i=0;i<${#var};i++)); do rev="$rev${var:~i:1}"; done
sh c=$var; while [ "$c" ]; do rev=$rev${c#"${c%?}"}; c=${c%?}; done
echo "var: $var, rev: $rev"
Run it:
$ rev
var: 12345, rev: 54321
This can of course be shortened, but it should be simple to understand: the final print adds the newline.
echo 12345 | awk '{for (i = length($0); i > 0; i--) {printf("%s", substr($0, i, 1));} print "";}'
Nobody appears to have posted a sed solution, so here's one that works in non-GNU sed (so I wouldn't consider it "3rd party"). It does capture single characters using the regex ., but that's the only regex.
In two stages:
$ echo 123456 | sed $'s/./&\\\n/g' | sed -ne $'x;H;${x;s/\\n//g;p;}'
654321
This uses bash format substitution to include newlines in the scripts (since the question is tagged bash). It works by first separating the input string into one line per character, and then by inserting each character into the beginning of the hold buffer.
x swaps the hold space and the pattern space, and
H H appends the (current) pattern space to the hold space.
So for every character, we place that character into the hold space, then append the old hold space to it, thus reversing the input. The final command removes the newlines in order to reconstruct the original string.
This should work for any single string, but it will concatenate multi-line input into a single output string.
Here is another simpler awk solution:
awk 'BEGIN{FS=""} {for (i=NF; i>0; i--) s=s $i; print s}' <<< '123456'
654321
Try Perl:
echo 12345 | perl -nle 'print scalar reverse $_'
Source: Perl one-liners
read word
reve=`echo "$word" | awk '{for(i=length($0); i>0;i--) printf (substr($0,i,1));}'`
echo "$reve"

Resources