Append to a string in bash

Append to a string in bash - bash

I'm trying to get a download URL using curl and awk and want to append something to that URL afterwards.
Here some snipped of my code:
IMAGE=$(curl -I -s https://downloads.raspberrypi.org/raspbian_lite_latest | awk '/Location/ {print $2}')
CHECKSUM="$IMAGE.sha256"
echo $IMAGE
echo $CHECKSUM
What I'm getting is that it is somehow replacing parts at the beginning.
https://downloads.raspberrypi.org/raspbian_lite/images/raspbian_lite-2018-11-15/2018-11-13-raspbian-stretch-lite.zip
.sha256/downloads.raspberrypi.org/raspbian_lite/images/raspbian_lite-2018-11-15/2018-11-13-raspbian-stretch-lite.zip
I'm a bit helpless, because the following works as expected:
A="https""://abc.org/a_b/a.zip" # looks weird, but full URLs are not allowed here
B="$A.sha256"
echo $B
What am I doing wrong?

When you hexdump your string, you see that is uses windows line endings (with carriage return):
echo $IMAGE | hexdump -C
00000000 68 74 74 70 73 3a 2f 2f 64 6f 77 6e 6c 6f 61 64 |https://download|
00000010 73 2e 72 61 73 70 62 65 72 72 79 70 69 2e 6f 72 |s.raspberrypi.or|
00000020 67 2f 72 61 73 70 62 69 61 6e 5f 6c 69 74 65 2f |g/raspbian_lite/|
00000030 69 6d 61 67 65 73 2f 72 61 73 70 62 69 61 6e 5f |images/raspbian_|
00000040 6c 69 74 65 2d 32 30 31 38 2d 31 31 2d 31 35 2f |lite-2018-11-15/|
00000050 32 30 31 38 2d 31 31 2d 31 33 2d 72 61 73 70 62 |2018-11-13-raspb|
00000060 69 61 6e 2d 73 74 72 65 74 63 68 2d 6c 69 74 65 |ian-stretch-lite|
00000070 2e 7a 69 70 0d 0a |.zip..|
00000076
To fix that, use
IMAGE=$(curl -I -s https://downloads.raspberrypi.org/raspbian_lite_latest | awk '/Location/ {print $2}' | tr -d "\r")

The problem apparently is, that your $IMAGE contains / ends in a trailing '\r(carriage return). So you've actually appended ".sha256" as you expected to"something\r.sha256" which when being echoed means.... something, cursor back to the beginning of the line, .sha256. Long story short, strip that '\r`. E.g:
IMAGE=$(curl -I -s https://downloads.raspberrypi.org/raspbian_lite_latest | awk '/Location/ {sub(/\r$/, "", $2); print $2}')

Since you are using bash you can use substring replacement, ie. replace the \r in IMAGEvar:
$ CHECKSUM="${IMAGE/$'\r'/}.sha256"
$ echo $CHECKSUM
https://downloads.raspberrypi.org/raspbian_lite/images/raspbian_lite-2018-11-15/2018-11-13-raspbian-stretch-lite.zip.sha256
or prepare for it in the awk part by setting the record separator RS:
... | awk -v RS="\r?\n" '/Location/ {print $2}'
Tested with gawk, mawk and original-awk. Surprisingly busybox awk removed it by itself:
$ echo -e \\r | busybox awk '{print $1}' | hexdump -C
00000000 0a |.|
but for example:
$ echo -e \\r | gawk '{print $1}' | hexdump -C
00000000 0d 0a |..|

Related

Case doesn't work when examining tail output

I can't find the reasons why my case statement doesn't work when looking tail output.
tail -F -n1 /var/log/pihole.log |
while read input; do
echo "$input" | hexdump -C # just to physically compare the output
case $input in
cached|blacklisted|blocked)
echo "We have a match!";;
*)
echo "No match!"
esac
done
This always returns No match!, even if the strings are in the $input.
:~ $ ./pihole_test.sh
00000000 4a 61 6e 20 31 20 31 31 3a 35 35 3a 35 38 20 64 |Jan 1 11:55:58 d|
00000010 6e 73 6d 61 73 71 5b 36 39 36 5d 3a 20 65 78 61 |nsmasq[696]: exa|
00000020 63 74 6c 79 20 62 6c 61 63 6b 6c 69 73 74 65 64 |ctly blacklisted|
00000030 20 70 6c 61 79 2e 67 6f 6f 67 6c 65 2e 63 6f 6d | play.google.com|
00000040 20 69 73 20 30 2e 30 2e 30 2e 30 0a | is 0.0.0.0.|
0000004c
No match!

Replace
cached|blacklisted|blocked)
with
*cached*|*blacklisted*|*blocked*)
to match substrings.

How can I remove non-breaking spaces from a text file in bash?

I have a csv file with text and numbers.
If a number is bigger than 1000, formatted like this: 1 000,
so it has a space as thousand separator, but it is not space. I tried to sed it, and it worked where real space was, but not in this format.
It is also not TAB, I removed all the TABs with "expand -t 1".
The following is a line that demonstrates the issue:
x17_Provident_GDN_REMARKETING_provident.hu_listák;Display_Hálózat;Szeged;2021-03-09;Kedd;Mobil;HUF;1 736;9;130.83;0.00
In penultimate row, in column 8: 1 736
is the problem.
And running this: grep -E -m 1 -e '[;]1[^;]+736[;]' <yourfile.csv | hexdump -C
gives:
00000000 78 31 37 5f 50 72 6f 76 69 64 65 6e 74 5f 47 44 |x17_Provident_GD|
00000010 4e 5f 52 45 4d 41 52 4b 45 54 49 4e 47 5f 70 72 |N_REMARKETING_pr|
00000020 6f 76 69 64 65 6e 74 2e 68 75 5f 6c 69 73 74 c3 |ovident.hu_list.|
00000030 a1 6b 3b 44 69 73 70 6c 61 79 5f 48 c3 a1 6c c3 |.k;Display_H..l.|
00000040 b3 7a 61 74 3b 53 7a 65 67 65 64 3b 32 30 32 31 |.zat;Szeged;2021|
00000050 2d 30 33 2d 30 39 3b 4b 65 64 64 3b 4d 6f 62 69 |-03-09;Kedd;Mobi|
00000060 6c 3b 48 55 46 3b 31 c2 a0 37 33 36 3b 39 3b 31 |l;HUF;1..736;9;1|
00000070 33 30 2e 38 33 3b 30 2e 30 30 0a |30.83;0.00.|
0000007b

It's a 2 byte, UTF-8 encoded non breaking space - c2 a0.
You can use perl to safely remove it.
perl -pe 's/\xc2\xa0//g' dirty.csv > clean.csv

After we know it is No break space, I simply sed it on mac with entry method:
opt+space
cat test4.csv | sed 's/ //g'

Similar to perl, you can use GNU sed with LC_ALL=C:
LC_ALL=C sed 's/\xc2\xa0//g'

bash substitution after glob not working?

I encounter a strange behaviour with bash string substitution.
I expected the same substitution on $r1 and $var to yield the exact same results.
both strings seem to have the same value.
But It is not the case and I can't understand what I am missing....
maybe is because of the glob? I just don't know... I am not pure IT guys and maybe it's something that will be evident for you.
(bottom a Repl.it link)
mkdir -p T21805
touch T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
r1=T21805/*R1*
echo $r1;
echo ${r1%%_S1*z}
var=T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
echo ${var%%_S1*z}
echo $r1| hexdump -C
echo $var | hexdump -C
output :
echo $r1
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
echo ${r1%%_S1*z}
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
echo ${var%%_S1*z}
T21805/T21805_SI-GA-D8-BH25N7DSXY
echo $r1| hexdump -C
00000000 54 32 31 38 30 35 2f 54 32 31 38 30 35 5f 53 49
|T21805/T21805_SI|
00000010 2d 47 41 2d 44 38 2d 42 48 32 35 4e 37 44 53 58
|-GA-D8-BH25N7DSX|
00000020 59 5f 53 31 5f 4c 30 30 31 5f 52 31 5f 30 30 31
|Y_S1_L001_R1_001|
00000030 2e 66 61 73 74 71 2e 67 7a 0a
|.fastq.gz.| 0000003a
echo $var | hexdump -C
00000000 54 32 31 38 30 35 2f 54 32 31 38 30 35 5f 53 49
|T21805/T21805_SI|
00000010 2d 47 41 2d 44 38 2d 42 48 32 35 4e 37 44 53 58
|-GA-D8-BH25N7DSX|
00000020 59 5f 53 31 5f 4c 30 30 31 5f 52 31 5f 30 30 31
|Y_S1_L001_R1_001|
00000030 2e 66 61 73 74 71 2e 67 7a 0a
|.fastq.gz.| 0000003a
Repl.it
I am interested on understanding why this is not working, I can achieve my desire output using sed for example.

Glob expansion doesn't happen at assignment time.
$ mkdir -p T21805
$ touch T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
$ touch T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_002.fastq.gz
$ r1=T21805/*R1*
$ printf '%s\n' "$r1"
T21805/*R1*
$ printf '%s\n' $r1
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_002.fastq.gz
It happens after the unquoted r1 has been expanded. When you write ${r1%%_S1*z}, the value of r1 doesn't contain the string S1; only after ${r1} expands is there an S1 you could match against.
If you set an array, the assignment rules are different. The glob expands before the assignment, and so you can do your filtering on each element of the array.
$ r1=( T21805/*R1* )
$ printf '%2\n' "${r1[#]}"
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_002.fastq.gz
$ printf '%s\n' "${r1[#]%%_S1*z}"
T21805/T21805_SI-GA-D8-BH25N7DSXY
T21805/T21805_SI-GA-D8-BH25N7DSXY

I ran it after set -xv to see the contents of r1.
$ r1=T21805/*R1*
+ r1='T21805/*R1*'
$ var=T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
+ var=T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
The r1 of$ {r1 %% _ S1 * z}isT21805 / * R1 *.
r1 does not include_S1 * z.

Remove ^M at end of each line in shell script [duplicate]

This question already has answers here:
How to convert DOS/Windows newline (CRLF) to Unix newline (LF)
(23 answers)
Are shell scripts sensitive to encoding and line endings?
(14 answers)
Closed 5 years ago.
I have this code but doesn't work.
line="20170425"
anycopia=${line:0:4}
mescopia=${line:4:2}
diacopia=${line:6:2}
echo $anycopia
echo $mescopia
echo $diacopia
DATE=$(date +%Y%m%d)
any=${DATE:0:4}
mes=${DATE:4:2}
dia=${DATE:6:2}
echo $any
echo $mes
echo $dia
if [ $anycopia == $any ]; then
echo "equals"
else
echo "not equals"
fi
Error:
syntax error near unexpected token fi
I've tried changing "then" but it doesn't matter, just like this:
if [ $anycopia == $any ]
then
echo "equals"
else
echo "not equals"
fi
And same error going on all over the time.
PD: Other answers in Stack Overflow with same question didn't work for me.
Edit:
I did this command:
hexdump -C script.sh
This is the output:
00000000 6c 69 6e 65 3d 22 32 30 31 37 30 34 32 35 22 0d |line="20170425".|
00000010 0a 61 6e 79 63 6f 70 69 61 3d 24 7b 6c 69 6e 65 |.anycopia=${line|
00000020 3a 30 3a 34 7d 0d 0a 6d 65 73 63 6f 70 69 61 3d |:0:4}..mescopia=|
00000030 24 7b 6c 69 6e 65 3a 34 3a 32 7d 0d 0a 64 69 61 |${line:4:2}..dia|
00000040 63 6f 70 69 61 3d 24 7b 6c 69 6e 65 3a 36 3a 32 |copia=${line:6:2|
00000050 7d 0d 0a 65 63 68 6f 20 24 61 6e 79 63 6f 70 69 |}..echo $anycopi|
00000060 61 0d 0a 65 63 68 6f 20 24 6d 65 73 63 6f 70 69 |a..echo $mescopi|
00000070 61 0d 0a 65 63 68 6f 20 24 64 69 61 63 6f 70 69 |a..echo $diacopi|
00000080 61 0d 0a 44 41 54 45 3d 24 28 64 61 74 65 20 2b |a..DATE=$(date +|
00000090 25 59 25 6d 25 64 29 0d 0a 61 6e 79 3d 24 7b 44 |%Y%m%d)..any=${D|
000000a0 41 54 45 3a 30 3a 34 7d 0d 0a 6d 65 73 3d 24 7b |ATE:0:4}..mes=${|
000000b0 44 41 54 45 3a 34 3a 32 7d 0d 0a 64 69 61 3d 24 |DATE:4:2}..dia=$|
000000c0 7b 44 41 54 45 3a 36 3a 32 7d 0d 0a 65 63 68 6f |{DATE:6:2}..echo|
000000d0 20 24 61 6e 79 0d 0a 65 63 68 6f 20 24 6d 65 73 | $any..echo $mes|
000000e0 0d 0a 65 63 68 6f 20 24 64 69 61 0d 0a 69 66 20 |..echo $dia..if |
000000f0 5b 20 24 61 6e 79 63 6f 70 69 61 20 3d 3d 20 24 |[ $anycopia == $|
00000100 61 6e 79 20 5d 3b 20 74 68 65 6e 0d 0a 20 20 20 |any ]; then.. |
00000110 20 65 63 68 6f 20 22 68 6f 6c 61 22 0d 0a 65 6c | echo "hola"..el|
00000120 73 65 0d 0a 20 20 20 20 65 63 68 6f 20 22 61 64 |se.. echo "ad|
00000130 65 75 22 0d 0a 66 69 0d 0a |eu"..fi..|
00000139
PDD: I'm running this with Bash on Ubuntu on Windows.
Edit2:
user#DESKTOP-UO9KRO4:/mnt/d$ cat -v script.sh
line="20170425"^M
anycopia=${line:0:4}^M
mescopia=${line:4:2}^M
diacopia=${line:6:2}^M
echo $anycopia^M
echo $mescopia^M
echo $diacopia^M
DATE=$(date +%Y%m%d)^M
any=${DATE:0:4}^M
mes=${DATE:4:2}^M
dia=${DATE:6:2}^M
echo $any^M
echo $mes^M
echo $dia^M
if [ $anycopia == $any ]; then^M
echo "hola"^M
else^M
echo "adeu"^M
fi^M

^M is a carriage return, and is commonly seen when files are copied from Windows. Run dos2unix to clean up those meta-characters.
dos2unix script.sh
Also as a safe coding practice,
Always double-quote your variables to not let them split when they contains spaces or any shell meta characters
Define a proper she-bang i.e. the interpreter using which the script should run. (Most cases if bash is available #!/usr/bin/env bash or #!/bin/bash)

New to awk and sed, How could I improve this? Multiple sed and awk commands

This is the script I've constructed
It takes a list of files according to the extension supplied as an argument.
It then removes everything before the pattern 00000000: in those files.
The pattern 00000000: is preceded by the string <pre>, it then removes those five first characters.
The script then removes the last three lines of the file
The script the outputs only the hexdump data of the file.
The script runs xxd to convert the hexdump to a file.jpg
if [[ $# -eq 0 ]] ; then
echo 'Run script as ./hexconv ext'
exit 0
fi
for file in *.$1
do
filename=$(basename $file)
extension="${filename##*.}"
filename="${filename%.*}"
sed -n '/00000000:/,$p' $file | sed '1s/^.....//' | head -n -3 | awk '{print $2" "$3" "$4" "$5" "$6" "$7" "$8" "$9" "$10" "$11" "$12" "$13" "$14" "$15" "$16" "$17}' | xxd -p -r > $filename.jpg
done
It works as I want it too, but I suspect there are things to improve it by, but alas, I am a novice in the use of awk and sed.
Excerpt from file
<th>response-head:</th>
<td>HTTP/1.1 200 OK
Date: Sun, 15 Dec 2013 04:27:04 GMT
Server: PWS/8.0.18
X-Px: ms h0-s34.p6-lhr ( h0-s35.p6-lhr), ht-d h0-s35.p6-lhr.cdngp.net
Etag: "4556354-9fbf8-4e40387aadfc0"
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0, max-age=0
Accept-Ranges: bytes
Content-Length: 654328
Content-Type: image/jpeg
Last-Modified: Thu, 15 Aug 2013 21:55:19 GMT
Pragma: no-cache
</td>
</tr>
</table>
<hr/>
<pre>00000000: ff d8 ff e0 00 10 4a 46 49 46 00 01 01 01 00 48 ......JFIF.....H
00000010: 00 48 00 00 ff e1 00 18 45 78 69 66 00 00 49 49 .H......Exif..II
00000020: 2a 00 08 00 00 00 00 00 00 00 00 00 00 00 ff ed *...............
00000030: 00 48 50 68 74 73 68 70 20 33 2e 30 00 .HPhotoshop 3.0.
00000040: 38 42 49 4d 04 04 00 00 00 00 00 1c 01 5a 00 8BIM..........Z.
00000050: 03 1b 25 47 1c 02 00 00 02 00 02 00 38 42 49 4d ..%G........8BIM
00000060: 04 25 00 00 00 00 00 10 fc e1 89 c8 b7 c9 78 .%.............x
00000070: 34 62 34 07 58 77 eb ff e1 03 a5 68 74 74 70 /4b4.Xw.....http
00000080: 3a 6e 73 2e 61 64 62 65 2e 63 6d ://ns.adobe.com/
00000090: 78 61 70 31 2e 30 00 3c 78 70 61 63 6b xap/1.0/.<?xpack
000000a0: 65 74 20 62 65 67 69 6e 3d 22 ef bb bf 22 20 69 et begin="..." i
000000b0: 64 3d 22 57 35 4d 30 4d 70 43 65 68 69 48 7a 72 d="W5M0MpCehiHzr
000000c0: 65 53 7a 4e 54 63 7a 6b 63 39 64 22 3e 20 3c eSzNTczkc9d"?> <
000000d0: 78 3a 78 6d 70 6d 65 74 61 20 78 6d 6c 6e 73 3a x:xmpmeta xmlns:
000000e0: 78 3d 22 61 64 62 65 3a 6e 73 3a 6d 65 74 61 x="adobe:ns:meta
000000f0: 22 20 78 3a 78 6d 70 74 6b 3d 22 41 64 62 /" x:xmptk="Adob
00000100: 65 20 58 4d 50 20 43 72 65 20 35 2e 30 2d 63 e XMP Core 5.0-c
00000110: 30 36 31 20 36 34 2e 31 34 30 39 34 39 2c 20 32 061 64.140949, 2
00000120: 30 31 30 31 32 30 37 2d 31 30 3a 35 37 3a 010/12/07-10:57:

Although #CodeGnome is right and this might belong to Code Review SE, here you go anyway:
Slightly more efficient to combine the multiple sed commands into one, for example:
sed -n -e 's/^<pre>//' -e '/00000000:/,$p'
I decided to retract this part, as I'm not all that sure it's any better or clearer. Your version is fine, except that s/^<pre>// is better than s/^.....//.
Use exit 1 when checking the number of arguments to signal an error
What is for file in *. there? Iterate for all files ending with a dot? Typo?
Unless you're 100% sure the filenames will never contain spaces, you should quote them, but don't quote where you don't need, for example:
filename=$(basename "$file") # need to quote
extension=${filename##*.} # no need,
filename=${filename%.*} # no need
sed ... "$file" # need to quote
... | xxd > "$filename".jpg # need to quote
The last awk could be shorter and less error prone as a loop:
... | awk '{printf $2; for (i=3; i<=17; ++i) printf " " $i; print ""}'
It seems you want to learn. You might be interested in this other answer too: What are the rules to write robust shell scripts?

The error message should be sent to stderr, should not hard-code the name of the script in case you rename it later, and should exit with a nonzero value.
if (( ! $# )); then
echo >&2 "Run script as '$0' \$extension"
exit 1
fi
If you're going to put the then on the same line as the if, then you should put the do on the same line as the for, too, for consistency:
for file in *.$1; do
Using file for the full name and filename for the basename is confusing variable name choice. I would use basename for the variable, to match the operation. And you need to quote the parameter expansion:
basename=$(basename "$file")
But you don't need to quote the right hand side of an assignment:
extension=${basename##*.}
The part of a filename without the extension is sometimes called the root (in vi and csh :-modifiers, you get it with :r)... using that name would be less confusing than changing an existing variable and reusing it:
root=${basename%.*}
As far as the actual pipeline, I would reorder it to put the head before the awk, since the sed and the head are all about what lines to print out and should be grouped together before the awk which modifies those selected lines. I would also use a loop and printf to make the awk a little more wieldy:
sed -n '/0\{8\}:/,$p' "$file" |
head -n -3 |
awk '{ printf "%s", $2; for (f=3;f<=17;++f) { printf " %s", $f }; print "" }' |
xxd -p -r > "$root.jpg"
done

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Append to a string in bash - bash

Related

Case doesn't work when examining tail output

How can I remove non-breaking spaces from a text file in bash?

bash substitution after glob not working?

Remove ^M at end of each line in shell script [duplicate]

New to awk and sed, How could I improve this? Multiple sed and awk commands

Categories

Resources