The content of a variable created in bash contains some invisible symbols, and these are not a new line - bash

These 2 variables will have the same visible content
x_sign1="aabbccdd_and_somthing_else"
var1="...."
[........]
x_sign2=$(echo -n "${var1}${var2}${var3}" | shasum -a 256)
echo $x_sign2
====>
aabbccdd_and_somthing_else -
Note the "-" in the end.
However, their lengths will be different. Even though the x_sign2 doesn't contain a new line symbol. To ensure this:
x_sign22=$(echo -n "${var1}${var2}${var3}" | shasum -a 256 | tr -d '\n')
But:
echo ${#x_sign1}
====> 64
And:
And:
echo ${#x_sign2}
====> 67
echo ${#x_sign22}
====> 67
The difference is 3 symbols. The visible content is identical.
Also, when I make a request via curl to a REST API which needs that value of a signature, x_sign1 always succeeds, whereas x_sign2 doesn't -- "wrong signature"
Why? How to fix that?

Nonprintable characters could be a lot of things. If you want to remove any "nonprintable" characters, then tell tr that's what you want removed.
POSIX tr has a number of named character classes. For all printable characters, use [:print:]. To remove everything NOT in that set, -delete the -complement of the set.
So, instead of
tr -d '\n'
use
tr -dc '[:print:]'
IF THE PROBLEM is just the ' - ' on the end...
x_sign22="${x_sign22% -*}"
e.g.:
$: x="aabbccdd_and_somthing_else - "
$: echo "[$x]"
[aabbccdd_and_somthing_else - ]
$: x="${x% -*}"
$: echo "[$x]"
[aabbccdd_and_somthing_else]

Related

Bash script to add double quotes in .CSV comma delimited file

I need to add double quotes to the csv file. My sample data is like this..
378478,COMPLETED,Tracfone,,,"2020/03/29 09:39:22",,2787,,356074101197544,89148000005748235454,75176540
378328,COMPLETED,"Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)",50,"2020/03/29 06:10:01",200890899011202395,0899,0279395,356058102052972,89148000005117597971,67756296
I have tried some code available online with awk and sed, it is resulting as below , Error - **First digit in the number is being trimmed like for ex. in '378478' it is only displaying '78478'.
Also it is adding double quotes to already existing double quotes too!** nothing seems to be perfectly working. Please guide me!
"78478","COMPLETED","Tracfone","","",""2020/03/29 09:39:22"","","2787","","356074101197544","89148000005748235454","75176540"
"78328","COMPLETED",""Total Wireless"",""Unlimited Talk"," Text"," & Data (First 25GB High Speed"," then unlimited 2GB)"","50",""2020/03/29 06:10:01"","200890899011202395","0899","0279395","356058102052972","89148000005117597971","67756296"
"78329","COMPLETED",""Cricket Wireless"",""Unlimited Talk"," Text"," & 4G LTE Data w/ 15GB Hotspot"","60",""2020/03/29""
This is the code I am using:
awk -F"'?,'?" -v OFS='","' '{$1=$1; gsub(/^.|$/,"\"")} 1' file # or
sed -E 's/([^,]*) , (.*)/"\1" , "\2"/' file
My total code is the below one. my Intention was to first convert all .xlsx to .csv and then add double quotes to same csv and save it in the same file.i know the $file.csv part is wrong, hence i need some help
find "$Src_Dir" -type f -iname "*.xlsx" -print>path/temp
cat path/temp | while IFS="" read -r -d $'\0' file;
do
echo $file
ssconvert "${file}" --export-type=Gnumeric_stf:stf_csv
awk -F"'?,'?" -v OFS='","' '{$1=$1; gsub(/^.|$/,"\"")} 1' $file > $file.csv
done
If you want to handle anything other than the simplest CSV files, you should probably move away from sed and awk. There are much better tools available.
For example, if you sudo apt install csvtool (or equivalent) on your favourite distro, you can use its call-per-line functionality to process each line in the input file. See the following script for an example:
#!/bin/bash
function quotify {
# Start empty line, process every field.
line=""
while [[ $# -ne 0 ]] ; do
# Append comma for all but first field, then quoted field.
[[ -n "${line}" ]] && line="${line},"
line="${line}\"$1\""
shift
done
# Output the fully quoted line.
echo "${line}"
}
# Needed to call functions. Also, ensure link: /bin/sh -> /bin/bash.
export -f quotify
# Pretty-print input and output.
echo "Input file:"
sed 's/^/ /' inputFile.csv
echo "Output file:"
csvtool call quotify inputFile.csv | sed 's/^/ /'
Note the quotify function which is called for each line in the CSV file, with the arguments set to each field within that line (sans quotes, whether the original fields had quotes or not).
It basically constructs a string of all the fields in the line, with quotes around them, then writes that to standard output, as shown below in the output from that script:
Input file:
378478,COMPLETED,Tracfone,,,"2020/03/29 09:39:22",,2787,,356074101197544,89148000005748235454,75176540
378328,COMPLETED,"Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)",50,"2020/03/29"
Output file:
"378478","COMPLETED","Tracfone","","","2020/03/29 09:39:22","","2787","","356074101197544","89148000005748235454","75176540"
"378328","COMPLETED","Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)","50","2020/03/29"
Even though using a separate tool is probably the easiest way to go, if you absolutely cannot install other packages, then you're going to have to code up something in a package you already have. The following bash script is a good place to start, as it uses no other tools to achieve its goal.
At the moment, it's tied to a very specific set of rules, as follows:
White space matters. Anything between the commas is considered part of the field. This especially matters when detecting a quoted field, it must have the quote as the first character, no abc, "d,e,f",ghi stuff since the "d,e,f" won't be handled correctly.
Quoted fields are allowed to contain commas, and "" sequences within them are turned into ".
It's probably not a good idea to supply ill-formatted CSV files :-)
But, with that in mind, here we go. I'll offer a brief textual description of each section but hopefully the comments in the code will be enough to figure out what's going on.
First, a function for finding the position if some string within another string, useful for working out the field bounds:
function findPos {
haystack="$1"
needle="$2"
# Remove everything past the needle.
prefix="${haystack%%${needle}*}"
# If nothing was removed, it wasn't found, so supply massive number.
# Otherwise, it was found at the length of the string with removed stuff.
position=999999
[[ ${#prefix} -ne ${#haystack} ]] && position=${#prefix}
echo ${position}
}
Then we can use that in the function that works out the length of the next field. This basically just looks for the next comma for unquoted fields, and does special handling for quoted fields by building up the field from segments (it has to handle quotes within quotes and commas):
function getNextFieldLen {
line="$1"
# Empty line means all work done.
[[ -z "${line}" ]] && echo -1 && return
# Handle unquoted first, this is easy.
[[ "${line:0:1}" != '"' ]] && { echo $(findPos "${line}" ","); return; }
# Now handle quoted. Loop over all segments where a segment is defined as
# the text up to the next <"">, assuming it's before the next <",>.
field=""
nextQuoteComma=$(findPos "${line}" '",')
nextDoubleQuote=$(findPos "${line}" '""')
while [[ ${nextDoubleQuote} -lt ${nextQuoteComma} ]]; do
# Append segment to the field and go back for next segment.
field="${field}${line:0:${nextDoubleQuote}}\"\""
line="${line:${nextDoubleQuote}}"
line="${line:2}"
nextQuoteComma=$(findPos "${line}" '",')
nextDoubleQuote=$(findPos "${line}" '""')
done
# Add final segment (up to the comma) and output entire field.
field="${field}${line:0:${nextQuoteComma}}\""
echo "${#field}"
}
Finally, there's the top-level function which will quotify whatever comes in via standard input:
function quotifyStdIn {
# Process file line by line.
while read -r line; do
# Start with empty output line and non-comma separator.
outLine="" ; sep=""
# Place terminator to make processing easier, start field loop.
line="${line},"
fieldLen=$(getNextFieldLen "${line}")
while [[ ${fieldLen} -ge 0 ]]; do
# Get field and quotify if needed, adjust line (remove field and comma).
field="${line:0:${fieldLen}}"
[[ "${field:0:1}" = '"' ]] || field="\"${field}\""
line="${line:$((fieldLen+1))}"
#line="${line:${fieldLen}}"
#line="${line:1}"
# Append to output line and prepare for next field.
outLine="${outLine}${sep}${field}"; sep=","
fieldLen=$(getNextFieldLen "${line}")
done
# Output built line.
echo "${outLine}"
done
}
And, on the off-chance you want to read directly from a file (though providing a file name that's empty or "-" will use standard input so you can probably just use the file-based function for everything):
function quotifyFile {
file="$1"
# Empty file or "-" means standard input, otherwise take input from real file.
[[ ${#file} -eq 0 ]] && { quotifyStdIn; return; }
[[ "${file}" = "-" ]] && { quotifyStdIn; return; }
quotifyStdIn < "${file}"
}
And, finally, because every program that's not a "Hello, world" one deserves some form of test harness, this is what you can use to test the various capabilities:
(
echo 'paxdiablo,was here'
echo 'and,"then, strangely,",he,was,not'
echo '50,"My name is ""Pax"", and yours is ""Bob""",42'
echo '17,"""Love"" is grand",19'
) > harness.csv
echo "Before:"
sed "s/^/ /" harness.csv
echo "After:"
quotifyFile harness.csv | sed "s/^/ /"
rm -rf harness.csv
And, since a test harness is of little use unless you run the tests, here's the results of the first run:
Before:
paxdiablo,was here
and,"then, strangely,",he,was,not
50,"My name is ""Pax"", and yours is ""Bob""",42
17,"""Love"" is grand",19
After:
"paxdiablo","was here"
"and","then, strangely,","he","was","not"
"50","My name is ""Pax"", and yours is ""Bob""","42"
"17","""Love"" is grand","19"
Hopefully, that will be enough to get you going in the absence of being able to install packages. Of course, if one of the packages you can't install in bash itself, then you have problems that I can't help you with :-)
Your starting CSV is not a good CSV: the 2 rows have different number of columns
+--------+-----------+----------------+--------------------------------------------------------------------------+----+---------------------+---+------+---+-----------------+----------------------+----------+
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
+--------+-----------+----------------+--------------------------------------------------------------------------+----+---------------------+---+------+---+-----------------+----------------------+----------+
| 378478 | COMPLETED | Tracfone | - | - | 2020/03/29 09:39:22 | - | 2787 | - | 356074101197544 | 89148000005748235454 | 75176540 |
| 378328 | COMPLETED | Total Wireless | Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB) | 50 | 2020/03/29 | - | - | - | - | - | - |
+--------+-----------+----------------+--------------------------------------------------------------------------+----+---------------------+---+------+---+-----------------+----------------------+----------+
Using Miller (https://github.com/johnkerl/miller) you could run
mlr --csv --quote-all -N unsparsify input >output
to have
"378478","COMPLETED","Tracfone","","","2020/03/29 09:39:22","","2787","","356074101197544","89148000005748235454","75176540"
"378328","COMPLETED","Total Wireless","Unlimited Talk, Text, & Data (First 25GB High Speed, then unlimited 2GB)","50","2020/03/29","","","","","",""
You can use it downloading the executable https://github.com/johnkerl/miller/releases/tag/v5.7.0

How to convert this string to an interable list of Python files in Bash?

I have a string as follows:
string = '[ "file1.py", "file2.py", "file3.py", "file4.py" ]'
How to convert it to an iterable list of individual filenames so that I can run a for loop on it as follows:
for filen in fileArr
do
echo $filen
done
Expected output:
file1.py
file2.py
file3.py
file4.py
So far I have just removed the first and last square brackets using string=${string:1:${#string}-2} but I still have quotations and commas to remove. Is there a clean and simple way to achieve this?
You can use the tr command (more information about the tr command here) to eliminate the start bracket, the end bracket, and the double quotations. Also, you can substitute the coma for a tab so you can later iterate the over the result.
Here the code:
$ s='[ "file1.py", "file2.py", "file3.py", "file4.py" ]'
$ s2=$(echo $s | tr "[" " " | tr "]" " " | tr "\"" " " | tr "," "\\t")
$ for x in $s2; do echo $x ; done
file1.py
file2.py
file3.py
file4.py
More in-depth, the s2 statement uses:
s2=$( --> Assigns the output of the full command to s2
echo $s --> Prints the content of s to the pipes
| tr "[" " " --> Substitute the start bracket by spaces
| tr "]" " " --> Substitute the end bracket by spaces
| tr "\"" " " --> Substitute the double quotations by spaces
| tr "," "\\t" --> Substitute the comma by a tab
)
Notice that this code is valid for the kind of input you provided but it will not work if the filenames contain spaces.
EDIT:
Another solution is possible using substring replacement instead of using the tr command.
Here the code:
$ s2=${s//\[/} # Erase the start brackets
$ s3=${s2//\]/} # Erase the end brackets
$ s4=${s3//\"/} # Erase the double quotations
$ s5=${s4//,/ } # Substitute the comma by a tab
$ for x in $s5; do echo $x ; done
file1.py
file2.py
file3.py
file4.py
As the previous solution, notice that the code will not work if the filenames contain spaces (since they will be considered separated entries).
EDIT 2:
As #Z4-tier pointed out, the first option can be re-written in a more compact way using the -d option available in tr. This option erases the given characters. Also, the obtained string after the parsing might not be iterable if the Internal Field Separator (IFS) is set incorrectly. Although I think that the previous solutions cover most of the cases, you might consider setting and restoring the IFS value if it was set to something else than the default value.
Hence, you could write:
$ s='[ "file1.py", "file2.py", "file3.py", "file4.py" ]'
$ s2=$(echo $s | tr -d "[]\"" | tr "," "\\t")
$ IFS=$' '
$ for x in $s2; do echo $x ; done
file1.py
file2.py
file3.py
file4.py
$ IFS= #restore your IFS value
tr <string1> <string2> will replace any occurance of a character in string1 with the character appearing in the same index position from string2, so this can be done with 1 pipe.
tr can also be used as tr -d <string1> where any occurance of a character in string1 is deleted.
s='[ "file1.py", "file2.py", "file3.py", "file4.py" ]'
s2=$(echo $s | tr -d '[]",')
This is not iterable like you might think. It is still one string:
bash-$ echo "${s2[0]}"
file1.py file2.py file3.py file4.py
Try this:
IFS=$' '
bash-$ my_array=($(echo $s | tr -d '[]",'))
bash-$ echo "${my_array[0]}"
file1.py
bash-$ for k in "${my_array[#]}"; do echo $k; done
file1.py
file2.py
file3.py
file4.py
This sets IFS (the internal field separator) to a space character to take advantage of the spaces that are left in the string after it gets run through tr. It uses a subshell to translate s into what we called s2 above (now using my_array to indicate that it's not a string...), surrounds it with (...) to create an indexed array (this is where IFS is important).

Elegant way to replace tr '\n' '\0' (Null byte generating warnings at runtime)

I strongly doubt about the grep best use in my code and would like to find a better and cleaner coding style for extracting the session ID and security level from my cookie file :
cat mycookie
# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_127.0.0.1 FALSE / FALSE 0 PHPSESSID 1hjs18icittvqvpa4tm2lv9b12
#HttpOnly_127.0.0.1 FALSE /mydir/ FALSE 0 security medium
The expected output is the SSID hash :
1hjs18icittvqvpa4tm2lv9b12
Piping grep with tr '\n' '\0' works like a charm in the command line, but generates warnings (warning: command substitution: ignored null byte in input”) at the bash code execution. Here is the related code (with warnings):
ssid=$(grep -Po 'PHPSESSID.*' path/sessionFile | grep -Po '[a-z]|[0-9]' | tr '\n' '\0')
I am using bash 4.4.12 (x86_64-pc-linux-gnu) and could read here this crystal clear explanation :
Bash variables are stored as C strings. C strings are NUL-terminated.
They thus cannot store NULs by definition.
I could see here and there in both cases a coding solution using read:
# read content from stdin into array variable and a scalar variable "suffix"
array=( )
while IFS= read -r -d '' line; do
array+=( "$line" )
done < <(process that generates NUL stream here)
suffix=$line # content after last NUL, if any
# emit recorded content
printf '%s\0' "${array[#]}"; printf '%s' "$suffix"
I don't want to use arrays nor a while loop for this specific case, or others. I found this workaround using sed:
ssid=$(grep -Po 'PHPSESSID.*' path/sessionFile | grep -Po '[a-z]|[0-9]' | tr '\n' '_' | sed -e 's/_//g')
My two questions are :
1) Would it be a better way to substitute tr '\n' '\0', without using read into a while loop ?
2) Would it be a better way to extract properly the SSID and security level ?
Thx
It looks like you're trying to get rid of the newlines in the output from grep, but turning them into nulls doesn't do this. Nulls aren't visible in your terminal, but are still there and (like many other nonprinting characters) will wreak havoc if they get treated as part of your actual data. If you want to get rid of the newlines, just tell tr to delete them for you with ... | tr -d '\n'. But if you're trying to get the PHPSESSID value from a Netscape-format cookie file, there's a much much better way:
ssid=$(awk '($6 == "PHPSESSID") {print $7}' path/sessionFile)
This looks for "PHPSESSID" only in the sixth field (not in e.g. the path or cookie values -- both places it could legally appear), and specifically prints the seventh field of matching lines (not just anything after "PHPSESSID" that happens to be a digit or lowercase letter).
You could also try this, if you don't want to use awk:
ssid=$(grep -P '\bPHPSESSID\b' you_cookies_file)
echo $ssid # for debug only
which outputs something like
#HttpOnly_127.0.0.1 FALSE / FALSE 0 PHPSESSID 1hjs18icittvqvpa4tm2lv9b12
Then with cut(1) extract the relevant field:
echo $ssid |cut -d" " -f7
which outputs
1hjs18icittvqvpa4tm2lv9b12
Of course you should capture the last echo.
UPDATE
If you don't want to use cut, it is possible to emulate it with:
echo $ssid | (read a1 b2 c3 d4 e5 f6 g7; echo $g7)
Demonstration to capture in a variable:
$ field=$(echo $ssid | (read a1 b2 c3 d4 e5 f6 g7; echo $g7))
$ echo $field
1hjs18icittvqvpa4tm2lv9b12
$
Another way is to use positional parameters passing the string to a function which then refers to $7. Perhaps cleaner. Otherwise, you can use an array:
array=($(echo $ssid))
echo ${array[6]} # outputs the 7th field
It should also be possible to use regular expressions and/or string manipulation is bash, but they seem a little more difficult to me.

script variable in tr

I want to make a script that is looking for special numbers.
numbers like this 153 = 1^3+5^3+3^3
bash script 153 3
153
In my script I have this kinda thing
echo "$1" | tr -d " " | sed -e 's/\([[:digit:]]\)/\1+/g' | tr '+' '^"$2"+'
That last command doesn't work, it does change something, it changes 1+5+3+ to 1^+5^+3^+
So my question is: how can I use variables in tr?
tr replaces one character with another one. It can't replace one character with a longer string. That's sed's job:
set -- 153 3
echo "$1" | \
tr -d " " | \
sed -e 's/\([[:digit:]]\)/\1^'"$2"'+/g; s/\+$//'
The answer by choroba is correct. Here is a python based one-liner:
$ set -- 153 3
$ python -c "print '+'.join([x+'^$2' for x in list('$1')])"
1^3+5^3+3^3
Explanation:
list will convert the string "153" to ['1', '5', '3']
[ x+'^$2' for x in <list> ] is called list comprehension. Effectively it returns another list: ['1^3', '5^3', '3^3']
Then join them with '+'
NOTE: Only reason I added this answer was because, this does not require to adjust the completed string after processing by build-in functions.
Below are the other common approaches:
$ python -c "print '^$2+'.join(list('$1')) + '^$2'" # Add "^3" after join returns "1^3+5^3+3"
$ echo $1 | sed "s/./&^$2+/g; s/+$//" # Remove last '+' sign from "1^3+5^3+3^3+"

to print words seperated with special charecters in shell script

shell script to print three words differently I have tried
{
a="Uname/pass#last"
echo $a | tr "/" "\n" | tr "#" "\n"
output is:
Uname
pass
last
}
I want it as
{Username- Uname
Password- pass
lastname-last}
Ok, I guess you want to add a prefix to each results:
printf 'Username\nPassword\nlastname' > /tmp/prefixes
a="Uname/pass#last"
echo "${a}" | tr '/#' '\n\n' | paste -d':' /tmp/prefixes -
ie: paste together the output of /tmp/prefixes and of the Standard Input (-), which is receiving the output of : echo ".../...#..." | tr '/#' '\n\n'
(and in the resulting output, separate the 2 with a : in this example, or whatever else you would want. Ex: - like in your question.)
and it outputs :
Username:user
Password:pass
lastname:last
(I know you wanted a - instead of a : but I give my example with : to better separate the "-" denoting the standard input, and the ":" denoting the field-separator character in the output. Just change -d':' into -d'-' to have a - instead.)
First off, I hope you're not going to manipulate important passwords in a shell script and external commands. There are some risks involved with that.
Defining the problem
I suspect you want split a string encoding a user's Username, password and surname into a three line structure, adding tags to document which is which. For that, tr is insufficient.
However, it can be done inside the shell.
Example (bash, ksh):
function split_account_string {
typeset account=${1:?account string} uname pass last t
uname=${account%%/*}
last=${account##*#}
t=${account#$uname/}
pass=${t%#*}
[[ $uname/$pass#$last == "$account" ]] || return
echo "{Username-$uname"
echo "Password-$pass"
echo "lastname-$last}"
}
split_account_string "USER_A/seKreT#John.Doe"
This function will extract all tokens between the first / and the last # as the value of the password. If either one is missing, it will print nothing, and return an error status.
When run, this gives:
{Username-USER_A
Password-seKreT
lastname-John.Doe}
Use this simple script and get the output.
#!/bin/bash
a="Uname/pass#last"
array2=(`echo $a | tr "/" "\n" | tr "#" "\n"`)
array1=(`echo -e "Username\nPassword\nlastname"`)
i=${#array1[#]}
for (( j=0 ; j<$i ; j++ ))
do
echo "${array1[$j]}=${array2[$j]}"
done

Resources