Struggling with spaces in file.txt with cat - bash

Im trying to create a list of file-paths from a file but I don't seem to be able to get around the spaces in the file paths.
# Show current series list
PS3="Type a number or 'q' to quit: "
# Create a list of files to display
Current_list=`cat Current_series_list.txt`
select fileName in $Current_list; do
if [ -n "$fileName" ]; then
Selected_series=${fileName}
fi
break
done
The file path in the Current_series list is: /Volumes/Lara's Hard Drive/LARA HARD DRIVE/Series/The Big Bang Theory 3/The.Big.Bang.Theory S03E11.avi
and
/Volumes/Lara's Hard Drive/LARA HARD DRIVE/Series/nakitaS03E11.avi
So i would like them two be 1 and 2 respectively in my list but I get the following result.
1) /Volumes/Lara's 6) Big
2) Hard 7) Bang
3) Drive/LARA 8) Theory
4) HARD 9) 3/The.Big.Bang.Theory
5) DRIVE/Series/The 10) S03E11.avi
Type a number or 'q' to quit:

You need to trick it a little bit:
# Show current series list
PS3="Type a number or 'q' to quit: "
# Create a list of files to display
Current_list=$(tr '\n' ',' < Current_series_list.txt)
IFS=, read -a list <<< "$Current_list"
select fileName in "${list[#]}"; do
if [ -n "$fileName" ]; then
Selected_series="${fileName}"
fi
break
done
echo "you selected $fileName"
Executing:
$ ./a
1) /Volumes/Lara's Hard Drive/LARA HARD DRIVE/Series/The Big Bang Theory3/The.Big.Bang.Theory S03E11.avi
2) /Volumes/Lara's Hard Drive/LARA HARD DRIVE/Series/nakitaS03E11.avi
Type a number or 'q' to quit: 2
you selected /Volumes/Lara's Hard Drive/LARA HARD DRIVE/Series/nakitaS03E11.avi
The key point is that you have to convert a file into an array.
This part converts it into a "string one", "string two" format:
$ tr '\n' ',' < Current_series_list.txt
/Volumes/Lara's Hard Drive/LARA HARD DRIVE/Series/The Big Bang Theory 3/The.Big.Bang.Theory S03E11.avi,/Volumes/Lara's Hard Drive/LARA HARD DRIVE/Series/nakitaS03E11.avi,
While this one creates an array in the variable list based on the comma delimiter that was set in the previous step:
IFS=, read -a list <<< "$Current_list"

You could try to read each line of Current_series_list.txt separately into an array element and select from the expanded array "${Current_array[#]}":
# Show current series list
PS3="Type a number or 'q' to quit: "
# Create an array of files to display
Current_array=()
while read line; do Current_array+=("$line"); done < Current_series_list.txt
select fileName in "${Current_array[#]}"; do
if [ -n "$fileName" ]; then
Selected_series=${fileName}
fi
break
done

Related

Bash: checking substring increments with modular arithmetic

I have a list of files with file names that contain a substring of 6 numbers that represents HHMMSS, HH: 2 digits hour, MM: 2 digits minutes, SS: 2 digits seconds.
If the list of files is ordered, the increments should be in steps of 30 minutes, that is, the first substring should be 000000, followed by 003000, 010000, 013000, ..., 233000.
I want to check that no file is missing iterating the list of files and checking that neither of these substrings is missing. My approach:
string_check=000000
for file in ${file_list[#]}; do
if [[ ${file:22:6} == $string_check ]]; then
echo "Ok"
else
echo "Problem: an hour (file) is missing"
exit 99
fi
string_check=$((string_check+3000)) #this is the key line
done
And the previous to the last line is the key. It should be formatted to 6 digits, I know how to do that, but I want to add time like a clock, or, in more specific words, modular arithmetic modulo 60. How can that be done?
Assumptions:
all 6-digit strings are of the format xx[03]0000 (ie, has to be an even 00 or 30 minutes and no seconds)
if there are strings like xx1529 ... these will be ignored (see 2nd half of answer - use of comm - to address OP's comment about these types of strings being an error)
Instead of trying to do a bunch of mod 60 math for the MM (minutes) portion of the string, we can use a sequence generator to generate all the desired strings:
$ for string_check in {00..23}{00,30}00; do echo $string_check; done
000000
003000
010000
013000
... snip ...
230000
233000
While OP should be able to add this to the current code, I'm thinking we might go one step further and look at pre-parsing all of the filenames, pulling the 6-digit strings into an associative array (ie, the 6-digit strings act as the indexes), eg:
unset myarray
declare -A myarray
for file in ${file_list}
do
myarray[${file:22:6}]+=" ${file}" # in case multiple files have same 6-digit string
done
Using the sequence generator as the driver of our logic, we can pull this together like such:
for string_check in {00..23}{00,30}00
do
[[ -z "${myarray[${string_check}]}" ]] &&
echo "Problem: (file) '${string_check}' is missing"
done
NOTE: OP can decide if the process should finish checking all strings or if it should exit on the first missing string (per OP's current code).
One idea for using comm to compare the 2 lists of strings:
# display sequence generated strings that do not exist in the array:
comm -23 <(printf "%s\n" {00..23}{00,30}00) <(printf "%s\n" "${!myarray[#]}" | sort)
# OP has commented that strings not like 'xx[03]000]` should generate an error;
# display strings (extracted from file names) that do not exist in the sequence
comm -13 <(printf "%s\n" {00..23}{00,30}00) <(printf "%s\n" "${!myarray[#]}" | sort)
Where:
comm -23 - display only the lines from the first 'file' that do not exist in the second 'file' (ie, missing sequences of the format xx[03]000)
comm -13 - display only the lines from the second 'file' that do not exist in the first 'file' (ie, filenames with strings not of the format xx[03]000)
These lists could then be used as input to a loop, or passed to xargs, for additional processing as needed; keeping in mind the comm -13 output will display the indices of the array, while the associated contents of the array will contain the name of the original file(s) from which the 6-digit string was derived.
Doing this easy with POSIX shell and only using built-ins:
#!/usr/bin/env sh
# Print an x for each glob matched file, and store result in string_check
string_check=$(printf '%.0sx' ./*[0-2][0-9][03]000*)
# Now string_check length reflects the number of matches
if [ ${#string_check} -eq 48 ]; then
echo "Ok"
else
echo "Problem: an hour (file) is missing"
exit 99
fi
Alternatively:
#!/usr/bin/env sh
if [ "$(printf '%.0sx' ./*[0-2][0-9][03]000*)" \
= 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' ]; then
echo "Ok"
else
echo "Problem: an hour (file) is missing"
exit 99
fi

Creating a progress bar for BASH script exporting system log files

Essentially for a set number of systems logs pulled and exported I need to indicate the scripts progress by printing a character "#". This should eventually create a progress bar with a width of 60. Something like what's presented below: ############################################# ,additionally I need the characters to build from left to right indicating the progression of the script.
The Question/Problem that this code was based off of goes as follows: "Use a separate invocation of wevtutil el to get the count of the number of logs and scale this to,say, a width of 60."
SYSNAM=$(hostname)
LOGDIR=${1:-/tmp/${SYSNAM}_logs}
i=0
LOGCOUNT=$( wevtutil el | wc -l )
x=$(( LOGCOUNT/60 ))
wevtutil el | while read ALOG
do
ALOG="${ALOG%$'\r'}"
printf "${ALOG}:\r"
SAFNAM="${ALOG// /_}"
SAFNAM="${SAFNAM//\//-}"
wevtutil epl "$ALOG" "${SYSNAM}_${SAFNAM}.evtx"
done
I've attempted methods such as using echo -ne "#", and printf "#%0.s" however the issue that I encounter is that the "#" characters gets printed with each instance of the name of the log file being retrieved; also the pattern is printed vertically rather than horizontally.
LOGCOUNT=$( wevtutil el | wc -l )
x=$(( LOGCOUNT/60 ))
echo -ne "["
for i in {1..60}
do
if [[ $(( x*i )) != $LOGCOUNT ]]
then
echo -ne "#"
#printf '#%0.s'
fi
done
echo "]"
printf "\n"
echo "Transfer Complete."
echo "Total Log Files Transferred: $LOGCOUNT"
I tried previously integrating this code into the first block but no luck. But something tells me that I don't need to establish a whole new loop, I keep thinking that the first block of code only needs a few lines of modification. Anyhow sorry for the lengthy explanation, please let me know if anything additional is needed for assistance--Thank you.
For the sake of this answer I'm going to assume the desired output is a 2-liner that looks something like:
$ statbar
file: /bin/cygdbusmenu-qt5-2.dll
[######## ]
The following may not work for everyone as it comes down to individual terminal attributes and how they can(not) be manipulated by tput (ie, ymmv) ...
For my sample script I'm going to loop through the contents of /bin, printing the name of each file as I process it, while updating the status bar with a new '#' after each 20 files:
there are 719 files under my /bin so there should be 35 #'s in my status bar (I add an extra # at the end once processing has completed)
we'll use a few tput commands to handle cursor/line movement, plus erasing previous output from a line
for printing the status bar I've pre-calculated the number of #'s and then use 2 variables ... $barspace for spaces, $barhash for #'s; for each 20 files I strip a space off $barspace and add a single # to $barhash; by (re)printing these 2x variables every 20x files I get the appearance of a moving status bar
Putting this all together:
$ cat statbar
clear # make sure we have plenty of room to display our status bar;
# if we're at the bottom of the console/window and we cause the
# windows to 'scroll up' then 'tput sc/rc' will not work
tput sc # save pointer/reference to current terminal line
erase=$(tput el) # save control code for 'erase (rest of) line'
# init some variables; get a count of the number of files so we can pre-calculate the total length of our status bar
modcount=20
filecount=$(find /bin -type f | wc -l)
# generate a string of filecount/20+1 spaces (35+1 for my particular /bin)
barspace=
for ((i=1; i<=(filecount/modcount+1); i++))
do
barspace="${barspace} "
done
barhash= # start with no #'s for this variable
filecount=0 # we'll re-use this variable to keep track of # of files processed so need to reset
while read -r filename
do
filecount=$((filecount+1))
tput rc # return cursor to previously saved terminal line (tput sc)
# print filename (1st line of output); if shorter than previous filename we need to erase rest of line
printf "file: ${filename}${erase}\n"
# print our status bar (2nd line of output) on the first and every ${modcount} pass through loop;
if [ ${filecount} -eq 1 ]
then
printf "[${barhash}${barspace}]\n"
elif [[ $((filecount % ${modcount} )) -eq 0 ]]
then
# for every ${modcount}th file we ...
barspace=${barspace:1:100000} # strip a space from barspace
barhash="${barhash}#" # add a '#' to barhash
printf "[${barhash}${barspace}]\n" # print our new status bar
fi
done < <(find /bin -type f | sort -V)
# finish up the status bar (should only be 1 space left to 'convert' to a '#')
tput rc
printf "file: -- DONE --\n"
if [ ${#barspace} -gt 0 ]
then
barspace=${barspace:1:100000}
barhash="${barhash}#"
fi
printf "[${barhash}${barspace}]\n"
NOTE: While testing I had to periodically reset my terminal in order for the tput commands to function properly, eg:
$ reset
$ statbar
I couldn't get the above to work on any of the (internet) fiddle sites (basically having problems getting tput to work with the web-based 'terminals').
Here's a gif displaying the behavior ...
NOTES:
the script does print every filename to stdout but since this script isn't actually doing anything with the files in question a) the printfs occur quite rapidly and b) the video/gif only captures a (relatively) few fleeting images ("Duh, Mark!" ?)
the last printf "file: -- DONE --\n" was added after I created the gif, and I'm being lazy by not generating and uploading a new gif

Random word Bash script if a number is supplied as the first command line argument then it will select from only words with that many characters

I am trying to create a Bash script that
- prints a random word
- if a number is supplied as the first command line argument then it will select from only words with that many characters.
This is my go at the first section (print a random word):
C=$(sed -n "$RANDOM p" /usr/share/dict/words)
echo $C
I am really stuck with the second section. Can anyone help?
might help someone coming from ryans tutorial
#!/bin/bash
charlen=$1
grep -E "^.{$charlen}$" $PWD/words.txt | shuf -n 1
you have to use a while loop to read every single line of that file and check if the length of a word equals the specified number ( including apostrophes ). In my o.s it is 99171 line ( i.e the file).
#!/usr/bin/env bash
readWords() {
declare -i int="$1"
(( int == 0 )) && {
printf "%s\n" "$int is 0, cant find 0 words"
return 1
}
while read getWords;do
if [[ ${#getWords} -eq $int ]];then
printf "%s\n" "$getWords"
fi
done < /usr/share/dict/words
}
readWords 20
this function takes a single argument. the declare command coerces the argument into an integer, if the argument is a string , it coerces it into a number which is 0 . Since we don't have 0 words if the specified argument ( number ) is 0 ( or a string coerced to 0 ) return from the function.
Read every single line in /usr/share/dict/words, get the length of each line with ${#getWords} ( $# >> gives the length of a string/commandline parameters/array size ) check if it equals the specified argument ( number )
A loop is not required, you can do something like
CH=$1; # how many characters the word must have
WordFile=/usr/share/dict/words; # file to read from
# find how many words that matches that length
TOTW=$(grep -Ec "^.{$CH}$" $WordFile);
# pick a random one, if you expect more than 32767 hits you
# need to do something like ($RANDOM+1)*($RANDOM+1)
RWORD=$(($RANDOM%$TOTW+1));
#show that word
grep -E "^.{$CH}$" $WordFile|sed -n "$RWORD p"
Depending on things you probably need to add checks for things like that $1 is a reasonable number, the file exist, that TOTW is >0 and so on.
This code would achieve what you want:
awk -v n="$1" 'length($0) == n' /usr/share/dict/words > /tmp/wordsHolder
shuf -n 1 /tmp/wordsHolder
Some comments: by using "$RANDOM" (as you did on your original script attempt), one would generate an integer on the range 0 - 32767, which could be more (or less) than the number of words (lines) available, given the desired number of characters on a word -- thus, potential for errors here.
To avoid that, we are using a shuf syntax that will retrieve a (sub)randomly picked word (line) on the file using its entire range (from line 1 - last line of file).

Storing multiple columns of data from a file in a variable

I'm trying to read from a file the data that it contains and get 2 important pieces of data from the file and use it in a bash script. A string and then a number for example:
Box 12
Toy 85
Dog 13
Bottle 22
I was thinking I could write a while loop to loop through the file and store the data into a variable. However I need two different variables, one for the number and one for the word. How do I get them separated into two variables?
Example code:
#!/bin/bash
declare -a textarr numarr
while read -r text num;do
textarr+=("$text")
numarr+=("$num")
done <file
echo ${textarr[1]} ${numarr[1]} #will print Toy 85
data are stored into two array variables: textarr numarr.
You can access each one of them using index ${textarr[$index]} or all of them at once with ${textarr[#]}
To read all the data into a single associative array (in bash 4.0 or newer):
#!/bin/bash
declare -A data=( )
while read -r key value; do
data[$key]=$value
done <file
With that done, you can retrieve a value by key efficiently:
echo "${data[Box]}"
...or iterate over all keys:
for key in "${!data[#]}"; do
value=${data[$key]}
echo "Key $key has value $value"
done
You'll note that read takes multiple names on its argument list. When given more than one argument, it splits fields by IFS, putting columns into their respective variables (with the entire rest of the line going into the last variable named, if more columns exist than variables are named).
Here I provide my own solution which should be discussed. I am not sure this is a good solution or not. Using while read construct has the drawback of starting a new shell and it will not be able to update a variable outside the loop. Here is an example code which you can modify to suite your own need. If you have more column data to use, then slight adjustment is need.
#!/bin/sh
res=$(awk 'BEGIN{OFS=" "}{print $2, $3 }' mytabularfile.tab)
n=0
for x in $res; do
row=$(expr $n / 2)
col=$(expr $n % 2)
#echo "row: $row column: $col value: $x"
if [ $col -eq 0 ]; then
if [ $n -gt 0 ]; then
echo "row: $row "
echo col1=$col1 col2=$col2
fi
col1=$x
else
col2=$x
fi
n=$(expr $n + 1)
done
row=$(expr $row + 1)
echo "last row: $row col1=$col1 col2=$col2"

While loop computed hash compare in bash?

I am trying to write a script to count the number of zero fill sectors for a dd image file. This is what I have so far, but it is throwing an error saying it cannot open file #hashvalue#. Is there a better way to do this or what am I missing? Thanks in advance.
count=1
zfcount=0
while read Stuff; do
count+=1
if [ $Stuff == "bf619eac0cdf3f68d496ea9344137e8b" ]; then
zfcount+=1
fi
echo $Stuff
done < "$(dd if=test.dd bs=512 2> /dev/null | md5sum | cut -d ' ' -f 1)"
echo "Total Sector Count Is: $count"
echo "Zero Fill Sector Count is: $zfcount"
Doing this in bash is going to be extremely slow -- on the order of 20 minutes for a 1GB file.
Use another language, like Python, which can do this in a few seconds (if storage can keep up):
python -c '
import sys
total=0
zero=0
file = open(sys.argv[1], "r")
while True:
a=file.read(512)
if a:
total = total + 1
if all(x == "\x00" for x in a):
zero = zero + 1
else:
break
print "Total sectors: " + str(total)
print "Zeroed sectors: " + str(zero)
' yourfilehere
Your error message comes from this line:
done < "$(dd if=test.dd bs=512 2> /dev/null | md5sum | cut -d ' ' -f 1)"
What that does is reads your entire test.dd, calculates the md5sum of that data, and parses out just the hash value, then, by merit of being included inside $( ... ), it substitutes that hash value in place, so you end up with that line essentially acting like this:
done < e6e8c42ec6d41563fc28e50080b73025
(except, of course, you have a different hash). So, your shell attempts to read from a file named like the hash of your test.dd image, can't find the file, and complains.
Also, it appears that you are under the assumption that dd if=test.dd bs=512 ... will feed you 512-byte blocks one at a time to iterate over. This is not the case. dd will read the file in bs-sized blocks, and write it in the same sized blocks, but it does not insert a separator or synchronize in any way with whatever is on the other side of its pipe line.

Resources