Iterate through URLs - bash

If I wanted to use ffmpeg to download a bunch of .ts files from a website, and the url format is
http://example.com/video-1080Pxxxxx.ts
Where the xxxxx is a number from 00000 to 99999 (required zero padding), how would I iterate through that in bash so that it tries every integer starting at 00000, 00001, 00002, etc.?

Loop over the integer values from 0 to 99999, and use printf to pad to 5 digits.
for x in {0..99999}; do
zx=$(printf '%05d' $x) # zero-pad to 5 digits
url="http://example.com/video-1080P${zx}.ts"
... # Do something with url
done

In pure bash:
$ n=99999 ; for ((i=0; i<=n; i++)) { s=$(printf "%05d" $i); echo $s ; }
or with a utility:
$ seq -w 0 99999
$ seq --help
Usage: seq [OPTION]... LAST
or: seq [OPTION]... FIRST LAST
or: seq [OPTION]... FIRST INCREMENT LAST
Print numbers from FIRST to LAST, in steps of INCREMENT.
Mandatory arguments to long options are mandatory for short options too.
-f, --format=FORMAT use printf style floating-point FORMAT
-s, --separator=STRING use STRING to separate numbers (default: \n)
-w, --equal-width equalize width by padding with leading zeroes

Why not get do something with a for loop:
for i in 0000{0..9} 000{10..99} 00{100..999} 0{1000..9999} {10000..99999}
do
# Curl was used since some minimal installs of linux do not have wget
curl -O http://example.com/video-1080P"$i".ts
sleep 1
done
(I am sure that there is a much better way to do this but it is not presenting itself to me at the moment)

My Bash (4.3) can do this:
$ echo {001..010}
001 002 003 004 005 006 007 008 009 010
So you could just do
for i in {00000..99999}; do
url="http://example.com/video-1080P${i}.ts"
# Use url
done

Related

divide floating point numbers from two different outputs

I am writing a bash script that has 1) number of lines in a file matching a pattern and 2) total lines in a file.
a) To get the number of lines in a file within a directory that had a specific pattern I used grep -c "pattern" f*
b) For overall line count in each file within the directory I used
wc -l f*
I am trying to divide the output from 2 by 1. I have tried a for loop
for i in $a
do
printf "%f\n" $(($b/$a)
echo i
done
but that returns an error syntax error in expression (error token is "first file in directory")
I also have tried
bc "$b/$a"
which does not work either
I am not sure if this is possible to do -- any advice appreciated. thanks!
Sample: grep -c *f generates a list like this
myfile1 500
myfile2 0
myfile3 14
myfile4 18
and wc -l *f generates a list like this:
myfile1 500
myfile2 500
myfile3 500
myfile4 238
I want my output to be the outcome of output for grep/wc divided so for example
myfile1 1
myfile2 0
myfile3 0.28
myfile4 0.07
bash only supports integer math so the following will print the (silently) truncated integer value:
$ a=3 b=5
$ printf "%f\n" $(($b/$a))
1.000000
bc is one solution and with a tweak of OP's current code:
$ bc <<< "scale=2;$b/$a"
1.66
# or
$ echo "scale=4;$b/$a" | bc
1.6666
If you happen to start with real/float numbers the printf approach will error (more specifically, the $(($b/$a)) will generate an error):
$ a=3.55 b=8.456
$ printf "%f\n" $(($b/$a))
-bash: 8.456/3.55: syntax error: invalid arithmetic operator (error token is ".456/3.55")
bc to the rescue:
$ bc <<< "scale=2;$b/$a"
2.38
# or
$ echo "scale=4;$b/$a" | bc
2.3819
NOTE: in OP's parent code there should be a test for $a=0 and if true then decide how to proceed (eg, set answer to 0; skip the calculation; print a warning message) otherwise the this code will generate a divide by zero error
bash doesn't have builtin floating-point arithmetic, but it can be simulated to some extent. For instance, in order to truncate the value of the fraction a/b to two decimal places (without rounding):
q=$((100*a/b)) # hoping multiplication won't overflow
echo ${q:0:-2}.${q: -2}
The number of decimal places can be made parametric:
n=4
q=$((10**n*a/b))
echo ${q:0:-n}.${q: -n}
This awk will do it all:
awk '/pattern/{a+=1}END{print a/NR}' f*
jot 93765431 |
mawk -v __='[13579]6$' 'BEGIN {
_^=__=_*=FS=__ }{ __+=_<NF } END { if (___=NR) {
printf(" %\47*.f / %\47-*.f ( %.*f %% )\n",
_+=++_*_*_++,__,_,___,_--,_*__/___*_) } }'
4,688,271 / 93,765,431 ( 4.99999941343 % )
filtering pattern = [13579]6$

read file line by line and sum each line individually

Im trying to make a script that creates a file say file01.txt that writes a number on each line.
001
002
...
998
999
then I want to read the file line by line and sum each line and say whether the number is even or odd.
sum each line like 0+0+1 = 1 which is odd
9+9+8 = 26 so even
001 odd
002 even
..
998 even
999 odd
I tried
while IFS=read -r line; do sum+=line >> file02.txt; done <file01.txt
but that sums the whole file not each line.
You can do this fairly easily in bash itself making use of built-in parameter expansions to trim leading zeros from the beginning of each line in order to sum the digits for odd / even.
When reading from a file (either a named file or stdin by default), you can use the initialization with default to use the first argument (positional parameter) as the filename (if given) and if not, just read from stdin, e.g.
#!/bin/bash
infile="${1:-/dev/stdin}" ## read from file provide as $1 or stdin
Which you will use infile with your while loop, e.g.
while read -r line; do ## loop reading each line
...
done < "$infile"
To trim the leading zeros, first obtain the substring of leading zeros trimming all digits from the right until only zeros remain, e.g.
leading="${line%%[1-9]*}" ## get leading 0's
Now using the same type parameter expansion with # instead of %% trim the leading zeros substring from the front of line saving the resulting number in value, e.g.
value="${line#$leading}" ## trim from front
Now zero your sum and loop over the digits in value to obtain the sum of digits:
for ((i=0;i<${#value};i++)); do ## loop summing digits
sum=$((sum + ${value:$i:1}))
done
All that remains is your even / odd test. Putting it altogether in a short example script that intentionally outputs the sum of digits in addition to your wanted "odd" / "even" output, you could do:
#!/bin/bash
infile="${1:-/dev/stdin}" ## read from file provide as $1 or stdin
while read -r line; do ## read each line
[ "$line" -eq "$line" 2>/dev/null ] || continue ## validate integer
leading="${line%%[1-9]*}" ## get leading 0's
value="${line#$leading}" ## trim from front
sum=0 ## zero sum
for ((i=0;i<${#value};i++)); do ## loop summing digits
sum=$((sum + ${value:$i:1}))
done
printf "%s (sum=%d) - " "$line" "$sum" ## output line w/sum
## (temporary output)
if ((sum % 2 == 0)); then ## check odd / even
echo "even"
else
echo "odd"
fi
done < "$infile"
(note: you can actually loop over the digits in line and skip removing the leading zeros substring. The removal ensure that if the whole value is used it isn't interpreted as an octal value -- up to you)
Example Use/Output
Using a quick process substitution to provide input of 001 - 020 on stdin you could do:
$ ./sumdigitsoddeven.sh < <(printf "%03d\n" {1..20})
001 (sum=1) - odd
002 (sum=2) - even
003 (sum=3) - odd
004 (sum=4) - even
005 (sum=5) - odd
006 (sum=6) - even
007 (sum=7) - odd
008 (sum=8) - even
009 (sum=9) - odd
010 (sum=1) - odd
011 (sum=2) - even
012 (sum=3) - odd
013 (sum=4) - even
014 (sum=5) - odd
015 (sum=6) - even
016 (sum=7) - odd
017 (sum=8) - even
018 (sum=9) - odd
019 (sum=10) - even
020 (sum=2) - even
You can simply remove the output of "(sum=X)" when you have confirmed it operates as you expect and redirect the output to your new file. Let me know if I understood your question properly and if you have further questions.
Would you please try the bash version:
parity=("even" "odd")
while IFS= read -r line; do
mapfile -t ary < <(fold -w1 <<< "$line")
sum=0
for i in "${ary[#]}"; do
(( sum += i ))
done
echo "$line" "${parity[sum % 2]}"
done < file01.txt > file92.txt
fold -w1 <<< "$line" breaks the string $line into lines of character
(one digit per line).
mapfile assigns array to the elements fed by the fold command.
Please note the bash script is not efficient in time and not suitable
for the large inputs.
With GNU awk:
awk -vFS='' '{sum=0; for(i=1;i<=NF;i++) sum+=$i;
print $0, sum%2 ? "odd" : "even"}' file01.txt
The FS awk variable defines the field separator. If it is set to the empty string (this is what the -vFS='' option does) then each character is a separate field.
The rest is trivial: the block between curly braces is executed for each line of the input. It compute the sum of the fields with a for loop (NF is another awk variable, its value is the number of fields of the current record). And it then prints the original line ($0) followed by the string even if the sum is even, else odd.
pure awk:
BEGIN {
for (i=1; i<=999; i++) {
printf ("%03d\n", i) > ARGV[1]
}
close(ARGV[1])
ARGC = 2
FS = ""
result[0] = "even"
result[1] = "odd"
}
{
printf("%s: %s\n", $0, result[($1+$2+$3) % 2])
}
Processing a file line by line, and doing math, is a perfect task for awk.
pure bash:
set -e
printf '%03d\n' {1..999} > "${1:?no path provided}"
result=(even odd)
mapfile -t num_list < "$1"
for i in "${num_list[#]}"; do
echo $i: ${result[(${i:0:1} + ${i:1:1} + ${i:2:1}) % 2]}
done
A similar method can be applied in bash, but it's slower.
comparison:
bash is about 10x slower.
$ cd ./tmp.Kb5ug7tQTi
$ bash -c 'time awk -f ../solution.awk numlist-awk > result-awk'
real 0m0.108s
user 0m0.102s
sys 0m0.000s
$ bash -c 'time bash ../solution.bash numlist-bash > result-bash'
real 0m0.931s
user 0m0.929s
sys 0m0.000s
$ diff --report-identical result*
Files result-awk and result-bash are identical
$ diff --report-identical numlist*
Files numlist-awk and numlist-bash are identical
$ head -n 5 *
==> numlist-awk <==
001
002
003
004
005
==> numlist-bash <==
001
002
003
004
005
==> result-awk <==
001: odd
002: even
003: odd
004: even
005: odd
==> result-bash <==
001: odd
002: even
003: odd
004: even
005: odd
read is a bottleneck in a while IFS= read -r line loop. More info in this answer.
mapfile (combined with for loop) can be slightly faster, but still slow (it also copies all the data to an array first).
Both solutions create a number list in a new file (which was in the question), and print the odd/even results to stdout. The path for the file is given as a single argument.
In awk, you can set the field separator to empty (FS="") to process individual characters.
In bash it can be done with substring expansion (${var:index:length}).
Modulo 2 (number % 2) to get odd or even.

Iterate Through List with Seq and Variable

I am attempting to loop through a list of integers starting out like so:
start=000
for i in $(seq -w $start 48 006);
However, when I try this code above, the loop seems to loop once and then quit.
What do I need to modify? (The leading zeroes need to stay)
Could you please try following.
start=0
diff=6
for i in $(seq $start $diff 48);
do
printf '%03d\n' $i
done
Output will be as follows.
000
006
012
018
024
030
036
042
048
Problem in OP's tried code:
I believe you have given wrong syntax in seq it should be startpoint then increment_number then endpoint eg-->(seq(start_point increment end_point)). Since you have given them wrongly thus it is printing them only once in loop.
In your attempt it is taking starting point as 0 and should run till 6 with difference of 48 which is NOT possible so it is printing only very first integer value which is fair enough.
EDIT: As per #Cyrus sir's comment adding BASH builtin solution here without using seq.
for ((i=0; i<=48; i=i+6)); do printf '%03d\n' $i; done
seq's input takes a start, increment-by, and finish.
You've reversed the increment-by with finish: seq -w $start 48 006 means start at zero, increment by 48 to finish at 6. The simple fix is seq -w $start 6 48. Note: 006 is not needed, just 6 since seq will equalize the widths of the numbers to two places.

How to find if a series of non-float numbers are missing in a string

In Bash,
I want to find a range of non-float numbers in a string.
If I have a string like so:
"1.4.jpg 2.005.jpg 003: Blah.jpg Blah4.jpg 4.5.jpg"
And I want to find if numbers firstNum-lastNum are missing. Say, if
firstNum=1
lastNum=5
the function would return
"1 is missing, 2 is missing, 5 is missing"
Its relatively easy to find non-float numbers in a string, but what confuses my script is the "2.005.jpg" part of the string. My script doesnt understand how to recognize that 5 is part of float 2, and therefore should ignore it.
I would just say if the number has leading zeros or has "[0-9]." in front of it, ignore it. But unfortunately, I need support for numbers with any amount of leading zeros.
If you're not against using awk, you can use this script:
echo "1.4.jpg 2.005.jpg 003: Blah.jpg Blah4.jpg" | \
awk -v min=1 -v max=5 -v RS="[^0-9. ]+" '
($0+0)!~/\./&&/[0-9]+/{a[$0+0]}
END{for(i=min;i<=max;i++)if(!(i in a))print i " is missing"}'
This is a GNU awk script that relies on the record separator RS to split the line with only (float) numbers.
The trick is to add 0 to the found number and check that it is still in decimal form (without any dot .). If so, the number is stored in the array a.
The END statement is looping through all decimal number from min (1) to max (5) and prints a message if the number is not part of the array a.
The posix compliant alternative script is the following:
echo "1.4.jpg 2.005.jpg 003: Blah.jpg Blah4.jpg" | \
awk -v min=1 -v max=5 '
{
split($0,n,"[^0-9. ]+");
for(i in n){
if((n[i]+0)!~/\./&&n[i]~/[0-9]+/){
a[n[i]+0]
}
}
}
END{for(i=min;i<=max;i++)if(!(i in a))print i " is missing"}'
The main difference is the use of the function split() that replaces RS. split breaks the input string and puts number into the array n. The array elements are then checked and put in the array a in case of decimal number.
Take a look at this extglob pattern:
find_missing() {
shopt -s extglob
for(( i = $2; i <= $3; i++ )); do
[[ $1 = !(*[0-9]|*[0-9].)*(0)"$i"!(.[0-9]*|[0-9]*) ]] || printf '<%s> missing!\n' "$i"
done
}
Consider $i to be 4:
"$i": match the number
"$i"!(.[0-9]*|[0-9]*): match the number if it's not followed by either .<number>, which would make it a float number (4.1 for example), or simply followed by another number which would make it a different number (it would falsely consider 41 to be 4 for example)
*(0)"$i"!(.[0-9]*|[0-9]*): allow leading 0s
!(*[0-9]|*[0-9].)*(0)"$i"!(.[0-9]*|[0-9]*): match the number if it's not prefixed by <number>., which would make it a float number (1.4 for example), or prefixed by another number which would make it a different number (it would falsely consider 24 to be 4 for example)
shopt -s extglob: enable extended globbing
Test run:
$ find_missing "1.4.jpg 2.005.jpg 003: Blah.jpg Blah4.jpg" 1 5
<1> missing!
<2> missing!
<5> missing!
$ find_missing "1.4.jpg 2.005.jpg 003: Blah.jpg Blah4.jpg" 1 2
<1> missing!
<2> missing!
$ find_missing "001 3.002 A.4A" 1 4
<2> missing!
<3> missing!
Possible Answer:
Here's a bash function that gives the expected output value on the provided test case in a (hopefully) reasonable way:
function check_missing {
prefix=""
for i in {1..5}; do
# make sure that $i is present,
# with optional leading zeroes,
# but with at least one non-number
# *before* the zeroes and *after* $i
if ! [[ "$1" =~ .*[^0-9\.]0*"$i"\.?[^0-9\.].* ]]; then
echo -n "${prefix}${i} is missing"
prefix=", "
fi
done
echo
}
I'm not sure how well this will generalize to the other inputs you have (or how important the output formatting is), but hopefully it at least gives an idea for how to solve the problem.
Sample output:
> check_missing "001.004.jpg 2.005.jpg 003.jpg Blah4.jpg"
1 is missing, 2 is missing, 5 is missing
> check_missing "1.4.jpg 2.005.jpg 003: Blah.jpg Blah4.jpg"
1 is missing, 2 is missing, 5 is missing

Find all numbers between two numbers in Bash

I have two variables like:
a=200
b=205
and want to find out all numbers between these two numbers (including. these specified numbers).
Check the seq instruction:
seq $a $b
The good tool is seq (as ChronoTrigger already stated), but this is not a bash internal function. Unfortunately the {1..4} notation is not working with variables. But there is a sideway:
a=200; b=205; eval "t=({$a..$b})"; echo ${t[*]}
Output:
200 201 202 203 204 205
The resulting array can be used in a for cycle later. for i in ${t[*]};{ ...;}. But better to use for((...)) loop for that as 1_CR stated.
ADDED
If it should be added some string as prefix or postfix to all elements then it is pretty easy to do:
echo ${t[*]/#/ab}
echo ${t[*]/%/cd}
Output:
ab200 ab201 ab202 ab203 ab204 ab205
200cd 201cd 202cd 203cd 204cd 205cd
ADDED #2
If fixed number of digits needed to be placed to the array this can be used
a=0; b=5; eval "t=({$a..$b})"; printf -v tt "%03d " ${t[*]}; t=($tt)
echo Array length: ${#t[*]}
echo ${t[*]}
Output:
Array length: 6
000 001 002 003 004 005
You could use the bash C-style for loop. Note that a $ is not needed before the a and b; this is characteristic of bash arithmetic expressions.
for ((i=a; i<=b; ++i))
do
echo $i
done
Alternately, to capture the numbers in an array
arr=()
for ((i=a; i<=b; ++i))
do
arr+=($i)
done
echo "${arr[*]}"

Resources