Split number string arbitrarily using bash into fixed number of variables - bash

I have a string with 3000 elements (NOT in series) in bash,
sections='1 2 4 ... 3000'
I am trying to split this string into x chunks of length n.
I want x to be typically between 3-10. Each chunk may not be of
the same length.
Each chunk is the input to a job.
Looking at https://unix.stackexchange.com/questions/122499/bash-split-a-list-of-files
and using bash arrays, my first attempt looks like this:
#! /bin/bash
nArgs=10
nChunkSize=10
z="0 1 2 .. 1--"
zs=(${z// / })
echo ${zs[#]}
for i in $nArgs; do
echo "Creating argument: "$i
startItem=$i*$nChunkSize
zArg[$i] = ${zs[#]:($startItem:$chunkSize}
done
echo "Resulting args"
for i in $nArgs; do
echo "Argument"${zArgs[$1]}
done
The above is far from working I'm afraid. Any pointers on the ${zs[#]:($startItem:$chunkSize} syntax?
For an input of 13 elements:
z='0 1 2 3 4 5 6 7 8 10 11 12 15'
nChunks=3
and nArgs=4
I would like to obtain an array with 3 elements, zs with content
zs[0] = '0 1 2 3'
zs[1] = '4 5 6 7'
zs[2] = '8 10 11 12 15'
Each zs will be used as arguments to subsequent jobs.

First note: This is a bad idea. It won't work reliably with arbitrary (non-numeric) contents, as bash doesn't have support for nested arrays.
output=( )
sections_str='1 2 4 5 6 7 8 9 10 11 12 13 14 15 16 3000'
batch_size=4
read -r -a sections <<<"$sections_str"
for ((i=0; i<${#sections[#]}; i+=batch_size)); do
current_pieces=( "${sections[#]:i:batch_size}" )
output+=( "${current_pieces[*]}" )
done
declare -p output # to view your output
Notes:
zs=( $z ) is buggy. For example, any * inside your list will be replaced with a list of filenames in the current directory. Use read -a to read into an array in a reliable way that doesn't depend on shell configuration other than IFS (which can be controlled scoped to just that one line with IFS=' ' read -r -a).
${array[#]:start:count} expands to up to count items from your array, starting at position start.

Related

Bash how to add a sum of unknown numbers from a user [duplicate]

This question already has an answer here:
Command Line Arguments vs Input - What's the Difference?
(1 answer)
Closed 12 months ago.
I have for example a user with a set of numbers. How can I make bash add them together?
Example in one go the user enters
(The amount of numbers they enter is up to them and it is unknown)
bash file 3 1 5 2 2 4
How can I make bash return 17 directly from that example?
I tried
#!/usr/bin/env sh
sum=0
while read number && [ -n "$number" ]; do
sum=$((sum + ${number/#-}))
echo "$sum"
done
But this is not clean and it is returning
$ bash file
3
3
1
4
5
9
2
11
2
13
4
17
I instead want the user to only place their numbers in 1 go and not be there to put more and more numbers
Instead of having them excute the command like I have it like
bash file
1
3
4
etc
instead I want to do it in 1 go
bash file 1 3 5 6
How?
You can loop through all script arguments and calculate sum:
#!/usr/bin/env sh
sum=0
for i in "$#"; do
sum=$(( $sum + $i ))
done
echo $sum
Running with your example:
$ bash sum 3 1 5 2 2 4
17

Why is `for i in {1..10}` different from `for ((i=1; i<=10; i++))`?

I wrote a simple code via bash script,
Code 1
for((i=1;i<=10;i++))
do
echo $i
i=$((i+1))
echo $i
i=$((i+2))
done
output of Code 1
1
2
5
6
9
10
Code 2
for i in {1..10}
do
echo $i
i=$((i+1))
echo $i
i=$((i+2))
done
output of Code 2
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
I'm just wondering, why outputs are not same ?
Thnx in advance
With in, the variable iterates over the list. You can change its value in the loop, but when the next iteration starts, the next value will be assigned to it, regardless of what value you assigned to it. (And I can't imagine any other behaviour: should the shell try to guess how far in the list you want to jump? What if the value is repeated, or not present in the list at all?)
With the C-style for, the variable is initialized, and at each iteration, its value is changed and the condition checked. There's no list of values, only the condition to end the loop.
These are not shorthand for each other: They're completely different code, and expected to behave in different ways.
{1..10} is just shorthand for 1 2 3 4 5 6 7 8 9 10; when you run for i in 1 2 3 4 5 6 7 8 9 10, you're explicitly assigning those exact values to i in turn, overwriting whatever else was previously there.
By contrast, for ((i=1; i<=10; i++)) is providing three separate statements: An initializer (i=1), telling it what to do to start the loop; a check (i<=10) to tell it how to determine whether the loop is finished; and an update (i++), telling it what to do between loop iterations. These are completely arbitrary commands, and you can put any arithmetic expression you want in these positions.
The for((i=1;i<=10;i++)) loop could be rewritten as such:
i=1 # for loop's 'i=1'
while [[ "${i}" -le 10 ]] # for loop's 'i<=10'
do
echo $i
i=$((i+1))
echo $i
i=$((i+2))
((i++)) # for loop's 'i++'
done
This generates ... you guessed it ...
1
2
5
6
9
10
Charles Duffy has already expanded the for i in {1..10} loop to show how that behaves.

How to read multiple integers on same line in bash?

declare -A a
for((i=0;i<2;i++))
for((j=0;j<5;j++))
read a[$i,$j]
I want to take the inputs on same line , but this input
1 2 3 4 5
6 7 8 9 5
is not doing the work , I have to take all 10 integers on different line .
Can I read multiple variables on same line in Bash (if all are integers).
You can use -a to put multiple fields into an array:
#!/bin/bash
echo "Enter some numbers:"
read -ra myarray
echo "There were ${#myarray[#]} numbers and index 4 was ${myarray[4]}"
If you enter 4 8 15 16 23 42 the output is:
There were 6 numbers and index 4 was 23
The simple answer is we cannot do it, as there is no provision of 2d array in Bash.
Input :
1 2 3 4 5
6 7 8 9 5
The following code will take the desired input as string array(whole line as a single input) and convert the individual string to an array (which is a cubersome task as there are multiple integers in a string)
assuming the array dimension is 2x5 and then it prints the 2d array :
#!/bin/bash
declare -A b #Associative Array
for((i=0;i<2;i++))do
read a[$i]
done
for((i=0;i<2;i++))do
array=(${a[$i]}) # spliting the string into array
for((j=0;j<5;j++))do
b[$i,$j]=${array[$j]}
done
done
for((i=0;i<2;i++))do
for((j=0;j<5;j++))do
printf "${b[$i,$j]} "
done
echo
done
Hence we can conclude it is better to take input in multiple lines or else we have to follow these steps.

Replace the nth field of every mth line using awk or bash

For a file that contains entries similar to as follows:
foo 1 6 0
fam 5 11 3
wam 7 23 8
woo 2 8 4
kaz 6 4 9
faz 5 8 8
How would you replace the nth field of every mth line with the same element using bash or awk?
For example, if n = 1 and m = 3 and the element = wot, the output would be:
foo 1 6 0
fam 5 11 3
wot 7 23 8
woo 2 8 4
kaz 6 4 9
wot 5 8 8
I understand you can call / print every mth line using e.g.
awk 'NR%7==0' file
So far I have tried to keep this in memory but to no avail... I need to keep the rest of the file as well.
I would prefer answers using bash or awk, but sed solutions would also be helpful. I'm a beginner in all three. Please explain your solution.
awk -v m=3 -v n=1 -v el='wot' 'NR % m == 0 { $n = el } 1' file
Note, however, that the inter-field whitespace is not guaranteed to be preserved as-is, because awk splits a line into fields by any run of whitespace; as written, the output fields of modified lines will be separated by a single space.
If your input fields are consistently separated by 2 spaces, however, you can effectively preserve the input whitespace by adding -F' ' -v OFS=' ' to the awk invocation.
-v m=3 -v n=1 -v el='wot' defines Awk variables m, n, and el
NR % m == 0 is a pattern (condition) that evaluates to true for every m-th line.
{ $n = el } is the associated action that replaces the nth field of the input line with variable el, causing the line to be rebuilt, implicitly using OFS, the output-field separator, which defaults to a space.
1 is a common Awk shorthand for printing the (possibly modified) input line at hand.
Great little exercise. While I would probably lean toward an awk solution, in bash you can also rely on parameter expansion with substring replacement to replace the nth field of every mth line. Essentially, you can read every line, preserving whitespace, then check your line count, e.g. if c is your line counter and m your variable for mth line, you could use:
if (( $((c % m )) == 0)) ## test for mth line
If the line is a replacement line, you can read each word into an array after restoring default word-splitting and then use your array element index n-1 to provide the replacement (e.g. ${line/find/replace} with ${line/"${array[$((n-1))]}"/replace}).
If it isn't a replacement line, simply output the line unchanged. A short example could be similar to the following (to which you can add additional validations as required)
#!/bin/bash
[ -n "$1" -a -r "$1" ] || { ## filename given an readable
printf "error: insufficient or unreadable input.\n"
exit 1
}
n=${2:-1} ## variables with default n=1, m=3, e=wot
m=${3:-3}
e=${4:-wot}
c=1 ## line count
while IFS= read -r line; do
if (( $((c % m )) == 0)) ## test for mth line
then
IFS=$' \t\n'
a=( $line ) ## split into array
IFS=
echo "${line/"${a[$((n-1))]}"/$e}" ## nth replaced with e
else
echo "$line" ## otherwise just output line
fi
((c++)) ## advance counter
done <"$1"
Example Use/Output
n=1, m=3, e=wot
$ bash replmn.sh dat/repl.txt
foo 1 6 0
fam 5 11 3
wot 7 23 8
woo 2 8 4
kaz 6 4 9
wot 5 8 8
n=1, m=2, e=baz
$ bash replmn.sh dat/repl.txt 1 2 baz
foo 1 6 0
baz 5 11 3
wam 7 23 8
baz 2 8 4
kaz 6 4 9
baz 5 8 8
n=3, m=2, e=99
$ bash replmn.sh dat/repl.txt 3 2 99
foo 1 6 0
fam 5 99 3
wam 7 23 8
woo 2 99 4
kaz 6 4 9
faz 5 99 8
An awk solution is shorter (and avoids problems with duplicate occurrences of the replacement string in $line), but both would need similar validation of field existence, etc.. Learn from both and let me know if you have any questions.

How to nest loops correctly

I have 2 scripts, #1 and #2. Each work OK by themselves. I want to read a 15 row file, row by row, and process it. Script #2 selects rows. Row 0 is is indicated as firstline=0, lastline=1. Row 14 would be firstline=14, lastline=15. I see good results from echo. I want to do the same with script #1. Can't get my head around nesting correctly. Code below.
#!/bin/bash
# script 1
filename=slash
firstline=0
lastline=1
i=0
exec <${filename}
while read ; do
i=$(( $i + 1 ))
if [ "$i" -ge "${firstline}" ] ; then
if [ "$i" -gt "${lastline}" ] ; then
break
else
echo "${REPLY}" > slash1
fold -w 21 -s slash1 > news1
sleep 5
fi
fi
done
# script2
firstline=(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14)
lastline=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
for ((i=0;i<${#firstline[#]};i++))
do
echo ${firstline[$i]} ${lastline[$i]};
done
Your question is very unclear, but perhaps you are simply looking for some simple function calls:
#!/bin/bash
script_1() {
filename=slash
firstline=$1
lastline=$2
i=0
exec <${filename}
while read ; do
i=$(( $i + 1 ))
if [ "$i" -ge "${firstline}" ] ; then
if [ "$i" -gt "${lastline}" ] ; then
break
else
echo "${REPLY}" > slash1
fold -w 21 -s slash1 > news1
sleep 5
fi
fi
done
}
# script2
firstline=(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14)
lastline=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
for ((i=0;i<${#firstline[#]};i++))
do
script_1 ${firstline[$i]} ${lastline[$i]};
done
Note that reading the file this way is extremely inefficient, and there are undoubtedly better ways to handle this, but I am trying to minimize the changes from your code.
Update: Based on your later comments, the following idiomatic Bash code that uses sed to extract the line of interest in each iteration solves your problem much more simply:
Note:
- If the input file does not change between loop iterations, and the input file is small enough (as it is in the case at hand), it's more efficient to buffer the file contents in a variable up front, as is demonstrated in the original answer below.
- As tripleee points out in a comment: If simply reading the input lines sequentially is sufficient (as opposed to extracting lines by specific line numbers, then a single, simple while read -r line; do ... # fold and output, then sleep ... done < "$filename" is enough.
# Determine the input filename.
filename='slash'
# Count its number of lines.
lineCount=$(wc -l < "$filename")
# Loop over the line numbers of the file.
for (( lineNum = 1; lineNum <= lineCount; ++lineNum )); do
# Use `sed` to extract the line with the line number at hand,
# reformat it, and output to the target file.
fold -w 21 -s <(sed -n "$lineNum {p;q;}" "$filename") > 'news1'
sleep 5
done
A simplified version of what I think you're trying to achieve:
#!/bin/bash
# Split fields by newlines on input,
# and separate array items by newlines on output.
IFS=$'\n'
# Read all input lines up front, into array ${lines[#]}
# In terms of your code, you'd use
# read -d '' -ra lines < "$filename"
read -d '' -ra lines <<<$'line 1\nline 2\nline 3\nline 4\nline 5\nline 6\nline 7\nline 8\nline 9\nline 10\nline 11\nline 12\nline 13\nline 14\nline 15'
# Define the arrays specifying the line ranges to select.
firstline=(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14)
lastline=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
# Loop over the ranges and select a range of lines in each iteration.
for ((i=0; i<${#firstline[#]}; i++)); do
extractedLines="${lines[*]: ${firstline[i]}: 1 + ${lastline[i]} - ${firstline[i]}}"
# Process the extracted lines.
# In terms of your code, the `> slash1` and `fold ...` commands would go here.
echo "$extractedLines"
echo '------'
done
Note:
The name of the array variable filled with read -ra is lines; ${lines[#]} is Bash syntax for returning all array elements as separate words (${lines[*]} also refers to all elements, but with slightly different semantics), and this syntax is used in the comments to illustrate that lines is indeed an array variable (note that if you were to use simply $lines to reference the variable, you'd implicitly get only the item with index 0, which is the same as: ${lines[0]}.
<<<$'line 1\n...' uses a here-string (<<<) to read an ad-hoc sample document (expressed as an ANSI C-quoted string ($'...')) in the interest of making my example code self-contained.
As stated in the comment, you'd read from $filename instead:
read -d '' -ra lines <"$filename"
extractedLines="${lines[*]: ${firstline[i]}: 1 + ${lastline[i]} - ${firstline[i]}}" extracts the lines of interest; ${firstline[i]} references the current element (index i) from array ${firstline[#]}; since the last token in Bash's array-slicing syntax
(${lines[*]: <startIndex>: <elementCount>}) is the count of elements to return, we must perform a calculation to determine the count, which is what 1 + ${lastline[i]} - ${firstline[i]} does.
By virtue of using "${lines[*]...}" rather than "${lines[#]...}", the extracted array elements are joined by the first character in $IFS, which in our case is a newline ($'\n') (when extracting a single line, that doesn't really matter).

Resources