Loop through variable 2 elements at a time in bash - bash

I have a bash variable which is a white-space separated list of strings and would like a loop that iterates 2 elements at a time from that list. I've made sure the length of the list is divisible by 2. So I want something like:
x="a bb cccc d"
while read first second; do
echo "($first,$second)"
done <<< $x
output should be:
(a,bb)
(cccc,d)
currently the above yields:
(a,bb cccc d)
Note: I need the assignment of $first and $second in my loop. echo was put in as a placeholder.
I'm looking for an efficient answer (preferably without a counter).

Depending on what you want to do with the values, you could do something like this:
$ x="one two buckle myShoe"
$ xargs -n 2 printf "(%s,%s)\n" <<<$x
(one,two)
(buckle,myShoe)
I assume that you really want to assign the values to shell variables, though, and that the body of your anticipated loop is more complicated than a simple formated print. In fact, since printf automatically repeats the format until its arguments are exhausted, xargs in the above example was unnecessary; it could have just been:
printf "(%s,%s)\n" $x
Unfortunately, xargs is not a built-in, so you can't xargs an arbitrary bash pipeline. Nor can you just define a shell function and give xargs its name. You can, however, xargs a bash subshell, which provides a very general solution:
# Shell function which takes two arguments
doit() {
echo "Number of arguments: $#"
echo "First argument: $1"
echo "Second argument: $2"
}
# Make the function visible to subshells
export -f doit
x="one two buckle myShoe"
xargs -n2 bash -c 'doit "$#"' _ <<<$x
Finally, as per a discussion in comments, you can use either printf or xargs (but probably printf is more efficient, since it is a builtin) to reorganize the list into pairs, and then feed that into a while read loop:
printf "%s %s\n" $x |
while read -r first second; do
echo "($first,$second)"
done

Using shift and $1, $2, etc.
You can use set -- $x to assign each part of x to the bash variables $1, $2, $3, etc.
You can use shift <n> to shift the numbered variables down by n. (I.e. shift 2 moves the value of $3 to $1, and $4 to $2, etc.)
Example
x="a bb cccc d"
set -- $x
while [ ! -z "$1" ] # while $1 is not empty
do
# do whatever you want here with $1 and $2. you can take more
# than two at a time by calling shift with a higher argument
# (and use $1, $2, and $3).
echo "($1,$2)"
shift 2
done
This prints:
(a,bb)
(cccc,d)
If you need the command line arguments
Save them to a temporary variable:
ARGS=( "$#" ) # save the command line args
set -- $x
...
set -- "${ARGS[#]}" # restore them back

You can tell read to read specific # of characters:
while read -n4 first second; do echo "($first,$second)"; done <<< "$x"
(a,b)
(c,d)
EDIT: For generic solution use awk:
x="aa bb cccc de"
awk '{for (i=1; i<=NF; i+=2) printf "(%s,%s)\n", $i, $(i+1) }' <<< "$x"
(aa,bb)
(cccc,de)

I would declare your variable like this :
x="a b c d"
A=($x)
And then
$ for ((i=0; i<${#A[*]}; i=i+2)); do echo "("${A[$i]},${A[$i+1]}")"; done
(a,b)
(c,d)

Related

How do I select each information in one line with delimiters [duplicate]

I have this string stored in a variable:
IN="bla#some.com;john#home.com"
Now I would like to split the strings by ; delimiter so that I have:
ADDR1="bla#some.com"
ADDR2="john#home.com"
I don't necessarily need the ADDR1 and ADDR2 variables. If they are elements of an array that's even better.
After suggestions from the answers below, I ended up with the following which is what I was after:
#!/usr/bin/env bash
IN="bla#some.com;john#home.com"
mails=$(echo $IN | tr ";" "\n")
for addr in $mails
do
echo "> [$addr]"
done
Output:
> [bla#some.com]
> [john#home.com]
There was a solution involving setting Internal_field_separator (IFS) to ;. I am not sure what happened with that answer, how do you reset IFS back to default?
RE: IFS solution, I tried this and it works, I keep the old IFS and then restore it:
IN="bla#some.com;john#home.com"
OIFS=$IFS
IFS=';'
mails2=$IN
for x in $mails2
do
echo "> [$x]"
done
IFS=$OIFS
BTW, when I tried
mails2=($IN)
I only got the first string when printing it in loop, without brackets around $IN it works.
You can set the internal field separator (IFS) variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's environment (to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.
This example will parse one line of items separated by ;, pushing it into an array:
IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[#]}"; do
# process "$i"
done
This other example is for processing the whole content of $IN, each time one line of input separated by ;:
while IFS=';' read -ra ADDR; do
for i in "${ADDR[#]}"; do
# process "$i"
done
done <<< "$IN"
Taken from Bash shell script split array:
IN="bla#some.com;john#home.com"
arrIN=(${IN//;/ })
echo ${arrIN[1]} # Output: john#home.com
Explanation:
This construction replaces all occurrences of ';' (the initial // means global replace) in the string IN with ' ' (a single space), then interprets the space-delimited string as an array (that's what the surrounding parentheses do).
The syntax used inside of the curly braces to replace each ';' character with a ' ' character is called Parameter Expansion.
There are some common gotchas:
If the original string has spaces, you will need to use IFS:
IFS=':'; arrIN=($IN); unset IFS;
If the original string has spaces and the delimiter is a new line, you can set IFS with:
IFS=$'\n'; arrIN=($IN); unset IFS;
I've seen a couple of answers referencing the cut command, but they've all been deleted. It's a little odd that nobody has elaborated on that, because I think it's one of the more useful commands for doing this type of thing, especially for parsing delimited log files.
In the case of splitting this specific example into a bash script array, tr is probably more efficient, but cut can be used, and is more effective if you want to pull specific fields from the middle.
Example:
$ echo "bla#some.com;john#home.com" | cut -d ";" -f 1
bla#some.com
$ echo "bla#some.com;john#home.com" | cut -d ";" -f 2
john#home.com
You can obviously put that into a loop, and iterate the -f parameter to pull each field independently.
This gets more useful when you have a delimited log file with rows like this:
2015-04-27|12345|some action|an attribute|meta data
cut is very handy to be able to cat this file and select a particular field for further processing.
If you don't mind processing them immediately, I like to do this:
for i in $(echo $IN | tr ";" "\n")
do
# process
done
You could use this kind of loop to initialize an array, but there's probably an easier way to do it.
Compatible answer
There are a lot of different ways to do this in bash.
However, it's important to first note that bash has many special features (so-called bashisms) that won't work in any other shell.
In particular, arrays, associative arrays, and pattern substitution, which are used in the solutions in this post as well as others in the thread, are bashisms and may not work under other shells that many people use.
For instance: on my Debian GNU/Linux, there is a standard shell called dash; I know many people who like to use another shell called ksh; and there is also a special tool called busybox with his own shell interpreter (ash).
For posix shell compatible answer, go to last part of this answer!
Requested string
The string to be split in the above question is:
IN="bla#some.com;john#home.com"
I will use a modified version of this string to ensure that my solution is robust to strings containing whitespace, which could break other solutions:
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
Split string based on delimiter in bash (version >=4.2)
In pure bash, we can create an array with elements split by a temporary value for IFS (the input field separator). The IFS, among other things, tells bash which character(s) it should treat as a delimiter between elements when defining an array:
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
# save original IFS value so we can restore it later
oIFS="$IFS"
IFS=";"
declare -a fields=($IN)
IFS="$oIFS"
unset oIFS
In newer versions of bash, prefixing a command with an IFS definition changes the IFS for that command only and resets it to the previous value immediately afterwards. This means we can do the above in just one line:
IFS=\; read -a fields <<<"$IN"
# after this command, the IFS resets back to its previous value (here, the default):
set | grep ^IFS=
# IFS=$' \t\n'
We can see that the string IN has been stored into an array named fields, split on the semicolons:
set | grep ^fields=\\\|^IN=
# fields=([0]="bla#some.com" [1]="john#home.com" [2]="Full Name <fulnam#other.org>")
# IN='bla#some.com;john#home.com;Full Name <fulnam#other.org>'
(We can also display the contents of these variables using declare -p:)
declare -p IN fields
# declare -- IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
# declare -a fields=([0]="bla#some.com" [1]="john#home.com" [2]="Full Name <fulnam#other.org>")
Note that read is the quickest way to do the split because there are no forks or external resources called.
Once the array is defined, you can use a simple loop to process each field (or, rather, each element in the array you've now defined):
# `"${fields[#]}"` expands to return every element of `fields` array as a separate argument
for x in "${fields[#]}" ;do
echo "> [$x]"
done
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
Or you could drop each field from the array after processing using a shifting approach, which I like:
while [ "$fields" ] ;do
echo "> [$fields]"
# slice the array
fields=("${fields[#]:1}")
done
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
And if you just want a simple printout of the array, you don't even need to loop over it:
printf "> [%s]\n" "${fields[#]}"
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
Update: recent bash >= 4.4
In newer versions of bash, you can also play with the command mapfile:
mapfile -td \; fields < <(printf "%s\0" "$IN")
This syntax preserve special chars, newlines and empty fields!
If you don't want to include empty fields, you could do the following:
mapfile -td \; fields <<<"$IN"
fields=("${fields[#]%$'\n'}") # drop '\n' added by '<<<'
With mapfile, you can also skip declaring an array and implicitly "loop" over the delimited elements, calling a function on each:
myPubliMail() {
printf "Seq: %6d: Sending mail to '%s'..." $1 "$2"
# mail -s "This is not a spam..." "$2" </path/to/body
printf "\e[3D, done.\n"
}
mapfile < <(printf "%s\0" "$IN") -td \; -c 1 -C myPubliMail
(Note: the \0 at end of the format string is useless if you don't care about empty fields at end of the string or they're not present.)
mapfile < <(echo -n "$IN") -td \; -c 1 -C myPubliMail
# Seq: 0: Sending mail to 'bla#some.com', done.
# Seq: 1: Sending mail to 'john#home.com', done.
# Seq: 2: Sending mail to 'Full Name <fulnam#other.org>', done.
Or you could use <<<, and in the function body include some processing to drop the newline it adds:
myPubliMail() {
local seq=$1 dest="${2%$'\n'}"
printf "Seq: %6d: Sending mail to '%s'..." $seq "$dest"
# mail -s "This is not a spam..." "$dest" </path/to/body
printf "\e[3D, done.\n"
}
mapfile <<<"$IN" -td \; -c 1 -C myPubliMail
# Renders the same output:
# Seq: 0: Sending mail to 'bla#some.com', done.
# Seq: 1: Sending mail to 'john#home.com', done.
# Seq: 2: Sending mail to 'Full Name <fulnam#other.org>', done.
Split string based on delimiter in shell
If you can't use bash, or if you want to write something that can be used in many different shells, you often can't use bashisms -- and this includes the arrays we've been using in the solutions above.
However, we don't need to use arrays to loop over "elements" of a string. There is a syntax used in many shells for deleting substrings of a string from the first or last occurrence of a pattern. Note that * is a wildcard that stands for zero or more characters:
(The lack of this approach in any solution posted so far is the main reason I'm writing this answer ;)
${var#*SubStr} # drops substring from start of string up to first occurrence of `SubStr`
${var##*SubStr} # drops substring from start of string up to last occurrence of `SubStr`
${var%SubStr*} # drops substring from last occurrence of `SubStr` to end of string
${var%%SubStr*} # drops substring from first occurrence of `SubStr` to end of string
As explained by Score_Under:
# and % delete the shortest possible matching substring from the start and end of the string respectively, and
## and %% delete the longest possible matching substring.
Using the above syntax, we can create an approach where we extract substring "elements" from the string by deleting the substrings up to or after the delimiter.
The codeblock below works well in bash (including Mac OS's bash), dash, ksh, lksh, yash, zsh, and busybox's ash:
(Thanks to Adam Katz's comment, making this loop a lot simplier!)
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
while [ "$IN" != "$iter" ] ;do
# extract the substring from start of string up to delimiter.
iter=${IN%%;*}
# delete this first "element" AND next separator, from $IN.
IN="${IN#$iter;}"
# Print (or doing anything with) the first "element".
printf '> [%s]\n' "$iter"
done
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
Why not cut?
cut is usefull for extracting columns in big files, but doing forks repetitively (var=$(echo ... | cut ...)) become quickly overkill!
Here is a correct syntax, tested under many posix shell using cut, as suggested by This other answer from DougW:
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
i=1
while iter=$(echo "$IN"|cut -d\; -f$i) ; [ -n "$iter" ] ;do
printf '> [%s]\n' "$iter"
i=$((i+1))
done
I wrote this in order to compare execution time.
On my raspberrypi, this look like:
$ export TIMEFORMAT=$'(%U + %S) / \e[1m%R\e[0m : %P '
$ time sh splitDemo.sh >/dev/null
(0.000 + 0.019) / 0.019 : 99.63
$ time sh splitDemo_cut.sh >/dev/null
(0.051 + 0.041) / 0.188 : 48.98
Where overall execution time is something like 10x longer, using 1 forks to cut, by field!
This worked for me:
string="1;2"
echo $string | cut -d';' -f1 # output is 1
echo $string | cut -d';' -f2 # output is 2
I think AWK is the best and efficient command to resolve your problem. AWK is included by default in almost every Linux distribution.
echo "bla#some.com;john#home.com" | awk -F';' '{print $1,$2}'
will give
bla#some.com john#home.com
Of course your can store each email address by redefining the awk print field.
How about this approach:
IN="bla#some.com;john#home.com"
set -- "$IN"
IFS=";"; declare -a Array=($*)
echo "${Array[#]}"
echo "${Array[0]}"
echo "${Array[1]}"
Source
echo "bla#some.com;john#home.com" | sed -e 's/;/\n/g'
bla#some.com
john#home.com
This also works:
IN="bla#some.com;john#home.com"
echo ADD1=`echo $IN | cut -d \; -f 1`
echo ADD2=`echo $IN | cut -d \; -f 2`
Be careful, this solution is not always correct. In case you pass "bla#some.com" only, it will assign it to both ADD1 and ADD2.
A different take on Darron's answer, this is how I do it:
IN="bla#some.com;john#home.com"
read ADDR1 ADDR2 <<<$(IFS=";"; echo $IN)
How about this one liner, if you're not using arrays:
IFS=';' read ADDR1 ADDR2 <<<$IN
In Bash, a bullet proof way, that will work even if your variable contains newlines:
IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
Look:
$ in=$'one;two three;*;there is\na newline\nin this field'
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two three" [2]="*" [3]="there is
a newline
in this field")'
The trick for this to work is to use the -d option of read (delimiter) with an empty delimiter, so that read is forced to read everything it's fed. And we feed read with exactly the content of the variable in, with no trailing newline thanks to printf. Note that's we're also putting the delimiter in printf to ensure that the string passed to read has a trailing delimiter. Without it, read would trim potential trailing empty fields:
$ in='one;two;three;' # there's an empty field
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two" [2]="three" [3]="")'
the trailing empty field is preserved.
Update for Bash≥4.4
Since Bash 4.4, the builtin mapfile (aka readarray) supports the -d option to specify a delimiter. Hence another canonical way is:
mapfile -d ';' -t array < <(printf '%s;' "$in")
Without setting the IFS
If you just have one colon you can do that:
a="foo:bar"
b=${a%:*}
c=${a##*:}
you will get:
b = foo
c = bar
Here is a clean 3-liner:
in="foo#bar;bizz#buzz;fizz#buzz;buzz#woof"
IFS=';' list=($in)
for item in "${list[#]}"; do echo $item; done
where IFS delimit words based on the separator and () is used to create an array. Then [#] is used to return each item as a separate word.
If you've any code after that, you also need to restore $IFS, e.g. unset IFS.
The following Bash/zsh function splits its first argument on the delimiter given by the second argument:
split() {
local string="$1"
local delimiter="$2"
if [ -n "$string" ]; then
local part
while read -d "$delimiter" part; do
echo $part
done <<< "$string"
echo $part
fi
}
For instance, the command
$ split 'a;b;c' ';'
yields
a
b
c
This output may, for instance, be piped to other commands. Example:
$ split 'a;b;c' ';' | cat -n
1 a
2 b
3 c
Compared to the other solutions given, this one has the following advantages:
IFS is not overriden: Due to dynamic scoping of even local variables, overriding IFS over a loop causes the new value to leak into function calls performed from within the loop.
Arrays are not used: Reading a string into an array using read requires the flag -a in Bash and -A in zsh.
If desired, the function may be put into a script as follows:
#!/usr/bin/env bash
split() {
# ...
}
split "$#"
you can apply awk to many situations
echo "bla#some.com;john#home.com"|awk -F';' '{printf "%s\n%s\n", $1, $2}'
also you can use this
echo "bla#some.com;john#home.com"|awk -F';' '{print $1,$2}' OFS="\n"
There is a simple and smart way like this:
echo "add:sfff" | xargs -d: -i echo {}
But you must use gnu xargs, BSD xargs cant support -d delim. If you use apple mac like me. You can install gnu xargs :
brew install findutils
then
echo "add:sfff" | gxargs -d: -i echo {}
So many answers and so many complexities. Try out a simpler solution:
echo "string1, string2" | tr , "\n"
tr (read, translate) replaces the first argument with the second argument in the input.
So tr , "\n" replace the comma with new line character in the input and it becomes:
string1
string2
There are some cool answers here (errator esp.), but for something analogous to split in other languages -- which is what I took the original question to mean -- I settled on this:
IN="bla#some.com;john#home.com"
declare -a a="(${IN//;/ })";
Now ${a[0]}, ${a[1]}, etc, are as you would expect. Use ${#a[*]} for number of terms. Or to iterate, of course:
for i in ${a[*]}; do echo $i; done
IMPORTANT NOTE:
This works in cases where there are no spaces to worry about, which solved my problem, but may not solve yours. Go with the $IFS solution(s) in that case.
If no space, Why not this?
IN="bla#some.com;john#home.com"
arr=(`echo $IN | tr ';' ' '`)
echo ${arr[0]}
echo ${arr[1]}
This is the simplest way to do it.
spo='one;two;three'
OIFS=$IFS
IFS=';'
spo_array=($spo)
IFS=$OIFS
echo ${spo_array[*]}
Apart from the fantastic answers that were already provided, if it is just a matter of printing out the data you may consider using awk:
awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"
This sets the field separator to ;, so that it can loop through the fields with a for loop and print accordingly.
Test
$ IN="bla#some.com;john#home.com"
$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"
> [bla#some.com]
> [john#home.com]
With another input:
$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "a;b;c d;e_;f"
> [a]
> [b]
> [c d]
> [e_]
> [f]
IN="bla#some.com;john#home.com"
IFS=';'
read -a IN_arr <<< "${IN}"
for entry in "${IN_arr[#]}"
do
echo $entry
done
Output
bla#some.com
john#home.com
System : Ubuntu 12.04.1
Use the set built-in to load up the $# array:
IN="bla#some.com;john#home.com"
IFS=';'; set $IN; IFS=$' \t\n'
Then, let the party begin:
echo $#
for a; do echo $a; done
ADDR1=$1 ADDR2=$2
Two bourne-ish alternatives where neither require bash arrays:
Case 1: Keep it nice and simple: Use a NewLine as the Record-Separator... eg.
IN="bla#some.com
john#home.com"
while read i; do
# process "$i" ... eg.
echo "[email:$i]"
done <<< "$IN"
Note: in this first case no sub-process is forked to assist with list manipulation.
Idea: Maybe it is worth using NL extensively internally, and only converting to a different RS when generating the final result externally.
Case 2: Using a ";" as a record separator... eg.
NL="
" IRS=";" ORS=";"
conv_IRS() {
exec tr "$1" "$NL"
}
conv_ORS() {
exec tr "$NL" "$1"
}
IN="bla#some.com;john#home.com"
IN="$(conv_IRS ";" <<< "$IN")"
while read i; do
# process "$i" ... eg.
echo -n "[email:$i]$ORS"
done <<< "$IN"
In both cases a sub-list can be composed within the loop is persistent after the loop has completed. This is useful when manipulating lists in memory, instead storing lists in files. {p.s. keep calm and carry on B-) }
In Android shell, most of the proposed methods just do not work:
$ IFS=':' read -ra ADDR <<<"$PATH"
/system/bin/sh: can't create temporary file /sqlite_stmt_journals/mksh.EbNoR10629: No such file or directory
What does work is:
$ for i in ${PATH//:/ }; do echo $i; done
/sbin
/vendor/bin
/system/sbin
/system/bin
/system/xbin
where // means global replacement.
IN='bla#some.com;john#home.com;Charlie Brown <cbrown#acme.com;!"#$%&/()[]{}*? are no problem;simple is beautiful :-)'
set -f
oldifs="$IFS"
IFS=';'; arrayIN=($IN)
IFS="$oldifs"
for i in "${arrayIN[#]}"; do
echo "$i"
done
set +f
Output:
bla#some.com
john#home.com
Charlie Brown <cbrown#acme.com
!"#$%&/()[]{}*? are no problem
simple is beautiful :-)
Explanation: Simple assignment using parenthesis () converts semicolon separated list into an array provided you have correct IFS while doing that. Standard FOR loop handles individual items in that array as usual.
Notice that the list given for IN variable must be "hard" quoted, that is, with single ticks.
IFS must be saved and restored since Bash does not treat an assignment the same way as a command. An alternate workaround is to wrap the assignment inside a function and call that function with a modified IFS. In that case separate saving/restoring of IFS is not needed. Thanks for "Bize" for pointing that out.
Here's my answer!
DELIMITER_VAL='='
read -d '' F_ABOUT_DISTRO_R <<"EOF"
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS"
NAME="Ubuntu"
VERSION="14.04.4 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.4 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
EOF
SPLIT_NOW=$(awk -F$DELIMITER_VAL '{for(i=1;i<=NF;i++){printf "%s\n", $i}}' <<<"${F_ABOUT_DISTRO_R}")
while read -r line; do
SPLIT+=("$line")
done <<< "$SPLIT_NOW"
for i in "${SPLIT[#]}"; do
echo "$i"
done
Why this approach is "the best" for me?
Because of two reasons:
You do not need to escape the delimiter;
You will not have problem with blank spaces. The value will be properly separated in the array.
A one-liner to split a string separated by ';' into an array is:
IN="bla#some.com;john#home.com"
ADDRS=( $(IFS=";" echo "$IN") )
echo ${ADDRS[0]}
echo ${ADDRS[1]}
This only sets IFS in a subshell, so you don't have to worry about saving and restoring its value.

Bash script to filter value from key value [duplicate]

I have this string stored in a variable:
IN="bla#some.com;john#home.com"
Now I would like to split the strings by ; delimiter so that I have:
ADDR1="bla#some.com"
ADDR2="john#home.com"
I don't necessarily need the ADDR1 and ADDR2 variables. If they are elements of an array that's even better.
After suggestions from the answers below, I ended up with the following which is what I was after:
#!/usr/bin/env bash
IN="bla#some.com;john#home.com"
mails=$(echo $IN | tr ";" "\n")
for addr in $mails
do
echo "> [$addr]"
done
Output:
> [bla#some.com]
> [john#home.com]
There was a solution involving setting Internal_field_separator (IFS) to ;. I am not sure what happened with that answer, how do you reset IFS back to default?
RE: IFS solution, I tried this and it works, I keep the old IFS and then restore it:
IN="bla#some.com;john#home.com"
OIFS=$IFS
IFS=';'
mails2=$IN
for x in $mails2
do
echo "> [$x]"
done
IFS=$OIFS
BTW, when I tried
mails2=($IN)
I only got the first string when printing it in loop, without brackets around $IN it works.
You can set the internal field separator (IFS) variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's environment (to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.
This example will parse one line of items separated by ;, pushing it into an array:
IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[#]}"; do
# process "$i"
done
This other example is for processing the whole content of $IN, each time one line of input separated by ;:
while IFS=';' read -ra ADDR; do
for i in "${ADDR[#]}"; do
# process "$i"
done
done <<< "$IN"
Taken from Bash shell script split array:
IN="bla#some.com;john#home.com"
arrIN=(${IN//;/ })
echo ${arrIN[1]} # Output: john#home.com
Explanation:
This construction replaces all occurrences of ';' (the initial // means global replace) in the string IN with ' ' (a single space), then interprets the space-delimited string as an array (that's what the surrounding parentheses do).
The syntax used inside of the curly braces to replace each ';' character with a ' ' character is called Parameter Expansion.
There are some common gotchas:
If the original string has spaces, you will need to use IFS:
IFS=':'; arrIN=($IN); unset IFS;
If the original string has spaces and the delimiter is a new line, you can set IFS with:
IFS=$'\n'; arrIN=($IN); unset IFS;
I've seen a couple of answers referencing the cut command, but they've all been deleted. It's a little odd that nobody has elaborated on that, because I think it's one of the more useful commands for doing this type of thing, especially for parsing delimited log files.
In the case of splitting this specific example into a bash script array, tr is probably more efficient, but cut can be used, and is more effective if you want to pull specific fields from the middle.
Example:
$ echo "bla#some.com;john#home.com" | cut -d ";" -f 1
bla#some.com
$ echo "bla#some.com;john#home.com" | cut -d ";" -f 2
john#home.com
You can obviously put that into a loop, and iterate the -f parameter to pull each field independently.
This gets more useful when you have a delimited log file with rows like this:
2015-04-27|12345|some action|an attribute|meta data
cut is very handy to be able to cat this file and select a particular field for further processing.
If you don't mind processing them immediately, I like to do this:
for i in $(echo $IN | tr ";" "\n")
do
# process
done
You could use this kind of loop to initialize an array, but there's probably an easier way to do it.
Compatible answer
There are a lot of different ways to do this in bash.
However, it's important to first note that bash has many special features (so-called bashisms) that won't work in any other shell.
In particular, arrays, associative arrays, and pattern substitution, which are used in the solutions in this post as well as others in the thread, are bashisms and may not work under other shells that many people use.
For instance: on my Debian GNU/Linux, there is a standard shell called dash; I know many people who like to use another shell called ksh; and there is also a special tool called busybox with his own shell interpreter (ash).
For posix shell compatible answer, go to last part of this answer!
Requested string
The string to be split in the above question is:
IN="bla#some.com;john#home.com"
I will use a modified version of this string to ensure that my solution is robust to strings containing whitespace, which could break other solutions:
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
Split string based on delimiter in bash (version >=4.2)
In pure bash, we can create an array with elements split by a temporary value for IFS (the input field separator). The IFS, among other things, tells bash which character(s) it should treat as a delimiter between elements when defining an array:
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
# save original IFS value so we can restore it later
oIFS="$IFS"
IFS=";"
declare -a fields=($IN)
IFS="$oIFS"
unset oIFS
In newer versions of bash, prefixing a command with an IFS definition changes the IFS for that command only and resets it to the previous value immediately afterwards. This means we can do the above in just one line:
IFS=\; read -a fields <<<"$IN"
# after this command, the IFS resets back to its previous value (here, the default):
set | grep ^IFS=
# IFS=$' \t\n'
We can see that the string IN has been stored into an array named fields, split on the semicolons:
set | grep ^fields=\\\|^IN=
# fields=([0]="bla#some.com" [1]="john#home.com" [2]="Full Name <fulnam#other.org>")
# IN='bla#some.com;john#home.com;Full Name <fulnam#other.org>'
(We can also display the contents of these variables using declare -p:)
declare -p IN fields
# declare -- IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
# declare -a fields=([0]="bla#some.com" [1]="john#home.com" [2]="Full Name <fulnam#other.org>")
Note that read is the quickest way to do the split because there are no forks or external resources called.
Once the array is defined, you can use a simple loop to process each field (or, rather, each element in the array you've now defined):
# `"${fields[#]}"` expands to return every element of `fields` array as a separate argument
for x in "${fields[#]}" ;do
echo "> [$x]"
done
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
Or you could drop each field from the array after processing using a shifting approach, which I like:
while [ "$fields" ] ;do
echo "> [$fields]"
# slice the array
fields=("${fields[#]:1}")
done
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
And if you just want a simple printout of the array, you don't even need to loop over it:
printf "> [%s]\n" "${fields[#]}"
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
Update: recent bash >= 4.4
In newer versions of bash, you can also play with the command mapfile:
mapfile -td \; fields < <(printf "%s\0" "$IN")
This syntax preserve special chars, newlines and empty fields!
If you don't want to include empty fields, you could do the following:
mapfile -td \; fields <<<"$IN"
fields=("${fields[#]%$'\n'}") # drop '\n' added by '<<<'
With mapfile, you can also skip declaring an array and implicitly "loop" over the delimited elements, calling a function on each:
myPubliMail() {
printf "Seq: %6d: Sending mail to '%s'..." $1 "$2"
# mail -s "This is not a spam..." "$2" </path/to/body
printf "\e[3D, done.\n"
}
mapfile < <(printf "%s\0" "$IN") -td \; -c 1 -C myPubliMail
(Note: the \0 at end of the format string is useless if you don't care about empty fields at end of the string or they're not present.)
mapfile < <(echo -n "$IN") -td \; -c 1 -C myPubliMail
# Seq: 0: Sending mail to 'bla#some.com', done.
# Seq: 1: Sending mail to 'john#home.com', done.
# Seq: 2: Sending mail to 'Full Name <fulnam#other.org>', done.
Or you could use <<<, and in the function body include some processing to drop the newline it adds:
myPubliMail() {
local seq=$1 dest="${2%$'\n'}"
printf "Seq: %6d: Sending mail to '%s'..." $seq "$dest"
# mail -s "This is not a spam..." "$dest" </path/to/body
printf "\e[3D, done.\n"
}
mapfile <<<"$IN" -td \; -c 1 -C myPubliMail
# Renders the same output:
# Seq: 0: Sending mail to 'bla#some.com', done.
# Seq: 1: Sending mail to 'john#home.com', done.
# Seq: 2: Sending mail to 'Full Name <fulnam#other.org>', done.
Split string based on delimiter in shell
If you can't use bash, or if you want to write something that can be used in many different shells, you often can't use bashisms -- and this includes the arrays we've been using in the solutions above.
However, we don't need to use arrays to loop over "elements" of a string. There is a syntax used in many shells for deleting substrings of a string from the first or last occurrence of a pattern. Note that * is a wildcard that stands for zero or more characters:
(The lack of this approach in any solution posted so far is the main reason I'm writing this answer ;)
${var#*SubStr} # drops substring from start of string up to first occurrence of `SubStr`
${var##*SubStr} # drops substring from start of string up to last occurrence of `SubStr`
${var%SubStr*} # drops substring from last occurrence of `SubStr` to end of string
${var%%SubStr*} # drops substring from first occurrence of `SubStr` to end of string
As explained by Score_Under:
# and % delete the shortest possible matching substring from the start and end of the string respectively, and
## and %% delete the longest possible matching substring.
Using the above syntax, we can create an approach where we extract substring "elements" from the string by deleting the substrings up to or after the delimiter.
The codeblock below works well in bash (including Mac OS's bash), dash, ksh, lksh, yash, zsh, and busybox's ash:
(Thanks to Adam Katz's comment, making this loop a lot simplier!)
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
while [ "$IN" != "$iter" ] ;do
# extract the substring from start of string up to delimiter.
iter=${IN%%;*}
# delete this first "element" AND next separator, from $IN.
IN="${IN#$iter;}"
# Print (or doing anything with) the first "element".
printf '> [%s]\n' "$iter"
done
# > [bla#some.com]
# > [john#home.com]
# > [Full Name <fulnam#other.org>]
Why not cut?
cut is usefull for extracting columns in big files, but doing forks repetitively (var=$(echo ... | cut ...)) become quickly overkill!
Here is a correct syntax, tested under many posix shell using cut, as suggested by This other answer from DougW:
IN="bla#some.com;john#home.com;Full Name <fulnam#other.org>"
i=1
while iter=$(echo "$IN"|cut -d\; -f$i) ; [ -n "$iter" ] ;do
printf '> [%s]\n' "$iter"
i=$((i+1))
done
I wrote this in order to compare execution time.
On my raspberrypi, this look like:
$ export TIMEFORMAT=$'(%U + %S) / \e[1m%R\e[0m : %P '
$ time sh splitDemo.sh >/dev/null
(0.000 + 0.019) / 0.019 : 99.63
$ time sh splitDemo_cut.sh >/dev/null
(0.051 + 0.041) / 0.188 : 48.98
Where overall execution time is something like 10x longer, using 1 forks to cut, by field!
This worked for me:
string="1;2"
echo $string | cut -d';' -f1 # output is 1
echo $string | cut -d';' -f2 # output is 2
I think AWK is the best and efficient command to resolve your problem. AWK is included by default in almost every Linux distribution.
echo "bla#some.com;john#home.com" | awk -F';' '{print $1,$2}'
will give
bla#some.com john#home.com
Of course your can store each email address by redefining the awk print field.
How about this approach:
IN="bla#some.com;john#home.com"
set -- "$IN"
IFS=";"; declare -a Array=($*)
echo "${Array[#]}"
echo "${Array[0]}"
echo "${Array[1]}"
Source
echo "bla#some.com;john#home.com" | sed -e 's/;/\n/g'
bla#some.com
john#home.com
This also works:
IN="bla#some.com;john#home.com"
echo ADD1=`echo $IN | cut -d \; -f 1`
echo ADD2=`echo $IN | cut -d \; -f 2`
Be careful, this solution is not always correct. In case you pass "bla#some.com" only, it will assign it to both ADD1 and ADD2.
A different take on Darron's answer, this is how I do it:
IN="bla#some.com;john#home.com"
read ADDR1 ADDR2 <<<$(IFS=";"; echo $IN)
How about this one liner, if you're not using arrays:
IFS=';' read ADDR1 ADDR2 <<<$IN
In Bash, a bullet proof way, that will work even if your variable contains newlines:
IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
Look:
$ in=$'one;two three;*;there is\na newline\nin this field'
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two three" [2]="*" [3]="there is
a newline
in this field")'
The trick for this to work is to use the -d option of read (delimiter) with an empty delimiter, so that read is forced to read everything it's fed. And we feed read with exactly the content of the variable in, with no trailing newline thanks to printf. Note that's we're also putting the delimiter in printf to ensure that the string passed to read has a trailing delimiter. Without it, read would trim potential trailing empty fields:
$ in='one;two;three;' # there's an empty field
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two" [2]="three" [3]="")'
the trailing empty field is preserved.
Update for Bash≥4.4
Since Bash 4.4, the builtin mapfile (aka readarray) supports the -d option to specify a delimiter. Hence another canonical way is:
mapfile -d ';' -t array < <(printf '%s;' "$in")
Without setting the IFS
If you just have one colon you can do that:
a="foo:bar"
b=${a%:*}
c=${a##*:}
you will get:
b = foo
c = bar
Here is a clean 3-liner:
in="foo#bar;bizz#buzz;fizz#buzz;buzz#woof"
IFS=';' list=($in)
for item in "${list[#]}"; do echo $item; done
where IFS delimit words based on the separator and () is used to create an array. Then [#] is used to return each item as a separate word.
If you've any code after that, you also need to restore $IFS, e.g. unset IFS.
The following Bash/zsh function splits its first argument on the delimiter given by the second argument:
split() {
local string="$1"
local delimiter="$2"
if [ -n "$string" ]; then
local part
while read -d "$delimiter" part; do
echo $part
done <<< "$string"
echo $part
fi
}
For instance, the command
$ split 'a;b;c' ';'
yields
a
b
c
This output may, for instance, be piped to other commands. Example:
$ split 'a;b;c' ';' | cat -n
1 a
2 b
3 c
Compared to the other solutions given, this one has the following advantages:
IFS is not overriden: Due to dynamic scoping of even local variables, overriding IFS over a loop causes the new value to leak into function calls performed from within the loop.
Arrays are not used: Reading a string into an array using read requires the flag -a in Bash and -A in zsh.
If desired, the function may be put into a script as follows:
#!/usr/bin/env bash
split() {
# ...
}
split "$#"
you can apply awk to many situations
echo "bla#some.com;john#home.com"|awk -F';' '{printf "%s\n%s\n", $1, $2}'
also you can use this
echo "bla#some.com;john#home.com"|awk -F';' '{print $1,$2}' OFS="\n"
There is a simple and smart way like this:
echo "add:sfff" | xargs -d: -i echo {}
But you must use gnu xargs, BSD xargs cant support -d delim. If you use apple mac like me. You can install gnu xargs :
brew install findutils
then
echo "add:sfff" | gxargs -d: -i echo {}
There are some cool answers here (errator esp.), but for something analogous to split in other languages -- which is what I took the original question to mean -- I settled on this:
IN="bla#some.com;john#home.com"
declare -a a="(${IN//;/ })";
Now ${a[0]}, ${a[1]}, etc, are as you would expect. Use ${#a[*]} for number of terms. Or to iterate, of course:
for i in ${a[*]}; do echo $i; done
IMPORTANT NOTE:
This works in cases where there are no spaces to worry about, which solved my problem, but may not solve yours. Go with the $IFS solution(s) in that case.
So many answers and so many complexities. Try out a simpler solution:
echo "string1, string2" | tr , "\n"
tr (read, translate) replaces the first argument with the second argument in the input.
So tr , "\n" replace the comma with new line character in the input and it becomes:
string1
string2
If no space, Why not this?
IN="bla#some.com;john#home.com"
arr=(`echo $IN | tr ';' ' '`)
echo ${arr[0]}
echo ${arr[1]}
This is the simplest way to do it.
spo='one;two;three'
OIFS=$IFS
IFS=';'
spo_array=($spo)
IFS=$OIFS
echo ${spo_array[*]}
Apart from the fantastic answers that were already provided, if it is just a matter of printing out the data you may consider using awk:
awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"
This sets the field separator to ;, so that it can loop through the fields with a for loop and print accordingly.
Test
$ IN="bla#some.com;john#home.com"
$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"
> [bla#some.com]
> [john#home.com]
With another input:
$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "a;b;c d;e_;f"
> [a]
> [b]
> [c d]
> [e_]
> [f]
IN="bla#some.com;john#home.com"
IFS=';'
read -a IN_arr <<< "${IN}"
for entry in "${IN_arr[#]}"
do
echo $entry
done
Output
bla#some.com
john#home.com
System : Ubuntu 12.04.1
Use the set built-in to load up the $# array:
IN="bla#some.com;john#home.com"
IFS=';'; set $IN; IFS=$' \t\n'
Then, let the party begin:
echo $#
for a; do echo $a; done
ADDR1=$1 ADDR2=$2
Two bourne-ish alternatives where neither require bash arrays:
Case 1: Keep it nice and simple: Use a NewLine as the Record-Separator... eg.
IN="bla#some.com
john#home.com"
while read i; do
# process "$i" ... eg.
echo "[email:$i]"
done <<< "$IN"
Note: in this first case no sub-process is forked to assist with list manipulation.
Idea: Maybe it is worth using NL extensively internally, and only converting to a different RS when generating the final result externally.
Case 2: Using a ";" as a record separator... eg.
NL="
" IRS=";" ORS=";"
conv_IRS() {
exec tr "$1" "$NL"
}
conv_ORS() {
exec tr "$NL" "$1"
}
IN="bla#some.com;john#home.com"
IN="$(conv_IRS ";" <<< "$IN")"
while read i; do
# process "$i" ... eg.
echo -n "[email:$i]$ORS"
done <<< "$IN"
In both cases a sub-list can be composed within the loop is persistent after the loop has completed. This is useful when manipulating lists in memory, instead storing lists in files. {p.s. keep calm and carry on B-) }
In Android shell, most of the proposed methods just do not work:
$ IFS=':' read -ra ADDR <<<"$PATH"
/system/bin/sh: can't create temporary file /sqlite_stmt_journals/mksh.EbNoR10629: No such file or directory
What does work is:
$ for i in ${PATH//:/ }; do echo $i; done
/sbin
/vendor/bin
/system/sbin
/system/bin
/system/xbin
where // means global replacement.
IN='bla#some.com;john#home.com;Charlie Brown <cbrown#acme.com;!"#$%&/()[]{}*? are no problem;simple is beautiful :-)'
set -f
oldifs="$IFS"
IFS=';'; arrayIN=($IN)
IFS="$oldifs"
for i in "${arrayIN[#]}"; do
echo "$i"
done
set +f
Output:
bla#some.com
john#home.com
Charlie Brown <cbrown#acme.com
!"#$%&/()[]{}*? are no problem
simple is beautiful :-)
Explanation: Simple assignment using parenthesis () converts semicolon separated list into an array provided you have correct IFS while doing that. Standard FOR loop handles individual items in that array as usual.
Notice that the list given for IN variable must be "hard" quoted, that is, with single ticks.
IFS must be saved and restored since Bash does not treat an assignment the same way as a command. An alternate workaround is to wrap the assignment inside a function and call that function with a modified IFS. In that case separate saving/restoring of IFS is not needed. Thanks for "Bize" for pointing that out.
Here's my answer!
DELIMITER_VAL='='
read -d '' F_ABOUT_DISTRO_R <<"EOF"
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS"
NAME="Ubuntu"
VERSION="14.04.4 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.4 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
EOF
SPLIT_NOW=$(awk -F$DELIMITER_VAL '{for(i=1;i<=NF;i++){printf "%s\n", $i}}' <<<"${F_ABOUT_DISTRO_R}")
while read -r line; do
SPLIT+=("$line")
done <<< "$SPLIT_NOW"
for i in "${SPLIT[#]}"; do
echo "$i"
done
Why this approach is "the best" for me?
Because of two reasons:
You do not need to escape the delimiter;
You will not have problem with blank spaces. The value will be properly separated in the array.
A one-liner to split a string separated by ';' into an array is:
IN="bla#some.com;john#home.com"
ADDRS=( $(IFS=";" echo "$IN") )
echo ${ADDRS[0]}
echo ${ADDRS[1]}
This only sets IFS in a subshell, so you don't have to worry about saving and restoring its value.

How to parse multiple line output as separate variables

I'm relatively new to bash scripting and I would like someone to explain this properly, thank you. Here is my code:
#! /bin/bash
echo "first arg: $1"
echo "first arg: $2"
var="$( grep -rnw $1 -e $2 | cut -d ":" -f1 )"
var2=$( grep -rnw $1 -e $2 | cut -d ":" -f1 | awk '{print substr($0,length,1)}')
echo "$var"
echo "$var2"
The problem I have is with the output, the script I'm trying to write is a c++ function searcher, so upon launching my script I have 2 arguments, one for the directory and the second one as the function name. This is how my output looks like:
first arg: Projekt
first arg: iseven
Projekt/AX/include/ax.h
Projekt/AX/src/ax.cpp
h
p
Now my question is: how do can I save the line by line output as a variable, so that later on I can use var as a path, or to use var2 as a character to compare. My plan was to use IF() statements to determine the type, idea: IF(last_char == p){echo:"something"}What I've tried was this question: Capturing multiple line output into a Bash variable and then giving it an array. So my code looked like: "${var[0]}". Please explain how can I use my line output later on, as variables.
I'd use readarray to populate an array variable just in case there's spaces in your command's output that shouldn't be used as field separators that would end up messing up foo=( ... ). And you can use shell parameter expansion substring syntax to get the last character of a variable; no need for that awk bit in your var2:
#!/usr/bin/env bash
readarray -t lines < <(printf "%s\n" "Projekt/AX/include/ax.h" "Projekt/AX/src/ax.cpp")
for line in "${lines[#]}"; do
printf "%s\n%s\n" "$line" "${line: -1}" # Note the space before the -1
done
will display
Projekt/AX/include/ax.h
h
Projekt/AX/src/ax.cpp
p

Iterate over arguments in a bash script and make use of their numbers

If I want to iterate over all arguments it is as easy as for i in "$#"; do .... However, let's say I want to start with the second argument and also make use of the arguments' positions for some basic calculation.
As an example I want to shorten these commands into one loop:
grep -v 'foobar' "$2" | grep -f $file > output1.txt
grep -v 'foobar' "$3" | grep -f $file > output2.txt
grep -v 'foobar' "$4" | grep -f $file > output3.txt
grep -v 'foobar' "$5" | grep -f $file > output4.txt
I tried many variations like for i in {2..5}; do grep -v 'foobar' "$$i" | grep -f $file > output$(($i-1)).txt; done; however, it seems bash expansion doesn't work like this.
EDIT:
Seems I made a mistake not emphasizing that I need to make use of the argument's position/number (i.e., 2 from $2). It's important because the output files get used separately later in the script. All of the provided answers so far seem correct but I don't know how to use them to make use of the argument's "number".
Couple correct answers already, another way could be:
for (( i=2; i <= "$#"; i++ )); do
echo "arg position: ${i}"
echo "arg value: ${!i}"
done
If you do not want to shift off the first unneeded arguments you can use the indirection expansion:
for i in {2..5}; do
echo "${!i}"
done
You can also make use of array indexes directly:
#!/bin/bash
for i in "${#:2}"; do
echo "$i"
done
Which will iterate over the arguments beginning with the second argument. It will also preserve whitespace in arguments when quoted. e.g.
$ bash args.sh one two "three four" five
two
three four
five
You just need to use shift to move positional parameter once (thus discarding $1):
fn() { arg1="$1"; shift; for arg; do echo "$arg"; done; }
Call it as:
fn val1 val2 val3 val4
Output:
val2
val3
val4
Above answers are correct. Another approach:
a=("${#:2}")
for i in ${!a[#]}; do
echo "$i = ${a[i]}"
done
$# is similar to an array, with the small difference that it is one-based and not zero-based ($0 is the name of the shell or shell script), and it can't be indexed (${#[1]} won't work to access $1). You can use parameter expansion (more specifically: substring expansion) to access a subset of your positional parameters:
$ set arg1 arg2 arg3 arg4 arg5 # Set $1, $2, $3, $4, $5
$ for i in "${#:2}"; do echo "$i"; done
arg2
arg3
arg4
arg5
Notice that this will not actually allow you to use the index of the argument and the argument itself. To do that, you'd have to either turn the positional parameters into a regular array and deal with the index offset by one ($1 would have index 0), or use indirect expansion (see andlrc's answer).
Example of the first method:
$ set arg1 arg2 arg3 arg4 arg5
$ args=("$#")
$ for i in 3 2 5 1 4; do echo "\$$i is ${args[$((i-1))]}"; done
$3 is arg3
$2 is arg2
$5 is arg5
$1 is arg1
$4 is arg4
Clearly, ${args[$((i-1))]} is rather messy, but it works.
Side note: if you want to access all positional parameters, you can use a convenient shorthand:
$ for i; do echo "$i"; done
arg1
arg2
arg3
arg4
arg5
for without the in part loops over all positional parameters (see manual).

Cut unix variable

I have the following at the moment:
for file in *
do
list="$list""$file "`cat $file | wc -l | sort -k1`$'\n'
done
echo "$list"
This is printing:
fileA 10
fileB 20
fileC 30
I would then like to cycle through $list and cut column 2 and perform calculations.
When I do:
for line in "$list"
do
noOfLinesInFile=`echo "$line" | cut -d\ -f2`
echo "$noOfLinesInFile"
done
It prints:
10
20
30
BUT, the for loop is only being entered once. In this example, it should be entering the loop 3 times.
Can someone please tell me what I should do here to achieve this?
If you quote the variable
for line in "$list"
there is only one word, so the loop is executed just once.
Without quotes, $line would be populated with any word found in the $list, which is not what you want, either, as it would process the values one by one, not lines.
You can set the $IFS variable to newline to split $list on newlines:
IFS=$'\n'
for line in $list ; do
...
done
Don't forget to reset IFS to the original value - either put the whole part into a subshell (if no variables should survive the loop)
(
IFS=$'\n'
for ...
)
or backup the value:
IFS_=$IFS
IFS=$'\n'
for ...
IFS=$IFS_
...
done
This is because list in shell are just defined using space as a separator.
# list="a b c"
# for i in $list; do echo $i; done
a
b
c
# for i in "$list"; do echo $i; done
a b c
in your first loop, you actually are not building a list in shell sens.
You should setting other than default separators either for the loop, in the append, or in the cut...
Use arrays instead:
#!/bin/bash
files=()
linecounts=()
for file in *; do
files+=("$file")
linecounts+=("$(wc -l < "$file")")
done
for i in "${!files[#]}" ;do
echo "${linecounts[i]}"
printf '%s %s\n' "${files[i]}" "${linecounts[i]}" ## Another form.
done
Although it can be done simpler as printf '%s\n' "${linecounts[#]}".
wc -l will only output one value, so you don't need to sort it:
for file in *; do
list+="$file "$( wc -l < "$file" )$'\n'
done
echo "$list"
Then, you can use a while loop to read the list line-by-line:
while read file nlines; do
echo $nlines
done <<< "$list"
That while loop is fragile if any filename has spaces. This is a bit more robust:
while read -a words; do
echo ${words[-1]}
done <<< "$list"

Resources