This question already has answers here:
Read values into a shell variable from a pipe
(17 answers)
Closed 12 months ago.
I just discovered mapfile in bash when shellcheck recommended it to my code to capture two shell parameters from a space-delimited string of two words. But, mapfile does not seem to function consistently. For example, this works as expected:
mapfile arr < myfile
The array arr is populated with entries corresponding to the lines of myfile. However, this does not work:
echo -e "hello\nworld" | mapfile arr
The array arr is not populated at all. And, this doesn't work either:
echo "hello world" | mapfile -d ' ' arr
I am not sure why it would make a difference where the standard input for the command comes from. I didn't know it would be possible to distinguish what the input came from, a file or a pipeline.
Any clues?
Note to moderator: It was suggested my question was a duplicate to Read values into a shell variable from a pipe . I do not agree. Nowhere is mapfile mentioned in that question, nor was there any other useful Q/A found in a SO search. In addition, that referenced question did not deal with shell parameter assignments. Therefore, this question and answers are valuable.
Technically the array is being populated; the issue is that mapfile is called in a sub-process which, when it exits (back to the command line) the array assignment is lost, ie, you can't pass assignments 'up' from a sub-process to a parent/calling process.
Try these instead:
$ mapfile -d ' ' arr < <(echo -e "hello\nworld")
$ typeset -p arr
declare -a arr=([0]=$'hello\nworld\n')
$ mapfile -d ' ' arr < <(echo "hello world")
$ typeset -p arr
declare -a arr=([0]="hello " [1]=$'world\n')
While these will populate the array ... you'll have to decide if this is what you were expecting to show up in the array. Perhaps the following are a bit closer to what's desired?
$ mapfile -t arr < <(echo -e "hello\nworld")
$ typeset -p arr
declare -a arr=([0]="hello" [1]="world")
$ mapfile -t arr < <(echo "hello world")
$ typeset -p arr
declare -a arr=([0]="hello world")
On the 2nd command set, if the intention is to parse each word into an array then perhaps switch out mapfile with read?
$ read -ra arr < <(echo "hello world")
$ typeset -p arr
declare -a arr=([0]="hello" [1]="world")
Related
I am trying to store the output of this:
mdfind "kMDItemContentType == 'com.apple.application-bundle'"
Output is like this:
/Applications/Safari.app
/Applications/Xcode.app
/Applications/Xcode.app/Contents/Applications/Accessibility Inspector.app
/Applications/Xcode.app/Contents/Applications/RealityComposer.app
/Applications/Xcode.app/Contents/Applications/FileMerge.app
/Applications/Xcode.app/Contents/Applications/Instruments.app
/Applications/Xcode.app/Contents/Applications/Create ML.app
I try to store it as an array but the contents are split per spaces:
bash-5.1$ arr=( $(/usr/bin/mdfind "kMDItemContentType == 'com.apple.application-bundle'") )
bash-5.1$ echo ${arr[2]}
/Applications/Xcode.app/Contents/Applications/Accessibility
bash-5.1$ echo ${arr[3]}
Inspector.app
bash-5.1$
So how can I do the trick?
Use readarray and process substitution to read a null-delimited series of paths into an array.
readarray -d '' arr < <(mdfind -0 "...")
The -0 option tells mdfind to terminate each path with a null byte instead of a linefeed. (This guards agains rare, but legal, path names that include linefeeds. Null bytes are not a valid character for any path component.)
The -d '' option tells readarray to treat the null byte as the end of a "line".
readarray populates an array with one "line" of input per element. The input is the output of mdfind; the process substitution ensures that readarray executes in the current shell, not a subshell induced by a pipe like
mdfind -0 "..." | readarray -d '' arr
(Under some situations, you can make the last job of a pipeline execute in the current shell; that's beyond the scope of this answer, though.)
Example:
# Using printf to simulate mdfind -0
$ readarray -d '' arr < <(printf 'foo\nbar\000baz\000')
$ declare -p arr
declare -a arr=([0]=$'foo\nbar' [1]="baz")
So I know I can use a single IFS in a read statement, but is it possible to use two. For instance if I have the text
variable = 5 + 1;
print variable;
And I have the code to assign every word split to an array, but I also want to split at the ; as well as a space, if it comes up.
Here is the code so far
INPUT="$1"
declare -a raw_parse
while IFS=' ' read -r -a raw_input; do
for raw in "${raw_input[#]}"; do
raw_parse+=("$raw")
done
done < "$INPUT"
What comes out:
declare -a raw_parse=([0]="variable" [1]="=" [2]="5" [3]="+" [4]="1;" [5]="print" [6]="variable;")
What I want:
declare -a raw_parse=([0]="variable" [1]="=" [2]="5" [3]="+" [4]="1" [5]=";" [6]="print" [7]="variable" [8]=";")
A workaround with GNU sed. This inserts a space before every ; and replaces every newline with a space.
read -r -a raw_input < <(sed -z 's/;/ ;/g; s/\n/ /g' "$INPUT")
declare -p raw_input
Output:
declare -a raw_input=([0]="variable" [1]="=" [2]="5" [3]="+" [4]="1" [5]=";" [6]="print" [7]="variable" [8]=";")
A command emits the string: "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj"
I want to load a bash associative array using these key|value pairs, but the result I'm getting is a single row array where the key is formed of the first pair [abc]=kjlkjkl and the value is the whole of the rest of the string, so: declare -p arr returns declare -A arr["[abc]=kjlkjkl"]="[def]=yutuiu [ghi]=jljlkj"
This is what I am doing at the moment. Where am I going wrong please?
declare -A arr=()
while read -r a b; do
arr["$a"]="$b"
done < <(command that outputs the string "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj")
You need to parse it: split the string on spaces, split each key-value pair on the equals sign, and get rid of the brackets.
Here's one way, using tr to replace the spaces with newlines, then tr again to remove all brackets (including any that occur in a value), then IFS="=" to split the key-value pairs. I'm sure this could be done more effectively, like with AWK or Perl, but I don't know how.
declare -A arr=()
while IFS="=" read -r a b; do
arr["$a"]="$b"
done < <(
echo "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj" |
tr ' ' '\n' |
tr -d '[]'
)
echo "${arr[def]}" # -> yutuiu
See Cyrus's answer for another take on this, with the space and equals steps combined.
Append this to your command which outputs the string:
| tr ' =' '\n ' | tr -d '[]'
You can use the "eval declare" trick - but be sure your input is clean.
#! /bin/bash
s='[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj'
eval declare -A arr=("$s")
echo ${arr[def]} # yutuiu
If the input is insecure, don't use it. Imagine (don't try) what would happen if
s='); rm -rf / #'
The "proper" good™ solution would be to write your own parser and tokenize the input. For example read the input char by char, handle [ and ] and = and space and optionally quoting. After parsing the string, assign the output to an associative array.
A simple way could be:
echo "[abc]=kjlkjkl [def]=yutuiu [ghi]=jljlkj" |
xargs -n1 |
{
declare -A arr;
while IFS= read -r line; do
if [[ "$line" =~ ^\[([a-z]*)\]=([a-z]*)$ ]]; then
arr[${BASH_REMATCH[1]}]=${BASH_REMATCH[2]}
fi
done
declare -p arr
}
outputs:
declare -A arr=([abc]="kjlkjkl" [ghi]="jljlkj" [def]="yutuiu" )
This question already has answers here:
How to sort an array in Bash
(20 answers)
Closed 2 years ago.
In bash, there is an array like this:
arr=(12345_34, 5_32134, 8_123, 13_1234)
And I'd like to sort (decreasing order) this array based on the numbers before the underscore.
So the desired result is the following:
(12345_34, 13_1234, 8_123, 5_32134)
I tried sort -t _-k 2 -g $arr
arr=(12345_34 5_32134 8_123 13_1234)
readarray -t arr_sorted < <(printf '%s\n' "${arr[#]}" | sort -r -t _ -g)
declare -p arr_sorted
...properly emits as output the ordering specified in the question:
declare -a arr_sorted=([0]="12345_34" [1]="13_1234" [2]="8_123" [3]="5_32134")
If you need to target versions of bash too old to have readarray, a while read loop can substitute, with considerable loss of terseness:
# define the input array
arr=(12345_34 5_32134 8_123 13_1234)
# generate a sorted version
arr_sorted=( )
while IFS= read -r item; do
arr_sorted+=( "$item" )
done < <(printf '%s\n' "${arr[#]}" | sort -r -t _ -g)
# print the sorted version to demonstrate that we built it correctly
declare -p arr_sorted
Try this:
sorted=($(printf '%s\n' "${arr[#]}" | sort -nr))
Here is how to split a tab-split line into an array:
IFS=$'\t' read -a array < <(echo -e "a\tb\tc")
And here is how to read null-terminated lines into an array:
while IFS= read -r -d '' item
do array+=("$item")
done < <(echo -e "a\0b\0c\0")
Now, is it possible to combine both and have a one-liner suitable to read a null-split line into an array?
First, putting -r -d '' together with -a array, and setting IFS to empty, doesn't to work:
IFS= read -r -d '' -a array < <(echo -e "a\0b\0c")
as array will have 1 item: a
Secondly, such while construct reading a null-split line into an array is mildly unsatisfactory. It will drop the last item if the last item isn't followed by null. A workaround is to append the last item to the array after the loop, as suggested by chepner below.
while IFS= read -r -d '' item
do array+=("$item")
done < <(echo -e "a\0b\0c")
array+=("$item")
There isn't, if you don't consider a single while loop a "one"-liner.
while IFS= read -r -d '' item; do array+=("$item"); done < <(echo -e "a\0b\0c\0")
read only reads a line of input at a time, and you are using -d '' to define what a line is. That is, you aren't treating "a\0b\0c" as a single line of input, but as multiple lines separate by null byte. The readarray command, introduced in bash 4, doesn't provide a way to specify what to consider as the line separator.
readarray a.k.a. mapfile supports nul termination with the -d '' flag, at least in bash 5.2.2 where I tested, and 4.4.20 on a test Ubuntu 18.04 container, which is as far back as I cared to go.
TL;DR:
mapfile -d '' ARRAYNAME < <(command that emits null bytes here)
mapfile -d '' ARRAYNAME < file-with-null-bytes
Order is significant, arrayname must come after the -d ''. Using a pipe instead will not work how you expect, because the mapfile will run in a subshell, hence the use of command-redirection above.
To append to an existing array, use
mapfile -d '' -O "${#ARR[#]}" ARR
Demo and explanation
Set up a NUL-terminated stream with one plain entry, one entry with embedded space, and one entry with embedded newline. Use xargs -0 to show it's correct.
$ python3 -c 'print("\x00".join(["1", "2 3", "4\n5"]), end="\x00")'
12 34
5
$ python3 -c 'print("\x00".join(["1", "2 3", "4\n5"]), end="\x00")' \
| xargs -0 -I{} echo "[{}]"
[1]
[2 3]
[4
5]
Python is only used to generate the demo input; use whatever else you prefer.
Now we use mapfile -d '' to read the demo null terminated stream and verify that it was split solely on null bytes:
$ mapfile -d '' ARR < <(python3 -c 'print("\x00".join(["1", "2 3", "4\n5"]), end="\x00")')
$ echo "${#ARR[#]}"
3
$ printf "[%s]" "${ARR[#]}"
[1][2 3][4
5]$
Note:
readarray is just an alias for mapfile
You can write mapfile -d $'\0' ARRAYNAME instead if you want, to make your intent more explicit. $'xxxxx' is a bash escape-string. But bash strings are null-terminated, so $'\0' is the same as the empty-string anyway. mapfile special-cases the zero-length delimiter argument to mean null terminated.
Order of arguments to mapfile is significant. Array name must be the last argument. mapfile ARRAYNAME -d '' is not the same as mapfile -d '' ARRAYNAME. Redirection operators are removed by the calling shell so they don't count.
Invoking mapfile in a pipeline won't export the resulting array to the shell that invokes the pipeline, because the mapfile in the pipeline will run in its own separate shell process. Use process substitution redirection < <(command) instead of a pipe.
So this won't work
$ # DELIBERATELY WRONG #
$ python3 -c 'print("\x00".join(["1", "2 3", "4\n5"]), end="\x00")' | mapfile -d '' ARR2
$ echo "${#ARR2[#]}"
0
$ echo "${ARR2[#]}"
$ # DELIBERATELY WRONG #
I used Python for the demo output because working with null bytes in bash is too painful. Here's an alternative using only sh etc, using the creation of files and then find -print0 to get the desired output:
mkdir omg
cd omg
touch '1' '2 3' $'4\n5'
find . -print0