Related
Assume that the command alpha produces this output:
a b c
d
If I run the command
beta $(alpha)
then beta will be executed with four parameters, "a", "b", "c" and "d".
But if I run the command
beta "$(alpha)"
then beta will be executed with one parameter, "a b c d".
What should I write in order to execute beta with two parameters, "a b c" and "d". That is, how do I force $(alpha) to return one parameter per output line from alpha?
You can use:
$ alpha | xargs -d "\n" beta
Similar to anubhava's answer, if you are using bash 4 or later.
readarray -t args < <(alpha)
beta "${args[#]}"
Do that in 2 steps in bash:
IFS=$'\n' read -d '' a b < <(alpha)
beta "$a" "$b"
Example:
# set IFS to \n with -d'' to read 2 values in a and b
IFS=$'\n' read -d '' a b < <(echo $'a b c\nd')
# check a and b
declare -p a b
declare -- a="a b c"
declare -- b="d"
Script beta.sh should fix your issue:
$ cat alpha.sh
#! /bin/sh
echo -e "a b c\nd"
$ cat beta.sh
#! /bin/sh
OIFS="$IFS"
IFS=$'\n'
for i in $(./alpha.sh); do
echo $i
done
I have a bash variable which is a white-space separated list of strings and would like a loop that iterates 2 elements at a time from that list. I've made sure the length of the list is divisible by 2. So I want something like:
x="a bb cccc d"
while read first second; do
echo "($first,$second)"
done <<< $x
output should be:
(a,bb)
(cccc,d)
currently the above yields:
(a,bb cccc d)
Note: I need the assignment of $first and $second in my loop. echo was put in as a placeholder.
I'm looking for an efficient answer (preferably without a counter).
Depending on what you want to do with the values, you could do something like this:
$ x="one two buckle myShoe"
$ xargs -n 2 printf "(%s,%s)\n" <<<$x
(one,two)
(buckle,myShoe)
I assume that you really want to assign the values to shell variables, though, and that the body of your anticipated loop is more complicated than a simple formated print. In fact, since printf automatically repeats the format until its arguments are exhausted, xargs in the above example was unnecessary; it could have just been:
printf "(%s,%s)\n" $x
Unfortunately, xargs is not a built-in, so you can't xargs an arbitrary bash pipeline. Nor can you just define a shell function and give xargs its name. You can, however, xargs a bash subshell, which provides a very general solution:
# Shell function which takes two arguments
doit() {
echo "Number of arguments: $#"
echo "First argument: $1"
echo "Second argument: $2"
}
# Make the function visible to subshells
export -f doit
x="one two buckle myShoe"
xargs -n2 bash -c 'doit "$#"' _ <<<$x
Finally, as per a discussion in comments, you can use either printf or xargs (but probably printf is more efficient, since it is a builtin) to reorganize the list into pairs, and then feed that into a while read loop:
printf "%s %s\n" $x |
while read -r first second; do
echo "($first,$second)"
done
Using shift and $1, $2, etc.
You can use set -- $x to assign each part of x to the bash variables $1, $2, $3, etc.
You can use shift <n> to shift the numbered variables down by n. (I.e. shift 2 moves the value of $3 to $1, and $4 to $2, etc.)
Example
x="a bb cccc d"
set -- $x
while [ ! -z "$1" ] # while $1 is not empty
do
# do whatever you want here with $1 and $2. you can take more
# than two at a time by calling shift with a higher argument
# (and use $1, $2, and $3).
echo "($1,$2)"
shift 2
done
This prints:
(a,bb)
(cccc,d)
If you need the command line arguments
Save them to a temporary variable:
ARGS=( "$#" ) # save the command line args
set -- $x
...
set -- "${ARGS[#]}" # restore them back
You can tell read to read specific # of characters:
while read -n4 first second; do echo "($first,$second)"; done <<< "$x"
(a,b)
(c,d)
EDIT: For generic solution use awk:
x="aa bb cccc de"
awk '{for (i=1; i<=NF; i+=2) printf "(%s,%s)\n", $i, $(i+1) }' <<< "$x"
(aa,bb)
(cccc,de)
I would declare your variable like this :
x="a b c d"
A=($x)
And then
$ for ((i=0; i<${#A[*]}; i=i+2)); do echo "("${A[$i]},${A[$i+1]}")"; done
(a,b)
(c,d)
I can use herestrings to pass a string to a command, e.g.
cat <<< "This is a string"
How can I use herestrings to pass two strings to a command? How can I do something like
### not working
diff <<< "string1" "string2"
### working but overkill
echo "string1" > file1
echo "string2" > file2
diff file1 file2
You can't use two herestrings as input to the same command. In effect, the latest one will replace all others. Demonstration:
cat <<< "string 1" <<< "string 2" <<< "string 3"
# only shows "string 3"
On the other hand, if what you want is really diff two immediate inputs, you can do it this way:
diff <(echo "string 1") <(echo "string 2")
You can simply concatenate the two strings:
cat <<< "string1""string2"
(not the lack of space between the two). The here string now consists of a single word whose contents are the contents of the two strings.
I have a variable like this:
words="这是一条狗。"
I want to make a for loop on each of the characters, one at a time, e.g. first character="这", then character="是", character="一", etc.
The only way I know is to output each character to separate line in a file, then use while read line, but this seems very inefficient.
How can I process each character in a string through a for loop?
You can use a C-style for loop:
foo=string
for (( i=0; i<${#foo}; i++ )); do
echo "${foo:$i:1}"
done
${#foo} expands to the length of foo. ${foo:$i:1} expands to the substring starting at position $i of length 1.
With sed on dash shell of LANG=en_US.UTF-8, I got the followings working right:
$ echo "你好嗎 新年好。全型句號" | sed -e 's/\(.\)/\1\n/g'
你
好
嗎
新
年
好
。
全
型
句
號
and
$ echo "Hello world" | sed -e 's/\(.\)/\1\n/g'
H
e
l
l
o
w
o
r
l
d
Thus, output can be looped with while read ... ; do ... ; done
edited for sample text translate into English:
"你好嗎 新年好。全型句號" is zh_TW.UTF-8 encoding for:
"你好嗎" = How are you[ doing]
" " = a normal space character
"新年好" = Happy new year
"。全型空格" = a double-byte-sized full-stop followed by text description
${#var} returns the length of var
${var:pos:N} returns N characters from pos onwards
Examples:
$ words="abc"
$ echo ${words:0:1}
a
$ echo ${words:1:1}
b
$ echo ${words:2:1}
c
so it is easy to iterate.
another way:
$ grep -o . <<< "abc"
a
b
c
or
$ grep -o . <<< "abc" | while read letter; do echo "my letter is $letter" ; done
my letter is a
my letter is b
my letter is c
I'm surprised no one has mentioned the obvious bash solution utilizing only while and read.
while read -n1 character; do
echo "$character"
done < <(echo -n "$words")
Note the use of echo -n to avoid the extraneous newline at the end. printf is another good option and may be more suitable for your particular needs. If you want to ignore whitespace then replace "$words" with "${words// /}".
Another option is fold. Please note however that it should never be fed into a for loop. Rather, use a while loop as follows:
while read char; do
echo "$char"
done < <(fold -w1 <<<"$words")
The primary benefit to using the external fold command (of the coreutils package) would be brevity. You can feed it's output to another command such as xargs (part of the findutils package) as follows:
fold -w1 <<<"$words" | xargs -I% -- echo %
You'll want to replace the echo command used in the example above with the command you'd like to run against each character. Note that xargs will discard whitespace by default. You can use -d '\n' to disable that behavior.
Internationalization
I just tested fold with some of the Asian characters and realized it doesn't have Unicode support. So while it is fine for ASCII needs, it won't work for everyone. In that case there are some alternatives.
I'd probably replace fold -w1 with an awk array:
awk 'BEGIN{FS=""} {for (i=1;i<=NF;i++) print $i}'
Or the grep command mentioned in another answer:
grep -o .
Performance
FYI, I benchmarked the 3 aforementioned options. The first two were fast, nearly tying, with the fold loop slightly faster than the while loop. Unsurprisingly xargs was the slowest... 75x slower.
Here is the (abbreviated) test code:
words=$(python -c 'from string import ascii_letters as l; print(l * 100)')
testrunner(){
for test in test_while_loop test_fold_loop test_fold_xargs test_awk_loop test_grep_loop; do
echo "$test"
(time for (( i=1; i<$((${1:-100} + 1)); i++ )); do "$test"; done >/dev/null) 2>&1 | sed '/^$/d'
echo
done
}
testrunner 100
Here are the results:
test_while_loop
real 0m5.821s
user 0m5.322s
sys 0m0.526s
test_fold_loop
real 0m6.051s
user 0m5.260s
sys 0m0.822s
test_fold_xargs
real 7m13.444s
user 0m24.531s
sys 6m44.704s
test_awk_loop
real 0m6.507s
user 0m5.858s
sys 0m0.788s
test_grep_loop
real 0m6.179s
user 0m5.409s
sys 0m0.921s
I believe there is still no ideal solution that would correctly preserve all whitespace characters and is fast enough, so I'll post my answer. Using ${foo:$i:1} works, but is very slow, which is especially noticeable with large strings, as I will show below.
My idea is an expansion of a method proposed by Six, which involves read -n1, with some changes to keep all characters and work correctly for any string:
while IFS='' read -r -d '' -n 1 char; do
# do something with $char
done < <(printf %s "$string")
How it works:
IFS='' - Redefining internal field separator to empty string prevents stripping of spaces and tabs. Doing it on a same line as read means that it will not affect other shell commands.
-r - Means "raw", which prevents read from treating \ at the end of the line as a special line concatenation character.
-d '' - Passing empty string as a delimiter prevents read from stripping newline characters. Actually means that null byte is used as a delimiter. -d '' is equal to -d $'\0'.
-n 1 - Means that one character at a time will be read.
printf %s "$string" - Using printf instead of echo -n is safer, because echo treats -n and -e as options. If you pass "-e" as a string, echo will not print anything.
< <(...) - Passing string to the loop using process substitution. If you use here-strings instead (done <<< "$string"), an extra newline character is appended at the end. Also, passing string through a pipe (printf %s "$string" | while ...) would make the loop run in a subshell, which means all variable operations are local within the loop.
Now, let's test the performance with a huge string.
I used the following file as a source:
https://www.kernel.org/doc/Documentation/kbuild/makefiles.txt
The following script was called through time command:
#!/bin/bash
# Saving contents of the file into a variable named `string'.
# This is for test purposes only. In real code, you should use
# `done < "filename"' construct if you wish to read from a file.
# Using `string="$(cat makefiles.txt)"' would strip trailing newlines.
IFS='' read -r -d '' string < makefiles.txt
while IFS='' read -r -d '' -n 1 char; do
# remake the string by adding one character at a time
new_string+="$char"
done < <(printf %s "$string")
# confirm that new string is identical to the original
diff -u makefiles.txt <(printf %s "$new_string")
And the result is:
$ time ./test.sh
real 0m1.161s
user 0m1.036s
sys 0m0.116s
As we can see, it is quite fast.
Next, I replaced the loop with one that uses parameter expansion:
for (( i=0 ; i<${#string}; i++ )); do
new_string+="${string:$i:1}"
done
The output shows exactly how bad the performance loss is:
$ time ./test.sh
real 2m38.540s
user 2m34.916s
sys 0m3.576s
The exact numbers may very on different systems, but the overall picture should be similar.
I've only tested this with ascii strings, but you could do something like:
while test -n "$words"; do
c=${words:0:1} # Get the first character
echo character is "'$c'"
words=${words:1} # trim the first character
done
It is also possible to split the string into a character array using fold and then iterate over this array:
for char in `echo "这是一条狗。" | fold -w1`; do
echo $char
done
The C style loop in #chepner's answer is in the shell function update_terminal_cwd, and the grep -o . solution is clever, but I was surprised not to see a solution using seq. Here's mine:
read word
for i in $(seq 1 ${#word}); do
echo "${word:i-1:1}"
done
#!/bin/bash
word=$(echo 'Your Message' |fold -w 1)
for letter in ${word} ; do echo "${letter} is a letter"; done
Here is the output:
Y is a letter
o is a letter
u is a letter
r is a letter
M is a letter
e is a letter
s is a letter
s is a letter
a is a letter
g is a letter
e is a letter
To iterate ASCII characters on a POSIX-compliant shell, you can avoid external tools by using the Parameter Expansions:
#!/bin/sh
str="Hello World!"
while [ ${#str} -gt 0 ]; do
next=${str#?}
echo "${str%$next}"
str=$next
done
or
str="Hello World!"
while [ -n "$str" ]; do
next=${str#?}
echo "${str%$next}"
str=$next
done
sed works with unicode
IFS=$'\n'
for z in $(sed 's/./&\n/g' <(printf '你好嗎')); do
echo hello: "$z"
done
outputs
hello: 你
hello: 好
hello: 嗎
Another approach, if you don't care about whitespace being ignored:
for char in $(sed -E s/'(.)'/'\1 '/g <<<"$your_string"); do
# Handle $char here
done
Another way is:
Characters="TESTING"
index=1
while [ $index -le ${#Characters} ]
do
echo ${Characters} | cut -c${index}-${index}
index=$(expr $index + 1)
done
fold and while read are great for the job as shown in some answers here. Contrary to those answers, I think it's much more intuitive to pipe in the order of execution:
echo "asdfg" | fold -w 1 | while read c; do
echo -n "$c "
done
Outputs: a s d f g
I share my solution:
read word
for char in $(grep -o . <<<"$word") ; do
echo $char
done
TEXT="hello world"
for i in {1..${#TEXT}}; do
echo ${TEXT[i]}
done
where {1..N} is an inclusive range
${#TEXT} is a number of letters in a string
${TEXT[i]} - you can get char from string like an item from an array
I have a text file test.txt with the following content:
text1
text2
And I want to assign the content of the file to a UNIX variable, but when I do this:
testvar=$(cat test.txt)
echo $testvar
the result is:
text1 text2
instead of
text1
text2
Can someone suggest me a solution for this?
The assignment does not remove the newline characters, it's actually the echo doing this. You need simply put quotes around the string to maintain those newlines:
echo "$testvar"
This will give the result you want. See the following transcript for a demo:
pax> cat num1.txt ; x=$(cat num1.txt)
line 1
line 2
pax> echo $x ; echo '===' ; echo "$x"
line 1 line 2
===
line 1
line 2
The reason why newlines are replaced with spaces is not entirely to do with the echo command, rather it's a combination of things.
When given a command line, bash splits it into words according to the documentation for the IFS variable:
IFS: The Internal Field Separator that is used for word splitting after expansion ... the default value is <space><tab><newline>.
That specifies that, by default, any of those three characters can be used to split your command into individual words. After that, the word separators are gone, all you have left is a list of words.
Combine that with the echo documentation (a bash internal command), and you'll see why the spaces are output:
echo [-neE] [arg ...]: Output the args, separated by spaces, followed by a newline.
When you use echo "$x", it forces the entire x variable to be a single word according to bash, hence it's not split. You can see that with:
pax> function count {
...> echo $#
...> }
pax> count 1 2 3
3
pax> count a b c d
4
pax> count $x
4
pax> count "$x"
1
Here, the count function simply prints out the number of arguments given. The 1 2 3 and a b c d variants show it in action.
Then we try it with the two variations on the x variable. The one without quotes shows that there are four words, "test", "1", "test" and "2". Adding the quotes makes it one single word "test 1\ntest 2".
This is due to IFS (Internal Field Separator) variable which contains newline.
$ cat xx1
1
2
$ A=`cat xx1`
$ echo $A
1 2
$ echo "|$IFS|"
|
|
A workaround is to reset IFS to not contain the newline, temporarily:
$ IFSBAK=$IFS
$ IFS=" "
$ A=`cat xx1` # Can use $() as well
$ echo $A
1
2
$ IFS=$IFSBAK
To REVERT this horrible change for IFS:
IFS=$IFSBAK
Bash -ge 4 has the mapfile builtin to read lines from the standard input into an array variable.
help mapfile
mapfile < file.txt lines
printf "%s" "${lines[#]}"
mapfile -t < file.txt lines # strip trailing newlines
printf "%s\n" "${lines[#]}"
See also:
http://bash-hackers.org/wiki/doku.php/commands/builtin/mapfile
Your variable is set correctly by testvar=$(cat test.txt). To display this variable which consist new line characters, simply add double quotes, e.g.
echo "$testvar"
Here is the full example:
$ printf "test1\ntest2" > test.txt
$ testvar=$(<test.txt)
$ grep testvar <(set)
testvar=$'test1\ntest2'
$ echo "$testvar"
text1
text2
$ printf "%b" "$testvar"
text1
text2
Just if someone is interested in another option:
content=( $(cat test.txt) )
a=0
while [ $a -le ${#content[#]} ]
do
echo ${content[$a]}
a=$[a+1]
done
The envdir utility provides an easy way to do this. envdir uses files to represent environment variables, with file names mapping to env var names, and file contents mapping to env var values. If the file contents contain newlines, so will the env var.
See https://pypi.python.org/pypi/envdir