Convert range to string - bash

If I run the
echo {0..9}
command, then I get the following output:
0 1 2 3 4 5 6 7 8 9
Can I somehow put the string "0 1 2 3 4 5 6 7 8 9" into a variable inside bash script? I only found a way using echo:
x=`echo {0..9}`
But this method implies the execution of an external program. Is it possible to somehow manage only with bash?
Interested, rather than a way to convert a range to a string, but additionally concatenate with a string, for example:
datafiles=`echo data{0..9}.txt`

First of all,
x=`echo {0..9}`
doesn't call an external program (echo is a built-in) but creates a subshell. If it isn't desired you can use printf (a built-in as well) with -v option:
printf -v x ' %s' {0..9}
x=${x:1} # strip off the leading space
or
printf -v datafiles ' data%s.txt' {0..9}
datafiles=${datafiles:1}
or you may want storing them in an array:
datafiles=(data{0..9}.txt)
echo "${datafiles[#]}"
This last method will work correctly even if filenames contain whitespace characters:
datafiles=(data\ {0..9}\ .txt)
printf '%s\n' "${datafiles[#]}"

Related

printf returns multiple copies

On OSX High Sierra, bash's printf seems to behave erroneously. Consider:
printf "[%s]" "x"
returns
[x]
all good... but:
printf "[%s]" "x" "y"
returns
[x][y]
instead of just [x] !!
don't tell me: don't provide more parameters. I don't know what the format will look like as it's passed to me, but I have parameters
the docs don't address this squarely, merely stating:
The format string is reused as often as necessary to satisfy the arguments.
Any extra format specifications are evaluated with zero or the null string.
is this broken?
From posix utilities printf:
The format operand shall be reused as often as necessary to satisfy the argument operands.
That exactly means that the format string is repeated as many times it needs to go through all the arguments. This is exactly how it was intended to work and this is one of the most useful features of printf.
You want to repeat a character '#' 10 times? Nothing simpler:
printf "#%.0s" $(seq 10)
# will expand to:
printf "#%.0s" 1 2 3 4 5 6 7 8 9 10
# is equivalent to:
printf "#%.0s#%.0s#%.0s#%.0s#%.0s#%.0s#%.0s#%.0s#%.0s#%.0s" 1 2 3 4 5 6 7 8 9 10
The %.0s will print zero character from the string, so it will print zero character, so it will.. print nothing. Thus the # is repeated as many times as many arguments are there.
You have an array and want to print all array members separated with a newline? Nothing simpler:
arr=(1 2 3 value1 test5 text7)
printf "%s\n" "${arr[#]}"
From my understanding is behaving as stated in this sentence of documentation:
The format string is reused as often as necessary to satisfy the arguments.
In your case, you have 2 arguments ("y" and "z") and just 1 format string ([%s]), so it is reused (i.e: use the same for each argument).
It iterates the arguments list and when it reaches the format string list end, it starts from the beginning:
The command:
printf "[%s](%s)" "x" "y" "z" "a"
Ouputs:
[x](y)[z](a)

Paste files conditionally with bash if and awk loop

I have a list of files that I want to paste to a master file (bar) if some awk condition is fulfilled.
for foo in *;
do
if awk '*condition* {exit 1}' $foo
then
:
else
paste $foo > bar
fi
done
However, it looks like only the last pasted file is in bar. Shouldn't paste add new columns to bar every time, without overwriting all the data completely?
File1 File2 Expected_Output Actual_Output
1 4 1 NaN 1 4 1 NaN 1 NaN
2 5 2 7 2 5 2 7 2 7
3 6 3 8 3 6 3 8 3 8
Your paste command overwrites file bar at each iteration in the loop, which explains that at the end you only have the last file.
declare -a FILES=()
for foo in *;
do
if awk '*condition* {exit 1}' $foo
then
:
else
FILES+=("$foo")
fi
done
paste "${FILES[#]}" > bar
This code accumulates all filenames that match your condition in an array named FILES, and calls paste only once, expanding all filenames into individual, quoted arguments (this is what "${FILES[#]}" does) and redirecting output to the bar file.
Additionally, you can replace the whole if/then/else block with :
awk '*condition* {exit 1}' "$foo" || FILES+=("$foo")
The || expresses a condition, and because of Bash performing a lazy evaluation of logical operators, the statement to the right is only executed if awk returns a non-zero return code.
Please note I quoted "$foo" (when passing it to awk) for the cases the name of your files would contain special characters.

How to delete leading newline in a string in bash?

I'm having the following issue. I have an array of numbers:
text="\n1\t2\t3\t4\t5\n6\t7\t8\t9\t0"
And I'd like to delete the leading newline.
I've tried
sed 's/.//' <<< "$text"
cut -c 1- <<< "$text"
and some iterations. But the issue is that both of those delete the first character AFTER EVERY newline. Resulting in this:
text="\n\t2\t3\t4\t5\n\t7\t8\t9\t0"
This is not what I want and there doesn't seem to be an answer to this case.
Is there a way to tell either of those commands to treat newlines like characters and the entire string as one entity?
awk to the rescue!
awk 'NR>1'
of course you can do the same with tail -n +2 or sed 1d as well.
You can probably use the substitution modifier (see parameter expansion and ANSI C quoting in the Bash manual):
$ text=$'\n1\t2\t3\t4\t5\n6\t7\t8\t9\t0'
$ echo "$text"
1 2 3 4 5
6 7 8 9 0
$ echo "${text/$'\n'/}"
1 2 3 4 5
6 7 8 9 0
$
It replaces the first newline with nothing, as requested. However, note that it is not anchored to the first character:
$ alt="${text/$'\n'/}"
$ echo "${alt/$'\n'/}"
1 2 3 4 56 7 8 9 0
$
Using a caret ^ before the newline doesn't help — it just means there's no match.
As pointed out by rici in the comments, if you read the manual page I referenced, you can find how to anchor the pattern at the start with a # prefix:
$ echo "${text/#$'\n'/}"
1 2 3 4 5
6 7 8 9 0
$ echo "${alt/#$'\n'/}"
1 2 3 4 5
6 7 8 9 0
$
The notation bears no obvious resemblance to other regex systems; you just have to know it.

for loop control in bash using a string

I want to use a string to control a for loop in bash. My first test code produces what I would expect and what I want:
$ aa='1 2 3 4'
$ for ii in $aa; do echo $ii; done
1
2
3
4
I'd like to use something like the following instead. This doesn't give the output I'd like (I can see why it does what it does).
$ aa='1..4'
$ for ii in $aa; do echo $ii; done
1..4
Any suggestions on how I should modify the second example to give the same output as the first?
Thanks in advance for any thoughts. I'm slowly learning bash but still have a lot to learn.
Mike
The notation could be written out as:
for ii in {1..4}; do echo "$ii"; done
but the {1..4} needs to be written out like that, no variables involved, and not as the result of variable substitution. That is brace expansion in the Bash manual, and it happens before string expansions, etc. You'll probably be best off using:
for ii in $(seq 1 4); do echo "$ii"; done
where either the 1 or the 4 or both can be shell variables.
You could use seq command (see man seq).
$ aa='1 4'
$ for ii in $(seq $aa); do echo $ii; done
Bash won't do brace expansion with variables, but you can use eval:
$ aa='1..4'
$ for ii in $(eval echo {$aa}); do echo $ii; done
1
2
3
4
You could also split aa into an array:
IFS=. arr=($aa)
for ((ii=arr[0]; ii<arr[2]; ii++)); do echo $ii; done
Note that IFS can only be a single character, so the .. range places the numbers into indexes 0 and 2.
Note There are certainly more elegant ways of doing this, as Ben Grimm's answer, and this is not pure bash, as uses seq and awk.
One way of achieving this is by calling seq. It would be trivial if you knew the numbers in the string beforehand, so there would be no need to do any conversion, as you could simple do seq 1 4 or seq $a $b for that matter.
I assume, however, that your input is indeed a string in the format you mentioned, that is, 1..4 or 20..100. For this purpose you could convert the string into 2 numbers ans use them as parameters for seq.
One of possibly many ways of achieving this is:
$ `echo "1..4" | sed -e 's/\.\./ /g' | awk '{print "seq", $1, $2}'`
1
2
3
4
Note that this will work the same way for any input in the given format. If desired, sed can be changed by tr with similar results.
$ x="10..15"
$ `echo $x | tr "." " " | awk '{print "seq", $1, $2}'`
10
11
12
13
14
15

setting awk variables through inlining

I've got this:
./awktest -v fields=`cat testfile`
which ought to set fields variable to '1 2 3 4 5' which is all that testfile contains
It returns:
gawk: ./awktest:9: fatal: cannot open file `2' for reading (No such file or directory)
When I do this it works fine.
./awktest -v fields='1 2 3 4 5'
printing fields at the time of error yields:
1
printing fields in the second instance yields:
1 2 3 4 5
When I try it with 12345 instead of 1 2 3 4 5 it works fine for both, so it's a problem with the white space. What is this problem? And how do I fix it.
This is most likely not an awk question. Most likely, it is your shell that is the culprit.
For example, if awktest is:
#!/bin/bash
i=1
for arg in "$#"; do
printf "%d\t%s\n" $i "$arg"
((i++))
done
Then you get:
$ ./awktest -v fields=`cat testfile`
1 -v
2 fields=1
3 2
4 3
5 4
6 5
You see that the file contents are not being handled as a single word.
Simple solution: use double quotes on the command line:
$ ./awktest -v fields="$(< testfile)"
1 -v
2 fields=1 2 3 4 5
The $(< file) construct is a bash shortcut for `cat file` that does not need to spawn an external process.
Or, read the first line of the file in the awk BEGIN block
awk '
BEGIN {getline fields < "testfile"}
rest of awk program ...
'
./awktest -v fields="`cat testfile`"
#note that:
#./awktest -v fields='`cat testfile`'
#does not work

Resources