Bash array containing output from 'find' function is incorrectly structured - bash

I am trying to create an array in bash that contains filenames for a subset of files stored in a single folder. I want the array to contain only filenames with the common string "zzz", and I want the array to contain one filename per element. I have been trying to use the find function to get filenames containing "zzz", and store the results in myarray.
Here is what I'm doing:
# Define folder containing files
file_dir=./my_files
# Define the common string
pattern="*zzz*"
# Store find output to myarray
readarray -d ' ' -t myarray < <(find ${file_dir} -name ${pattern})
# Print myarray
echo $myarray
Output:
./my_files/abc_zzz_1.nii.gz ./my_files/def_zzz_763.nii.gz ./my_files/ghi_zzz_628.nii.gz
myarray contains the correct filenames, however it does not appear to be structured in a way that allows indexing - I would like to be able to index the nth filename in myarray with ${myarray[n]}, however it seems that the full output from find is stored in a single element. echo ${myarray[0]} prints the same output as above, while echo ${myarray[1]} prints an empty line.
I figured that the whole output from find was being stored as a single string in ${myarray[0]}, so I tried to break the string up using:
read -r -a myarray2 <<< "${myarray[0]}"
...but this did not work as intended, because echo ${myarray2} only returns a single filename.
What am I doing wrong here?

Related

How to concatenate string to comma-separated element in bash

I am new to Bash coding. I would like to concatenate a string to each element of a comma-separated strings "array".
This is an example of what I have in mind:
s=a,b,c
# Here a function to concatenate the string "_string" to each of them.
# Expected result:
a_string,b_string,c_string
One way:
$ s=a,b,c
$ echo ${s//,/_string,}_string
a_string,b_string,c_string
Using a proper array is generally a much more robust solution. It allows the values to contain literal commas, whitespace, etc.
s=(a b c)
printf '%s\n' "${s[#]/%/_string}"
As suggested by chepner, you can use IFS="," to merge the result with commas.
(IFS=","; echo "${s[#]/%/_string}")
(The subshell is useful to keep the scope of the IFS reassignment from leaking to the current shell.)
Simply, you could use a for loop
main() {
local input='a,b,c'
local append='_string'
# create an 'output' variable that is empty
local output=
# convert the input into an array called 'items' (without the commas)
IFS=',' read -ra items <<< "$input"
# loop over each item in the array, and append whatever string we want, in this case, '_string'
for item in "${items[#]}"; do
output+="${item}${append},"
done
# in the loop, the comma was re-added back. now, we must remove the so there are only commas _in between_ elements
output=${output%,}
echo "$output"
}
main
I've split it up in three steps:
Make it into an actual array.
Append _string to each element in the array using Parameter expansion.
Turn it back into a scalar (for which I've made a function called turn_array_into_scalar).
#!/bin/bash
function turn_array_into_scalar() {
local -n arr=$1 # -n makes `arr` a reference the array `s`
local IFS=$2 # set the field separator to ,
arr="${arr[*]}" # "join" over IFS and assign it back to `arr`
}
s=a,b,c
# make it into an array by turning , into newline and reading into `s`
readarray -t s < <(tr , '\n' <<< "$s")
# append _string to each string in the array by using parameter expansion
s=( "${s[#]/%/_string}" )
# use the function to make it into a scalar again and join over ,
turn_array_into_scalar s ,
echo "$s"

how to find pattern and insert text in middle using shell script

I would like to add a name in the middle of dirPath
#!/bin/bash
name='agent_name-2'
dirPath='/var/azp/1/s'
I want to insert agent_name-2 after /var/azp in dirPath, and store it in a separate variable result like this
result=/var/azp/agent_name-2/1/s
If /var/azp is a hard coded string (i.e. constant), try:
name='agent_name-2'
dirPath='/var/azp/1/s'
result="/var/azp/$name${dirPath#/var/azp}"
Explanation: ${dirPath#/var/azp} removes the string /var/azp from the beginning of the string $dirPath.
Try this:
#!/bin/bash
name='agent_name-2'
dirPath='/var/azp/1/s'
Split dirPath by / and store it in the array dirs.
IFS=/ read -r -a dirs <<< "$dirPath"
Calculate the middle of the array.
middle=$(((${#dirs[#]}+1)/2))
Create two new arrays left and right with the left and right half of the dirs array.
left=("${dirs[#]:0:$middle}")
right=("${dirs[#]:$middle}")
Join the left and right half and put the name in between.
result="$(printf "%s/" "${left[#]}" "$name" "${right[#]}")"
Remove the trailing slash.
result=${result%/}
Bash search-replace
You can use Bash's search and replace syntax ${variable//search/replace}.
prefix='/var/azp'
result=${dirPath//$prefix/$prefix\/$name}
# > /var/azp/agent_name-2/1/s
sed s
If $name doesn't contain any special characters, you could inject it into a sed search-replace:
$ sed "s|/var/azp|\0/$name|" <<< "$dirPath"
/var/azp/agent_name-2/1/s
Then for saving the result to a variable, see How do I set a variable to the output of a command in Bash?

Basic string manipulation from filenames in bash

I have a some file names in bash that I have acquired with
$ ones=$(find SRR*pass*1*.fq)
$ echo $ones
SRR6301033_pass_1_trimmed.fq
SRR6301034_pass_1_trimmed.fq
SRR6301037_pass_1_trimmed.fq
...
I then converted into an array so I can iterate over this list and perform some operations with filenames:
# convert to array
$ ones=(${ones// / })
and the iteration:
for i in $ones;
do
fle=$(basename $i)
out=$(echo $fle | grep -Po '(SRR\d*)')
echo "quants/$out.quant"
done
which produces:
quants/SRR6301033
SRR6301034
...
...
SRR6301220
SRR6301221.quant
However I want this:
quants/SRR6301033.quant
quants/SRR6301034.quant
...
...
quants/SRR6301220.quant
quants/SRR6301221.quant
Could somebody explain why what I'm doing doesn't work and how to correct it?
Why do you want this be done this complicated? You can get rid of all the unnecessary roundabouts and just use a for loop and built-in parameter expansion techniques to get this done.
# Initialize an empty indexed array
array=()
# Start a loop over files ending with '.fq' and if there are no such files
# the *.fq would be un-expanded and checking it against '-f' would fail and
# in-turn would cause the loop to break out
for file in *.fq; do
[ -f "$file" ] || continue
# Get the part of filename after the last '/' ( same as basename )
bName="${file##*/}"
# Remove the part after '.' (removing extension)
woExt="${bName%%.*}"
# In the resulting string, remove the part after first '_'
onlyFir="${woExt%%_*}"
# Append the result to the array, prefixing/suffixing strings 'quant'
array+=( quants/"$onlyFir".quant )
done
Now print the array to see the result
for entry in "${array[#]}"; do
printf '%s\n' "$entry"
done
Ways your attempt could fail
With ones=$(find SRR*pass*1*.fq) you are storing the results in a variable and not in an array. A variable has no way to distinguish if the contents are a list or a single string separated by spaces
With echo $ones i.e. an unquoted expansion, the string content is subject to word splitting. You might not see a difference as long as you have filenames with spaces, having one might let you interpret parts of the filename as different files
The part ${ones// / } makes no-sense in converting the string to an array as the attempt to use an unquoted variable $ones itself would be erroneous
for i in $ones; would be error prone for the said reasons above, the filenames with spaces could be interpreted as separated files instead of one.

split element of array in multiple array bash

I need to read a file into an array.
Then store in a new array only the first column of each line
example file:
aa,1,2,3
bb,4,5,2
cc,7,1,4
mapfile -t arrFile < file
so in arrFile, I got all the rows
${arrFile[0]} , returns 'aa,1,2,3'
echo ${arrFile[0]} | cut -d ";" -f1 returns 'aa'
How can I copy the firstcolumns from arrFile in another array, possibly without looping in a while
Why copy? Perhaps it is enough if you simply use ${arrFile[0]%%,*} ?
Or you can copy, using arr2=(${arrFile[#]%%,*})

Open file with two columns and dynamically create variables

I'm wondering if anyone can help. I've not managed to find much in the way of examples and I'm not sure where to start coding wise either.
I have a file with the following contents...
VarA=/path/to/a
VarB=/path/to/b
VarC=/path/to/c
VarD=description of program
...
The columns are delimited by the '=' and some of the items in the 2nd column may contain gaps as they aren't just paths.
Ideally I'd love to open this in my script once and store the first column as the variable and the second as the value, for example...
echo $VarA
...
/path/to/a
echo $VarB
...
/path/to/a
Is this possible or am I living in a fairy land?
Thanks
You might be able to use the following loop:
while IFS== read -r name value; do
declare "$name=$value"
done < file.txt
Note, though, that a line like foo="3 5" would include the quotes in the value of the variable foo.
A minus sign or a special character isn't allowed in a variable name in Unix.
You may consider using BASH associative array for storing key and value together:
# declare an associative array
declare -A arr
# read file and populate the associative array
while IFS== read -r k v; do
arr["$k"]="$v"
done < file
# check output of our array
declare -p arr
declare -A arr='([VarA]="/path/to/a" [VarC]="/path/to/c" [VarB]="/path/to/b" [VarD]="description of program" )'
What about source my-file? It won't work with spaces though, but will work for what you've shared. This is an example:
reut#reut-home:~$ cat src
test=123
test2=abc/def
reut#reut-home:~$ echo $test $test2
reut#reut-home:~$ source src
reut#reut-home:~$ echo $test $test2
123 abc/def

Resources