I have a template script with some analysis and the only thing that I need to change in it is a case.
#!/bin/bash
CASE=XXX
... the rest of the script where I use $CASE
I created a list of all my cases, that I saved into file: list.txt.
So my list.txt file may contain cases as XXX, YYY, ZZZ.
Now I would run a loop over list.txt content and fill my template_script.sh with a case from the list.txt and then saved the file with a new name - script_CASE.sh
for case in `cat ./list.txt`;
do
# open template_script.sh
# use somehow the line from template_script.sh (maybe substitute CASE=$case)
# save template_script with a new name script_$case
done
In pure bash :
#!/bin/bash
while IFS= read -r casevalue; do
escaped=${casevalue//\'/\'\\\'\'} # escape single quotes if any
while IFS= read -r line; do
if [[ $line = CASE=* ]]; then
echo "CASE='$escaped'"
else
echo "$line"
fi
done < template_script.sh > "script_$casevalue"
done < list.txt
Note that saving to "script_$casevalue" may not work if the case contains a / character.
If it is guaranteed that case values (lines in list.txt) needn't to be escaped then using sed is simpler:
while IFS= read -r casevalue; do
sed -E "s/^CASE=(.*)/CASE=$casevalue/" template_script.sh > "script_$casevalue"
done < list.txt
But this approach is fragile and will fail, for instance, if a case value contains a & character. The pure bash version, I believe, is very robust.
Converting my comment to answer so that solution is easy to find for future visitors.
You may use this bash script:
while read -r c; do
sed "s/^CASE=.*/CASE=$c/" template_script.sh > "script_${c}.sh"
done < list.txt
Related
I have a list of files stored in a text file, and if a Python file is found in that list. I want to the corresponding test file using Pytest.
My file looks like this:
/folder1/file1.txt
/folder1/file2.jpg
/folder1/file3.md
/folder1/file4.py
/folder1/folder2/file5.py
When 4th/5th files are found, I want to run the command pytest like:
pytest /folder1/test_file4.py
pytest /folder1/folder2/test_file5.py
Currently, I am using this command:
cat /workspace/filelist.txt | while read line; do if [[ $$line == *.py ]]; then exec "pytest test_$${line}"; fi; done;
which is not working correctly, as I have file path in the text as well. Any idea how to implement this?
Using Bash's variable substring removal to add the test_. One-liner:
$ while read line; do if [[ $line == *.py ]]; then echo "pytest ${line%/*}/test_${line##*/}"; fi; done < file
In more readable form:
while read line
do
if [[ $line == *.py ]]
then
echo "pytest ${line%/*}/test_${line##*/}"
fi
done < file
Output:
pytest /folder1/test_file4.py
pytest /folder1/folder2/test_file5.py
Don't know anything about the Google Cloudbuild so I'll let you experiment with the double dollar signs.
Update:
In case there are files already with test_ prefix, use this bash script that utilizes extglob in variable substring removal:
shopt -s extglob # notice
while read line
do
if [[ $line == *.py ]]
then
echo "pytest ${line%/*}/test_${line##*/?(test_)}" # notice
fi
done < file
You can easily refactor all your conditions into a simple sed script. This also gets rid of the useless cat and the similarly useless exec.
sed -n 's%[^/]*\.py$%test_&%p' /workspace/filelist.txt |
xargs -n 1 pytest
The regular expression matches anything after the last slash, which means the entire line if there is no slash; we include the .py suffix to make sure this only matches those files.
The pipe to xargs is a common way to convert standard input into command-line arguments. The -n 1 says to pass one argument at a time, rather than as many as possible. (Maybe pytest allows you to specify many tests; then, you can take out the -n 1 and let xargs pass in as many as it can fit.)
If you want to avoid adding the test_ prefix to files which already have it, one solution is to break up the sed script into two separate actions:
sed -n '/test_[^/]*\.py/p;t;s%[^/]*\.py$%test_&%p' /workspace/filelist.txt |
xargs -n 1 pytest
The first p simply prints the matches verbatim; the t says if that matched, skip the rest of the script for this input.
(MacOS / BSD sed will want a newline instead of a semicolon after the t command.)
sed is arguably a bit of a read-only language; this is already pressing towards the boundary where perhaps you would rewrite this in Awk instead.
You may want to focus on lines that ends with ".py" string
You can achieve that using grep combined with a regex so you can figure out if a line ends with .py - that eliminates the if statement.
IFS=$'\n'
for file in $(cat /workspace/filelist.txt|grep '\.py$');do pytest $file;done
This question already has answers here:
Test whether a glob has any matches in Bash
(22 answers)
Closed 1 year ago.
I am trying to change my code to read a .txt file from a directory. That directory contains only one .txt file but I do not know its name beforehand. I only know it is a .txt file. Is it possible to do that using shell script?
Below is my current code to read a file but I have to manually specify the file name.
#!/bin/bash
declare -a var
filename='file.txt'
let count=0
while read line; do
var[$count]=$line
((count++))
done < $filename
If you are 100% sure that there is only one matching file just replace:
done < $filename
by:
done < *.txt
Of course this will fail if you have zero or more than one matching file. So, it would be better to test first. For instance with:
tmp=$(shopt -p nullglob || true)
shopt -s nullglob
declare -a filename=(*.txt)
if (( ${#filename[#]} != 1 )); then
printf 'error: zero or more than one *.txt file\n'
else
declare -a var
let count=0
while read line; do
var[$count]=$line
((count++))
done < "${filename[0]}"
fi
eval "$tmp"
The shopt stuff stores the current status of the nullglob option in variable tmp, enables the option, and restores the initial status at the end. Enabling nullglob is needed here if there is a risk that you have zero *.txt file. Without nullglob that would store literal string *.txt in array filename.
Your loop could be optimized a bit:
declare -a var
while IFS= read -r line; do
var+=("$line")
done < "${filename[0]}"
IFS= is needed to preserve leading and trailing spaces in the read lines. Remove it if this is not what you want. The -r option of read preserves backslashes in the read line. Remove it if this is not what you want. The += assignment automatically adds an element at the end of an indexed array. No need to count. If you want to know how many elements you array contains just use ${#var[#]} as we did to test the length of the filename array.
Note that storing the content of a text file in a bash array is better done with mapfile if your bash version is recent enough. See this other question, for instance.
Parameter expansion can be triggered in assignments as well!
filename=$(echo -n *.txt)
If you want to make it foolproof (catching the case that the number of matching files is not 1), assign to an array and check its size:
farr=( *.txt )
if (( ${#farr[*]} != 1 ))
then
echo You lied to may when you said that there is exactly on .txt file
else
filename=${farr[0]}
fi
You can use this. The head command is used in this context to ensure one result.
filename=$(ls *.txt | head -n 1)
I am working in a directory with file names ending with fastq.gz. with using a loop like the following, I will be running a tool.
for i inls; do if [[ "$i" == *".gz" ]]; then bwa aln ../hg38.fa $i > $i | sed 's/fastq.gz/sai/g'; fi; done
My question is, I want my output filename to end with .sai instead of fastq.gz with keeping the rest of the filename the same. yet, as it first sees $i after >, it modifies the input file itself. I tried using it like <($i | sed 's/fastq.gz/sai/g') but that does not work either. what is the right way of writing this?
You can use string replacements to compute the filename and the extension.
Moreover, you shouldn't rely on the ls output but loop directly over the expression you are looking for.
for file in *.gz; do
name="${file%.*}"
file_output="${name}.sai"
bwa aln ../hg38.fa ${file} > ${file_output}
done
The idea is that I want to read any .txt file in a specific folder and do something. So I tried this code:
#!/bin/bash
#Read the file line by line
while read line
do
if [ $i -ne 0 ]; then
#do something...
fi
done < "*.txt"
echo "Finished!"
I think you got my idea now. Thanks for any advice.
After doing some stuff, I want to move the file to another folder.
Not sure what $i is in your if statement.. but you can read all the .txt files in a dir line by line like this:
while read line; do
# your code here, eg
echo "$line"
done < <(cat *.txt)
For a "specific directory" (ie not the directory you are currently in):
DIR=/example/dir
while read line; do
# your code here, eg
echo "$line"
done < <(cat "$DIR"/*.txt)
To avoid using cat unnecessarily, you could use a for loop:
for file in *.txt
do
while read line
do
# whatever
mv -i "$file" /some/other/place
done < "$file"
done
This treats each file separately so you can perform actions on each one individually. If you wanted to move all the files to the same place, you could do that outside the loop:
for file in *.txt
do
while read line
do
# whatever
done < "$file"
done
mv -i *.txt /some/other/place
As suggested in the comments, I have added the -i switch to mv, which prompts before overwriting files. This is probably a good idea, especially when you are expanding a * wildcard. If you would rather not be prompted, you could instead use the -n switch which will not overwrite any files.
I am trying to run a simple bash script but I am struggling on how to incoperate a condition. any pointers. the loop says. I would like to incoperate a conditions such that when gdalinfo cannot open the image it copies that particular file to another location.
for file in `cat path.txt`; do gdalinfo $file;done
works fine in opening the images and also shows which ones cannot be opened.
the wrong code is
for file in `cat path.txt`; do gdalinfo $file && echo $file; else cp $file /data/temp
Again, and again and again - zilion th again...
Don't use contsructions like
for file in `cat path.txt`
or
for file in `find .....`
for file in `any command what produces filenames`
Because the code will BREAK immediatelly, when the filename or path contains space. Never use it for any command what produces filenames. Bad practice. Very Bad. It is incorrect, mistaken, erroneous, inaccurate, inexact, imprecise, faulty, WRONG.
The correct form is:
for file in some/* #if want/can use filenames directly from the filesystem
or
find . -print0 | while IFS= read -r -d '' file
or (if you sure than no filename contains a newline) can use
cat path.txt | while read -r file
but here the cat is useless, (really - command what only copies a file to STDOUT is useless). You should use instead
while read -r file
do
#whatever
done < path.txt
It is faster (doesn't fork a new process, as do in case of every pipe).
The above whiles will fill the corect filename into the variable file in cases when the filename contains a space too. The for will not. Period. Uff. Omg.
And use "$variable_with_filename" instead of pure $variable_with_filename for the same reason. If the filename contains a white-space any command will misunderstand it as two filenames. This probably not, what you want too..
So, enclose any shell variable what contain a filename with double quotes. (not only filename, but anything what can contain a space). "$variable" is correct.
If i understand right, you want copy files to /data/temp when the gdalinfo returns error.
while read -r file
do
gdalinfo "$file" || cp "$file" /data/temp
done < path.txt
Nice, short and safe (at least if your path.txt really contains one filename per line).
And maybe, you want use your script more times, therefore dont out the filename inside, but save the script in a form
while read -r file
do
gdalinfo "$file" || cp "$file" /data/temp
done
and use it like:
mygdalinfo < path.txt
more universal...
and maybe, you want only show the filenames for what gdalinfo returns error
while read -r file
do
gdalinfo "$file" || printf "$file\n"
done
and if you change the printf "$file\n" to printf "$file\0" you can use the script in a pipe safely, so:
while read -r file
do
gdalinfo "$file" || printf "$file\0"
done
and use it for example as:
mygdalinfo < path.txt | xargs -0 -J% mv % /tmp/somewhere
Howgh.
You can say:
for file in `cat path.txt`; do gdalinfo $file || cp $file /data/temp; done
This would copy the file to /data/temp if gdalinfo cannot open the image.
If you want to print the filename in addition to copying it in case of failure, say:
for file in `cat path.txt`; do gdalinfo $file || (echo $file && cp $file /data/temp); done