Bash script to stdout stuck with redirect - bash

My bash script is the following:
#!/bin/bash
if [ ! -f "$1" ]; then
exit
fi
while read line;do
str1="[GAC]*T"
num=$"(echo $line | tr -d -c 'T' | wc -m)"
for((i=0;i<$num;i++))do
echo $line | sed "s/$str1/&\n/" | head -n1 -q
str1="${str1}[GAC]*T"
done
str1="[GAC]*T"
done < "$1
While it works normally as it should (take the filename input and print it line by line until the letter T and next letter T and so on) it prints to the terminal.
Input:
GATTT
ATCGT
Output:
GAT
GATT
GATTT
AT
ATCGT
When I'm using the script with | tee outputfile the outputfile is correct but when using the script with > outputfile the terminal hangs / is stuck and does not finish. Moreover it works with bash -x scriptname inputfile > outputfile but is stuck with bash scriptname inputfile > outputfile.

I made modifications to your original script, please try:
if [ ! -f "$1" ]; then
exit
fi
while IFS='' read -r line || [[ -n "$line" ]];do
str1="[GAC]*T"
num=$(echo $line | tr -d -c 'T' | wc -m)
for((i=0;i<$num;i++));do
echo $line | sed "s/$str1/&\n/" | head -n1 -q
str1="${str1}[GAC]*T"
done
str1="[GAC]*T"
done < "$1"
For input:
GATTT
ATCGT
This script outputs:
GAT
GATT
GATTT
AT
ATCGT
Modifications made to your original script were:
Line while read line; do changed to while IFS='' read -r line || [[ -n "$line" ]]; do. Why I did this is explained here: Read a file line by line assigning the value to a variable
Line num=$"(echo $line | tr -d -c 'T' | wc -m)" changed to num=$(echo $line | tr -d -c 'T' | wc -m)
Line for((i=0;i<$num;i++))do changed to for((i=0;i<$num;i++));do
Line done < "$1 changed to done < "$1"
Now you can do: ./scriptname inputfile > outputfile

Try:
sed -r 's/([^T]*T+)/\1\n/g' gatc.txt > outputfile
instead of your script.
It takes some optional non-Ts, followed by at least one T and inserts a newline after the T.
cat gatc.txt
GATGATTGATTTATATCGT
sed -r 's/([^T]*T+)/\1\n/g' gatc.txt
GAT
GATT
GATTT
AT
AT
CGT
For multiple lines, to delete empty lines in the end:
echo "GATTT
ATCGT" | sed -r 's/([^T]*T+)/\1\n/g;' | sed '/^$/d'
GATTT
AT
CGT

Related

Shell: Add string to the end of each line, which match the pattern. Filenames are given in another file

I'm still new to the shell and need some help.
I have a file stapel_old.
Also I have in the same directory files like english_old_sync, math_old_sync and vocabulary_old_sync.
The content of stapel_old is:
english
math
vocabulary
The content of e.g. english is:
basic_grammar.md
spelling.md
orthography.md
I want to manipulate all files which are given in stapel_old like in this example:
take the first line of stapel_old 'english', (after that math, and so on)
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
search in english_old_sync line by line for the pattern '.md'
And append to each line after .md :::#a1
The result should be e.g. of english_old_sync:
basic_grammar.md:::#a1
spelling.md:::#a1
orthography.md:::#a1
of math_old_sync:
geometry.md:::#a1
fractions.md:::#a1
and so on. stapel_old should stay unchanged.
How can I realize that?
I tried with sed -n, while loop (while read -r line), and I'm feeling it's somehow the right way - but I still get errors and not the expected result after 4 hours inspecting and reading.
Thank you!
EDIT
Here is the working code (The files are stored in folder 'olddata'):
clear
echo -e "$(tput setaf 1)$(tput setab 7)Learning directories:$(tput sgr 0)\n"
# put here directories which should not become flashcards, command: | grep -v 'name_of_directory_which_not_to_learn1' | grep -v 'directory2'
ls ../ | grep -v 00_gliederungsverweise | grep -v 0_weiter | grep -v bibliothek | grep -v notizen | grep -v Obsidian | grep -v z_nicht_uni | tee olddata/stapel_old
# count folders
echo -ne "\nHow much different folders: " && wc -l olddata/stapel_old | cut -d' ' -f1 | tee -a olddata/stapel_old
echo -e "Are this learning directories correct? [j ODER y]--> yes; [Other]-->no\n"
read lernvz_korrekt
if [ "$lernvz_korrekt" = j ] || [ "$lernvz_korrekt" = y ];
then
read -n 1 -s -r -p "Learning directories correct. Press any key to continue..."
else
read -n 1 -s -r -p "Learning directories not correct, please change in line 4. Press any key to continue..."
exit
fi
echo -e "\n_____________________________\n$(tput setaf 6)$(tput setab 5)Found cards:$(tput sgr 0)$(tput setaf 6)\n"
#GET && WRITE FOLDER NAMES into olddata/stapel_old
anzahl_zeilen=$(cat olddata/stapel_old |& tail -1)
#GET NAMES of .md files of every stapel and write All to 'stapelname'_old_sync
i=0
name="var_$i"
for (( num=1; num <= $anzahl_zeilen; num++ ))
do
i="$((i + 1))"
name="var_$i"
name=$(cat olddata/stapel_old | sed -n "$num"p)
find ../$name/ -name '*.md' | grep -v trash | grep -v Obsidian | rev | cut -d'/' -f1 | rev | tee olddata/$name"_old_sync"
done
(tput sgr 0)
I tried to add:
input="olddata/stapel_old"
while IFS= read -r line
do
sed -n "$line"p olddata/stapel_old
done < "$input"
The code to change only the english_old_sync is:
lines=$(wc -l olddata/english_old_sync | cut -d' ' -f1)
for ((num=1; num <= $lines; num++))
do
content=$(sed -n "$num"p olddata/english_old_sync)
sed -i "s/"$content"/""$content":::#a1/g"" olddata/english_old_sync
done
So now, this need to be a inner for-loop, of a outer for-loop which holds the variable for english, right?
stapel_old should stay unchanged.
You could try a while + read loop and embed sed inside the loop.
#!/usr/bin/env bash
while IFS= read -r files; do
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
cp copies the file with a new name, if the goal is renaming the original file name from the content of the file staple_old then change cp to mv
The -n and -i flag from sed was ommited , include it, if needed.
The script also assumes that there are no empty/blank lines in the content of staple_old file. If in case there are/is add an addition test after the line where the do is.
[[ -n $files ]] || continue
It also assumes that the content of staple_old are existing files. Just in case add an additional test.
[[ -e $files ]] || { printf >&2 '%s no such file or directory.\n' "$files"; continue; }
Or an if statement.
if [[ ! -e $files ]]; then
printf >&2 '%s no such file or directory\n' "$files"
continue
fi
See also help test
See also help continue
Combining them all together should be something like:
#!/usr/bin/env bash
while IFS= read -r files; do
[[ -n $files ]] || continue
[[ -e $files ]] || {
printf >&2 '%s no such file or directory.\n' "$files"
continue
}
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
Remove the echo's If you're satisfied with the output so the script could copy/rename and edit the files.

how to open all links in a file and ignore comments using firefox?

so the file contains data like
# entertainment
youtube.com
twitch.tv
# research
google.com
wikipedia.com
...
and I would like to pass that file as an argument in a script that would open all lines if they doesn't start with an #. Any clues on how to ?
so far what i have:
for Line in $Lines
do
case "# " in $Line start firefox $Line;; esac
done
some code that could be useful (?):
while read line; do chmod 755 "$line"; done < file.txt
grep -e '^[^#]' inputfile.txt | xargs -d '\n' firefox --new-tab
grep -e '^[^#]': Will print all lines that don't start with a sharp (comments)
xargs -d '\n' firefox --new-tab: Will pass each line that is not blank, as argument to Firefox.
Removes both the lines that start with # and empty lines.
#!/bin/bash
#
while read -r line
do
if [[ $(echo "$line" | grep -Ev "^#|^$") ]]
then
firefox --new-tab "$url" &
fi
done <file.txt
Skip the empty lines and the lines that starts with a #
#!/usr/bin/env bash
while IFS= read -r url; do
[[ "$url" == \#* || -z "$url" ]] && continue
firefox --new-tab "$url" &
done < file.txt
awk 'NF && $1!="#"{print "firefox --new-tab", $0, "&"}' file.txt | bash

Intermittent piping failure in bash

I have a code snippet that looks like this
while grep "{{SECRETS}}" /tmp/kubernetes/$basefile | grep -v "#"; do
grep -n "{{SECRETS}}" /tmp/kubernetes/$basefile | grep -v "#" | head -n1 | while read -r line ; do
lineno=$(echo $line | cut -d':' -f1)
spaces=$(sed "${lineno}!d" /tmp/kubernetes/$basefile | awk -F'[^ \t]' '{print length($1)}')
spaces=$((spaces-1))
# Delete line that had {{SECRETS}}
sed -i -e "${lineno}d" /tmp/kubernetes/$basefile
while IFS='' read -r secretline || [[ -n "$secretline" ]]; do
newline=$(printf "%*s%s" $spaces "" "$secretline")
sed -i "${lineno}i\ ${newline}" /tmp/kubernetes/$basefile
lineno=$((lineno+1))
done < "/tmp/secrets.yaml"
done
done
in /tmp/kubernetes/$basefile, the string {{SECRETS}} appears twice 100% of the time.
Almost every single time, this completes fine. However, very infrequently, the script errors on its second loop through the file. like so, according to set -x
...
IFS=
+ read -r secretline
+ [[ -n '' ]]
+ read -r line
exit code 1
When it works, the set -x looks like this, and continues processesing the file correctly.
...
+ IFS=
+ read -r secretline
+ [[ -n '' ]]
+ read -r line
+ grep '{{SECRETS}}' /tmp/kubernetes/deployment.yaml
+ grep -v '#'
I have no answer for how this can only happen occasionally, so I think there's something about bash piping's parallelism I don't understand. Is there something in grep -n "{{SECRETS}}" /tmp/kubernetes/$basefile | grep -v "#" | head -n1 | while read -r line ; do that could lead to out-of-order execution somehow? Based on the error, it seems like it's trying to read a line, but can't because previous commands didn't work. But there's no indication of that in the set -x output.
A likely cause of the problem is that the pipeline containing the inner loop both reads and writes the "basefile" at the same time. See How to make reading and writing the same file in the same pipeline always “fail”?.
One way to fix the problem is do a full read of the file before trying to update it. Try:
basepath=/tmp/kubernetes/$basefile
secretspath=/tmp/secrets.yaml
while
line=$(grep -n "{{SECRETS}}" "$basepath" | grep -v "#" | head -n1)
[[ -n $line ]]
do
lineno=$(echo "$line" | cut -d':' -f1)
spaces=$(sed "${lineno}!d" "$basepath" \
| awk -F'[^ \t]' '{print length($1)}')
spaces=$((spaces-1))
# Delete line that had {{SECRETS}}
sed -i -e "${lineno}d" "$basepath"
while IFS='' read -r secretline || [[ -n "$secretline" ]]; do
newline=$(printf "%*s%s" $spaces "" "$secretline")
sed -i "${lineno}i\ ${newline}" "$basepath"
lineno=$((lineno+1))
done < "$secretspath"
done
(I introduced the variables basepath and secretspath to make the code easier to test.)
As an aside, it's also possible to do this with pure Bash code:
basepath=/tmp/kubernetes/$basefile
secretspath=/tmp/secrets.yaml
updated_lines=()
is_updated=0
while IFS= read -r line || [[ -n $line ]] ; do
if [[ $line == *'{{SECRETS}}'* && $line != *'#'* ]] ; then
spaces=${line%%[^[:space:]]*}
while IFS= read -r secretline || [[ -n $secretline ]]; do
updated_lines+=( "${spaces}${secretline}" )
done < "$secretspath"
is_updated=1
else
updated_lines+=( "$line" )
fi
done <"$basepath"
(( is_updated )) && printf '%s\n' "${updated_lines[#]}" >"$basepath"
The whole updated file is stored in memory (in the update_lines array) but that shouldn't be a problem because any file that's too big to store in memory will almost certainly be too big to process line-by-line with Bash. Bash is generally extremely slow.
In this code spaces holds the actual space characters for indentation, not the number of them.

How to pass a variable string to a file txt at the biginig of test?

I have a problem
I Have a program general like this gene.sh
that for all file (es file: geneX.csv) make a directory with the name of gene (example: Genex/geneX.csv) next this program compile an other program inside gene.sh but this progrm need a varieble and I dont know how do it.
this is the program gene.sh
#!/bin/bash
# Create a dictory for each file *.xls and *.csv
for fname in *.xlsx *csv
do
dname=${fname%.*}
[[ -d $dname ]] || mkdir "$dname"
mv "$fname" "$dname"
done
# For each gene go inside the directory and compile the programs getChromosomicPositions.sh to have the positions, and getHapolotipeStings.sh to have the variants
for geni in */; do
cd $geni
z=$(tail -n 1 *.csv | tr ';' "\n" | wc -l)
cd ..
cp getChromosomicPositions.sh $geni --->
cp getHaplotypeStrings.sh $geni
cd $geni
export z
./getChromosomicPositions.sh *.csv
export z
./getHaplotypeStrings.sh *.csv
cd ..
done
This is the program getChromosomichPositions.sh:
rm chrPosRs.txt
grep '^Haplotype\ ID' $1 | cut -d ";" -f 4-61 | tr ";" "\n" | awk '{print "select chrom,chromStart,chromEnd,name from snp147 where name=\""$1"\";"}' > listOfQuery.txt
while read l; do
echo $l > query.txt
mysql -h genome-mysql.cse.ucsc.edu -u genome -A -D hg38 --skip-column-names < query.txt > queryResult.txt
if [[ "$(cat queryResult.txt)" == "" ]];
then
cat query.txt |
while read line; do
echo $line | awk '$6 ~/rs/ {print $6}' > temp.txt;
if [[ "$(cat temp.txt)" != "" ]];
then cat temp.txt | awk -F'name="' '{print $2}' | sed -e 's/";//g' > temp.txt;
./getHGSVposHG19.sh temp.txt ---> Hear the problem--->
else
echo $line | awk '{num=sub(/.*:g\./,"");num+=sub(/\".*/,"");if(num==2){print};num=""}' > temp2.txt
fi
done
cat query.txt >> varianti.txt
echo "Missing Data" >> chrPosRs.txt
else
cat queryResult.txt >> chrPosRs.txt
fi
done < listOfQuery.txt
rm query*
hear the problem:
I need to enter in the file temp.txt and put automatically at the beginning of the file the variable $geni of the program gene.sh
How can I do that?
Why not pass "$geni" as say the first argument when invoking your script, and treating the rest of the arguments as your expected .csv files.
./getChromosomicPositions.sh "$geni" *.csv
Alternatively, you can set it as environment variable for the script, so that it can be used there (or just export it).
geni="$geni" ./getChromosomicPositions.sh *.csv
In any case, once you have it available in the second script, you can do
if passed as the first argument:
echo "${1}:$(cat temp.txt | awk -F'name="' '{print $2}' | sed -e 's/";//g')
or if passed as environment variable:
echo "${geni}:$(cat temp.txt | awk -F'name="' '{print $2}' | sed -e 's/";//g')

Shell Script: Read line in file

I have a file paths.txt:
/my/path/Origin/.:your/path/Destiny/.
/my/path/Origin2/.:your/path/Destiny2/.
/...
/...
I need a Script CopyPaste.sh using file paths.txt to copy all files in OriginX to DestinyX
Something like that:
#!/bin/sh
while read line
do
var= $line | cut --d=":" -f1
car= $line | cut --d=":" -f2
cp -r var car
done < "paths.txt"
Use translate : tr command & apply cp command in the same go!
#!/bin/sh
while read line; do
cp `echo $line | tr ':' ' '`
done < "paths.txt"
You need to use command substitution to get command's output into a shell variable:
#!/bin/sh
while read line
do
var=`echo $line | cut --d=":" -f1`
car=`echo $line | cut --d=":" -f2`
cp -r "$var" "$car"
done < "paths.txt"
Though your script can be simplified using read -d:
while read -d ":" var car; do
cp -r "$var" "$car"
done < "paths.txt"

Resources