I have bunch of fastq files in the directory and i want to trim the sequence by 2 nucleotides and quality(if the read has 51 base pairs and ends-with CTG or TTG).
here is what i wrote as shell script but i am getting some errors,need help as i am new to shell scripting
Input:
#HWI-ST1072:187:C35YUACXX:7:1101:1609:1983 1:N:0:ACAGTG
NGGAGAAAGAGAGTGTGTTTTTAGGGGGAGATTTTTAAAATGGTTGTTTTG
+
#0<BFFFFFFFFF<BFFFIIFFFFFIIIBFFFFFIIFIIIIIFFBFFFFFF
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTATTCGGGAGGTTGAGCTG
+
#0<BFFFFFFFFFFIIBFFIIIIIIFIIIFFIIFIIIFIIFIIFFFFIIFF
#HWI-ST1072:187:C35YUACXX:7:1101:9351:2210 1:N:0:ACAGTG
CGGTTTTGTTTTATTTTGTATGATTAGGAGGGTTTTGGAGGTTTAGTTACC
+
BBBFFFFFFFFFFIIIIIFFIIFIIIIIIIIIFFIIFIFIIFFIIIFIIII
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTAT
+
#0<BFFFFFFFFFFIIBFFIIIIIIFIIIFFIIFI
output:
#HWI-ST1072:187:C35YUACXX:7:1101:1609:1983 1:N:0:ACAGTG
NGGAGAAAGAGAGTGTGTTTTTAGGGGGAGATTTTTAAAATGGTTGTTT
+
#0<BFFFFFFFFF<BFFFIIFFFFFIIIBFFFFFIIFIIIIIFFBFFFF
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTATTCGGGAGGTTGAGC
+
#0<BFFFFFFFFFFIIBFFIIIIIIFIIIFFIIFIIIFIIFIIFFFFII
#HWI-ST1072:187:C35YUACXX:7:1101:9351:2210 1:N:0:ACAGTG
CGGTTTTGTTTTATTTTGTATGATTAGGAGGGTTTTGGAGGTTTAGTTACC
+
BBBFFFFFFFFFFIIIIIFFIIFIIIIIIIIIFFIIFIFIIFFIIIFIIII
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTAT
+
#0<BFFFFFFFFFFIIBFFIIIIIIFIIIFFIIFI
script:
for sample in *.fastq;do
name=$(echo ${sample} | sed 's/.fastq//')
while read line;do
if [ ${line:0:1} == "#" ] ; then
head="${line}"
$echo $head
elif [ "${head}" ] && [ "${line}" ] ; then
length=${#line}
if [ "${length}" = 51 -a "${line}" =~ *CTG|*TTG ] ; then
sequence= substr($line,0,49)
#echo $sequence
fi
elif [ ${line:0:1} == "+" ] ; then
plus="${line}"
#echo $plus
elif [ "${plus}" ] && [ "${line}" ] ; then
quality= substr($line,0,49)
#echo $quality
fi
print "${head}\n${sequence}\n${plus}\n${quality}" > ${name}_new.fq
done < $sample
done
Don't 100% understand what you're doing, but fixed a few things. Try below
#!/bin/bash
for sample in *.fastq; do
name="${sample/.fastq/}"
while read -r line; do
if [[ $line == '#'* ]]; then
head="$line" && echo "$head" >> "${name}_new.fq"
elif [[ -n $head && ${#line} == 51 && $line =~ (CTG|TTG)$ ]]; then
sequence="${line:0:49}" && echo "$sequence" >> "${name}_new.fq"
elif [[ $line == '+'* ]]; then
plus="$line" && echo "$line" >> "${name}_new.fq"
else
quality="$line" && echo "$quality" >> "${name}_new.fq"
fi
done < "$sample"
done
Example output
> cat sample_new.fq
> cat sample.fastq
#HWI-ST1072:187:C35YUACXX:7:1101:1609:1983 1:N:0:ACAGTG
NGGAGAAAGAGAGTGTGTTTTTAGGGGGAGATTTTTAAAATGGTTGTTTTG
+
#0<BFFFFFFFFF<BFFFIIFFFFFIIIBFFFFFIIFIIIIIFFBFFFFFF
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTATTCGGGAGGTTGAGCTG
+
#0<BFFFFFFFFFFIIBFFIIIIIIFIIIFFIIFIIIFIIFIIFFFFIIFF
#HWI-ST1072:187:C35YUACXX:7:1101:9351:2210 1:N:0:ACAGTG
CGGTTTTGTTTTATTTTGTATGATTAGGAGGGTTTTGGAGGTTTAGTTACC
+
BBBFFFFFFFFFFIIIIIFFIIFIIIIIIIIIFFIIFIFIIFFIIIFIIII
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTAT
+
#0<BFFFFFFFFFFIIBFFIIIIIIFIIIFFIIFI
> ./abovescript
> cat sample_new.fq
#HWI-ST1072:187:C35YUACXX:7:1101:1609:1983 1:N:0:ACAGTG
NGGAGAAAGAGAGTGTGTTTTTAGGGGGAGATTTTTAAAATGGTTGTTT
+
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTATTCGGGAGGTTGAGC
+
#HWI-ST1072:187:C35YUACXX:7:1101:9351:2210 1:N:0:ACAGTG
CGGTTTTGTTTTATTTTGTATGATTAGGAGGGTTTTGGAGGTTTAGTTACC
+
BBBFFFFFFFFFFIIIIIFFIIFIIIIIIIIIFFIIFIFIIFFIIIFIIII
#HWI-ST1072:187:C35YUACXX:7:1101:1747:1995 1:N:0:ACAGTG
NGGTTGTGGTGGTGGGTATTTGTAGTTTTATTTAT
+
Related
I have an if-testing construct for a bash-progress bar:
progress_bar () {
PROGRESS=`expr $( cat $CONTROL_FILE ) - 100 | tr "-" "\r"`
clear
if [ "$PROGRESS" -eq 10 ]
then
echo -n "$PROGRESS % [# ] "
fi
if [ $PROGRESS -eq 20 ]
then
echo -n "$PROGRESS % [## ] "
fi
if [ $PROGRESS -eq 30 ]
then
echo -n "$PROGRESS % [### ] "
fi
if [ $PROGRESS -eq 40 ]
then
echo -n "$PROGRESS % [#### ] "
fi
if [ $PROGRESS -eq 50 ]
then
echo -n "$PROGRESS % [##### ] "
fi
if [ $PROGRESS -eq 60 ]
then
echo -n "$PROGRESS % [###### ] "
fi
if [ $PROGRESS -eq 70 ]
then
echo -n "$PROGRESS % [####### ] "
fi
if [ $PROGRESS -eq 80 ]
then
echo -n "$PROGRESS % [######## ] "
fi
if [ $PROGRESS -eq 90 ]
then
echo -n "$PROGRESS % [######### ] "
fi
if [ $PROGRESS -eq 100 ]
then
echo "$PROGRESS % [##########]"
fi}
It works well: A "#" will be added each time when +10 are counted. My question is: Is there a way to make this look more beautiful? Like a for-loop or something else?
Thanks
I'd do it like this:
progress_bar () {
local percent hashes
hashes='##########'
percent=$(<"$CONTROL_FILE")
printf '\r%d %% [%-10s] ' $((100 - percent)) "${hashes:percent/10}"
}
Would you please try the following:
progress_bar() {
local progress n bar
progress=$(expr $(<"$CONTROL_FILE") - 100 | tr "-" "\r")
clear
n=$(( $progress / 10 ))
if (( $progress > 0 )); then
bar=$(printf "%${n}s" " " | tr " " "#")
fi
printf "%s %% [%s]" "$progress" "$bar"
(( $progress >= 100 )) && echo # put newline when done
}
printf "%${n}s" " " | tr " " "#" repeats a whitespace $n times, then
relace them with #s of the same count.
As it is not recommended to use uppercases for user's variables, I've
changed PROGRESS to progress.
I'm writing a function in shell script to check if the two numbers are Palindromes but I am getting an error, It is showing error in line 18 command not found. Please help me how can I remove this error.
#!/bin/bash
echo "Enter two number:"
read a
read b
for num in $a $b;
do
x=$x$sep$num
sep=" "
done
y=$x
num1=$a
num2=$b
rem=""
rev=0
for word in $y;
do
checkPalindrome $word
if [ $? -eq 0 ]
then
echo "$word is palindrome"
fi
done
checkPalindrome() {
local s=$1
for i in $s ;
do
while [ $i -gt 0]
do
rem=$(($i%10));
rev=$(($rev*10+$rem));
i=$(($i / 10));
done
done
if [[ $rev -eq $num1 && $rev -eq $num2 ]]
then
return 0;
else
return 1;
fi
}
You need to provide your checkPalindrome() definition before you use it, as below:
#!/bin/bash
checkPalindrome() {
local s=$1
for i in $s
do
while [ "$i" -gt 0 ]
do
rem=$((i%10))
rev=$((rev*10+rem))
i=$((i / 10))
done
done
if [[ $rev -eq $num1 && $rev -eq $num2 ]]
then
return 0
else
return 1
fi
}
echo "Enter two number:"
read -r a
read -r b
for num in $a $b
do
x="$x$sep$num"
sep=" "
done
y="$x"
num1="$a"
num2="$b"
rem=""
rev=0
for word in $y;
do
if checkPalindrome "$word"
then
echo "$word is palindrome"
fi
done
You could consider the input as string (you don't have to restrict it to be an integer)
#!/bin/bash
is_palindrome() {
local arg=$1 i j
for ((i = 0, j = ${#arg} - 1; i < j; ++i, --j)); do
[[ ${arg:i:1} = "${arg:j:1}" ]] || return
done
}
read -r -p 'Enter two words: ' a b
for word in $a $b; do
is_palindrome "$word" && echo "$word is palindrome"
done
I was given the following assignment. We have a prebuilt GUI in a binary form, kept in $GAME_BIN. I have to write a script which connects the GUI with the AI engine. This is my code, which is pretty self descriptive. We have in this case: GAME_BIN=./sredniowiecze_gui, ai1=./idle_ai.sh, ai2=./idle_ai.sh
#!/bin/bash
# arguments parsing and setting $args - not posting this here
gui_outpipe=$(mktemp -u)
gui_inpipe=$(mktemp -u)
ai1_outpipe=$(mktemp -u)
ai1_inpipe=$(mktemp -u)
ai2_outpipe=$(mktemp -u)
ai2_inpipe=$(mktemp -u)
mkfifo $gui_outpipe $gui_inpipe $ai1_outpipe $ai1_inpipe $ai2_inpipe $ai2_outpipe
printinit 1 > $ai1_inpipe &
printinit 2 > $ai2_inpipe &
$GAME_BIN $args < $gui_inpipe &
$ai1 < $ai1_inpipe > $ai1_outpipe &
$ai1 < $ai2_inpipe > $ai2_outpipe &
while true; do
echo "Started the loop"
while true; do
read line < $ai1_outpipe || echo "Nothing read"
echo $line
if [[ $line ]]; then
echo "$line" > $gui_inpipe
echo "$line" > $ai2_inpipe
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
while true; do
read line < $ai2_outpipe || echo "nothing read"
echo $line
if [[ $line ]]; then
echo "$line" > $gui_inpipe
echo "$line" > $ai2_inpipe
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
done
wait
I created a simple idle AI contained in idle_ai.sh
#!/bin/sh
while true; do
echo END_TURN
done
Then the END_TURN message from the GUI is not received at all. On the other hand, the second END_TURN in line (*) is not received by the script. If I use my own C-written AI - very long code, not posting it here, no information from the AI is received in the second run of the while loop
I have absolutely no idea how to debug it.
Since I'm not eager to run binaries unsandboxed, I'm calling the script by firejail ./game.sh [irrelevant parameters]
EDIT after adding set -x the output is
INIT 10 3 1 1 1 3 9
+ [[ -n ./idle_ai.sh ]]
+ [[ -n '' ]]
+ [[ -n ./idle_ai.sh ]]
+ printinit 1
+ ./sredniowiecze_gui -human2
+ true
+ echo 'Started the loop'
Started the loop
+ true
+ read line
+ ./idle_ai.sh
+ echo 'INIT 10 3 1 1 1 3 9'
+ echo END_TURN
END_TURN
+ [[ -n END_TURN ]]
+ echo END_TURN
+ [[ END_TURN == \E\N\D\_\T\U\R\N ]]
+ break
+ true
+ read line
+ echo MOVE 5 9 5 8
MOVE 5 9 5 8
+ [[ -n MOVE 5 9 5 8 ]]
+ echo 'MOVE 5 9 5 8'
In AI vs AI mode:
INIT 10 3 1 1 1 3 9
+ [[ -n ./idle_ai.sh ]]
+ [[ -n ./idle_ai.sh ]]
+ printinit 1
+ printinit 2
+ ./sredniowiecze_gui
+ ./idle_ai.sh
+ true
+ echo 'Started the loop'
Started the loop
+ echo 'INIT 10 3 1 1 1 3 9'
+ true
+ read line
+ ./idle_ai.sh
+ echo 'INIT 10 3 2 1 1 3 9'
+ echo END_TURN
END_TURN
+ [[ -n END_TURN ]]
+ echo END_TURN
+ echo END_TURN
+ [[ END_TURN == \E\N\D\_\T\U\R\N ]]
+ break
+ sleep 1
+ true
+ read line
+ echo END_TURN
END_TURN
+ [[ -n END_TURN ]]
+ echo END_TURN
+ echo END_TURN
+ [[ END_TURN == \E\N\D\_\T\U\R\N ]]
+ break
+ sleep 1
+ true
+ echo 'Started the loop'
Started the loop
+ true
+ read line
EDIT2
I did the suggested changes, now my code is:
printinit 1 > $ai1_inpipe &
printinit 2 > $ai2_inpipe &
$GAME_BIN $args < $gui_inpipe &
$ai1 < $ai1_inpipe > $ai1_outpipe &
echo $!
$ai2 < $ai2_inpipe > $ai2_outpipe &
echo $!
while true; do
echo "Started the loop"
while true; do
read -u3 line || echo "Nothing read"
echo $line
if [[ $line ]]; then
echo "$line" > $gui_inpipe
echo "$line" > $ai2_inpipe
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
while true; do
read -u4 line || echo "nothing read"
echo $line
if [[ $line ]]; then
echo "$line" > $gui_inpipe
echo "$line" > $ai1_inpipe
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
done 3<$ai1_outpipe 4<$ai2_outpipe
And now the script gets stuck on the echo "$line" > $ai1_inpipe line, although the $ai2 process is still running.
EDIT3. Now the log with set -x is
INIT 10 3 1 1 1 3 9
+ [[ -n ./idle_ai.sh ]]
+ [[ -n ./idle_ai.sh ]]
+ printinit 1
+ printinit 2
+ ./sredniowiecze_gui
+ echo 26
26
+ ./idle_ai.sh
+ echo 'INIT 10 3 1 1 1 3 9'
+ echo 27
27
+ ./idle_ai.sh
+ echo 'INIT 10 3 2 1 1 3 9'
+ true
+ echo 'Started the loop'
Started the loop
+ true
+ read -u3 line
+ echo END_TURN
END_TURN
+ [[ -n END_TURN ]]
+ echo END_TURN
+ echo END_TURN
+ [[ END_TURN == \E\N\D\_\T\U\R\N ]]
+ break
+ sleep 1
+ true
+ read -u4 line
+ echo END_TURN
END_TURN
+ [[ -n END_TURN ]]
+ echo END_TURN
+ echo END_TURN
If you add an echo FOO before and after the call, like this:
echo FOO
echo "$line" > $ai1_inpipe
echo BAR
then echo FOO is executed and echo BAR not.
You're using read < input, which sucks all the input and only uses the first line.
Instead of doing that, you should have read read from open file descriptors, like this:
EDIT: Same thing with writing to the fifo files with echo
while true; do
echo "Started the loop"
while true; do
read -u3 line || echo "Nothing read"
...
echo "$line" >&5
echo "$line" >&6
...
done
sleep $turndelay
while true; do
read -u4 line || echo "nothing read"
...
echo "$line" >&5
echo "$line" >&7
...
sleep $turndelay
done 3<$ai1_outpipe 4<$ai2_outpipe 5>$gui_inpipe 6>$ai2_inpipe 7>$ai1_inpipe
See these links for more help on the topic:
BashGuide Named pipes
BashFAQ 085
I am using Unix Shell. How to remove newline character between two specific strings.
For example, input is:
CASE when a in ('abcd','bdcdf') then
Shng,
END as xyz
Output should be:
CASE when a in ('abcd','bdcdf') then Shng END as xyz,
Parse the file line for line and remember when you see a CASE or END.
The code beneath uses a short syntax for an if-statement and an echo that suppresses \n by the -n parameter.
incase=0
cat x.sql | while read -r line; do
[[ ${line} = CASE* ]] && incase=1;
[[ ${line} = END* ]] && incase=0
[[ ${incase} = 0 ]] && echo "${line}"
[[ ${incase} = 1 ]] && echo -n "${line} "
done
EDIT:
When you have nested CASEs (like CASE ... CASE ... END ... END) and all
CASEs start on different lines you can count how deep your nested.
incase=0
cat x.sql | while read -r line; do
[[ ${line} = CASE* ]] && (( incase = incase + 1)) ;
[[ ${line} = END* ]] && (( incase = incase - 1))
[[ ${incase} = 0 ]] && echo "${line}"
[[ ${incase} > 0 ]] && echo -n "${line} "
done
# You might want an extra echo here so your last line will finish with a \n
echo
EDIT 2: Often you can avoid cat (look for UUOC). Here the code is better as
incase=0
cat x.sql | while read -r line; do
[[ ${line} = CASE* ]] && incase=1;
[[ ${line} = END* ]] && incase=0
[[ ${incase} = 0 ]] && echo "${line}"
[[ ${incase} = 1 ]] && echo -n "${line} "
done
EDIT:
When you have nested CASEs (like CASE ... CASE ... END ... END) and all
CASEs start on different lines you can count how deep your nested.
incase=0
while read -r line; do
[[ ${line} = CASE* ]] && (( incase = incase + 1)) ;
[[ ${line} = END* ]] && (( incase = incase - 1))
[[ ${incase} = 0 ]] && echo "${line}"
[[ ${incase} > 0 ]] && echo -n "${line} "
done < x.sql
# You might want an extra echo here so your last line will finish with a \n
echo
I'm having a weird issue incrementing a bash variable that seems
to be breaking after my first attempt at incremntation that I cannot
pin down, here is a sample of what I am doing and the debug output,
anyone see any reason this should NOT work?
I am currently on GNU bash, version 4.2.45(1)-release (i686-pc-linux-gnu)
#!/bin/bash
set -ex
declare -i elem=0
echo $elem # 0
(( elem++ )) # breaks
echo $elem # 1 but never encountered
while IFS=$'\n' read -r line || [[ -n "$line" ]]; do
(( elem++ ))
echo $elem
done <"${1}" # foo\nbar\nbaz
Output
./incr.sh test
+ declare -i elem=0
+ echo 0
0
+ (( elem++ ))
The weirdest part is by changing the initial incrementor to (( elem+=1 ))
the entire program increments correctly, this seems extremely buggy to the eye...
#!/bin/bash
set -ex
declare -i elem=0
echo $elem
(( elem+=1 ))
echo $elem
while IFS=$'\n' read -r line || [[ -n "$line" ]]; do
(( elem++ ))
echo $elem
done <"${1}" # foo\nbar\nbaz
Output
+ declare -i elem=0
+ echo 0
0
+ (( elem+=1 ))
+ echo 1
1
+ IFS='
'
+ read -r line
+ (( elem++ ))
+ echo 2
2
+ IFS='
'
+ read -r line
+ (( elem++ ))
+ echo 3
3
+ IFS='
'
+ read -r line
+ (( elem++ ))
+ echo 4
4
+ IFS='
'
+ read -r line
+ [[ -n '' ]]
set -e makes your script exit when any command returns failure.
(( 0 )), and equivalently elem=0; (( elem++ )) returns failure.
Therefore, the script exits.
If you set -e and want to run commands whose status you don't care, about, you can use
(( elem++ )) || true