I have a command like below
md5sum test1.txt | cut -f 1 -d " " >> test.txt
I want output of the above result prefixed with File_CheckSum:
Expected output: File_CheckSum: <checksumvalue>
I tried as follows
echo 'File_Checksum:' >> test.txt | md5sum test.txt | cut -f 1 -d " " >> test.txt
but getting result as
File_Checksum:
adbch345wjlfjsafhals
I want the entire output in 1 line
File_Checksum: adbch345wjlfjsafhals
echo writes a newline after it finishes writing its arguments. Some versions of echo allow a -n option to suppress this, but it's better to use printf instead.
You can use a command group to concatenate the the standard output of your two commands:
{ printf 'File_Checksum: '; md5sum test.txt | cut -f 1 -d " "; } >> test.txt
Note that there is a race condition here: you can theoretically write to test.txt before md5sum is done reading from it, causing you to checksum more data than you intended. (Your original command mentions test1.txt and test.txt as separate files, so it's not clear if you are really reading from and writing to the same file.)
You can use command grouping to have a list of commands executed as a unit and redirect the output of the group at once:
{ printf 'File_Checksum: '; md5sum test1.txt | cut -f 1 -d " " } >> test.txt
printf "%s: %s\n" "File_Checksum:" "$(md5sum < test1.txt | cut ...)" > test.txt
Note that if you are trying to compute the hash of test.txt(the same file you are trying to write to), this changes things significantly.
Another option is:
{
printf "File_Checksum: "
md5sum ...
} > test.txt
Or:
exec > test.txt
printf "File_Checksum: "
md5sum ...
but be aware that all subsequent commands will also write their output to test.txt. The typical way to restore stdout is:
exec 3>&1
exec > test.txt # Redirect all subsequent commands to `test.txt`
printf "File_Checksum: "
md5sum ...
exec >&3 # Restore original stdout
Operator &&
e.g. mkdir example && cd example
Related
I am at my wits end as to why this loop is failing to concatenate the files the way I need it. Basically, lets say we have following files:
AB124661.lane3.R1.fastq.gz
AB124661.lane4.R1.fastq.gz
AB124661.lane3.R2.fastq.gz
AB124661.lane4.R2.fastq.gz
What we want is:
cat AB124661.lane3.R1.fastq.gz AB124661.lane4.R1.fastq.gz > AB124661.R1.fastq.gz
cat AB124661.lane3.R2.fastq.gz AB124661.lane4.R2.fastq.gz > AB124661.R2.fastq.gz
What I tried (and didn't work):
Create and save file names (AB124661) to a ID file:
ls -1 R1.gz | awk -F '.' '{print $1}' | sort | uniq > ID
This creates an ID file that stores the samples/files name.
Run the following loop:
for i in `cat ./ID`; do cat $i\.lane3.R1.fastq.gz $i\.lane4.R1.fastq.gz \> out/$i\.R1.fastq.gz; done
for i in `cat ./ID`; do cat $i\.lane3.R2.fastq.gz $i\.lane4.R2.fastq.gz \> out/$i\.R2.fastq.gz; done
The loop fails and concatenates into empty files.
Things I tried:
Yes, the ID file is definitely in the folder
When I run with echo it shows the cat command correct
Any help will be very much appreciated,
Best,
AC
why are you escaping the \> ? That's going to result in a cat: '>': No such file or directory instead of a redirection.
Don't read lines with for
while IFS= read -r id; do
cat "${id}.lane3.R1.fastq.gz" "${id}.lane4.R1.fastq.gz" > "out/${id}.R1.fastq.gz"
cat "${id}.lane3.R2.fastq.gz" "${id}.lane4.R2.fastq.gz" > "out/${id}.R2.fastq.gz"
done < ./ID
Let say you have id stored in file ./ID per line
while read -r line; do
cat "$line".lane3.R1.fastq.gz "$line".lane4.R1.fastq.gz > "$line".R1.fastq.gz
cat "$line".lane3.R2.fastq.gz "$line".lane4.R2.fastq.gz > "$line".R2.fastq.gz
done < ./ID
A pure shell solution could be like that:
for file in *.fastq.gz; do
id=${file%%.*}
[ -e "$id".R1.fastq.gz ] || cat "$id".*.R1.fastq.gz > "$id".R1.fastq.gz
[ -e "$id".R2.fastq.gz ] || cat "$id".*.R2.fastq.gz > "$id".R2.fastq.gz
done
Alternatively:
printf '%s\n' *.fastq.gz | cut -d. -f1 | sort -u |
while IFS= read -r id; do
cat "$id".*.R1.fastq.gz > "$id".R1.fastq.gz
cat "$id".*.R2.fastq.gz > "$id".R2.fastq.gz
done
This solution assumes filenames of interest don't contain newline characters.
I have a textfile called log.txt, and it logs the file name and the path it was gotten from. so something like this
2.txt
/home/test/etc/2.txt
basically the file name and its previous location. I want to use grep to grab the file directory save it as a variable and move the file back to its original location.
for var in "$#"
do
if grep "$var" log.txt
then
# code if found
else
# code if not found
fi
this just prints out to the console the 2.txt and its directory since the directory has 2.txt in it.
thanks.
Maybe flip the logic to make it more efficient?
f=''
while read prev
do case "$prev" in
*/*) f="${prev##*/}"; continue;; # remember the name
*) [[ -e "$f" ]] && mv "$f" "$prev";;
done < log.txt
That walks through all the files in the log and if they exist locally, move them back. Should be functionally the same without a grep per file.
If the name is always the same then why save it in the log at all?
If it is, then
while read prev
do f="${prev##*/}" # strip the path info
[[ -e "$f" ]] && mv "$f" "$prev"
done < <( grep / log.txt )
Having the file names on the same line would significantly simplify your script. But maybe try something like
# Convert from command-line arguments to lines
printf '%s\n' "$#" |
# Pair up with entries in file
awk 'NR==FNR { f[$0]; next }
FNR%2 { if ($0 in f) p=$0; else p=""; next }
p { print "mv \"" p "\" \"" $0 "\"" }' - log.txt |
sh
Test it by replacing sh with cat and see what you get. If it looks correct, switch back.
Briefly, something similar could perhaps be pulled off with printf '%s\n' "$#" | grep -A 1 -Fxf - log.txt but you end up having to parse the output to pair up the output lines anyway.
Another solution:
for f in `grep -v "/" log.txt`; do
grep "/$f" log.txt | xargs -I{} cp $f {}
done
grep -q (for "quiet") stops the output
I have some bam files in my input directory and for each bam file i want to calculate the number of mapped reads (using Samtools view command) and print that number along with the name of the bam file into a output file. Though it is working, i am not getting the output that i desired.
Here is how my code looks like
for file in input/*;
do
echo $file >> test.out;
samtools view -F 4 $file | wc -l >> output;
done
This works fine but the problem is it ouputs the name of the file and number of reads in different lines. Here is an example
sample_data/wgEncodeUwRepliSeqBg02esG1bAlnRep1.bam
1784867
sample_data/wgEncodeUwRepliSeqBg02esG2AlnRep1.bam
2280544
I tried to convert the new line characters to tab by doing this
for file in input/*;
do
echo $file >> output;
samtools view -F 4 $file | wc -l >> output;
tr '\n' '\t' < output > output2
done
Here is the output for the same
sample_data/wgEncodeUwRepliSeqBg02esG1bAlnRep1.bam 1784867 sample_data/wgEncodeUwRepliSeqBg02esG2AlnRep1.bam 2280544
How can now i insert the new line character after each line? For example
sample_data/wgEncodeUwRepliSeqBg02esG1bAlnRep1.bam 1784867
sample_data/wgEncodeUwRepliSeqBg02esG2AlnRep1.bam 2280544
Thanks
You could get the desired output by writing everything in one line. Something like:
echo -e "$file\t$(samtools view -F 4 $file | wc -l)" >> output;
If you want to do it in two pieces, note that echo has a -n option to suppress trailing newlines, and -e to interpret escapes like \t, so you could do:
echo -ne "$file\t" >> $output
samtools view -F 4 $file | wc -l >> output
Writing what you want the first time is cleaner than trying to post-process your output.
If the output of every file definitely consists of a filename and a number, I think you can easily change
tr '\n' '\t' < output > output2
to
tr '\n' '\t' < output | sed -r 's/([0-9]+\t)/\1\n/' > output2
It will match the number followed by a tab and add a new line character afterwards.
Just use a command substitution:
for file in input/*
do
printf '%s\t%d\n' "$file" "$(samtools view -F 4 $file | wc -l)"
done >> output
The examplary code below writes hi in a new line at every iteration. Is there a way to prevent this?
#!/bin/bash
while read line; do
var=$(echo $line | cut -d \, -f 2)
echo -n " $var"
done < file.csv > output.txt
Desired output is a concatenation of '$var's at each iteration. The code is run in OS X.
[Resolved]
In most cases of similar problems, klashww's answer would be what you want to try so that I would accept it as the answer. Yet, in my case, such options all failed in fixing the bug. The behavior was due to non-displayed character '^M' at the end of each line, since the file was coming from windows. I relearned that we should make sure to get rid of '^M' before processing it in bash via the line below. After that, the original code works fine.
tr -d '\015' < file > newfile
You might like to try using pure bash:
while IFS=',' read nu1 var nu2; do
echo -n " $var"
done < file.csv > output.txt
nu: "not used"
Use echo "hi\c" instead of echo -n "hi" or printf if avaliable , example printf "hi".
In your example, this should work:
while read line; do
var=$(echo $line | cut -d \, -f 2)
printf " $var"
done < file.csv > output.txt
Or you can use a better tool:
awk -F\, '{printf " "$2}' file.csv > output.txt
If everything fails tr brute force:
echo " $var"| tr -d '\n'
I'm trying out process substitution and this is just a fun exercise.
I want to append the string "XXX" to all the values of 'ls':
paste -d ' ' <(ls -1) <(echo "XXX")
How come this does not work? XXX is not appended. However if I want to append the file name to itself such as
paste -d ' ' <(ls -1) <(ls -1)
it works.
I do not understand the behavior. Both echo and ls -1 write to stdout but echo's output isn't read by paste.
Try doing this, using a printf hack to display the file with zero length output and XXX appended.
paste -d ' ' <(ls -1) <(printf "%.0sXXX\n" * )
Demo :
$ ls -1
filename1
filename10
filename2
filename3
filename4
filename5
filename6
filename7
filename8
filename9
Output :
filename1 XXX
filename10 XXX
filename2 XXX
filename3 XXX
filename4 XXX
filename5 XXX
filename6 XXX
filename7 XXX
filename8 XXX
filename9 XXX
If you just want to append XXX, this one will be simpler :
printf "%sXXX\n"
If you want the XXX after every line of ls -l output, you need a second command that output x times the string. You are echoing it just once and therefore it will get appended to the first line of ls output only.
If you are searching for a tiny command line to achieve the task you may use sed:
ls -l | sed -n 's/\(^.*\)$/\1 XXX/p'
And here's a funny one, not using any external command except the legendary yes command!
while read -u 4 head && read -u 5 tail ; do echo "$head $tail"; done 4< <(ls -1) 5< <(yes XXX)
(I'm only posting this because it's funny and it's actually not 100% off topic since it uses file descriptors and process substitutions)
... you have to:
for i in $( ls -1 ); do echo "$i XXXX"; done
Never use for i in $(command). See this answer for more details.
So, to answer of this original question, you could simply use something like this :
for file in *; do echo "$file XXXX"; done
Another solution with awk :
ls -1|awk '{print $0" XXXX"}'
awk '{print $0" XXXX"}' <(ls -1) # with process substitution
Another solution with sed :
ls -1|sed "s/\(.*\)/\1 XXXX/g"
sed "s/\(.*\)/\1 XXXX/g" <(ls -1) # with process substitution
And useless solutions, just for fun :
while read; do echo "$REPLY XXXX"; done <<< "$(ls -1)"
ls -1|while read; do echo "$REPLY XXXX"; done
It does it only for the first line, since it groups the first line from parameter 1 with the first line from parameter 2:
paste -d ' ' <(ls -1) <(echo "XXX")
... outputs:
/dir/file-a XXXX
/dir/file-b
/dir/file-c
... you have to:
for i in $( ls -1 ); do echo "$i XXXX"; done
You can use xargs for the same effect:
ls -1 | xargs -I{} echo {} XXX