cat multiple files based on ID in filename - bash

I would like to combine files with similar ID before first underscore into one file using cat. How do I do this for multiple files like below?
Thought of something like this:
for f in *.R1.fastq.gz; do cat "$f" > "${f%}.fastq.gz"; done
in
9989_L004_R1.fastq.gz
9989_L005_R1.fastq.gz
9989_L009_R1.fastq.gz
9873_L008_R1.fastq.gz
9873_L005_R1.fastq.gz
9873_L001_R1.fastq.gz
out
9989.fastq.gz
9873.fastq.gz

for f in *_R1.fastq.gz; do cat "$f" >> "${f%%_*}.fastq.gz"; done
>> for appending,
${f%%_*} removes the longest suffix in $f matching _*.

Here is another way:
for f in *_R1.fast1.gz; do
[[ -f "${f%%_*}.fastq.gz" ]] || cat ${f%%_*}_*_R1.fast1.gz > "${f%%_*}.fastq.gz"
done
or if you want to have it a bit more readable:
for f in *_R1.fast1.gz; do
key="${f%%_*}"
[[ -f "${key}.fastq.gz" ]] || cat ${key}_*_R1.fast1.gz > "${key}.fastq.gz"
done

Related

Extract a line from a text file using grep?

I have a textfile called log.txt, and it logs the file name and the path it was gotten from. so something like this
2.txt
/home/test/etc/2.txt
basically the file name and its previous location. I want to use grep to grab the file directory save it as a variable and move the file back to its original location.
for var in "$#"
do
if grep "$var" log.txt
then
# code if found
else
# code if not found
fi
this just prints out to the console the 2.txt and its directory since the directory has 2.txt in it.
thanks.
Maybe flip the logic to make it more efficient?
f=''
while read prev
do case "$prev" in
*/*) f="${prev##*/}"; continue;; # remember the name
*) [[ -e "$f" ]] && mv "$f" "$prev";;
done < log.txt
That walks through all the files in the log and if they exist locally, move them back. Should be functionally the same without a grep per file.
If the name is always the same then why save it in the log at all?
If it is, then
while read prev
do f="${prev##*/}" # strip the path info
[[ -e "$f" ]] && mv "$f" "$prev"
done < <( grep / log.txt )
Having the file names on the same line would significantly simplify your script. But maybe try something like
# Convert from command-line arguments to lines
printf '%s\n' "$#" |
# Pair up with entries in file
awk 'NR==FNR { f[$0]; next }
FNR%2 { if ($0 in f) p=$0; else p=""; next }
p { print "mv \"" p "\" \"" $0 "\"" }' - log.txt |
sh
Test it by replacing sh with cat and see what you get. If it looks correct, switch back.
Briefly, something similar could perhaps be pulled off with printf '%s\n' "$#" | grep -A 1 -Fxf - log.txt but you end up having to parse the output to pair up the output lines anyway.
Another solution:
for f in `grep -v "/" log.txt`; do
grep "/$f" log.txt | xargs -I{} cp $f {}
done
grep -q (for "quiet") stops the output

rename all the files in the current directory whose name conatains upper-case into all lower case

Iam trying a shell script which will rename all the files in the current directory whose name contains upper-case characters into all lower case. For example, if the directory contains a file whose name is CoUnt.c, it should be renamed to count.c.
for f in *;
do
if [ -f "$f" ]; then
tr 'A-Z' 'a-z'
fi
done
but it is not working.
is there is any better solution for this?
You are not passing any data into the tr program, and you are not capturing any output either.
If you are using sh:
for f in *[A-Z]*
do
if [ -f "$f" ]; then
new_name=$(echo "$f"|tr 'A-Z' 'a-z')
mv "$f" "$new_name"
fi
done
Note the indentation - it makes code easier to read.
If you are using bash there is no need to use an external program like tr, you can use bash expansion:
for f in *[A-Z]*
do
if [[ -f $f ]]; then
new_name=${f,,*}
mv "$f" "$new_name"
fi
done
The problem is tr accepts values from stdin. So in order to translate upper to lower in each filename, you could do something like:
#!/bin/sh
for f in *
do
[ -f "$f" ] || continue
flc=$(echo "$f" | tr 'A-Z' 'a-z') ## form lower-case name
[ "$f" != "$flc" ] && echo mv "$f" "$flc"
done
(note: remove the echo before mv to actually move the files after you are satisfied with the operation)
Since I am unable to add comment posting here,
Used sed and it works for me
#!/bin/bash
for i in *
do
if [ -f $i ]
then
kar=$(echo "$i" | sed 's/.*/ \L&/')
mv "$i" "$kar"
done
The following code works fine.
for f in *
do
if [ -f $f ]; then
echo "$f" | tr 'A-Z' 'a-z' >/dev/null
fi
done
I would recommend rename because it is simple, efficient and also will check for clashes when two different files resolve to the same result:
You can use it with a Perl regex:
rename 'y/A-Z/a-z/' *
Documentation and examples available here.

loop through lines in each file in a folder - nested loop

I don't know why this is not working:
for g in *.txt; do for f in $(cat $g); do grep $f annotations.csv; done > ../$f_annot; done
I want to loop through each file in a folder, for each file I want to loop through each line and apply the grep command. When I do
for f in $(cat file1.txt); do grep $f annotations.csv; done > ../$f_annot
It works, it is the nested loop that doesn't output anything, it seems like it is running but it lasts forever and does nothing.
When you hava an empty txt file, grep $f annotations.csv will be translated into a grep command reading from stdin.
You might want to use something like
for g in *.txt; do
grep -f $g annotations.csv > ../$g_annot
done
SOLVED:
for file in *list.txt; do
while read -r line; do
grep "$line" annotations.csv
done < "$file"
> ${file}_annot.txt
done
:)
I get
Cannot write to a directory.
ksh: ../: 0403-005 Cannot create the specified file.
But this is because $f_annot evaluates to what we expect. It should be better with ${f}_annot:
for g in *.txt; do for f in $(cat $g); do grep $f annotations.csv; done > ../${f}_annot ; done
But there is an issue in your script because it erase the result of some loops: $f always has the last value of the loop. Maybe this suits your need better:
for g in *.txt; do for f in $(cat $g); do grep $f annotations.csv; done ; done

Renaming Multiples Files To delete first portion of name

I have a list of files like so :
10_I_am_here_001.jpg
20_I_am_here_003.jpg
30_I_am_here_008.jpg
40_I_am_here_004.jpg
50_I_am_here_009.jpg
60_I_am_here_002.jpg
70_I_am_here_005.jpg
80_I_am_here_006.jpg
How can I rename all the files in a directory, so that I can drop ^[0-9]+_ from the filename ?
Thank you
Using pure BASH:
s='10_I_am_here_001.jpg'
echo "${s#[0-9]*_}"
I_am_here_001.jpg
You can then write a simple for loop in that directory like this:
for s in *; do
f="${s#[0-9]*_}" && mv "$s" "$f"
done
Using rename :
rename 's/^[0-9]+_//' *
Here's another bash idea based on files ending .jpg as shown above or whatever>
VonBell
#!/bin/bash
ls *.jpg |\
while read FileName
do
NewName="`echo $FileName | cut -f2- -d "_"`"
mv $FileName $NewName
done
With bash extended globbing
shopt -s extglob
for f in *
do
[[ $f == +([0-9])_*.jpg ]] && mv "$f" "${f#+([0-9])_}"
done

Bash script iterate over files recusively and save output to file with identical name but different extension

I'm trying to recursively iterate over all my .html files in a directory and convert them to .jade using a bash script.
#!/bin/bash
for f in ./*.html ./**/*.html ; do
cat $f | html2jade -d > $f + '.jade';
done;
Naturally the $f + '.html' bit isn't correct. How might I fix this?
#!/bin/bash
shopt -s globstar
for f in **/*.html; do
html2jade -d < "$f" > "${f%.html}.jade"
done
Concatenation is the default for most cases.
... > "$f.jade"
Also:
html2jade ... < "$f"
And:
... > "${f%.html}.jade"

Resources