I have 2 files sorted by numerically. I need help with shell script to read these 2 files and do a 1:1 mapping and rename the filenames with the mapped case#;
For example:
cat case.txt
10_80
10_90
cat files.txt
A BCD_x 1.pdf
A BCD_x 2.pdf
ls pdf_dir
A BCD_x 1.pdf A BCD_x 2.pdf
Read these 2 txt and rename the pdf files in pdf_dir :
A BCD_x 1.pdf as A BCD_10_80.pdf
A BCD_x 1.pdf as A BCD_10_90.pdf
Use paste to create the "mapping", then shell facilities to do the renaming.
shopt -s extglob
while IFS=$'\t' read file replacement; do
echo mv "$file" "${file/x +([0-9])/$replacement}"
done < <(paste files.txt case.txt)
remove "echo" when you're satisfied.
Using awk:
awk 'FNR==NR{a[FNR]=$0;next}
{f=$0; sub(/_x /, "_" a[FNR] " "); system("mv \"" f "\" \"" $0 "\"")}' case.txt files.txt
Using normal array and sed substitution -
Removing echo before mv will provide you the move capability.
You can change the /path/to/pdf_dir/ to specify your path to desired directory
#!/bin/bash
i=0
while read line
do
arr[i]="$line"
((i=i+1));
done < files.txt
i=0
while read case
do
newFile=$(echo "${arr[i]}" | sed "s/x/"$case"/")
echo mv /path/to/pdf_dir/"${arr[i]}" /path/to/pdf_dir/"$newFile"
((i=i+1))
done < case.txt
If you have Bash 4.0 this could help:
#!/bin/bash
declare -A MAP
I=0
IFS=''
while read -r CASE; do
(( ++I ))
MAP["A BCD_x ${I}.pdf"]="A BCD_${CASE}.pdf"
done < case.txt
while read -r FILE; do
__=${MAP[$FILE]}
[[ -n $__ ]] && echo mv "$FILE" "$__" ## Remove echo when things seem right already.
done < files.txt
Note: Make sure you run the script in UNIX file format.
Related
I want to add lines at beginning of file, it works with:
sed -i '1s/^/#INFO\tFORMAT\tunknown\n/' file
sed -i '1s/^/##phasing=none\n/' file
However it doesn't work when my file is empty. I found these commands:
echo > file && sed '1s/^/#INFO\tFORMAT\tunknown\n/' -i file
echo > file && sed '1s/^/##phasing=none\n/' -i file
but the last one erase the first one (and also if file isn't empty)
I would like to know how to add lines at the beginning of file either if the file is empty or not
I tried a loop with if [ -s file ] but without success
Thanks!
You can use the insert command (i).
if [ -s file ]; then
sed -i '1i\
#INFO\tFORMAT\tunknown\
##phasing=none' file
else
printf '#INFO\tFORMAT\tunknown\n##phasing=none' > file
fi
Note that \t for tab is not POSIX, and does not work on all sed implementations (eg BSD/Apple, -i works differently there too). You can use a raw tab instead, or a variable: tab=$(printf '\t').
You should use i command in sed:
file='inputFile'
# insert a line break if file is empty
[[ ! -s $file ]] && echo > "$file"
sed -i.bak $'1i\
#INFO\tFORMAT\tunknown
' "$file"
Or you can ditch sed and do it in the shell using printf:
{ printf '#INFO\tFORMAT\tunknown\n'; cat file; } > file.new &&
mv file.new file
With plain bash and shell utilities:
#!/bin/bash
header=(
$'#INFO\tFORMAT\tunknown'
$'##phasing=none'
)
mv file file.bak &&
{ printf '%s\n' "${header[#]}"; cat file.bak; } > file &&
rm file.bak
Explicitely creating a new file, then moving it:
#!/bin/bash
echo -e '#INFO\tFORMAT\tunknown' | cat - file > file.new
mv file.new file
or slurping the whole content of the file into memory:
#!/bin/bash
printf '#INFO\tFORMAT\tunknown\n%s' "$(<file)" > file
It is trivial with ed if available/acceptable.
printf '%s\n' '0a' $'#INFO\tFORMAT\tunknown' $'##phasing=none' . ,p w | ed -s file
It even creates the file if it does not exists.
I have been trying to rename some specific files based on a table but with no success. It either renames all files or gives error.
The directory contains hundreds of files named with long barcodes and I want to rename only files containing the patter _1_.
Example
barcode_1_barcode_SL484171.fastq.gz barcode_2_barcode_SL484171.fastq.gz barcode_1_barcode_SL484370.fastq.gz barcode_2_barcode_SL484370.fastq.gz
mytable.txt
oldname
newname
barcode_1_barcode_SL484171
Description1
barcode_2_barcode_SL484171
Description1
barcode_1_barcode_SL484370
Description2
barcode_2_barcode_SL484370
Description2
Desire output:
Description1.R1.fastq.gz Description2.R1.fastq.gz
As you can see in the table there are two files per description but I only want to rename the ones with the _1_ pattern.
Code I have tried:
for i in *_1_*.fastq.gz; do read oldname newname; mv "$oldname" "$newname".R1.fastq.gz; done < mytable.txt
for i in $(grep '_1_' mytable.txt); do read -r oldname newname; mv ${oldname} ${newname}.R1.fastq.gz; done < mytable.txt
for i in $(grep '_1_' mytable.txt); do oldname=$(cut -f1 $i);newname=$(cut -f2 $i); ln -s ${oldname} ${newname}.R1.fastq.gz; done
while read -r oldname newname
do
if [[ $oldname =~ "_1_" ]]
then
mv $oldname $newname
fi
done < mytable.txt
Something like this.
#!/usr/bin/env bash
while IFS= read -r files; do ##: loop through the output of `grep 'barcode_1_barcode.*' table.txt`
while read -ru9 old_name prefix; do ##: loop through the output of `find . -name 'barcode_1_barcode*.gz' | grep -f <(cut -d' ' -f1 table.txt`
if [[ $files == *"$old_name"* ]]; then ##: If the filename from the output of find matches the first field of table.txt (space delimite)
old_filename="${files%.fastq.gz}" ##: Extract the filename without the fast.gz extesntion
extension="${files#"$old_filename"}" ##: Extract the extention .fast.gz without the filename
# mv -v "$files" "$prefix.R1${extension}"
printf '%s %s %s ==> %s\n' mv -v "$files" "$prefix.R1${extension}" ##: Rename the files to the desired output
fi
done 9< <(grep 'barcode_1_barcode.*' table.txt)
done < <(find . -name 'barcode_1_barcode*.gz' | grep -f <(cut -d' ' -f1 table.txt) ) ##: Remain the first column/field of table.txt
Output from the OP's sample data/files.
renamed './barcode_1_barcode_SL484370.fastq.gz' -> 'Description2.R1.fastq.gz'
renamed './barcode_1_barcode_SL484171.fastq.gz' -> 'Description1.R1.fastq.gz'
If you're satisfied with the output either move the # from the front of mv to the
front of printf or just delete the entire line with printf and remove the # from
mv in order for mv to actually rename the files.
I have a CSV file of the form:
1,frog
2,truck
3,truck
4,deer
5,automobile
and so on, for about 50 000 entries. I want to create 50 000 separate .txt files named with the number before the comma and containing the word after the comma, like so:
1.txt contains: frog
2.txt contains: truck
3.txt contains: truck
4.txt contains: deer
5.txt contains: automobile
and so on.
This is the script I've written so far, but it does not work properly:
#!/bin/bash
folder=/home/data/cifar10
for file in $(find "$folder" -type f -iname "*.csv")
do
name=$(basename "$file" .txt)
while read -r tag line; do
printf '%s\n' "$line" >"$tag".txt
done <"$file"
rm "$file"
done
The issue is in your inner loop:
while read -r tag line; do
printf '%s\n' "$line" > "$tag".txt
done < "$file"
You need to set IFS to , so that tag and line are parsed correctly:
while IFS=, read -r tag line; do
printf '%s\n' "$line" > "$tag".txt
done < "$file"
You can use shopt -s globstar instead of find, with Bash 4.0+. This will be immune to word splitting and globbing, unlike plain find:
shopt -s globstar nullglob
for file in /home/data/cifar10/**/*.csv; do
while IFS=, read -r tag line; do
printf '%s\n' "$line" > "$tag".txt
done < "$file"
done
Note that the name set through name=$(basename "$file" .txt) statement is not being used in your code.
An awk alternative:
awk -F, '{print $2 > $1 ".txt"}' file.csv
awk 'BEGIN{FS=","} {print $1".txt contains: "$2}' file
1.txt contains: frog
2.txt contains: truck
3.txt contains: truck
4.txt contains: deer
5.txt contains: automobile
The folowing code work great but when the folder path contain "," and spaces make error
dir data/ > folder_file.txt
IFS=$'\n'
for file in "`cat folder_file.txt`"
do
printf 'File found: %s\n' "$file"
ls "data/$file/" #-----------> "," and "space" brook this task
done
any idea ? to escape special character
it work now any other advice's to make it better
IFS=$'\n'
a=0
for file in out/*; do
ls "$file" > html_file.txt
for file2 in `cat html_file.txt`; do
echo $file
mv "$file""/""$file2" "$file""/""page_"$a
let a=$a+1
done
a=0
done
This is how you loop on the content of a directory:
#!/bin/bash
shopt -s nullglob
for file in data/*; do
printf 'File found: %s\n' "$file"
ls "$file"
done
We use the shell options nullglob so that the glob * expands to nothing (and hence the loop is void) in case there are no matches.
I have a file input.txt that looks as follows.
abas_1.txt
abas_2.txt
abas_3.txt
1fgh.txt
3ghl_1.txt
3ghl_2.txt
I have a folder ff. The filenames of this folder are abas.txt, 1fgh.txt, 3ghl.txt. Based on the input file, I would like to create and rename the multiple copies in ff folder.
For example in the input file, abas has three copies. In the ff folder, I need to create the three copies of abas.txt and rename it as abas_1.txt, abas_2.txt, abas_3.txt. No need to copy and rename 1fgh.txt in ff folder.
Your valuable suggestions would be appreciated.
You can try something like this (to be run from within your folder ff):
#!/bin/bash
while IFS= read -r fn; do
[[ $fn =~ ^(.+)_[[:digit:]]+\.([^\.]+)$ ]] || continue
fn_orig=${BASH_REMATCH[1]}.${BASH_REMATCH[2]}
echo cp -nv -- "$fn_orig" "$fn"
done < input.txt
Remove the echo if you're happy with it.
If you don't want to run from within the folder ff, just replace the line
echo cp -nv -- "$fn_orig" "$fn"
with
echo cp -nv -- "ff/$fn_orig" "ff/$fn"
The -n option to cp so as to not overwrite existing files, and the -v option to be verbose. The -- tells cp that there are no more options beyond this point, so that it will not be confused if one of the files starts with a hyphen.
using for and grep :
#!/bin/bash
for i in $(ls)
do
x=$(echo $i | sed 's/^\(.*\)\..*/\1/')"_"
for j in $(grep $x in)
do
cp -n $i $j
done
done
Try this one
#!/bin/bash
while read newFileName;do
#split the string by _ delimiter
arr=(${newFileName//_/ })
extension="${newFileName##*.}"
fileToCopy="${arr[0]}.$extension"
#check for empty : '1fgh.txt' case
if [ -n "${arr[1]}" ]; then
#check if file exists
if [ -f $fileToCopy ];then
echo "copying $fileToCopy -> $newFileName"
cp "$fileToCopy" "$newFileName"
#else
# echo "File $fileToCopy does not exist, so it can't be copied"
fi
fi
done
You can call your script like this:
cat input.txt | ./script.sh
If you could change the format of input.txt, I suggest you adjust it in order to make your task easier. If not, here is my solution:
#!/bin/bash
SRC_DIR=/path/to/ff
INPUT=/path/to/input.txt
BACKUP_DIR=/path/to/backup
for cand in `ls $SRC_DIR`; do
grep "^${cand%.*}_" $INPUT | while read new
do
cp -fv $SRC_DIR/$cand $BACKUP_DIR/$new
done
done