remove characters from filename in bash - bash

I am trying to remove specific characters from a file in bash but am not getting the desired result.
bash
for file in /home/cmccabe/Desktop/NGS/API/test/*.vcf.gz; do
mv -- "$file" "${file%%/*_variants_}.vcf.gz"
done
file name
TSVC_variants_IonXpress_004.vcf.gz
desired resuult
IonXpress_004.vcf.gz
current result (extention in filename repeats)
TSVC_variants_IonXpress_004.vcf.gz.vcf.gz
I have tried to move the * to the end and to use /_variants_/ and the same results. Thank you :).

${var%%*foo} removes a string ending with foo from the end of the value of var. If there isn't a suffix which matches, nothing is removed. I'm guessing you want ${var##*foo} to trim from the beginning, up through foo. You'll have to add the directory path back separately if you remove it, of course.
mv -- "$file" "/home/cmccabe/Desktop/NGS/API/test/${file##*_variants_}"

find . -type f -name "*.vcf.gz" -exec bash -c 'var="$1";mv $var ${var/TSVC_variants_/}' _ {} \;
may do the job for you .

Related

Linux Bash - How to remove part of the filename of a file contained in folder

I have several directories containing files whose names contain the name of the folder more other words.
Example:
one/berg - one.txt
two/tree - two.txt
three/water - three.txt
and I would like to remain so:
one/berg.txt
two/tree.txt
three/water.txt
I tried with the sed command, find command, for command, etc.
I fail has to find a way to get it.
Could you help me?. Thank you
Short and simple, if you have GNU find:
find . -name '* - *.*' -execdir bash -c '
for file; do
ext=${file##*.}
mv -- "$file" "${file%% - *}.${ext}"
done
' _ {} +
-execdir executes the given command within the directory where each set of files are found, so one doesn't need to worry about directory names.
for file; do is a shorter way to write for file in "$#"; do.
${file##*.} expands to the contents of $file, with everything up to and including the last . removed (thus, it expands to the file's extension).
"${varname%% - *}" expands to the contents of the variable varname, with everything after <space><dash><space> removed from the end.
In the idiom -exec bash -c '...' _ {} + (as with -execdir), the script passed to bash -c is run with _ as $0, and all files found by find in the subsequent positions.
Here's a way to do it with the help of sed:
#!/bin/bash
find -type f -print0 | \
while IFS= read -r -d '' old_path; do
new_path="$(echo "$old_path" | sed -e 's|/\([^/]\+\)/\([^/]\+\) - \1.\([^/.]\+\)$|/\1/\2.\3|')"
if [[ $new_path != $old_path ]]; then
echo mv -- "$old_path" "$new_path"
# ^^^^ remove this "echo" to actually rename the files
fi
done
You must cd to the top level directory that contains all those files to do this. Also, it constains an echo, so it does not actually rename the files. Run it one to see if you like its output and if you do, remove the echo and run it again.
The basic idea is that we iterate over all files and for each file, we try to find if the file matches with the given pattern. If it does, we rename it. The pattern detects (and captures) the second last component of the path and also breaks up the last component of the path into 3 pieces: the prefix, the suffix (which must match with the previous path component), and the extension.

Can't use $myvar_.* to mv files starting with $myvar_

week=$(date +%W)
I'm trying to move files beginning with $week to another folder using mv.
So I have a file named:
25_myfile.zip
And the number at the beginning is a number of a week. So I want to move it using mv from the directory it's currently in to /mydir/week25/:
mv /mydir/$week\_.* /mydir/week$week;
But I get a stat error.
The problem
When you say
mv /mydir/$week\_.* /mydir/week$week;
# ^^
You are using the syntax $var\_.* (or ${var}_.* if you don't want to have to escape the underscore) you are trying to use globbing, but failing because you use a regular expression syntax.
The solution
Use globbing as described in Bash Reference Manual → 3.5.8 Filename Expansion. That is
After word splitting, unless the -f option has been set (see The Set
Builtin), Bash scans each word for the characters ‘*’, ‘?’, and ‘[’.
If one of these characters appears, then the word is regarded as a
pattern, and replaced with an alphabetically sorted list of filenames
matching the pattern (see Pattern Matching).
mv /mydir/$week\_* /mydir/week$week;
# ^
or, using ${ } to define the scope of the name of the variable:
mv /mydir/${week}_* /mydir/week$week;
# ^ ^ ^
Another approach
You just need an expression like:
for file in <matching condition>; do
mv "$file" /another/dir
done
In this case:
for file in ${week}_*; do
mv "$file" /mydir/week"${week}"/
done
Because ${week}_* will expand to those filenames starting with $week plus _.
See an example:
$ touch 23_a
$ touch 23_b
$ touch 23_c
$ touch 24_c
$ d=23
$ echo ${d}*
23_a 23_b 23_c
$ for f in ${d}*; do echo "$f --"; done
23_a --
23_b --
23_c --
Below is another alternative using find
week=25 && find /mydir -type f -not -path "/mydir/week*" \
-name "$week*zip" -exec mv {} "/mydir/week$week" \;

Bash Script; Replacing certain string in filename with another string using find and sed

I'm trying to create a bash script that replaces certain string in filename with another string using find command and sed command.
I want to take two arguments; first being variable OLD, the string I'm looking for in filename (string may also contain spaces), second being variable NEW, the string I want to change it to.
I want to find all files that contains OLD in filename AND ends with .jpg or .png in current directories and all subdirectories and change the OLD part with NEW.
#! /bin/bash
OLD="${1}"
NEW="${2}"
# maybe..?
for file in `find . -name "*$OLD*\.(jpg|png)"`; do
# ...
done
# or this..?
find . -name "*$OLD*\.(jpg|png)" | sed -E "s/$OLD/$NEW/"
# ...
I have wrote some things but they probably don't make any sense and I'm really confused with bash scripting as I am still very new to it. I will really appreciate any help. Thank you!
I'd just do this, as long as you don't have any special characters in $OLD and $NEW:
for file in $(find -E . -regex ".*$OLD.*\.(jpg|png)$")
do
echo ${file/$OLD/$NEW}
done
EDIT: Seems find -E is an extension.
find . -regextype posix-extended -regex ".*$OLD.*\.(jpg|png)$"

How can i write a script (to automate this process) that find spaces and special character in file name. I have large data of filename in SVN

find.-type f|egrep-i"~||&|#|#|<|>|;|:|!|'^'|,|-|_"|tee temp.txt
I am not sure about special characters like * or $. Can you help me out with this.
First of all, I'd suggest to write a script which takes a single file name and fixes it. Then you can do:
find . -type f -exec /path/to/fixNames.sh "{}" \;
fixNames.sh could then contain:
rename 's/[ \t]/-/' "$1" # blanks
rename "s/'\",//" "$1" # characters to remove
rename 's/&/-n-/' "$1"
Note: Test this with a folder with some bad file names! Only run this against the real files when you know that this doesn't cause problems!
Related:
Introduction to shell scripting
How about setting up two arrays - one for the special characters, one for the replacements (they should contain the same number of indexes)?
#!/bin/bash
SPECIALCHARS=("," " " "&" "\\" "\"")
REPLACEMENTS=("" "-" "-n-" "" "")
for i in $(seq 0 $((${#SPECIALCHARS[#]}-1))); do
find . -exec rename "${SPECIALCHARS[$i]}" "${REPLACEMENTS[$i]}" {} \;
done

How can I read a list of filenames from a file in bash?

I'm trying to write a bash script that will process a list of files whose names are stored one per line in an input file, something the likes of
find . -type f -mtime +15 > /tmp/filelist.txt
for F in $(cat /tmp/filelist.txt) ; do
...
done;
My problem is that filenames in filelist.txt may contain spaces, so the snipped above will expand the line
my text file.txt
to three different filenames, my, text and file.txt. How can I fix that?
Use read:
while read F ; do
echo $F
done </tmp/filelist.txt
Alternatively use IFS to change how the shell separates your list:
OLDIFS=$IFS
IFS="
"
for F in $(cat /tmp/filelist.txt) ; do
echo $F
done
IFS=$OLDIFS
Alternatively (as suggested by #tangens), convert the body of your loop into a separate script, then use find's -exec option to run if for each file found directly.
You can do this without a temporary file using process substitution:
while read F
do
...
done < <(find . -type f -mtime +15)
use while read
cat $FILE | while read line
do
echo $line
done
You can do redirect instead of cat with a pipe
You could use the -exec parameter of find and use the file names directly:
find . -type f -mtime +15 -exec <your command here> {} \;
The {} is a placeholder for the file name.
pipe your find command straight to while read loop
find . -type f -mtime +15 | while read -r line
do
printf "do something with $line\n"
done
I'm not a bash expert by any means ( I usually write my script in ruby or python to be cross-platform), but I would use a regex expration to escape spaces in each line before you process it.
For Bash Regex:
http://www.linuxjournal.com/node/1006996
In a similar situation in Ruby ( processing a csv file, and cleaning up each line before using it):
File.foreach(csv_file_name) do |line|
clean_line = line.gsub(/( )/, '\ ')
#this finds the space in your file name and escapes it
#do more stuff here
end
I believe you can skip the temporary file entirely and just directly iterate over the results of find, i.e.:
for F in $(find . -type f -mtime +15) ; do
...
done;
No guarantees that my syntax is correct but I'm pretty sure the concept works.
Edit: If you really do have to process the file with a list of filenames and can't simply combine the commands as I did above, then you can change the value of the IFS variable--it stands for Internal Field Separator--to change how bash determines fields. By default it is set to whitespace, so a newline, space, or tab will begin a new field. If you set it to contain only a newline, then you can iterate over the file just as you did before.

Resources