Remove middle of name in bash - bash

I have 300+ files named:
T1_0000106_FS1_MAX_5743.nii.gz T1_0000214_FS1_MAX_5475.nii.gz
T1_0000107_FS1_MAX_5477.nii.gz T1_0000215_FS1_MAX_6162.nii.gz
I would like to remove everything between T1 and _5/6*.nii.gz so:
T1_5743.nii.gz T1_5475.nii.gz
T1_5477.nii.gz T1_6162.nii.gz
I can't figure out why it isn't working; I tried (from another post):
for file in *.gz;
do new_name=$(sed 's/_[^.]*_/_g' <<< "new_name");
mv "$file" "$new_name"; done
and variations of rename/sed but nothing changes.

Problems with your script include, at least,
s/_[^.]*_/_g is not a valid sed command. You appear to want s/_[^.]*_/_/g, or in this case, s/_[^.]*_/_/ would do fine as well.
<<< "new_name" redirects the literal string new_name into sed. Possibly you mean <<< "$new_name"
Personally, though, I would not bother with sed for this job, especially if you have a large number of files. Bash is perfectly capable of doing fairly complex string manipulation itself, and your needs don't push it too hard. Consider:
for f in *.gz; do
# The value of $f with the longest trailing match to _* removed
head=${f%%_*}
# The value of $f with the longest leading match to *_ removed
tail=${f##*_}
new_name="${head}_${tail}"
# Sanity check and avoid needless errors
if [ "$f" != "$new_name" ]; then
mv "$f" "$new_name"
fi
done

You could do
for i in *_5*.nii.gz *_6*.nii.gz;do a=${i%%_*};b=${i##*_};[[ $i != $a"_"$b ]] && mv $i $a"_"$b;done
Edited Following suggestion that the file could already be renamed.

Bash's built-in string substitution and its extglob option simplify the replacement of the middle part:
#!/usr/bin/env bash
shopt -s extglob
for file in T1_*.nii.gz; do
echo mv -- "$file" "${file/_+([^.])_/_}"
done
Remove the echo or pipe the output to a shell, if it matches your expectations.
Here is the output of my own test:
mv -- T1_0000106_FS1_MAX_5743.nii.gz T1_5743.nii.gz
mv -- T1_0000107_FS1_MAX_5477.nii.gz T1_5477.nii.gz
mv -- T1_0000214_FS1_MAX_5475.nii.gz T1_5475.nii.gz
mv -- T1_0000215_FS1_MAX_6162.nii.gz T1_6162.nii.gz

Related

Bash rename last underscore in string

I have got a directory with files in which some of then end with an underscore.
I would like to test each file to see if it ends with an underscore and then strip off the underscore.
I am currently running the following code:
for file in *;do
echo $file;
if [[ "${file:$length:1}" == "_" ]];then
mv $file $(echo $file | sed "s/.$//g");
fi
done
But it does not seem to be renaming the files with underscore. For example if i have a file called all_indoors_ I expect it to give me all_indoors.
You could use built-in string substitution:
for file in *_; do
mv "$file" "${file%_}"
done
Just use a regex to check the string:
for file in *
do
[[ $file =~ "_$" ]] && echo mv "$file" "${file%%_}"
done
Once you are sure it works as intended, remove the echo so that the mv command executes!
It may even be cleaner to use *_ so that the for will just loop over the files with a name ending with _, as hek2mgl suggests in comments.
for file in *_
do
echo mv "$file" "${file%%_}"
done
You can use which will be recursive:
while read f; do
mv "$f" "${f:0:-1}"; # Remove last character from $f
done < <(find . -type f -name '*_')
Although not a pure bash approach, you can use rename.ul (written by Larry Wall, the person behind perl). Rename is not part of the default linux environment, but is part of util-linux.
You use rename with:
rename perlexpr files
(some flags ommitted).
So you could use:
rename 's/_$//' *
if you want to remove all characters including and after the underscore.
As #hek2mgl points out, there are multiple rename commands (see here), so first test if you have picked the right one.

change file names in a directory with a certain pattern at the beginning

I want to remove the numbers from the file names in one directory:
file names:
89_Ajohn_text_phones
3_jpegs_directory
..
What I would like to have
Ajohn_text_phones
jpegs_directory
I tried:
rename 's/^\([0-9]|[0-9][0-9]\)_//g' *
but I did not work.
There are two rename tools. The one you appear to try to use is based on Perl, and as such uses perl-style regular expressions. The escaping rules are a little different from sed; in particular, parentheses for grouping aren't escaped (escaped parentheses are literal parentheses). Use
rename 's/^([0-9]|[0-9][0-9])_//' *
or, somewhat more concisely,
rename 's/^[0-9]{1,2}_//' *
This rename is the default on Debian-based distributions such as Ubuntu.
The other rename tool is from util-linux, and it is a very simple tool that does not work with regexes and cannot handle this particular requirement. It is the default on, for example, Fedora-based distributions, and what's worse, those often don't even package the Perl rename. You can find the Perl rename here on CPAN and put it in /usr/local/bin if you want, but otherwise, your simplest option is probably a shell loop with mv and sed:
for f in *; do
# set dest to the target file name
# Note: using sed's extended regex syntax (-r) because it is nice.
# Much less escaping of metacharacters is needed. Note that
# sed on FreeBSD and Mac OS X uses -E instead of -r, so there
# you'd have to change that or use regular syntax, in which
# the regex would have to be written as ^[0-9]\{1,2\}_
dest="$(echo "$f" | sed -r 's/^[0-9]{1,2}_//')"
if [ "$f" = "$dest" ]; then
# $f did not match pattern; do nothing
true;
elif [ -e "$dest" ]; then
# avoid losing files.
echo "$dest already exists!"
else
mv "$f" "$dest"
fi
done
You could put this into a shell script, say rename.sh:
#!/bin/sh
transform="$1"
shift
for f in "$#"; do
dest="$(echo "$f" | sed -r "$transform")"
if [ "$f" = "$dest" ]; then
## do nothing
true;
elif [ -e "$dest" ]; then
echo "$dest already exists!"
else
mv "$f" "$dest"
fi
done
and call rename.sh 's/^[0-9]{1,2}_//' *.
One caveat remains: in
dest="$(echo "$f" | sed -r "$transform")"
there is a possibility that "$f" could be something that echo considers a command line option, such as -n or -e. I do not know a way to solve this portably (if you read this and know one, please leave a comment), but if you are in a position where tethering yourself to bash is not a problem, you can use bash's here strings instead:
dest="$(sed -r "$transform" <<< "$f")"
(and replace the #!/bin/sh shebang with #!/bin/bash). Such files are rare, though, so as timebombs go, this is not unlikely to be a dud.
#!/bin/bash
for f in *
do
mv "$f" "${f#*_}"
done

Shell script Issues and Errors when tested in school's program

Files created in 'testdir':
file1 file2.old file3old file4.old
Execution of 'oldfiles2 testdir':
Files in 'testdir' after 'oldfiles2' was run:
file1.old file2.old file3old.old file4.old
Error: 'for' does not seem to loop only through required filenames
Please hit to continue with the Assignment
Is the error I am hitting with a script running for school,
Here is the script below
#!/bin/bash
shopt -s extglob nullglob
dir=$1
for file in "$dir"/!(*.old)
do
[[ $file == *.old ]] || mv -- "$file" "$file.old"
done
The assignment was written by someone who doesn't know bash well. Your approach is way better.
Instead of grepping ls, you can use extglob (and also nullglob in case there are no matches):
#!/bin/bash
shopt -s extglob nullglob
for file in "$dir"/!(*.old)
do
mv -- "$file" "$file.old"
done
As demonstrated by your test validator's output, it works perfectly:
file1 does not end in .old, and so it's renamed to file1.old
file2.old ends in .old, and is not renamed.
file3old does not end in .old (old != .old), and is renamed.
file4.old ends in .old, and is not renamed.
However, the validator refuses to accept it, indicating that the validator is wrong. A common mistake for people who don't know bash well (like your professor) is to use grep -v .old or grep -v '.old$', which doesn't actually check if files end .old because . means "any character".
We can emulate this bug in the script:
#!/bin/bash
shopt -s extglob nullglob
for file in "$dir"/!(*?old*)
do
mv -- "$file" "$file.old"
done
This code is objectively wrong, but may pass the incorrect validator. Alternatively, "$dir"/!(*?old) will emulate a buggy grep anchored to the end of the line.
If I read correctly what your teacher wants, then here is a one liner using grep -v and no if statement. You can block it out in the script or leave it as a one liner.
ls | grep -v '\.old' | while read FILE; do mv "${FILE}" "${FILE}.old"; done
BTW I've tested this and it works because the "." in '\.old' is a dot (or period) and not "any character" because it's escaped with a backslash.
Here is sample output from Terminal
System1:test 123$ ls -1
file name 1
file name 2
file name.old
file.old
file1
file2
System1:test 123$ ls | grep -v '\.old' | while read FILE; do mv "${FILE}" "${FILE}.old"; done
System1:test 123$ ls -1
file name 1.old
file name 2.old
file name.old
file.old
file1.old
file2.old
System1:test 123$
Try:
#!/bin/bash
for filename in $(ls $1 | grep -v "\.old$")
do
mv $1/$filename $1/$filename.old
done
In Bash you can use character classes beginning with the inversion character ^ or ! to match all characters except the listed character. In your case:
for file in "$dir"/*.[^o][^l][^d]*; do
[ "$file" = *.old ] || mv -- "$file" "$file.old"
done
That will locate all files in $dir that do NOT have and .old extension and move the file to $file.old. For a case insensitive version:
for file in "$dir"/*.[^oO][^lL][^dD]*; do
You can use the bash [[ operator for the [[ "$file" == *.old ]] test as well, but it is less portable in practice. (character classes are also not portable). Unless a file starts potentially starts with -, there isn't any reason to include -- following mv (but it doesn't hurt either).

How can I grep contents of files with bash only without using find or grep -r?

I have an assignment to write a bash program which if I type in the following:
-bash-4.1$ ./sample.sh path regex keyword
that will result something like that:
path/sample.txt:12
path/sample.txt:34
path/dir/sample1.txt:56
path/dir/sample2.txt:78
The numbers are the line number of the search results. I have absolutely no idea how can I achieve this in bash, without using find or grep -r. I am allowed to use grep, sed, awk, …
Break the problem into parts.
First, you need to obtain the file names to search in. How can you list the files in a directory and its subdirectories? (Hint: there's a glob pattern for that.)
You need to iterate over the files. What form of loop should this be?
For each file, you need to read each line from the file in turn. There's a builtin for that.
For each line, you need to test whether the line matches the specified regexp. There's a construct for that.
You need to maintain a counter of the number of lines read in a file to be able to print the line number.
Search for globstar in the bash manual.
See https://unix.stackexchange.com/questions/18886/why-is-while-ifs-read-used-so-often-instead-of-ifs-while-read/18936#18936 regarding while read loops.
shopt -s globstar # to enable **/
GLOBIGNORE=.:.. # to match dot files
dir=$1; regex=$2
for file in "$dir"/**/*; do
[[ -f $file ]] || continue
n=1
while IFS= read -r line; do
if [[ $line =~ $regex ]]; then
echo "$file:$n"
fi
((++n))
done <"$file"
done
It's possible that your teacher didn't intend you to use the globstar feature, which is a relatively recent addition to bash (appeared in version 4.0). If so, you'll need to write a recursive function to recurse into subdirectories.
traverse_directory () {
for x in "$1"/*; do
if [ -d "$x" ]; then
traverse_directory "$x"
elif [ -f "$x" ]; then
grep "$regexp" "$x"
fi
done
}
Putting this into practice:
#!/bin/sh
regexp="$2"
traverse_directory "$1"
Follow-up exercise: the glob pattern * omits files whose name begins with a . (dot files). You can easily match dot files as well by adding looping over .* as well, i.e. for x in .* *; do …. However, this throws the function into an infinite loop as it recurses forever into . (and also ..). How can you change the function to work with dot files as well?
while read
do
[[ $REPLY =~ foo ]] && echo $REPLY
done < file.txt

Basename puts single quotes around variable

I am writing a simple shell script to make automated backups, and I am trying to use basename to create a list of directories and them parse this list to get the first and the last directory from the list.
The problem is: when I use basename in the terminal, all goes fine and it gives me the list exactly as I want it. For example:
basename -a /var/*/
gives me a list of all the directories inside /var without the / in the end of the name, one per line.
BUT, when I use it inside a script and pass a variable to basename, it puts single quotes around the variable:
while read line; do
dir_name=$(echo $line)
basename -a $dir_name/*/ > dir_list.tmp
done < file_with_list.txt
When running with +x:
+ basename -a '/Volumes/OUTROS/backup/test/*/'
and, therefore, the result is not what I need.
Now, I know there must be a thousand ways to go around the basename problem, but then I'd learn nothing, right? ;)
How to get rid of the single quotes?
And if my directory name has spaces in it?
If your directory name could include spaces, you need to quote the value of dir_name (which is a good idea for any variable expansion, whether you expect spaces or not).
while read line; do
dir_name=$line
basename -a "$dir_name"/*/ > dir_list.tmp
done < file_with_list.txt
(As jordanm points out, you don't need to quote the RHS of a variable assignment.)
Assuming your goal is to populate dir_list.tmp with a list of directories found under each directory listed in file_with_list.txt, this might do.
#!/bin/bash
inputfile=file_with_list.txt
outputfile=dir_list.tmp
rm -f "$outputfile" # the -f makes rm fail silently if file does not exist
while read line; do
# basic syntax checking
if [[ ! ${line} =~ ^/[a-z][a-z0-9/-]*$ ]]; then
continue
fi
# collect targets using globbing
for target in "$line"/*; do
if [[ -d "$target" ]]; then
printf "%s\n" "$target" >> $outputfile
fi
done
done < $inputfile
As you develop whatever tool will process your dir_list.tmp file, be careful of special characters (including spaces) in that file.
Note that I'm using printf instead of echo so that targets whose first character is a hyphen won't cause errors.
This might work
while read; do
find "$REPLY" >> dir_list.tmp
done < file_with_list.txt

Resources