How to split path by last slash? - bash

I have a file (say called list.txt) that contains relative paths to files, one path per line, i.e. something like this:
foo/bar/file1
foo/bar/baz/file2
goo/file3
I need to write a bash script that processes one path at a time, splits it at the last slash and then launches another process feeding it the two pieces of the path as arguments. So far I have only the looping part:
for p in `cat list.txt`
do
# split $p like "foo/bar/file1" into "foo/bar/" as part1 and "file1" as part2
inner_process.sh $part1 $part2
done
How do I split? Will this work in the degenerate case where the path has no slashes?

Use basename and dirname, that's all you need.
part1=$(dirname "$p")
part2=$(basename "$p")

A proper 100% bash way and which is safe regarding filenames that have spaces or funny symbols (provided inner_process.sh handles them correctly, but that's another story):
while read -r p; do
[[ "$p" == */* ]] || p="./$p"
inner_process.sh "${p%/*}" "${p##*/}"
done < list.txt
and it doesn't fork dirname and basename (in subshells) for each file.
The line [[ "$p" == */* ]] || p="./$p" is here just in case $p doesn't contain any slash, then it prepends ./ to it.
See the Shell Parameter Expansion section in the Bash Reference Manual for more info on the % and ## symbols.

I found a great solution from this source.
p=/foo/bar/file1
path=$( echo ${p%/*} )
file=$( echo ${p##*/} )
This also works with spaces in the path!

While basename and dirnames are really helpful, maybe you are in the same situation as me:
I need to get only the first Nth folders of a path, and I can be on any folder, like these ones: /home/me/folder/i/want/, /home/me/, or /home/me/folder/i/want/folder/i/dont/want/.
So I used cut.
Here's the command to get only /home/me/folder/i/want, no matter where I am:
echo "/home/me/folder/i/want/folder/i/dont/want" | cut -f 1,2,3,4,5,6 -d "/"
Here, cut is splitting the string by "/" chars, and is displaying 1st, 2nd [...] 6th words only.
Here are some examples:
$ echo $PWD
/home/me/folder/i/want/folder/i/dont/want
$ echo $PWD | cut -f 1,2,3,4,5,6 -d "/"
/home/me/folder/i/want
$ cd ../../../..
$ echo $PWD
/home/me/folder/i/want
$ echo $PWD | cut -f 1,2,3,4,5,6 -d "/"
/home/me/folder/i/want
$ cd ~
echo $PWD | cut -f 1,2,3,4,5,6 -d "/"
/home/me

Here is one example to find and replace file extensions to xml.
for files in $(ls); do
filelist=$(echo $files |cut -f 1 -d ".");
mv $files $filelist.xml;
done

Related

Remove part of name of multiple files on mac os

i have a directory full of .png files with a random caracters in the middle of the filenames like
T1_021_É}ÉcÉjÉV_solid box.png
T1_091_ÉRÉjÉtÉ#Å[_City.png
T1_086_ÉnÉiÉ~ÉYÉL_holiday.png
I expect this after removing
T1_021_solid box.png
T1_091_City.png
T1_086_holiday.png
Thank you
Using for to collect the file lists and bash parameter expansion with substring removal, you can do the following in the directory containing the files:
for i in T1_*; do
beg="${i%_*_*}" ## trim from back to 2nd '_'
end="${i##*_}" ## trim from from through last '_'
mv "$i" "${beg}_$end" ## mv file to new name.
done
(note: you don't have to use variables beg and end you can just combing both parameter expansions to form the new filenaame, e.g. mv "$i" "${i%_*_*}_${i##*_}", up to you, but beg and end make things a bit more readable.)
Result
New file names:
$ ls -al T1_*
T1_021_solid
T1_086_holiday.png
T1_091_City.png
Just another way to approach it from bash only.
Using cut
You can use cut to remove the 3rd field using '_' as the delimiter with :
for i in T1_*; do
mv "$i" $(cut -d'_' -f-2,4- <<< "$i")
done
(same output)
The only drawback is the use of cut in the command substitution would require an additional subshell be spawned each iteration.
If the set of random characters have _ before and after
find . -type f -iname "T1_0*" 2>/dev/null | while read file; do
mv "${file}" "$(echo ${file} | cut -d'_' -f1,2,4-)"
done
Explanation:
Find all files that start with T1_
Read the list line by line using the while loop
Use _ as delimiter and cut the 3rd column
Use mv to rename
Filenames after renaming:
T1_021_solid box.png
T1_086_holiday.png
T1_091_City.png

Substitute shortest match of pattern in filename

I have files with the following filename pattern:
C14_1_S1_R1_001_copy1.fastq.gz
That I would like to be renamed this way:
C14_1_S1_R1.fastq.gz
I have tested unsuccessfully the following pattern replacement strategy:
for f in *.fastq.gz; do echo mv "$f" "${f/_*./_}"; done
Any suggestion is welcome.
Your original filename has several underscore characters but you only want to remove from the second to last underscore. In that case, try:
mv "$f" "${f%_*_*}.fastq.gz"
Consider a directory with these files:
$ ls -1
C14_1_S1_R1_001_copy1.fastq.gz
C15_1_S1_R1_001_copy1.fastq.gz
If we run our loop and then run a new ls, we see the changed filenames:
$ for f in ./*.fastq.gz; do mv "$f" "${f%_*_*}.fastq.gz"; done
$ ls -1
C14_1_S1_R1.fastq.gz
C15_1_S1_R1.fastq.gz
The key here is that ${var%word} is suffix removal and it matches the shortest possible suffix that matches the glob word. Thus, ${f%_*_*} removes the second-to-last underscore character and everything after it. ${f%_*_*}.fastq.gz removes the second-to-last underscore character and everything after and then restores your desired suffix of .fastq.gz.
str="C14_1_S1_R1_001_copy1.fastq.gz"
front=$(echo "${str}" | cut -d'_' -f1-4)
back=$(echo "${str}" | cut --complement -d'.' -f1)
echo "${front}.${back}"
With regex using the =~ test operator and BASH_REMATCH
#!/usr/bin/env bash
for file in *.fastq.gz; do
if [[ $file =~ ^(.+)(_[[:digit:]]+_copy.*[^\.])(\.fastq\.gz)$ ]]; then
echo mv -v "$file" "${BASH_REMATCH[1]}${BASH_REMATCH[3]}"
fi
done
Basically it just split the C14_1_S1_R1_001_copy1.fastq.gz into three parts.
BASH_REMATCH[1] has C14_1_S1_R1
BASH_REMATCH[2] has _001_copy1
BASH_REMATCH[3] has .fastq.gz
Remove the echo if you're ok with the output so the files can be renamed.

How to split the contents of `$PATH` into distinct lines?

Suppose echo $PATH yields /first/dir:/second/dir:/third/dir.
Question: How does one echo the contents of $PATH one directory at a time as in:
$ newcommand $PATH
/first/dir
/second/dir
/third/dir
Preferably, I'm trying to figure out how to do this with a for loop that issues one instance of echo per instance of a directory in $PATH.
echo "$PATH" | tr ':' '\n'
Should do the trick. This will simply take the output of echo "$PATH" and replaces any colon with a newline delimiter.
Note that the quotation marks around $PATH prevents the collapsing of multiple successive spaces in the output of $PATH while still outputting the content of the variable.
As an additional option (and in case you need the entries in an array for some other purpose) you can do this with a custom IFS and read -a:
IFS=: read -r -a patharr <<<"$PATH"
printf %s\\n "${patharr[#]}"
Or since the question asks for a version with a for loop:
for dir in "${patharr[#]}"; do
echo "$dir"
done
How about this:
echo "$PATH" | sed -e 's/:/\n/g'
(See sed's s command; sed -e 'y/:/\n/' will also work, and is equivalent to the tr ":" "\n" from some other answers.)
It's preferable not to complicate things unless absolutely necessary: a for loop is not needed here. There are other ways to execute a command for each entry in the list, more in line with the Unix Philosophy:
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
such as:
echo "$PATH" | sed -e 's/:/\n/g' | xargs -n 1 echo
This is functionally equivalent to a for-loop iterating over the PATH elements, executing that last echo command for each element. The -n 1 tells xargs to supply only 1 argument to it's command; without it we would get the same output as echo "$PATH" | sed -e 'y/:/ /'.
Since this uses xargs, which has built-in support to split the input, and echoes the input if no command is given, we can write that as:
echo -n "$PATH" | xargs -d ':' -n 1
The -d ':' tells xargs to use : to separate it's input rather than a newline, and the -n tells /bin/echo to not write a newline, otherwise we end up with a blank trailing line.
here is another shorter one:
echo -e ${PATH//:/\\n}
You can use tr (translate) to replace the colons (:) with newlines (\n), and then iterate over that in a for loop.
directories=$(echo $PATH | tr ":" "\n")
for directory in $directories
do
echo $directory
done
My idea is to use echo and awk.
echo $PATH | awk 'BEGIN {FS=":"} {for (i=0; i<=NF; i++) print $i}'
EDIT
This command is better than my former idea.
echo "$PATH" | awk 'BEGIN {FS=":"; OFS="\n"} {$1=$1; print $0}'
If you can guarantee that PATH does not contain embedded spaces, you can:
for dir in ${PATH//:/ }; do
echo $dir
done
If there are embedded spaces, this will fail badly.
# preserve the existing internal field separator
OLD_IFS=${IFS}
# define the internal field separator to be a colon
IFS=":"
# do what you need to do with $PATH
for DIRECTORY in ${PATH}
do
echo ${DIRECTORY}
done
# restore the original internal field separator
IFS=${OLD_IFS}

How could I append '\' in front of the space within a file name?

I was working on a program that could transfer files using sftp program:
sftp -oBatchMode=no -b ${BATCH_FILE} user#$123.123.123.123:/home << EOF
bye
EOF
One of my requirement is I must have a BATCH_FILE use with sftp and the batch file was generate using following script:
files=$(ls -1 ${SRC_PATH}/*.txt)
echo "$files" > ${TEMP_FILE}
while read file
do
if [ -s "${file}" ]
then
echo ${file} >> "${PARSE_FILE}" ## line 1
fi
done < ${TEMP_FILE}
awk '$0="put "$0' ${PARSE_FILE} > ${BATCH_FILE}
Somehow my program doesn't able to handle files with space in it. I did try using following code to replace line 1 but failed, the output of this will show filename\.txt.
newfile=`echo $file | tr ' ' '\\ '`
echo ${newfile} >> "${PARSE_FILE}"
In order to handle file name with space, how could I append a \ in front of the space within a file name?
THE PROBLEM
The problem is that tr SET1 SET2 will replace the Nth character in SET1 with the Nth character in SET2, which means that you are effectively replacing every space by \, instead of adding a backslash before every space.
PROPOSED SOLUTION
Instead of manually trying to fix the missing spaces, upon using your variable that might contain spaces; wrap it in quotes and let the shell handle the trouble for you.
See the below example:
$ echo $FILENAME
file with spaces.txt
$ ls $FILENAME
ls: cannot access file: No such file or directory
ls: cannot access with: No such file or directory
ls: cannot access spaces.txt: No such file or directory
$ ls "$FILENAME"
file with spaces.txt
But I really wanna replace stuff..
Well, if you really want a command to change every ' ' (space) into '\ ' (backslash, space) you could use sed with a basic replace-pattern, as the below:
$ echo "file with spaces.txt" | sed 's, ,\\ ,g'
file\ with\ spaces.txt
I haven't looked too closely at what you're trying to do there, but I do know that bash can handle filenames with spaces in them if you double-quote them. Why not try quoting every filename variable and see if that works? You're quoting some of them but not all yet.
Like try these: "${newfile}" or just "$newfile" "$file" "$tempfile" etc...
You can further simplify your code if you're using Bash:
function generate_batch_file {
for FILE in "${SRC_PATH}"/*.txt; do
[[ -s $FILE ]] && echo "put {$FILE// /\\ }"
done
}
sftp -oBatchMode=no -b <(generate_batch_file) user#$123.123.123.123:/home <<< "bye"
you can try to rename the file to work and rename it again after it has done.

Recursive BASH renaming

EDIT: Ok, I'm sorry, I should have specified that I was on Windows, and using win-bash, which is based on bash 1.14.2, along with the gnuwin32 tools. This means all of the solutions posted unfortunately didn't help out. It doesn't contain many of the advanced features. I have however figured it out finally. It's an ugly script, but it works.
#/bin/bash
function readdir
{
cd "$1"
for infile in *
do
if [ -d "$infile" ]; then
readdir "$infile"
else
renamer "$infile"
fi
done
cd ..
}
function renamer
{
#replace " - " with a single underscore.
NEWFILE1=`echo "$1" | sed 's/\s-\s/_/g'`
#replace spaces with underscores
NEWFILE2=`echo "$NEWFILE1" | sed 's/\s/_/g'`
#replace "-" dashes with underscores.
NEWFILE3=`echo "$NEWFILE2" | sed 's/-/_/g'`
#remove exclamation points
NEWFILE4=`echo "$NEWFILE3" | sed 's/!//g'`
#remove commas
NEWFILE5=`echo "$NEWFILE4" | sed 's/,//g'`
#remove single quotes
NEWFILE6=`echo "$NEWFILE5" | sed "s/'//g"`
#replace & with _and_
NEWFILE7=`echo "$NEWFILE6" | sed "s/&/_and_/g"`
#remove single quotes
NEWFILE8=`echo "$NEWFILE7" | sed "s/’//g"`
mv "$1" "$NEWFILE8"
}
for infile in *
do
if [ -d "$infile" ]; then
readdir "$infile"
else
renamer "$infile"
fi
done
ls
I'm trying to create a bash script to recurse through a directory and rename files, to remove spaces, dashes and other characters. I've gotten the script working fine for what I need, except for the recursive part of it. I'm still new to this, so it's not as efficient as it should be, but it works. Anyone know how to make this recursive?
#/bin/bash
for infile in *.*;
do
#replace " - " with a single underscore.
NEWFILE1=`echo $infile | sed 's/\s-\s/_/g'`;
#replace spaces with underscores
NEWFILE2=`echo $NEWFILE1 | sed 's/\s/_/g'`;
#replace "-" dashes with underscores.
NEWFILE3=`echo $NEWFILE2 | sed 's/-/_/g'`;
#remove exclamation points
NEWFILE4=`echo $NEWFILE3 | sed 's/!//g'`;
#remove commas
NEWFILE5=`echo $NEWFILE4 | sed 's/,//g'`;
mv "$infile" "$NEWFILE5";
done;
find is the command able to display all elements in a filesystem hierarchy. You can use it to execute a command on every found file or pipe the results to xargs which will handle the execution part.
Take care that for infile in *.* does not work on files containing whitespaces. Check the -print0 option of find, coupled to the -0 option of xargs.
All those semicolons are superfluous and there's no reason to use all those variables. If you want to put the sed commands on separate lines and intersperse detailed comments you can still do that.
#/bin/bash
find . | while read -r file
do
newfile=$(echo "$file" | sed '
#replace " - " with a single underscore.
s/\s-\s/_/g
#replace spaces with underscores
s/\s/_/g
#replace "-" dashes with underscores.
s/-/_/g
#remove exclamation points
s/!//g
#remove commas
s/,//g')
mv "$infile" "$newfile"
done
This is much shorter:
#/bin/bash
find . | while read -r file
do
# replace " - " or space or dash with underscores
# remove exclamation points and commas
newfile=$(echo "$file" | sed 's/\s-\s/_/g; s/\s/_/g; s/-/_/g; s/!//g; s/,//g')
mv "$infile" "$newfile"
done
Shorter still:
#/bin/bash
find . | while read -r file
do
# replace " - " or space or dash with underscores
# remove exclamation points and commas
newfile=$(echo "$file" | sed 's/\s-\s/_/g; s/[-\s]/_/g; s/[!,]//g')
mv "$infile" "$newfile"
done
In bash 4, setting the globstar option allows recursive globbing.
shopt -s globstar
for infile in **
...
Otherwise, use find.
while read infile
do
...
done < <(find ...)
or
find ... -exec ...
I've used 'find' in the past to locate files then had it execute another application.
See '-exec'
rename 's/pattern/replacement/' glob_pattern

Resources