I'm writing a script and it is printing all the spaces in new lines. It is a Bash script - bash

for f in $( tasks/1_uniq/ -type f -follow -print | sed -r 's/[[:blank:]]+/ /g' ); do
md5sum $f
done
And this is what is printing where it finds a space.
tasks/1_uniq/two/6/66/test/me
&
my
friends
I cant manage to escape the spaces properly.

this is due to word splitting, without quotes after expansions it is split by characters in '$IFS' (space tab and newline), by double quoting the whole expansion will be taken as an argument. It seems you want to split by newlines. It can be done easily with read;
while read filepath; do
md5sum "$filepath" # note the double quotes to avoid word splitting
done < <( tasks/1_uniq/ -type f -follow -print | sed -r 's/[[:blank:]]+/ /g' )
I don't understand sed command which modifies filenames, it will give wrong filenames, another option
find ... -exec md5sum {} +
where ... is replaced with options

From your original code, it seems that your goal is to print out the md5 checksums for all files under a directory. In that case, you can simply use rhash
rhash -r -M dir/
-r for recursive and -M for md5 hash sum

Related

bash script create file in dir containing (multiple) file types

I want to create a file (containing the dir name) in any sub dir of e.g music that has at least one but maybe more .mp3 file. I want one file created no matter if there is one or more .mp3 in that dir and the dir can have whitespace.. I tried something like this: for i in $(find . -name "*.mp3" -printf "%h\n"|sort -u); do echo "$i" ; done
This breaks the path into 2 lines where the whitespace was so:
./directory one
outputs as:
./directory
one
The construct $( ... ) in your
for x in $(find ... | ... | ... ) ; do ... ; done
executes whatever is in $( ... ) and passes the newline separated output that you would see in the terminal if you had executed the ... command from the shell prompt to the for construct as a long list of names separated by blanks, as in
% ls -d1 b*
bianco nodi.pdf
bin
b.txt
% echo $(ls -d1 b*)
bianco nodi.pdf bin b.txt
%
now, the for cycle assign to i the first item in the list, in my example bianco and of course it's not what you want...
This situation is dealt with this idiom, in which the shell reads ONE WHOLE LINE at a time
% ls -d1 b* | while read i ; do echo "$i" ; ... ; done
in your case
find . -name "*.mp3" -printf "%h\n" | sort -u | while read i ; do
echo "$i"
done
hth, ciao
Edit
My anser above catches the most common case of blanks inside the filename, but it still fails if one has blanks at the beginning or the end of the filename and it fails also if there are newlines embedded in the filename.
Hence, I'm modifying my answer quite a bit, according to the advice from BroSlow (see comments below).
find . -name "*.mp3" -printf "%h\0" | \
sort -uz | while IFS= read -r -d '' i ; do
...
done
Key points
find's printf now separates filenames with a NUL.
sort, by the -z option, splits elements to be sorted on NULs rather than on newlines.
IFS= stops completely the shell habit of splitting on generic whitespace.
read's option -d (this is a bashism) means that the input is split on a particular character (by default, a newline).
Here I have -d '' that bash sees as specifying NUL, where BroSlow had $'\0' that bash expands, by the rules of parameter expansion, to '' but may be clearer as one can see an explicit reference to the NUL character.
I like to close with "Thank you, BroSlow".

Rename Files to original extensions

Need help on writing a bash script that will rename files that are being outputted as file name.suffix.date I need these files to be rewritten as name.date.suffix instead.
Edited:
Changed suffix from date to ~
Here's what I have so far:
find . -type f -name "*.~" -print0 | while read -d $'\0' f
do
new=`echo "$f" | sed -e "s/~//"`
mv "$f" "$new"
done
This changes the suffix back to original but can't figure out how to get the date to be named before the extension (fname??)
You can use regular expression matching to pull apart the original file name:
find . -type f -name "*.~" -print0 | while read -d $'\0' f
do
dir=${f%/*}
fname=${f##*/}
[[ $fname =~ (.+)\.([^.]+)\.([^.]+)\.~$ ]] || continue
name=${BASH_REMATCH[1]}
suffix=${BASH_REMATCH[2]}
d=${BASH_REMATCH[3]}
mv "$f" "$dir/$name.$d.$suffix"
done
Bash-only solution:
while IFS=. read -r -u 9 -d '' name suffix date tilde
do
mv "${name}.${suffix}.${date}.~" "${name}.${date}.${suffix}"
done 9< <(find . -type f -name "*.~" -print0)
Notes:
-d '' gives you the same result as -d $'\0'
Splits file names by the dots while reading them. Of course this means it would break if there are dots anywhere else.
Should otherwise work with pretty much any filenames, including those containing space, newlines and other funny business.
create a list of the files first and redirect to a file.
ls > fileList.txt
Open the file and read line by line in Perl. Use a regex to match the parts of the files and capture them like this
my ($fileName,$suffix,$date)=($WholeFileName=~/(.*)\.(.*)\.(.*)/);
This should capture the three seperate variables for you. Now all you need to do is move the old file to the new file name. The new file name will be a concatenation of the above three variables that you have got. $newFileName=$fileName. ".".$date.".".$suffix. If you have a sample fileName post a comment and I can reply with a short script. Perl is not the only way. You could just use bash or awk and find alternate ways to do this.
cut each part of your filenames:
FIN=$(echo test.12345.ABCDEF | sed -e 's/[a-zA-Z0-9]*[\\.][a-zA-Z0-9]*[\\.]//')
DEBUT=$(echo test.12345.ABCDEF | sed -e 's/[\\.][a-zA-Z0-9]*[\\.][a-zA-Z0-9]*//')
MILIEU=$(echo test.12345.ABCDEF | sed -e 's/'${FIN}'//' -e 's/'${DEBUT}'//' -e 's/[\.]*//g')
paste each part as expected:
echo ${DEBUT}.${FIN}.${MILIEU}
rename --no-act 's/\(name-regex\).\(suffix-regex\).\(date-regex\)/\1.\3.\2' *
Tweak the three regexes to fit your file names, and remove --no-act when you're happy with the result to actually rename the files.

Bash Script which recursively makes all text in files lowercase

I'm trying to write a shell script which recursively goes through a directory, then in each file converts all Uppercase letters to lowercase ones. To be clear, I'm not trying to change the file names but the text in the files.
Considerations:
This is an old Fortran project which I am trying to make more accessible
I do not want to create a new file but rather write over the old one with the changes
There are several different file extensions in this directory, including .par .f .txt and others
What would be the best way to go about this?
To convert a file from lower case to upper case you can use ex (a good friend of ed, the standard editor):
ex -s file <<EOF
%s/[[:upper:]]\+/\L&/g
wq
EOF
or, if you like stuff on one line:
ex -s file <<< $'%s/[[:upper:]]\+/\L&/g\nwq'
Combining with find, you can then do:
find . -type f -exec bash -c "ex -s -- \"\$0\" <<< $'%s/[[:upper:]]\+/\L&/g\nwq'" {} \;
This method is 100% safe regarding spaces and funny symbols in the file names. No auxiliary files are created, copied or moved; files are only edited.
Edit.
Using glenn jackmann's suggestion, you can also write:
find . -type f -exec bash -c 'printf "%s\n" "%s/[[:upper:]]\+/\L&/g" "wq" | ex -- -s "$0"' {} \;
(the pro is that it avoids awkward escapes; the con is that it's longer).
You can translate all uppercase characters (A–Z) to lowercase (a–z) using the tr command
and specifying a range of characters, as in:
$ tr 'A-Z' 'a-z' <be.fore >af.ter
There is also special syntax in tr for specifying this sort of range for upper- and lowercase
conversions:
$ tr '[:upper:]' '[:lower:]' <be.fore >af.ter
The tr utility copies the given input to produced the output with substitution or deletion of selected characters. tr abbreviated as translate or transliterate. It takes as parameters two sets of characters, and replaces occurrences of the characters in the first set with the corresponding elements from the other set i.e. it is used to translate characters.
tr "set1" "set2" < input.txt > output.txt
Although tr doesn't support regular expressions, hmm, it does support a range of characters.
Just make sure that both arguments end up with the same number of characters.
If the second argument is shorter, its last character will be repeated to match the
length of the first argument. If the first argument is shorter, the second argument will
be truncated to match the length of the first.
sed -e 's/\(.*\)/\L\1/g' *
or you could pipe the files in from find
Expanding on #nullrevolution's solution:
find /path_to_files -type f -exec sed --in-place -e 's/\(.*\)/\L\1/g' '{}' \;
This one liner will look for all files in all sub-directories starting with /path_to_files as a base directory.
WARNING: This will change the case on ALL files in EVERY directory under */path_to_file*, so make sure you want to do that before you execute this script. You can limit the scope of the find based on file extensions by utilizing the following:
find /path_to_files -type f -name \*.txt -exec sed --in-place -e 's/\(.*\)/\L\1/g' '{}' \;
You may also want to make a backup of the original file before modifying the original:
find /path_to_files -type f -name *.txt -exec sed --in-place=-orig -e 's/(.*)/\L\1/g' '{}' \;
This will leave the original file name, while making an unmodified copy with the "_orig" appended to the file name (ie file.txt would become file.txt-orig).
An explanation of each piece:
find /path_to_file This will set the base directory to the path provided.
-type f This will search the directory hierarchy for files only.
-exec COMMAND '{}' \; This executes the provided command once for each matched file. The '{}' is replaced by the current file name. The \; indicates the end of the command.
sed --in-place -e 's/\(.*\)/\L\1/g' The --in-place will make the cnages to the file without backing up the file. The regular expression uses a backreference \1 to refer to the entire line and the \L to convert to lower case.
Optional
(For a more archaic solution.)
find /path_to_files -type f -exec dd if='{}' of='{}'-lc conv=lcase \;
Identifying text files can be a bit tricky in Unixlike environments. You can do something like this:
set -e -o noclobber
while read f; do
tr 'A-Z' 'a-z' <"$f" >"f.$$"
mv "$f.$$" "$f"
done < <(find "$start_directory" -type f -exec file {} + | cut -d: -f1)
This will fail on filenames with embedded colons or newlines, but should work on others, including those with spaces.

Parse output of a for loop into a variable as a comma delimited string

I'm grabbing all the filenames from a directory and I want to create a comma delimited string of these filenames so that I can pass that string as an argument to an application. This is my code snippet:
if [[ -n $(ls | grep lpt) ]]; then
for files in $(find . -maxdepth 1 -type f); do
#parse output into variable fileList
done
fi
How do I accomplish this?
Think easier:
find . -maxdepth 1 -type f -printf '%P,' | sed -e 's/,$/\n/'
The sed expression replaces the terminal , by a linebreak.
You should use the one-liners shown in the other answers, but in order to fill in your script, you could do:
if [[ -n $(ls | grep lpt) ]]; then
for file in $(find . -maxdepth 1 -type f); do
#parse output into variable fileList
fileList="$file,$fileList"
done
fi
#now remove the trailing comma from the fileList
fileList=$(sed 's/,$//' <<< "$fileList")
(Note: your for-loop won't work correctly if your filenames have spaces in them)
Instead of your loop, you could use find's -exec option, along with shell expansion:
fileList=$(find . -maxdepth 1 -type f -exec echo -n "{}," \; | sed 's/,$//')
The sed bit is just to remove the trailing comma. sed is used to edit input streams, i.e. here, it gets piped text from find and is editing what it's getting. Since the command given leaves an extra , at the end, sed uses its substitution command (s) to get rid of it. The form is:
s/EXPRESSION/REPLACEMENT/
So ,$ means "a comma at the end of the line, since $ means "at the end of the line", and the nothingness between the second and third slashes means it gets replaced by nothing.
As far as the \; in find, that's just a requirement for using -exec, so it knows when it's done reading commands, and it's in the man page. :)

Recursive BASH renaming

EDIT: Ok, I'm sorry, I should have specified that I was on Windows, and using win-bash, which is based on bash 1.14.2, along with the gnuwin32 tools. This means all of the solutions posted unfortunately didn't help out. It doesn't contain many of the advanced features. I have however figured it out finally. It's an ugly script, but it works.
#/bin/bash
function readdir
{
cd "$1"
for infile in *
do
if [ -d "$infile" ]; then
readdir "$infile"
else
renamer "$infile"
fi
done
cd ..
}
function renamer
{
#replace " - " with a single underscore.
NEWFILE1=`echo "$1" | sed 's/\s-\s/_/g'`
#replace spaces with underscores
NEWFILE2=`echo "$NEWFILE1" | sed 's/\s/_/g'`
#replace "-" dashes with underscores.
NEWFILE3=`echo "$NEWFILE2" | sed 's/-/_/g'`
#remove exclamation points
NEWFILE4=`echo "$NEWFILE3" | sed 's/!//g'`
#remove commas
NEWFILE5=`echo "$NEWFILE4" | sed 's/,//g'`
#remove single quotes
NEWFILE6=`echo "$NEWFILE5" | sed "s/'//g"`
#replace & with _and_
NEWFILE7=`echo "$NEWFILE6" | sed "s/&/_and_/g"`
#remove single quotes
NEWFILE8=`echo "$NEWFILE7" | sed "s/’//g"`
mv "$1" "$NEWFILE8"
}
for infile in *
do
if [ -d "$infile" ]; then
readdir "$infile"
else
renamer "$infile"
fi
done
ls
I'm trying to create a bash script to recurse through a directory and rename files, to remove spaces, dashes and other characters. I've gotten the script working fine for what I need, except for the recursive part of it. I'm still new to this, so it's not as efficient as it should be, but it works. Anyone know how to make this recursive?
#/bin/bash
for infile in *.*;
do
#replace " - " with a single underscore.
NEWFILE1=`echo $infile | sed 's/\s-\s/_/g'`;
#replace spaces with underscores
NEWFILE2=`echo $NEWFILE1 | sed 's/\s/_/g'`;
#replace "-" dashes with underscores.
NEWFILE3=`echo $NEWFILE2 | sed 's/-/_/g'`;
#remove exclamation points
NEWFILE4=`echo $NEWFILE3 | sed 's/!//g'`;
#remove commas
NEWFILE5=`echo $NEWFILE4 | sed 's/,//g'`;
mv "$infile" "$NEWFILE5";
done;
find is the command able to display all elements in a filesystem hierarchy. You can use it to execute a command on every found file or pipe the results to xargs which will handle the execution part.
Take care that for infile in *.* does not work on files containing whitespaces. Check the -print0 option of find, coupled to the -0 option of xargs.
All those semicolons are superfluous and there's no reason to use all those variables. If you want to put the sed commands on separate lines and intersperse detailed comments you can still do that.
#/bin/bash
find . | while read -r file
do
newfile=$(echo "$file" | sed '
#replace " - " with a single underscore.
s/\s-\s/_/g
#replace spaces with underscores
s/\s/_/g
#replace "-" dashes with underscores.
s/-/_/g
#remove exclamation points
s/!//g
#remove commas
s/,//g')
mv "$infile" "$newfile"
done
This is much shorter:
#/bin/bash
find . | while read -r file
do
# replace " - " or space or dash with underscores
# remove exclamation points and commas
newfile=$(echo "$file" | sed 's/\s-\s/_/g; s/\s/_/g; s/-/_/g; s/!//g; s/,//g')
mv "$infile" "$newfile"
done
Shorter still:
#/bin/bash
find . | while read -r file
do
# replace " - " or space or dash with underscores
# remove exclamation points and commas
newfile=$(echo "$file" | sed 's/\s-\s/_/g; s/[-\s]/_/g; s/[!,]//g')
mv "$infile" "$newfile"
done
In bash 4, setting the globstar option allows recursive globbing.
shopt -s globstar
for infile in **
...
Otherwise, use find.
while read infile
do
...
done < <(find ...)
or
find ... -exec ...
I've used 'find' in the past to locate files then had it execute another application.
See '-exec'
rename 's/pattern/replacement/' glob_pattern

Resources