How to execute command on files with spaces in them? [duplicate] - bash

This question already has answers here:
How to loop through file names returned by find?
(17 answers)
Closed 3 years ago.
I am dealing with a legacy codebase where we're trying to convert all jpeg/png files to webp format using the cwebp command. Unfortunately, a lot of the image files were saved with spaces in the name.
Example: i am poorly named.jpg
So when running the following bash script to find all jpegs in the directory and loop through and convert them the words separated by spaces are treated as another file so the image never gets converted.
We don't want to remove the whitespaces, but just create a webp file with the exact same name.
files=$(find ./ -type f -name "*.jpg")
for jpg in $files
do
webp="${jpg/%jpg/webp}";
if [ ! -f $webp ]; then
echo "The webp version does not exist";
cwebp -q 80 "$jpg" -o "$webp";
fi
done
I've tried placing jpg=$(printf '%q' "$jpg") immediately after the do in the above code as well as other things.
I expect i am poorly named.webp to be created if file i am poorly named.jpg exists.

But there is no real reason to store all filenames. So an alternative is:
find ./ -type f -name "*.jpg" | while read jpg
do
....
But this works only, if a filename contains no linefeed. For files with linesfeeds there are other solutions. Ask for it if needed.

Related

how list just one file from a (bash) shell directory listing

A bit lowly a query but here goes:
bash shell script. POSIX, Mint 21
I just want one/any (mp3) file from a directory. As a sample.
In normal execution, a full run, the code would be such
for f in *.mp3 do
#statements
done
This works fine but if I wanted to sample just one file of such an array/glob (?) without looping, how might I do that? I don't care which file, just that it is an mp3 from the directory I am working in.
Should I just start this for-loop and then exit(break) after one statement, or is there a neater way more tailored-for-the-job way?
for f in *.mp3 do
#statement
break
done
Ta (can not believe how dopey I feel asking this one, my forehead will hurt when I see the answers )
Since you are using Linux (Mint) you've got GNU find so one way to get one .mp3 file from the current directory is:
mp3file=$(find . -maxdepth 1 -mindepth 1 -name '*.mp3' -printf '%f' -quit)
-maxdepth 1 -mindepth 1 causes the search to be restricted to one level under the current directory.
-printf '%f' prints just the filename (e.g. foo.mp3). The -print option would print the path to the filename (e.g. ./foo.mp3). That may not matter to you.
-quit causes find to exit as soon as one match is found and printed.
Another option is to use the Bash : (colon) command and $_ (dollar underscore) special variable:
: *.mp3
mp3file=$_
: *.mp3 runs the : command with the list of .mp3 files in the current directory as arguments. The : command ignores its arguments and does nothing.
mp3file=$_ sets the value of the mp3file variable to the last argument supplied to the previous command (:).
The second option should not be used if the number of .mp3 files is large (hundreds or more) because it will find all of the files and sort them by name internally.
In both cases $mp3file should be checked to ensure that it really exists (e.g. [[ -e $mp3file ]]) before using it for anything else, in case there are no .mp3 files in the directory.
I would do it like this in POSIX shell:
mp3file=
for f in *.mp3; do
if [ -f "$f" ]; then
mp3file=$f
break
fi
done
# At this point, the variable mp3file contains a filename which
# represents a regular file (or a symbolic link) with the .mp3
# extension, or empty string if there is no such a file.
The fact that you use
for f in *.mp3 do
suggests to me, that the MP3s are named without to much strange characters in the filename.
In that case, if you really don't care which MP3, you could:
f=$(ls *.mp3|head)
statement
Or, if you want a different one every time:
f=$(ls *.mp3|sort -R | tail -1)
Note: if your filenames get more complicated (including spaces or other special characters), this will not work anymore.
Assuming you don't have spaces in your filenames, (and I don't understand why the collective taboo is against using ls in scripts at all, rather than not having spaces in filenames, personally) then:-
ls *.mp3 | tr ' ' '\n' | sed -n '1p'

Get basename of files in folder [duplicate]

This question already has answers here:
Extract filename and extension in Bash
(38 answers)
Closed 12 months ago.
I want to compress image files and wrote the following bash script:
for i in *.png; do convert $i -quality 100% $i-comp.jpeg; done;
When I run it I get filenames like 48.png-comp.jpeg. I only want to have like 48-comp.jpeg, so basically removing the .png part of the filename.
I tried using this ${$i/.png/}, which gives me an error message.
Any suggestions how to do this?
${$i/.png/} is almost correct, but the second $ is not needed inside a Parameter Expansion.
#!/bin/bash
for i in *.png; do
convert "$i" -quality 88% "${i/.png/}-comp.jpeg"
done
Note: ${i%.png} is commonly more used to remove a file extension than ${i/.png/}, but both should produce the same output.
You could use parameter expansion to strip the png extension for the output file name:
for i in *.png; do convert $i -quality 100% ${i%.png}-comp.jpeg; done

convert png files to pdf with Bash? [duplicate]

This question already has answers here:
How to convert all png files to pdf in Bash?
(2 answers)
Closed 2 years ago.
for f in find *.png; do convert "$f" "$f".pdf; done
This is what I have to find the png files in the directory and convert them to pdf, but I get errors. What is a better way to do this in Bash?
convert: unable to open image `find': No such file or directory # error/blob.c/OpenBlob/2705.
convert: no decode delegate for this image format `' # error/constitute.c/ReadImage/504.
convert: no images defined `find.pdf' # error/convert.c/ConvertImageCommand/3257.
If you're working in just one directory and not requiring find, you can do the following:
for i in *.png; do convert "$i" "${i%.png}.pdf"; done
which uses the shell globbing to find your files. Note the variable substitution to convert from a png to a pdf extension.
Otherwise it's more complicated. I think your find args are not correct. I would try:
find . -name \*.png
Note that I specify the starting directory (.) and then the name pattern (via -name). You need to escape the glob (asterisk) such that the shell doesn't expand it, and instead passes it directly to find.
Now, you can then execute find in a subshell, and then use the results.
e.g.
for f in $(find . -name \*.png); do convert "$f" "$f".pdf; done
Note the $(...) which executes the subshell and makes the output available.
If your filenames contain whitespace, the shell may split on this and cause you further problems. If this is the case there are a number of options presented here

Find files with special character in file-name in unix

Q:I am working with script on unix platform and I have to find out all the files in a directory which came around 8 hours early from now.
i am using below command to retrieve the files as per above condition:
find . name "*.dat" -mmin -480
But there are few files which are having special character(double question mark) ??" in the file-name itself and using above command ,file with ?? in its name ,got splits into two part in two lines.
for eg:
file name : aabb??cc.dat
after above command run,it results like this :
$./aabb
$cc.dat
($ here is unix command prompt)
Can someone suggest the correction in the above command or the right approach to handle this exception.
This command will show you find is considering these files just like the others :
find . -name "*.dat" -mmin -480 -exec \
ksh -c 'c=1
for file do
printf "file #%d is \"%s\"\n" $c "$file"
c=$((c+1))
done ' sh {} +
If find is showing some file names split in two lines, that's just because their names have an embedded new line. That's odd but they are still valid file names.

How to count all the human readable files in Bash?

I'm taking an intro course to UNIX and have a homework question that follows:
How many files in the previous question are text files? A text file is any file containing human-readable content. (TRICK QUESTION. Run the file command on a file to see whether the file is a text file or a binary data file! If you simply count the number of files with the .txt extension you will get no points for this question.)
The previous question simply asked how many regular files there were, which was easy to figure out by doing find . -type f | wc -l.
I'm just having trouble determining what "human readable content" is, since I'm assuming it means anything besides binary/assembly, but I thought that's what -type f displays. Maybe that's what the professor meant by saying "trick question"?
This question has a follow up later that also asks "What text files contain the string "csc" in any mix of upper and lower case?". Obviously "text" is referring to more than just .txt files, but I need to figure out the first question to determine this!
Quotes added for clarity:
Run the "file" command on a file to see whether the file is a text file or a binary data file!
The file command will inspect files and tell you what kind of file they appear to be. The word "text" will (almost) always be in the description for text files.
For example:
desktop.ini: Little-endian UTF-16 Unicode text, with CRLF, CR line terminators
tw2-wasteland.jpg: JPEG image data, JFIF standard 1.02
So the first part is asking you to run the file command and parse its output.
I'm just having trouble determining what "human readable content" is, since I'm assuming it means anything besides binary/assembly, but I thought that's what -type f displays.
find -type f finds files. It filters out other filesystem objects like directories, symlinks, and sockets. It will match any type of file, though: binary files, text files, anything.
Maybe that's what the professor meant by saying "trick question"?
It sounds like he's just saying don't do find -name '*.txt' or some such command to find text files. Don't assume a particular file extension. File extensions have much less meaning in UNIX than they do in Windows. Lots of files don't even have file extensions!
I'm thinking the professor wants us to be able to run the file command on all files and count the number of ones with 'text' in it.
How about a multi-part answer? I'll give the straightforward solution in #1, which is probably what your professor is looking for. And if you are interested I'll explain its shortcomings and how you can improve upon it.
One way is to use xargs, if you've learned about that. xargs runs another command, using the data from stdin as that command's arguments.
$ find . -type f | xargs file
./netbeans-6.7.1.desktop: ASCII text
./VMWare.desktop: a /usr/bin/env xdg-open script text executable
./VMWare: cannot open `./VMWare' (No such file or directory)
(copy).desktop: cannot open `(copy).desktop' (No such file or directory)
./Eclipse.desktop: a /usr/bin/env xdg-open script text executable
That works. Sort of. It'd be good enough for a homework assignment. But not good enough for a real world script.
Notice how it broke on the file VMWare (copy).desktop because it has a space in it. This is due to xargs's default behavior of splitting the arguments on whitespace. We can fix that by using xargs -0 to split command arguments on NUL characters instead of whitespace. File names can't contain NUL characters, so this will be able to handle anything.
$ find . -type f -print0 | xargs -0 file
./netbeans-6.7.1.desktop: ASCII text
./VMWare.desktop: a /usr/bin/env xdg-open script text executable
./VMWare (copy).desktop: a /usr/bin/env xdg-open script text executable
./Eclipse.desktop: a /usr/bin/env xdg-open script text executable
This is good enough for a production script, and is something you'll encounter a lot. But I personally prefer an alternative syntax which doesn't require a pipe, and so is slightly more efficient.
$ find . -type f -exec file {} \;
./netbeans-6.7.1.desktop: ASCII text
./VMWare.desktop: a /usr/bin/env xdg-open script text executable
./VMWare (copy).desktop: a /usr/bin/env xdg-open script text executable
./Eclipse.desktop: a /usr/bin/env xdg-open script text executable
To understand that, -exec calls file repeatedly, replacing {} with each file name it finds. The semi-colon \; marks the end of the file command.
there's a nice and easy way to determine whether a file is a human readable text file, just use file --mime-type <filename> and look for 'text/plain'. It will work no matter if the file has an ending or has a different ending to .txt
So you would do sth like:
FILES=`find $YOUR_DIR -type f`
for file in $FILES ;
do
mime=`/usr/bin/file --mime-type $YOUR_DIR/$file | /bin/sed 's/^.* //'`
if [ $mime = "text/plain" ]; then
fileTotal=$(( fileTotal + 1 ))
echo "$fileTotal - $file"
fi
done
echo "$fileTotal human readable files found!"
and the output would be sth like:
1 - /sampledir/samplefile
2 - /sampledir/anothersamplefile
....
23 human readable files found!
If you want to take it further to more mime types that are human readable(e.g. does HTML and/or XML count?) have a look at http://www.feedforall.com/mime-types.htm

Resources