Using Zsh and QPDF to decrypt multiple PDFs - bash

From this answer https://stackoverflow.com/a/59688271/7577919 I was able to decrpypt multiple PDFs in place using this bash script:
temp=`ls`;
for each in $temp;
do qpdf --decrypt --replace-input $each;
done
However, I had initially attempted to do this in Zsh (as it's encouraged in MacOS 10.15 Catalina), but was unable. It gave an error of output: File name too long
What is the difference between the for loops in Bash and Zsh and how would I go about writing a proper Zsh script?

There is no difference in the for-loop, but in the way variables are expanded. Consider this program:
x='a b'
for v in $x
do
echo $v
done
In bash, $x would word-split into 2 arguments, and hence the loop would be executed twice, once for a and once for b. In zsh, $x would not undergo word-splitting and the loop would be executed once, for the valud a b. This difference is everywhere when you expand a parameter.
In your case, the loop is executed once, each holding the complete output of the ls statement.
Of course in your case, it would be simpler in zsh to write the loop as
for each in *(N)
but if you really need a variable, I would use an array:
temp=(*(N))
The N-flag after the wildcard takes care that you get an empty string instead of an error message, if there are no files.
If you also want to catch the dot-files (similar to what a ls -A would do), use (ND) instead.

Since parsing the result of ls is not encouraged, it's probably not recommended to use ls to get the pdf filenames. Instead, you can use find:
find . -name '*.pdf' -exec qpdf --decrypt --replace-input "{}" \;
You can limit pdf in current directory by adding -maxdepth 1 to find.

Related

How to remove unknown file extensions from files using script

I can remove file extensions if I know the extensions, for example to remove .txt from files:
foreach file (`find . -type f`)
mv $file `basename $file .txt`
end
However if I don't know what kind of file extension to begin with, how would I do this?
I tried:
foreach file (`find . -type f`)
mv $file `basename $file .*`
end
but it wouldn't work.
What shell is this? At least in bash you can do:
find . -type f | while read -r; do
mv -- "$REPLY" "${REPLY%.*}"
done
(The usual caveats apply: This doesn't handle files whose name contains newlines.)
You can use sed to compute base file name.
foreach file (`find . -type f`)
mv $file `echo $file | sed -e 's/^\(.*\)\.[^.]\+$/\1/'`
end
Be cautious: The command you seek to run could cause loss of data!
If you don't think your file names contain newlines or double quotes, then you could use:
find . -type f -name '?*.*' |
sed 's/\(.*\)\.[^.]*$/mv "&" "\1"/' |
sh
This generates your list of files (making sure that the names contain at least one character plus a .), runs each file name through the sed script to convert it into an mv command by effectively removing the material from the last . onwards, and then running the stream of commands through a shell.
Clearly, you test this first by omitting the | sh part. Consider running it with | sh -x to get a trace of what the shell's doing. Consider making sure you capture the output of the shell, standard output and standard error, into a log file so you've got a record of the damage that occurred.
Do make sure you've got a backup of the original set of files before you start playing with this. It need only be a tar file stored in a different part of the directory hierarchy, and you can remove it as soon as you're happy with the results.
You can choose any shell; this doesn't rely on any shell constructs except pipes and single quotes and double quotes (pretty much common to all shells), and the sed script is version neutral too.
Note that if you have files xyz.c and xyz.h before you run this, you'll only have a file xyz afterwards (and what it contains depends on the order in which the files are processed, which needn't be alphabetic order).
If you think your file names might contain double quotes (but not single quotes), you can play with the changing the quotes in the sed script. If you might have to deal with both, you need a more complex sed script. If you need to deal with newlines in file names, then it is time to (a) tell your user(s) to stop being silly and (b) fix the names so they don't contain newlines. Then you can use the script above. If that isn't feasible, you have to work a lot harder to get the job done accurately — you probably need to make sure you've got a find that supports -print0, a sed that supports -z and an xargs that supports -0 (installing the most recent GNU versions if you don't already have the right support in place).
It's very simple:
$ set filename=/home/foo/bar.dat
$ echo ${filename:r}
/home/foo/bar
See more in man tcsh, in "History substitution":
r
Remove a filename extension '.xxx', leaving the root name.

Difference between using ls and find to loop over files in a bash script

I'm not sure I understand exactly why:
for f in `find . -name "strain_flame_00*.dat"`; do
echo $f
mybase=`basename $f .dat`
echo $mybase
done
works and:
for f in `ls strain_flame_00*.dat`; do
echo $f
mybase=`basename $f .dat`
echo $mybase
done
does not, i.e. the filename does not get stripped of the suffix. I think it's because what comes out of ls is formatted differently but I'm not sure. I even tried to put eval in front of ls...
The correct way to iterate over filenames here would be
for f in strain_flame_00*.dat; do
echo "$f"
mybase=$(basename "$f" .dat)
echo "$mybase"
done
Using for with a glob pattern, and then quoting all references to the filename is the safest way to use filenames that may have whitespace.
First of all, never parse the output of the ls command.
If you MUST use ls and you DON'T know what ls alias is out there, then do this:
(
COLUMNS=
LANG=
NLSPATH=
GLOBIGNORE=
LS_COLORS=
TZ=
unset ls
for f in `ls -1 strain_flame_00*.dat`; do
echo $f
mybase=`basename $f .dat`
echo $mybase
done
)
It is surrounded by parenthesis to protect existing environment, aliases and shell variables.
Various environment names were NUKED (as ls does look those up).
One unalias command (self-explanatory).
One unset command (again, protection against scrupulous over-lording 'ls' function).
Now, you can see why NOT to use the 'ls'.
Another difference that hasn't been mentioned yet is that find is recursive search by default, whereas ls is not. (even though both can be told to do recursive / non-recursive through options; and find can be told to recurse up to a specified depth)
And, as others have mentioned, if it can be achieved by globbing, you should avoid using either.

How do I wrap the results of a command in quotes to pass it to another command?

This is for the Apple platform. My end goal is to do a find and replace for a line inside of the firefox preference file "prefs.js" to turn off updates. I want to be able to do this for all accounts on the Mac, including the user template (didn't include that in the examples). So far I've been able to get a list of all the paths that have the prefs.js file with this:
find /Users -name prefs.js
I then put the old preference and new preference in variables:
oldPref='user_pref("app.update.enabled", false);'
newPref='user_pref("app.update.enabled", true);'
I then have a "for loop" with the sed command to replace the old preference with the new preference:
for prefs in `find /Users -name prefs.js`
do
sed "s/$oldPref/$newPref/g" "$prefs"
done
The problem I'm running into is that the "find" command returns the full paths with the stupid "Application Support" in the path name like this:
/Users/admin/Library/Application Support/Firefox/Profiles/437cwg3d.default/prefs.js
When the command runs, I get these errors:
sed: /Users/admin/Library/Application: No such file or directory
sed: Support/Firefox/Profiles/437cwg3d.default/prefs.js: No such file or directory
I'm assuming that I somehow need to get the "find" command to wrap the outputted path in quotes for the "sed" command to parse it correctly? I'm I on the right path? I've tried to pipe the find command into sed to wrap quotes, but I can't get anything to work correctly. Please let me know if I should go about this differently. Thank you.
You don't want to for prefs in ... on a list of files that are output from find. For a more complete explanation of why this is bad, see Greg's wiki page about parsing ls. You would only use a for loop in bash if you could match the files using a glob, which is difficult if you want to do it recursively.
It would be better, if you can swing it, to use find ... -exec ... instead. Perhaps something like:
find /Users -name prefs.js -exec sed -i.bak -e "s/$oldPref/$newPref/" {} \;
The sed command line is executed once for each file found by find. The {} gets replaced with the filename. Sed's -i option lets you run it in-place, rather than requiring stdin/stdout. Check the man page for usage details.
(Grain of salt: I'm basing this on my experience with linux)
I think it less to do with sed and more to do with the way the for loop array is formed. When the the results of find are converted to an array, the space between Application and Support is treated as a delimiter.
There are several ways to work around this, but the easiest is probably to change the IFS variable. The IFS variable is an internal variable that your command line interpreter uses to separate fields (more info). You can change the IFS variable of the environment before running the find command.
Modified example from here:
#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for f in `find /Users -name prefs.js`
do
echo "$f"
done
# restore $IFS
IFS=$SAVEIFS

Properly handle lists of files with whitespace in filename

I want to iterate over a list of files in Bash and perform some action. The problem: the file names may contain whitespace, which creates an obvious problem with wildcards or ls:
touch a\ b
FILES=* # or $(ls)
for FILE in $FILES; do echo $FILE; done
yields
a
b
Now, the conventional way to handle this is to use find … -print0 instead. However, this only works (well) in conjunction with xargs -0, not with Bash variables / loops.
My idea was to set $IFS to the null character to make this work. However, the comp.unix.shell seems to think that this is impossible in bash.
Bummer. Well, it’s theoretically possible to use another character, such as : (after all, $PATH uses this format, too):
IFS=$':'
FILES=$(find . -print0 | xargs -0 printf "%s:")
for FILE in $FILES; do echo $FILE; done
(The output is slightly different but fair enough.)
However, I can’t help but feel that this is clumsy and that there should be a more direct way of doing this. I’m looking for a more direct way of accomplishing this, preferably using wildcards or ls.
The best way to handle this is to store the file list as an array, rather than a string (and be sure to double-quote all variable substitutions):
files=(*)
for file in "${files[#]}"; do
echo "$file"
done
If you want to generate an array from find's output (e.g. if you need to search recursively), see this previous answer.
Exactly what you have in the first example works fine for me in Msys Bash, Cygwin and on my Fedora box:
FILES=*
for FILE in $FILES
do
echo $FILE
done
Its very important to preceed
IFS=""
otherwise files with two directly following spaces will not be found

calling grep from a bash script

I'm new to bash scripts (and the *nix shell altogether) but I'm trying to write this script to make grepping a codebase easier.
I have written this
#!/bin/bash
args=("$#");
for arg in args
grep arg * */* */*/* */*/*/* */*/*/*/*;
done
when I try to run it, this is what happens:
~/Work/richmond $ ./f.sh "\$_REQUEST\['a'\]"
./f.sh: line 4: syntax error near unexpected token `grep'
./f.sh: line 4: ` grep arg * */* */*/* */*/*/* */*/*/*/*;'
~/Work/richmond $
How do I do this properly?
And, I think a more important question is, how can I make grep recurse through subdirectories properly like this?
Any other tips and/or pitfalls with shell scripting and using bash in general would also be appreciated.
The syntax error is because you're missing do. As for searching recursively if your grep has the -R option you would do:
#!/bin/bash
for arg in "$#"; do
grep -R "$arg" *
done
Otherwise you could use find:
#!/bin/bash
for arg in "$#"; do
find . -exec grep "$arg" {} +
done
In the latter example, find will execute grep and replace the {} braces with the file names it finds, starting in the current directory ..
(Notice that I also changed arg to "$arg". You need the dollar sign to get the variable's value, and the quotes tell the shell to treat its value as one big word, even if $arg contains spaces or newlines.)
On recusive grepping:
Depending on your grep version, you can pass -R to your grep command to have it search Recursively (in subdirectories).
The best solution is stated above, but try putting your statement in back ticks:
`grep ...`
You should use 'find' plus 'xargs' to do the file searching.
for arg in "$#"
do
find . -type f -print0 | xargs -0 grep "$arg" /dev/null
done
The '-print0' and '-0' options assume you're using GNU grep and ensure that the script works even if there are spaces or other unexpected characters in your path names. Using xargs like this is more efficient than having find execute it for each file; the /dev/null appears in the argument list so grep always reports the name of the file containing the match.
You might decide to simplify life - perhaps - by combining all the searches into one using either egrep or grep -E. An optimization would be to capture the output from find once and then feed that to xargs on each iteration.
Have a look at the findrepo script which may give you some pointers
If you just want a better grep and don't want to do anything yourself, use ack, which you can get at http://betterthangrep.com/.

Resources