sed fails when "shopt -s nullglob" is set - bash

Some days ago I started a little bash script that should sum up the number of pages and file size of all PDF's in a folder. It's working quite well now but there's still one thing I don't understand.
Why is sed always failing if shopt -s nullglob is set? Does somebody know why this happens?
I'm working with GNU Bash 4.3 and sed 4.2.2 in Ubuntu 14.04.
set -u
set -e
folder=$1
overallfilesize=0
overallpages=0
numberoffiles=0
#If glob fails nothing should be returned
shopt -s nullglob
for file in $folder/*.pdf
do
# Disable empty string if glob fails
# (Necessary because otherwise sed fails ?:|)
#shopt -u nullglob
# This command is allowed to fail
set +e
pdfinfo="$(pdfinfo "$file" 2> /dev/null)"
ret=$?
set -e
if [[ $ret -eq 0 ]]
then
#Remove every non digit in the result
sedstring='s/[^0-9]//g'
filesize=$(echo -e "$pdfinfo" | grep -m 1 "File size:" | sed $sedstring)
pages=$(echo -e "$pdfinfo" | grep -m 1 "Pages:" | sed $sedstring)
overallfilesize=$(($overallfilesize + $filesize))
overallpages=$(($overallpages+$pages))
numberoffiles=$(($numberoffiles+1))
fi
done
echo -e "Processed files: $numberoffiles"
echo -e "Pagesum: $overallpages"
echo -e "Filesizesum [Bytes]: $overallfilesize"

Here's a simpler test case for reproducing your problem:
#!/bin/bash
shopt -s nullglob
pattern='s/[^0-9]//g'
sed $pattern <<< foo42
Expected output:
42
Actual output:
Usage: sed [OPTION]... {script-only-if-no-other-script} [input-file]...
(sed usage follows)
This happens because s/[^0-9]//g is a valid glob (matching a dir structure like like s/c/g), and you asked bash to interpret it. Since you don't have a matching file, nullglob kicks in and removes the pattern entirely.
Double quoting prevents word splitting and glob interpretation, which is almost always what you want:
#!/bin/bash
shopt -s nullglob
pattern='s/[^0-9]//g'
sed "$pattern" <<< foo42
This produces the expected output.
You should always double quote all your variable references, unless you have a specific reason not to.

Related

inputting multiple arguments into gzip to gzip select files? [duplicate]

I want to excluse a specific filename (say, fubar.log) from a shell (bash) globbing string, *.log. Nothing of what I tried seems to work, because globbing doesn't use the standard RE set.
Test case : the directory contains
fubar.log
fubaz.log
barbaz.log
text.txt
and only fubaz.log barbaz.log must be expanded by the glob.
if you are using bash
#!/bin/bash
shopt -s extglob
ls !(fubar).log
or without extglob
shopt -u extglob
for file in !(fubar).log
do
echo "$file"
done
or
for file in *log
do
case "$file" in
fubar* ) continue;;
* ) echo "do your stuff with $file";;
esac
done
Why don't you use grep? For example:
ls |grep -v fubar|while read line; do echo "reading $line"; done;
And here is the output:
reading barbaz.log
reading fubaz.log
reading text.txt

How to write a Bash script to edit many text files using the same commands? [duplicate]

This question already has answers here:
Run script on multiple files
(3 answers)
Closed 3 years ago.
I'm very new to bash. I have ten text files that I want to edit with the same line of code.
#!/bin/bash
sed -i -e 's/.\{6\}/&\n/g' -e 's/edit/edit2/g' | tr -d "\n" | sed 's/edit2/edit/g'| grep -o "here.*there" | sed -r '/^.{,100}$/d'
< files 1-10
I know I could use sed -f sed.sh <file1 >file1 but that only works with sed commands and it only works one file at a time?
Do I have to run a loop?
There's some great existing answers on the Unix stack exchange that help deal with your problem. Specifically, from this post, they use a loop to recursively loop through all the files in a particular directory, as follows:
( shopt -s globstar dotglob;
for file in **; do
if [[ -f $file ]] && [[ -w $file ]]; then
sed -i -- 's/foo/bar/g' "$file"
fi
done
)
Note the line, shopt -s globstar dotglob;, which allows us to use globbing patterns in the for loop. We also enclose the code in brackets, to prevent the shopt -s globstar dotglob; line option from becoming a global setting.
If you would like to apply this example to your file, you can just place your files in the current directory, and the code would probably look something like this:
( shopt -s globstar dotglob;
for file in **; do
if [[ -f $file ]] && [[ -w $file ]]; then
sed -i -e 's/.\{6\}/&\n/g' -e 's/edit/edit2/g' | tr -d "\n" | sed 's/edit2/edit/g' | grep -o "here.*there" | sed -r '/^.{,100}$/d' "$file"
fi
done
)
Note that we have placed a "$file" variable beside each of the seds that you used in your code, this replaces the name of the file for each command.
There is another example given in the code that allows you to pick which files to run on, rather than all the files in a directory, which you can also re-purpose for your code, as given here:
( shopt -s globstar dotglob
sed -i -- 's/foo/bar/g' **baz*
sed -i -- 's/foo/bar/g' **.baz
)
To answer your question of doing a loop on each line, you will need to put a loop for each line inside your for loop, like so:
while read line ; do
: sed -i -e 's/.\{6\}/&\n/g' -e 's/edit/edit2/g' | tr -d "\n" | sed 's/edit2/edit/g' | grep -o "here.*there" | sed -r '/^.{,100}$/d' "$line”
done
)
Although the for loop can be useful for dealing with files in recursive directories, I would recommend against also using another loop to grab lines, since it muddies your code, and it’s possible there is a better way to do it without parsing line by line.
The linked question is a fairly complete guide to many of the cases you may come across, and is also worth a read if you want to learn more.
Hope that helps!
You could use a for loop.
You could use the tool parallel.
Example
Create a set of test files using a for-loop
mkdir -p /tmp/so58333536
cd /tmp/so58333536
for i in 1.txt 2.txt 3.txt 4.txt 5.txt;do echo "The answer is 41" > $i;done
cat /tmp/so58333536/*
Now correct your mistake using parallel [1].
mkdir /tmp/so58333536.new
ls /tmp/so58333536/* |parallel "sed 's/41/42/' {} > /tmp/so58333536.new/{/}"
cat /tmp/so58333536.new/*
{}:: refers to the current file
{/}:: refers to name of the current file (path is removed)
Reads: List all files in so58333536 and apply the following sed command to each file and write the output to so58333536.new.
[1] Another option is to use sed -i for in-place editing.
Be very carefull with this!! Mistakes can cause serious damages!
# !! Do not use -i option regularly !!
ls /tmp/so58333536/* |parallel "sed -i 's/41/42/'"

Associative array, file names refering to the path, for dmenu

And I started playing with dmenu and it seems such an automation for almost every thing. Unfortunately I'm not familiar with bash and it should be on my list.
I have a folder for my markdowns with subfolders containing my files. I'm trying to have a script to show them in dmenu while using an alias.
If the path to a file is
/home/user/docs/markdown/practice01/rmd/network.rmd
I would like to have
network
as an option in my dmenu. So when I choose
network -----> /home/user/docs/markdown/practice01/rmd/network.rmd
Here is my broken script. There are a few things I'm missing.
This way I get full path on my dmenu which i don't need. I tried to read about associative arrays but I can't figure it out in bash.
This script works but in case I decide to ESC and exit, still it opens up an empty vim in my directory. Hence, I should know if statements huh!
#!/bin/bash
DMenu=("dmenu -l 10 -i -nb "#eaeaea" -sb "#E53935" -nf "#474747"")
cd ~/docs/markdown/
target=$(find -type f -name '*.rmd' | $DMenu)
st vim "$target"
I made a little example. But the problem is that it is a manual work to add each file, which definitely we don't wanna do right!
#!/bin/bash
declare -A dotfiles
dotfiles[i3]="/home/user/dotfiles/i3/.config/i3/config"
dotfiles[vimrc]="/home/user/dotfiles/vim/.vimrc"
list=("i3\nvimrc")
target=$(echo -e $list | dmenu -i -nb "#eaeaea" -sb "#E53935" -nf "#474747")
st vim "${dotfiles["$target"]}"
Thank you
Associative arrays can be weird... but returning output to a variable makes it easier to manipulate as any other string in bash, as shown in the example below:
prefix="$HOME/git/notes"
suffix=".md"
shopt -s nullglob globstar
item=( "$prefix"/**/*${suffix}) # Search *.md in all dirs/subdirs
item=( "${item[#]#"$prefix"/}" )
item=( "${item[#]%${suffix}}" ) # Removes '.md' string from item name
result=$(printf '%s\n' "${item[#]}" | dmenu)
[[ -n $result ]] || exit # exit if nothing is found
gedit "${prefix}/${result}.md" # Open file by adding again '.md'
When the percent sign (%) is used in the pattern ${variable%substring}, it will return content of the variable with the shortest occurrence of substring deleted from the back of the variable.
Listed below for reference are 2 examples I wrote, one in Bash and the other in Python, for managing pass and markdown notes with dmenu:
dmenu-pass.sh
dmenu-launch.py
Also, listed below are a couple nice articles that might help you out:
The weird, wondrous world of Bash arrays
Advanced Bash-Scripting Guide: Manipulating Strings
Instead of putting some code in an array, use a function!
my_dmenu() {
dmenu -l 10 -i -nb "#eaeaea" -sb "#e53935" -nf "#474747"
}
If your markdown files are all in the same folder (and not in subfolders), you certainly don't need find: use a glob instead! and if your files are in subfolders, use a glob instead (with the globstar shell option).
All in all:
#!/bin/bash
my_dmenu() {
dmenu -l 10 -i -nb "#eaeaea" -sb "#e53935" -nf "#474747"
}
base_dir=~/docs/markdown
# Also, check the return code of cd!
cd "$base_dir" || { echo >&2 "Can't cd to $base_dir. Exiting"; exit 1; }
# Using a glob: use the shell option nullglob
shopt -s nullglob
files=( *.rmd )
# Check that there are some files found:
if (( ${#files[#]} == 0 )); then
echo "No files found. Exiting."
exit 1
fi
# Now we're ready to send the files to dmenu:
chosen_file=$(printf '%s\n' "${files[#]}" | my_dmenu)
# If dmenu returns nothing: don't launch vim!
if [[ ! $chosen_file ]]; then
echo "No files selected. Exiting."
exit 1
fi
# Now you can launch vim!
st vim "$chosen_file"
If you also want to find the *.rmd files in subfolders: use instead:
shopt -s nullglob globstar
files=( **/*.rmd )
Edit to address the requirement in your comment (and the edit of your question):
If you want to strip the .rmd suffix to show in dmenu, use:
chosen_file=$(printf '%s\n' "${files[#]%.rmd}" | my_dmenu)
# ...
st vim "$chosen_file.rmd"
The expansion ${files[#]%.rmd} will strip the suffix .rmd from each field of the array files. Don't forget to add this suffix back when you edit the file (as shown in the last line).
dmenuoptions="-l 10 -i -nb '#eaeaea' -sb '#E53935' -nf '#474747'"
st -e vim $(find ~/docs/markdown -type f -name '*.rmd' | dmenu $dmenuoptions)

Is there an easy way to set nullglob for one glob

In bash, if you do this:
mkdir /tmp/empty
array=(/tmp/empty/*)
you find that array now has one element, "/tmp/empty/*", not zero as you'd like. Thankfully, this can be avoided by turning on the nullglob shell option using shopt -s nullglob
But nullglob is global, and when editing an existing shell script, may break things (e.g., did someone check the exit code of ls foo* to check if there are files named starting with "foo"?). So, ideally, I'd like to turn it on only for a small scope—ideally, one filename expansion. You can turn it off again using shopt -u nullglob But of course only if it was disabled before:
old_nullglob=$(shopt -p | grep 'nullglob$')
shopt -s nullglob
array=(/tmp/empty/*)
eval "$old_nullglob"
unset -v old_nullglob
makes me think there must be a better way. The obvious "put it in a subshell" doesn't work as of course the variable assignment dies with the subshell. Other than waiting for the Austin group to import ksh93 syntax, is there?
Unset it when done:
shopt -u nullglob
And properly (i.e. storing the previous state):
shopt -u | grep -q nullglob && changed=true && shopt -s nullglob
... do whatever you want ...
[ $changed ] && shopt -u nullglob; unset changed
With mapfile in Bash 4, you can load an array from a subshell with something like: mapfile array < <(shopt -s nullglob; for f in ./*; do echo "$f"; done). Full example:
$ shopt nullglob
nullglob off
$ find
.
./bar baz
./qux quux
$ mapfile array < <(shopt -s nullglob; for f in ./*; do echo "$f"; done)
$ shopt nullglob
nullglob off
$ echo ${#array[#]}
2
$ echo ${array[0]}
bar baz
$ echo ${array[1]}
qux quux
$ rm *
$ mapfile array < <(shopt -s nullglob; for f in ./*; do echo "$f"; done)
$ echo ${#array[#]}
0
Be sure to glob with ./* instead of a bare * when using echo to print the file name
Doesn't work with newline characters in the filename :( as pointed out by derobert
If you need to handle newlines in the filename, you will have to do the much more verbose:
array=()
while read -r -d $'\0'; do
array+=("$REPLY")
done < <(shopt -s nullglob; for f in ./*; do printf "$f\0"; done)
But by this point, it may be simpler to follow the advice of one of the other answers.
This is just a tiny bit better than your original suggestion:
local nullglob=$(shopt -p nullglob) ; shopt -s nullglob
... do whatever you want ...
$nullglob ; unset nullglob
This may be close to what you want; as is, it requires executing a command to expand the glob.
$ ls
file1 file2
$ array=( $(shopt -s nullglob; ls foo*) )
$ ls foo*
ls: foo*: No such file or directory
$ echo ${array[*]}
file1 file2
Instead of setting array in the subshell, we create a subshell using $() whose output is captured by array.
This is the simplest solution I've found:
For example, to expand the literal **/*.mp3 into a glob for only a particular variable, you can use
VAR=**/*.mp3(N)
Source: https://unix.stackexchange.com/a/204944/56160

Exclude specific filename from shell globbing

I want to excluse a specific filename (say, fubar.log) from a shell (bash) globbing string, *.log. Nothing of what I tried seems to work, because globbing doesn't use the standard RE set.
Test case : the directory contains
fubar.log
fubaz.log
barbaz.log
text.txt
and only fubaz.log barbaz.log must be expanded by the glob.
if you are using bash
#!/bin/bash
shopt -s extglob
ls !(fubar).log
or without extglob
shopt -u extglob
for file in !(fubar).log
do
echo "$file"
done
or
for file in *log
do
case "$file" in
fubar* ) continue;;
* ) echo "do your stuff with $file";;
esac
done
Why don't you use grep? For example:
ls |grep -v fubar|while read line; do echo "reading $line"; done;
And here is the output:
reading barbaz.log
reading fubaz.log
reading text.txt

Resources