Shell script to delete a set of files - shell

I am running a simulation which results in a large number of files. The files are named like this: a-*000.dat, a-*100.dat.....a-*500.dat....a-*900.dat, where * is a two digit number.I want to retain a-*000 and a-*500, and delete everything else.
I tried rm a-*{100,200,300,400,600,700,800,900} and it worked. But I have to run this for every value of *. Can you suggest a shell script so that I can avoid running rm many times.
PS: Hope this question is clear, objective, and specific. Please consider giving specific feedback before deleting/flagging this question.
Madhukar

I think you can achieve this by:
rm a-*[1-4,6-9]00.dat
Regarding your attempt: rm a-*{100,200,300,400,600,700,800,900}.dat, I don't see why you need to run it for every value of the prefix. It should work fine if you type it literally -- with the * in place, and it's equivalent to my suggestion above (only a bit longer).
As a side note, if you really needed to perform the rm multiple times, this can be automated by using a for loop in a script.

You want bash's extended globbing which is detailed here: http://www.linuxjournal.com/content/bash-extended-globbing
The key is
shopt -s extglob
After this you can use regular expressions for globbing. Use with care and verify with ls before you wipe out everything.

find -regextype posix-basic -regex './a-[0-9]\{3\}00.dat' -not -regex './a-*[0,5]\{1\}00.dat' - exec rm -i {} \;
looks ugly but extendable where basic regex is not enough.

Related

Is there a ZSH equivalent to "shopt -s nullglob"?

I'm currently working on a script that deletes all the PNG files from my Desktop. I want to create an array of file paths then use the rm command on each one.
This is the relevant bit of code:
#!/usr/bin/env bash
shopt -s nullglob
files=("$HOME"/Desktop/*.png)
files_found="${#files[#]}"
shopt -u nullglob
It has been recommend that I use shopt in case of no matching files.
However I'm on MacOS and just discovered that shopt is not available for ZSH. When I run the script I get command not found: shopt.
I've found the ZSH has an equivalent called setopt however after reading through the documentation I can't quite figure out which option is the correct one to use in the case. I can't seem to find any examples either.
Can anyone point me in the right direction?
The corresponding option in zsh is CSH_NULL_GLOB (documented in man zshoptions).b
setopt CSH_NULL_GLOB
(As far as I can tell, the idea of a pattern disappearing rather than being treated literally comes from csh.)
The more zsh-like approach is not to set this as a general option (as suggested in the answer given by chepner), but to decide on each pattern, whether or you want to have the nullglob effect. For example,
for f in x*y*(N)
do
echo $f
done
simply skips the loop if there are no files matching the pattern.
Just come to the realisation that the issue of shopt not being found was due to me auto-loading the file as a ZSH function.
The script worked perfectly when I ran it like so:
bash ./tidy-desktop
Previously I had been running it just with the command tidy-desktop
Instead I now have this in my zsh_aliases:
tidy-desktop="~/.zshfn/tidy-desktop"
Thanks to #Charles Duffy for helping me figure out what was going on there!

Preprocess line before it is processed by bash

Is there a way to preprocess a line entered into bash in interactive mode before it is processed by bash?
I'd like to introduce some custom shorthand syntax to deal with long paths. For example, instead of writing 'cd /one/two/three/four/five', I'd like to be able to write something like 'cd /.../five', and then my preprocessing script would replace this by the former command (if a unique directory 'five' exists somewhere below /).
I found http://glyf.livejournal.com/63106.html which describes how to execute a hook function before a command is executed. However, the approach does not allow to alter the command to be executed.
There's no good way of doing this generally for all commands.
However, you can do it for specific commands by overriding them with a function. For your cd case, you can stick something like this in your .bashrc:
cd() {
path="$1"
[[ $path == "/.../five" ]] && path="/one/two/three/four/five"
builtin cd "$path"
}
In bash 4 or later, you can use the globstar option.
shopt -s globstar
cd /**/five
assuming that five is a unique directory.
The short answer is not directly. As you have found, the PROMPT_COMMAND environment variable allows you to issue a command before the prompt is displayed, which can allow for some very creative uses, e.g. Unlimited BASH History, but nothing that would allow you to parse and replace input directly.
What you are wanting to do can be accomplished using functions and alias within your .bashrc. One approach would be to use either findutils-locate or simply a find command to search directories below the present working directory for the last component in the ellipsed path, and then provide the full path in return. However, even with the indexing, locate would take a bit of time, and depending on the depth, find itself may be to slow for generically doing this for all possible directories. If however, you had a list of specific directories you would like to implement something like this for, then the solution would be workable and relatively easy.
To provide any type of prototype or further detail, we would need to know more about how you intent to use the path information, and whether multiple paths could be provided in a single command.
Another issue arises if the directory five is non-unique...

BASH find -name (read variable)

I'm new to bash and have encountered a problem i can't solve. The issue is i need to use find -name with a name defined as a variable. Part of the script:
read MYNAME
find -name $MYNAME
But when i run the script, type in '*sh' for read, there are 0 results.
However, if i type directly in the terminal:
find -name '*sh'
it's working fine.
I also tried
read MYNAME
find -name \'$MYNAME\'
with typing *sh for read and no success.
Can anyone help me out?
Most probably
read MYNAME
find -name "$MYNAME"
is the version you are looking for. Without the double quotes " the shell will expand * in your *sh example prior to running find that's why your first attempt didn't work
You probably want
find -name "$MYNAME"
since this prevents $MYNAME from being subject to bash's pathname expansion (a.k.a. "globbing"), resulting in *sh being passed intact to find. One key difference is that globbing will not match hidden files such as .ssh, whereas find -name "*sh" will. However since you don't define the expected behaviour you are seeking, it's hard to say what you need for sure.

Keep wildcards from resolving in a ksh script

I am reading in a list of file names:
*.txt *.xml
which are space delimited. I read this into a variable in my ksh script, and I want to be able to manipulate it before putting each of them into a find command. The problem is, as soon as I do anything with the variable (for instance, breaking it into an array), the * resolves into filenames that are in my script's directory. What I want is for the *.txt to remain unchanged, so I can put that into my find command.
How do I do this? Unfortunately, I'm at work and can't just use perl or some other language.
set -f
turns off globbing in ksh, so * and ? characters are not expanded (globbed).
what's wrong with
'*.txt' '*.xml'
? . Else you have to show us more of your issues. Maybe edit your post to include a small test case that illustrates your problem, plus the desired output or intermediate values.

shell scripting help

This is one of my homework exercise.
Write a shell program, which will take a directory as an argument.
The script should then print all the regular files in the directory and all
the recursive directories, with the following information n the given order for
each of the files
File name (full name from the specified directory) file size owner
In case the directory argument is not given, the script should assume the
directory to be the current working directory
I am confused about how to approach this problem. For the listing of files part, I tried ls -R | awk ... but i was not able to do it because I was not able to find a suitable field seperator for awk.
I know its unfair to ask for a solution, but please can you guys give me a hint as how to proceed with the problem? Thanks in advance.
You really don't want to use ls and awk for this. Instead you want to check the documentation for find to figure out what string to put in the following script:
find ${1:-.} -type f -printf "format-string-to-be-determined-by-reader\n"
The problem is that parsing the output of ls is complicated at best and dangerous at worst.
What you'll want to do is use find to produce the list of files and a small shell script to produce the output you want. While there are many possible methods to accomplish this I would use the following general form
while read -r file ; do
# retrieve information about $file
# print that information on one line
done < <(find ...)
With a suitable find command to select the files. To retrieve the metadata about the files I would use stat inside the loop, probably multiple times.
I hope that's enough of a hint, but If you want a complete answer I can provide.
awk is fine.. use " +" as separator.
Bah. Where's the challenge in using ls or find? May as well write a one-liner in perl to do all the work, and then just call the one-liner from a script. ;)
You can do your recursive directory traversal in the shell natively, and use stat to get the size and owner. Basically, you write a function to list the directory (for element in *), and have the function change to the directory and call itself if [[ -d $element ]] is true. Something like
do_print "$elem"
if [[ -d "$elem" ]]
then
cd "$elem"
process_dir
cd ..
fi
or something akin to that.
Yeah, you'll have a zillion system calls to stat, but IMHO that's probably preferable to machine-parsing the output of a program whose output is intended to be human-readable. In this case, where performance is not an issue, it's worth it.
For bonus super happy fun times, change the value of IFS to a value which won't appear in a filename so the shell globbing won't get confused by files containing whitespace in its name. I'd suggest either a newline or a slash.
Or take the easy way out and just use find with printf. :)

Resources