dealing filenames with shell regex - shell

138.096.000.015.00111-138.096.201.072.38717
138.096.000.015.01008-138.096.201.072.00790
138.096.201.072.00790-138.096.000.015.01008
138.096.201.072.33853-173.194.020.147.00080
138.096.201.072.34293-173.194.034.009.00080
138.096.201.072.38717-138.096.000.015.00111
138.096.201.072.41741-173.194.034.025.00080
138.096.201.072.50612-173.194.034.007.00080
173.194.020.147.00080-138.096.201.072.33853
173.194.034.007.00080-138.096.201.072.50612
173.194.034.009.00080-138.096.201.072.34293
173.194.034.025.00080-138.096.201.072.41741
I have many folders inside which there are many files, the file names are like the above
I want to remove those files with file names having substring "138.096.000"
and sometimes I want to get the list of files with filenames with substring "00080"

To delete files with name containing "138.096.000":
find /root/of/files -type f -name '*138.096.000*' -exec rm {} \;
To list files with names containing "00080":
find /root/of/files -type f -name '*00080*'

rm $(find . -name \*138.096.000\*)
This uses the find command to find the appropriate files. This is executed within a subshell, and the output (the list of files) is used by rm. Note the escaping of the * pattern, since the shell will try and expand * itself.
This assumes you don't have filenames with spaces etc. You may prefer to do something like:
for i in $(find . -name \*138.096.000\*); do
rm $i
done
in this scenario, or even
find . -name \*138.096.000\* | xargs rm
Note that in the loop above you'll execute rm for each file, and the xargs variant will execute rm multiple times (dependin gon the number of files you have - it may only execute once).
However, if you're using zsh then you can simply do:
rm **/*138.096.000*
(I'm assuming your directories aren't named like your files. Note the -f flag as used in Kamil's answer if this is the case)

Related

Use Find and xargs to delete dups in arraylist

I have arraylist of files and I am trying to use rm with xargs to remove files like:
dups=["test.csv","man.csv","teams.csv"]
How can I pass the complete dups array to find and delete these files?
I want to make changes below to make it work
find ${dups[#]} -type f -print0 | xargs -0 rm
Your find command is wrong.
# XXX buggy: read below
find foo bar baz -type f -print0
means look in the paths foo, bar, and baz, and print any actual files within those. (If one of the paths is a directory, it will find all files within that directory. If one of the paths is a file in the current directory, it will certainly find it, but then what do you need find for?)
If these are files in the current directory, simply
rm -- "${dups[#]}"
(notice also how to properly quote the array expansion).
If you want to look in all subdirectories for files with these names, you will need something like
find . -type f \( -name "test.csv" -o -name "man.csv" -o -name "teams.csv" \) -delete
or perhaps
find . -type f -regextype egrep -regex '.*/(test\.csv|man\.csv|teams\.csv)' -delete
though the -regex features are somewhat platform-dependent (try find -E instead of find -regextype egrep on *BSD/MacOS to enable ERE regex support).
Notice also how find has a built-in predicate -delete so you don't need the external utility rm at all. (Though if you wanted to run a different utility, find -exec utility {} + is still more efficient than xargs. Some really old find implementations didn't have the + syntax for -exec but you seem to be on Linux where it is widely supported.)
Building this command line from an array is not entirely trivial; I have proposed a duplicate which has a solution to a similar problem. But of course, if you are building the command from Java, it should be easy to figure out how to do this on the Java side instead of passing in an array to Bash; and then, you don't need Bash at all (you can pass this to find directly, or at least use sh instead of bash because the command doesn't require any Bash features).
I'm not a Java person, but from Python this would look like
import subprocess
command = ["find", ".", "-type", "f"]
prefix = "("
for filename in dups:
command.extend([prefix, "-name", filename])
prefix = "-o"
command.extend([")", "-delete"])
subprocess.run(command, check=True, encoding="utf-8")
Notice how the backslashes and quotes are not necessary when there is no shell involved.

Find and rename multiple files using a bash script in Linux

As an example, in a directory /home/hel/files/ are thousends of files and hundreds of directories.
An application saves there its output files with special characters in the file names.
I want to replace these special characters with underscores in all file names. e.g. -:"<>#
I wrote a bash script which simply repeats a command to rename the files using Linux/Unix 'rename'.
Example: file name: rename.sh
#!/bin/bash
rename "s/\'/_/g" *
rename 's/[-:"<>#\,&\s\(\)\[\]?!–~%„“;│\´\’\+#]/_/g' *
rename 'y/A-Z/a-z/' *
rename 's/\.(?=[^.]*\.)/_/g' *
rename 's/[_]{2,}/_/g' *
I execute the following find command:
find /home/hel/files/ -maxdepth 1 -type f -execdir /home/hel/scripts/rename.sh {} \+
Now the issue:
This works fine, except the fact, that it renames subdirectories too, if they have the searched characters in their name.
The find command searches just for files and not for directories.
I tried some other find variations like:
find /home/hel/files/ -maxdepth 1 -type f -execdir sh /home/hel/scripts/rename.sh {} \+
find /home/hel/files/ -maxdepth 1 -type f -execdir sh /home/hel/scripts/rename.sh {} +
find /home/hel/files/ -maxdepth 1 -type f -execdir sh /home/hel/scripts/rename.sh {} \;
They are all working, but with the same result.
What is not working:
find /home/hel/files/ -maxdepth 1 -type f -exec sh /home/hel/scripts/rename.sh {} \+
This one is dangerous, because it renames the directories and files in the current directory, where you call the find command too.
Maybe one has an idea, why this happens or has a better solution.
The script rename.sh did not use its command line arguments at all, but instead searched files and directories (!) on its own using the glob *.
Change your script to the following.
#!/bin/bash
rename -d s/\''/_/g;
s/[-:"<>#\,&\s\(\)\[\]?!–~%„“;│\´\’\+#]/_/g;
y/A-Z/a-z/;
s/\.(?=[^.]*\.)/_/g;
s/[_]{2,}/_/g' "$#"
Then use find ... -maxdepth 1 -type f -exec sh .../rename.sh {} +.
Changes Made
Use "$#" instead of * to process the files given as arguments rather than everything in the current directory.
Execute rename only once as a 2nd rename wouldn't find the files specified with "$#" after they were renamed by the 1st rename.
Use the -d option such that only the basenames are modified. find always puts a path in front of the files, at the very least ./. Without this option rename would change ./filename to mangledPath/newFilename and therefore move the file to another directory.
Note that man rename is a bit misleading
--path, --fullpath
Rename full path: including any directory component. DEFAULT
-d, --filename, --nopath, --nofullpath
Do not rename directory: only rename filename component of path.
For a given path rename -d 's...' some/path/basename just processes the basename and ignores the leading components some/path/. If basename is a directory it will still be renamed despite the -d option.

How to delete all files or Sub-folders (both) in a folder except 2 folders with shell script

I would like to know how to deleted all the contents of a folder (it contains other folders and some files) except for 2 folders and its contents
The below command keeps the folder conf and removes all the other folders
find . ! -name 'conf' -type d -exec rm -rf {} +
I have tried to pipe it like below
find . -maxdepth 1 -type d ! -name 'conf' |find . -maxdepth 1 -type d ! -name 'foldername2'
but didnt work.
is it possible to do with a single command
You haven't specified which shell you're using, but if you're using bash then extended globs can help:
printf '%s\n' !(#(conf|foldername2)/)
If you're happy with the list of files and directories produced by that, then pass the same glob to rm -rf:
rm -rf !(#(conf|foldername2)/)
Inside a script, you may need to enable extglob using shopt -s extglob. Later, you can change -s to -u to unset the option.
If you're using a different shell, then you can add some more options to your find command:
find -maxdepth 1 ! -name 'conf' -a ! -name 'foldername2' -exec rm -rf {} +
Try it without the -exec part first to print the matches rather than deleting everything.
It may my little program utility can help you. I hope so.
First of all you should find the path of your files .sh
then you should find the main folder that contains those files .sh
then remove anything except those folders
I wrote drr for such a purpose that it can do such a task so easy
drr, stands for: remove or rename files based on regular expression in D language. So you must compile it before using.
See the screenshot:
Please be careful since this is not an appropriate tool for beginner.

Remove all files that start with same prefix, but different filetype

How do I remove all files in a folder that start with the same prefix? For example:
I have files:
SVM1.txt
SVM2.csv
SVM3.mat
helloworld.txt
README.txt
I want to delete all the files that start with 'SVM'. Note that they start with the same prefix, but are of different filetype!
With wildcards, of course.
rm SVM*
In addition to the straightforward
rm SVM*
which might fail (command line too long) if there are many, many matching files, you can use
find . -prune -name 'SVM*' -exec rm {} +
which will repeatedly run rm on as many files at a time as possible until all matching files are deleted. -prune keeps find from descending into any subdirectories to find matching files.
In the directory where the files are,
ls | grep '^SVM.*' | xargs rm
Stop at grep ^SVM.* to double check that you have the right files to delete, then add the xargs rm.

How can I use terminal to copy and rename files from multiple folders?

I have a folder called "week1", and in that folder there are about ten other folders that all contain multiple files, including one called "submit.pdf". I would like to be able to copy all of the "submit.pdf" files into one folder, ideally using Terminal to expedite the process. I've tried cp week1/*/submit.pdf week1/ as well as cp week1/*/*.pdf week1/, but it had only been ending up copying one file. I just realized that it has been writing over each file every time which is why I'm stuck with one...is there anyway I can prevent that from happening?
You don't indicate your OS, but if you're using Gnu cp, you can use cp week1/*/submit.pdf --backup=t week/ to have it (arbitrarily) number files that already exist; but, that won't give you any real way to identify which-is-which.
You could, perhaps, do something like this:
for file in week1/*/submit.pdf; do cp "$file" "${file//\//-}"; done
… which will produce files named something like "week1-subdir-submit.pdf"
For what it's worth, the "${var/s/r}" notation means to take var, but before inserting its value, search for s (\/, meaning /, escaped because of the other special / in that expression), and replace it with r (-), to make the unique filenames.
Edit: There's actually one more / in there, to make it match multiple times, making the syntax:
"${ var / / \/ / - }"
take "var" replace every instance of / with -
find to the rescue! Rule of thumb: If you can list the files you want with find, you can copy them. So try first this:
$ cd your_folder
$ find . -type f -iname 'submit.pdf'
Some notes:
find . means "start finding from the current directory"
-type -f means "only find regular files" (i.e., not directories)
-iname 'submit.pdf' "... with case-insensitive name 'submit.dpf'". You don't need to use 'quotation', but if you want to search using wildcards, you need to. E.g.:
~ foo$ find /usr/lib -iname '*.So*'
/usr/lib/pam/pam_deny.so.2
/usr/lib/pam/pam_env.so.2
/usr/lib/pam/pam_group.so.2
...
If you want to search case-sensitive, just use -name instead of -iname.
When this works, you can copy each file by using the -exec command. exec works by letting you specify a command to use on hits. It will run the command for each file find finds, and put the name of the file in {}. You end the sequence of commands by specifying \;.
So to echo all the files, do this:
$ find . -type f -iname submit.pdf -exec echo Found file {} \;
To copy them one by one:
$ find . -type f -iname submit.pdf -exec cp {} /destination/folder \;
Hope this helps!

Resources