Operating on multiple specific folders at once with cp and rm commands - bash

I'm new to linux (using bash) and I wanted to ask about something that I do often while I work, I'll give two examples.
Deleting multiple specific folders inside a certain directory.
Copying multiple specific folders into a ceratin directory.
I succesfully done this with files, using find with some regex and then using -exec and -delete. But for folders I found it more problematic, because I had problem pipelining the list of folders I got to the cp/rm command succescfully, each time getting the "No such file or directory error".
Looking online I found the following command (in my case for copying all folders starting with a Z):
cp -r $(ls -A | grep "Z*") destination
But when I execute it it says nothing and the prompt won't show up again until I hit Ctrl+C and nothing is copied.
How can I achieve what I'm looking for? For both cp and rm.
Thanks in advance!

First of all, you are trying to grep "Z*" but it means you are looking for Z, ZZ, ZZZZ, ZZZZZ ?
also try to execute ls -A - you will get multiple columns. I think need at least ls -1A to print result one per line.
So for your command try something like:
cp -r $(ls -1A|grep "^p") destination
or
cp -r $(ls -1A|grep "^p") -t destination
But all the above is just to correct syntax of your example.
It is much better to use find. Just in case try to put target directory in quotas like:
find <PATH_FROM> -type d -exec cp -r \"{}\" -t target \;

Related

Rename files in bash based on content inside

I have a directory which has 70000 xml files in it. Each file has a tag which looks something like this, for the sake of simplicity:
<ns2:apple>, <ns2:orange>, <ns2:grapes>, <ns2:melon>. Each file has only one fruit tag, i.e. there cannot be both apple and orange in the same file.
I would like rename every file (add "1_" before the beginning of each filename) which has one of: <ns2:apple>, <ns2:orange>, <ns2:melon> inside of it.
I can find such files with egrep:
egrep -r '<ns2:apple>|<ns2:orange>|<ns2:melon>'
So how would it look as a bash script, which I can then user as a cron job?
P.S. Sorry I don't have any bash script draft, I have very little experience with it and the time is of the essence right now.
This may be done with this script:
#!/bin/sh
find /path/to/directory/with/xml -type f | while read f; do
grep -q -E '<ns2:apple>|<ns2:orange>|<ns2:melon>' "$f" && mv "$f" "1_${f}"
done
But it will rescan the directory each time it runs and append 1_ to each file containing one of your tags. This means a lot of excess IO and files with certain tags will be getting 1_ prefix each run, resulting in names like 1_1_1_1_file.xml.
Probably you should think more on design, e.g. move processed files to two directories based on whether file has certain tags or not:
#!/bin/sh
# create output dirs
mkdir -p /path/to/directory/with/xml/with_tags/ /path/to/directory/with/xml/without_tags/
find /path/to/directory/with/xml -maxdepth 1 -mindepth 1 -type f | while read f; do
if grep -q -E '<ns2:apple>|<ns2:orange>|<ns2:melon>'; then
mv "$f" /path/to/directory/with/xml/with_tags/
else
mv "$f" /path/to/directory/with/xml/without_tags/
fi
done
Run this command as a dry run, then remove --dry_run to actually rename the files:
grep -Pl '(<ns2:apple>|<ns2:orange>|<ns2:melon>)' *.xml | xargs rename --dry-run 's/^/1_/'
The command-line utility rename comes in many flavors. Most of them should work for this task. I used the rename version 1.601 by Aristotle Pagaltzis. To install rename, simply download its Perl script and place into $PATH. Or install rename using conda, like so:
conda install rename
Here, grep uses the following options:
-P : Use Perl regexes.
-l : Suppress normal output; instead print the name of each input file from which output would normally have been printed.
SEE ALSO:
grep manual

Passing variables to cp / mv

I can use the cp or mv command to copy/mv files to a new folder manually but while in a for loop, it fails.
I've tried various ways of doing this and none seem to work. The most frustrating part is it works when run locally.
A simple version of what I'm trying to do is shown below:
#!bin/bash
#Define path variables
source_dir=/home/me/loop
destination_dir=/home/me/loop/new
#Change working dir
cd "$source_dir"
#Step through source_dir for each .txt. file
for f in *.txt
do
# If the txt file was modified within the last 300 minutes...
if [[ $(find "$f" -mmin -300) ]]
then
# Add breaks for any spaces in filenames
f="${f// /\\ }"
# Copy file to destination
cp "$source_dir/$f $destination_dir/"
fi
done
Error message is:
cp: missing destination file operand after '/home/me/loop/first\ second.txt /home/me/loop/new/'
Try 'cp --help' for more information.
However, I can manually run:
mv /home/me/loop/first\ second.txt /home/me/loop/new/
and it works fine.
I get the same error using cp and similar errors using rsync so I'm not sure what I'm doing wrong...
cp "$source_dir/$f $destination_dir/"
When you surround both arguments with double quotes you turn them into one argument with an embedded space. Quote them separately.
cp "$source_dir/$f" "$destination_dir/"
There's no do anything special for spaces beforehand. The quoting already ensures files with whitespace are handled correctly.
# Add breaks for any spaces in filenames
f="${f// /\\ }"
Let's take a step back, though. Looping over all *.txt files and then checking each one with find is overly complicated. find already loops over multiple files and does arbitrary things to those files. You can do everything in this script in a single find command.
#!bin/bash
source_dir=/home/me/loop
destination_dir=/home/me/loop/new
find "$source_dir" -name '*.txt' -mmin -300 -exec cp -t "$destination_dir" {} +
You need to divide it in to two strings, like this:
cp "$source_dir/$f" "$destination_dir/"
by having as one you are basically telling cp that the entire line is the first parameter, where it is actually two (source and destination).
Edit: As #kamil-cuk and #aaron states there are better ways of doing what you try to do. Please read their comments

usage error when moving files of one extension to a directory

I have files with extension .mp3 that are in different folders within a directory, and I need to move them all to one directory to work with. I have looked at multiple tutorials and questions on SO, and no matter what I try, I either get
usage: mv [-f | -i | -n] [-v] source target
mv [-f | -i | -n] [-v] source ... directory
or "No such file or directory".
There are way too many to into each individual folder, but for now I cd into one of the folders. With this:
mv *.mp3 /Users/myname/Volumes/LaCie/model/folder/media
I get the "usage" error above. I looked at Moving files to a directory and tried:
find . | grep ".mp3" | xargs mv /Users/myname/Volumes/LaCie/model/folder/media
and get same error.
What am I doing wrong? What is the correct syntax? Also, will I be able to extract the mp3 files and move them if I'm in a directory that contains the directory with the files, but not in that directory itself? I appreciate insights into this. Thanks.
EDIT: A major part of this issue was that the path when using /Volumes starts with /Volumes. I was doing /Users/myname/Volumes and that was one reason I had so much trouble.
The problem with the way you're using xargs is that by default, xargs will append the arguments to the end of the command string you've provided it. So you'll end up running a bunch of mv commands that look like this:
mv /Users/myname/Volumes/LaCie/model/folder/media foo.mp3
You can fix that by telling xargs where to place the arguments within the command:
<other commands> | xargs -I{} mv {} /Users/myname/Volumes/LaCie/model/folder/media
The -I option lets you provide any arbitrary string as a placeholder for where the args should go. I used {} just because that seems to be the conventional token that you see used in similar contexts (such as with the -exec option of find, as shown below).
But there's an easier way to do it, using the find command's -exec option:
find . -name '*.mp3' -exec mv {} /Users/myname/Volumes/LaCie/model/folder/media \;
Also note the -name '*.mp3' part, which lets you get rid of the | grep ".mp3" part.
Lastly, just to be safe, I'd personally put a / at the end of your destination path. If the media directory doesn't exist in /Users/myname/Volumes/LaCie/model/folder, or if a non-directory item (such as a regular file or a symlink) named media exists in that location, then the find command above will happily just move all your mp3 files, one at a time, to that folder, creating a file named media there each time. And you will have lost all of your mp3 files except for the last one, which will now be a file named media.
However, with a trailing /, if media is not a directory, the mv commands will fail with an error saying so. So the revised command would be:
find . -name '*.mp3' -exec mv {} /Users/myname/Volumes/LaCie/model/folder/media/ \;
Update: Per Gordon Davisson's comment below, you should also consider adding -i or -n to the mv command, to avoid accidentally overwriting files with duplicate names. For example, if you have a/foo.mp3 and b/foo.mp3, the above command will overwrite one with the other. The -i option will cause mv to prompt you to confirm each file move, whereas the -n option (a.k.a. --no-clobber) will prevent mv from overwriting a file if a file with the same name already exists.
There are different implementations and versions of mv. You can check the allowed syntax of your version using man mv.
If you have GNU mv you could use mv -t target/dir *.mp3.
Most implementations should support mv *.mp3 target/dir.
If your mv only supports the absolute minimum of mv source target with exactly one source and one target file you can use the following command which should always work if target/dir/ exists.
for i in *.mp3; do mv "$i" "target/dir/$i"; done

mv Bash Shell Command (on Mac) overwriting files even with a -i?

I am flattening a directory of nested folders/picture files down to a single folder. I want to move all of the nested files up to the root level.
There are 3,381 files (no directories included in the count). I calculate this number using these two commands and subtracting the directory count (the second command):
find ./ | wc -l
find ./ -type d | wc -l
To flatten, I use this command:
find ./ -mindepth 2 -exec mv -i -v '{}' . \;
Problem is that when I get a count after running the flatten command, my count is off by 46. After going through the list of files before and after (I have a backup), I found that the mv command is overwriting files sometimes even though I'm using -i.
Here's details from the log for one of these files being overwritten...
.//Vacation/CIMG1075.JPG -> ./CIMG1075.JPG
..more log
..more log
..more log
.//dog pics/CIMG1075.JPG -> ./CIMG1075.JPG
So I can see that it is overwriting. I thought -i was supposed to stop this. I also tried a -n and got the same number. Note, I do have about 150 duplicate filenames. Was going to manually rename after I flattened everything I could.
Is it a timing issue?
Is there a way to resolve?
NOTE: it is prompting me that some of the files are overwrites. On those prompts I just press Enter so as not to overwrite. In the case above, there is no prompt. It just overwrites.
Apparently the manual entry clearly states:
The -n and -v options are non-standard and their use in scripts is not recommended.
In other words, you should mimic the -n option yourself. To do that, just check if the file exists and act accordingly. In a shell script where the file is supplied as the first argument, this could be done as follows:
[ -f "${1##*/}" ]
The file, as first argument, contains directories which can be stripped using ##*/. Now simply execute the mv using ||, since we want to execute when the file doesn't exist.
[ -f "${1##*/}" ] || mv "$1" .
Using this, you can edit your find command as follows:
find ./ -mindepth 2 -exec bash -c '[ -f "${0##*/}" ] || mv "$0" .' '{}' \;
Note that we now use $0 because of the bash -c usage. It's first argument, $0, can't be the script name because we have no script. This means the argument order is shifted with respect to a usual shell script.
Why not check if file exists, prior move? Then you can leave the file where it is or you can rename it or do something else...
Test -f or, [] should do the trick?
I am on tablet and can not easyly include the source.

Can I limit the recursion when copying using find (bash)

I have been given a list of folders which need to be found and copied to a new location.
I have basic knowledge of bash and have created a script to find and copy.
The basic command I am using is working, to a certain degree:
find ./ -iname "*searchString*" -type d -maxdepth 1 -exec cp -r {} /newPath/ \;
The problem I want to resolve is that each found folder contains the files that I want, but also contains subfolders which I do not want.
Is there any way to limit the recursion so that only the files at the root level of the found folder are copied: all subdirectories and files therein should be ignored.
Thanks in advance.
If you remove -R, cp doesn't copy directories:
cp *searchstring*/* /newpath
The command above copies dir1/file1 to /newpath/file1, but these commands copy it to /newpath/dir1/file1:
cp --parents *searchstring*/*(.) /newpath
for GNU cp and zsh
. is a qualifier for regular files in zsh
cp --parents dir1/file1 dir2 copies file1 to dir2/dir1 in GNU cp
t=/newpath;for d in *searchstring*/;do mkdir -p "$t/$d";cp "$d"* "$t/$d";done
find *searchstring*/ -type f -maxdepth 1 -exec rsync -R {} /newpath \;
-R (--relative) is like --parents in GNU cp
find . -ipath '*searchstring*/*' -type f -maxdepth 2 -exec ditto {} /newpath/{} \;
ditto is only available on OS X
ditto file dir/file creates dir if it doesn't exist
So ... you've been given a list of folders. Perhaps in a text file? You haven't provided an example, but you've said in comments that there will be no name collisions.
One option would be to use rsync, which is available as an add-on package for most versions of Unix and Linux. Rsync is basically an advanced copying tool -- you provide it with one or more sources, and a destination, and it makes sure things are synchronized. It knows how to copy things recursively, but it can't be told to limit its recursion to a particular depth, so the following will copy each item specified to your target, but it will do so recursively.
xargs -L 1 -J % rsync -vi -a % /path/to/target/ < sourcelist.txt
If sourcelist.txt contains a line with /foo/bar/slurm, then the slurm directory will be copied in its entiriety to /path/to/target/slurm/. But this would include directories contained within slurm.
This will work in pretty much any shell, not just bash. But it will fail if one of the lines in sourcelist.txt contains whitespace, or various special characters. So it's important to make sure that your sources (on the command line or in sourcelist.txt) are formatted correctly. Also, rsync has different behaviour if a source directory includes a trailing slash, and you should read the man page and decide which behaviour you want.
You can sanitize your input file fairly easily in sh, or bash. For example:
#!/bin/sh
# Avoid commented lines...
grep -v '^[[:space:]]*#' sourcelist.txt | while read line; do
# Remove any trailing slash, just in case
source=${line%%/}
# make sure source exist before we try to copy it
if [ -d "$source" ]; then
rsync -vi -a "$source" /path/to/target/
fi
done
But this still uses rsync's -a option, which copies things recursively.
I don't see a way to do this using rsync alone. Rsync has no -depth option, as find has. But I can see doing this in two passes -- once to copy all the directories, and once to copy the files from each directory.
So I'll make up an example, and assume further that folder names do not contain special characters like spaces or newlines. (This is important.)
First, let's do a single-pass copy of all the directories themselves, not recursing into them:
xargs -L 1 -J % rsync -vi -d % /path/to/target/ < sourcelist.txt
The -d option creates the directories that were specified in sourcelist.txt, if they exist.
Second, let's walk through the list of sources, copying each one:
# Basic sanity checking on input...
grep -v '^[[:space:]]*#' sourcelist.txt | while read line; do
if [ -d "$line" ]; then
# Strip trailing slashes, as before
source=${line%%/}
# Grab the directory name from the source path
target=${source##*/}
rsync -vi -a "$source/" "/path/to/target/$target/"
fi
done
Note the trailing slash after $source on the rsync line. This causes rsync to copy the contents of the directory, rather than the directory.
Does all this make sense? Does it match your requirements?
You can use find's ipath argument:
find . -maxdepth 2 -ipath './*searchString*/*' -type f -exec cp '{}' '/newPath/' ';'
Notice the path starts with ./ to match find's search directory, ends with /* in order to exclude files in the top level directory, and maxdepth is set to 2 to only recurse one level deep.
Edit:
Re-reading your comments, it seems like you want to preserve the directory you're copying from? E.g. when searching for foo*:
./foo1/* ---> copied to /newPath/foo1/* (not to /newPath/*)
./foo2/* ---> copied to /newPath/foo2/* (not to /newPath/*)
Also, the other requirement is to keep maxdepth at 1 for speed reasons.
(As pointed out in the comments, the following solution has security issues for specially crafted names)
Combining both, you could use this:
find . -maxdepth 1 -type d -iname 'searchString' -exec sh -c "mkdir -p '/newPath/{}'; cp "{}/*" '/newPath/{}/' 2>/dev/null" ';'
Edit 2:
Why not ditch find altogether and use a pure bash solution:
for d in *searchString*/; do mkdir -p "/newPath/$d"; cp "$d"* "/newPath/$d"; done
Note the / at the end of the search string, causing only directories to be considered for matching.

Resources