Need to move 500 specific files from one share directory to another - bash

I have a directory that has 50,000 files; I only need to move a subset of these files that I have in a text file. I am looking for a script/way to move these files out.
I keep finding items on how to move all files but this won't work for me.

Assuming you have GNU coreutils, you can use
xargs -a tomove.txt -d '\n' mv -t /path/to/destination
where tomove.txt contains the names of the files to be moved, one per line. -d '\n' makes sure filenames with blanks in them are interpreted properly by using newline (instead of any blank) as the filename delimiter.
xargs guarantees that we don't run into command line length limitations.

You can use the mv command. In the following example I move entries in list.txt from the directory old to the directory new.
$ mkdir old new
$ touch old/{a,b,c}.{sh,c,txt}
$ echo "a.sh
c.txt
b.c" > list.txt
$ cd old/
$ mv $(cat ../list.txt) ../new
$ find ../new/
../new/
../new/a.sh
../new/c.txt
../new/b.c

Related

Is there a nice way to move .txt files with a specific character in the file (not filename) to another directory?

Using bash on Mac, I'm trying check a folder of .txt files and, any file that has a bullet character (•) anywhere in it, move those .txt files to a new folder/subdirectory. Any suggestions?
grep can be used for checking for character presence in a file:
#!/bin/bash
for filename in ./a/*.txt; do
if grep -q • "$filename"; then
mv "$filename" ./b/
fi
done
If there's no spaces in the .txt filenames (thanks #jordanm, for pointing that
out), you can do it in one line - you can use grep to tell you the filenames
(-l) containing a pattern (-e) and move them to your new_directory:
mv $(grep -l old_directory/*.txt -e •) new_directory/

Shell script to delete whose files names are not in a text file

I have a txt file which contains list of file names
Example:
10.jpg
11.jpg
12.jpeg
...
In a folder this files should protect from delete process and other files should delete.
So i want oppposite logic of this question: Shell command/script to delete files whose names are in a text file
How to do that?
Use extglob and Bash extended pattern matching !(pattern-list):
!(pattern-list)
Matches anything except one of the given patterns
where a pattern-list is a list of one or more patterns separated by a |.
extglob
If set, the extended pattern matching features described above are enabled.
So for example:
$ ls
10.jpg 11.jpg 12.jpeg 13.jpg 14.jpg 15.jpg 16.jpg a.txt
$ shopt -s extglob
$ shopt | grep extglob
extglob on
$ cat a.txt
10.jpg
11.jpg
12.jpeg
$ tr '\n' '|' < a.txt
10.jpg|11.jpg|12.jpeg|
$ ls !(`tr '\n' '|' < a.txt`)
13.jpg 14.jpg 15.jpg 16.jpg a.txt
The deleted files are 13.jpg 14.jpg 15.jpg 16.jpg a.txt according to the example.
So with extglob and !(pattern-list), we can obtain the files which are excluded based on the file content.
Additionally, if you want to exclude the entries starting with ., then you could switch on the dotglob option with shopt -s dotglob.
This is one way that will work with bash GLOBIGNORE:
$ cat file2
10.jpg
11.jpg
12.jpg
$ ls *.jpg
10.jpg 11.jpg 12.jpg 13.jpg
$ echo $GLOBIGNORE
$ GLOBIGNORE=$(tr '\n' ':' <file2 )
$ echo $GLOBIGNORE
10.jpg:11.jpg:12.jpg:
$ ls *.jpg
13.jpg
As it is obvious, globing ignores whatever (file, pattern, etc) is included in the GLOBIGNORE bash variable.
This is why the last ls reports only file 13.jpg since files 10,11 and 12.jpg are ignored.
As a result using rm *.jpg will remove only 13.jpg in my system:
$ rm -iv *.jpg
rm: remove regular empty file '13.jpg'? y
removed '13.jpg'
When you are done, you can just set GLOBIGNORE to null:
$ GLOBIGNORE=
Worths to be mentioned, that in GLOBIGNORE you can also apply glob patterns instead of single filenames, like *.jpg or my*.mp3 , etc
Alternative :
We can use programming techniques (grep, awk, etc) to compare the file names present in ignorefile and the files under current directory:
$ awk 'NR==FNR{f[$0];next}(!($0 in f))' file2 <(find . -type f -name '*.jpg' -printf '%f\n')
13.jpg
$ rm -iv "$(awk 'NR==FNR{f[$0];next}(!($0 in f))' file2 <(find . -type f -name '*.jpg' -printf '%f\n'))"
rm: remove regular empty file '13.jpg'? y
removed '13.jpg'
Note: This also makes use of bash process substitution, and will break if filenames include new lines.
Another alternative to George Vasiliou's answer would be to read the file with the names of the files to keep using the Bash builtin mapfile and then check for each of the files to be deleted whether it is in that list.
#! /bin/bash -eu
mapfile -t keepthose <keepme.txt
declare -a deletethose
for f in "$#"
do
keep=0
for not in "${keepthose[#]}"
do
[ "${not}" = "${f}" ] && keep=1 || :
done
[ ${keep} -gt 0 ] || deletethose+=("${f}")
done
# Remove the 'echo' if you really want to delete files.
echo rm -f "${deletethose[#]}"
The -t option causes mapfile to trim the trailing newline character from the lines it reads from the file. No other white-space will be trimmed, though. This might be what you want if your file names actually contain white-space but it could also cause subtle surprises if somebody accidentally puts a space before or after the name of an important file they want to keep.
Note that I'm first building a list of the files that should be deleted and then delete them all at once rather than deleting each file individually. This saves some sub-process invocations.
The lookup in the list, as coded above, has linear complexity which gives the overall script quadratic complexity (precisely, N × M where N is the number of command-line arguments and M the number of entries in the keepme.txt file). If you only have a few dozen files, this should be fine. Unfortunately, I don't know of a better way to check for set membership in Bash. (We cannot use the file names as keys in an associative array because they might not be proper identifiers.) If you are concerned with performance for many files, using a more powerful language like Python might be worth consideration.
I would also like to mention that the above example simply compares strings. It will not realize that important.txt and ./important.txt are the same file and hence delete the file. It would be more robust to convert the file name to a canonical path using readlink -f before comparing it.
Furthermore, your users might want to be able to put globing patterns (like important.* into the list of files to keep. If you want to handle those, extra logic would be required.
Overall, specifying what files to not delete seems a little dangerous as the error is on the bad side.
Provided there's no spaces or special escaped chars in the file names, either of these (or variations of these) would work:
rm -v $(stat -c %n * | sort excluded_file_list | uniq -u)
stat -c %n * | grep -vf excluded_file_list | xargs rm -v

Removing last n characters from Unix Filename before the extension

I have a bunch of files in Unix Directory :
test_XXXXX.txt
best_YYY.txt
nest_ZZZZZZZZZ.txt
I need to rename these files as
test.txt
best.txt
nest.txt
I am using Ksh on AIX .Please let me know how i can accomplish the above using a Single command .
Thanks,
In this case, it seems you have an _ to start every section you want to remove. If that's the case, then this ought to work:
for f in *.txt
do
g="${f%%_*}.txt"
echo mv "${f}" "${g}"
done
Remove the echo if the output seems correct, or replace the last line with done | ksh.
If the files aren't all .txt files, this is a little more general:
for f in *
do
ext="${f##*.}"
g="${f%%_*}.${ext}"
echo mv "${f}" "${g}"
done
If this is a one time (or not very often) occasion, I would create a script with
$ ls > rename.sh
$ vi rename.sh
:%s/\(.*\)/mv \1 \1/
(edit manually to remove all the XXXXX from the second file names)
:x
$ source rename.sh
If this need occurs frequently, I would need more insight into what XXXXX, YYY, and ZZZZZZZZZZZ are.
Addendum
Modify this to your liking:
ls | sed "{s/\(.*\)\(............\)\.txt$/mv \1\2.txt \1.txt/}" | sh
It transforms filenames by omitting 12 characters before .txt and passing the resulting mv command to a shell.
Beware: If there are non-matching filenames, it executes the filename—and not a mv command. I omitted a way to select only matching filenames.

How to archive files under certain dir that are not text files in Mac OS?

Hey, guys, I used zip command, but I only want to archive all the files except *.txt. For example, if two dirs file1, file2; both of them have some *.txt files. I want archive only the non-text ones from file1 and file2.
tl;dr: How to tell linux to give me all the files that don't match *.txt
$ zip -r zipfile -x'*.txt' folder1 folder2 ...
Move to you desired directory and run:
ls | grep -P '\.(?!txt$)' | zip -# zipname
This will create a zipname.zip file containing everything but .txt files. In short, what it does is:
List all files in the directory, one per line (this can be achieved by using the -1 option, however it is not needed here as it's the default when output is not the terminal, it is a pipe in this case).
Extract from that all lines that do not end in .txt. Note it's grep using a Perl regular expression (option -P) so the negative lookahead can be used.
Zip the list from stdin (-#) into zipname file.
Update
The first method I posted fails with files with two ., like I described in the comments. For some reason though, I forgot about the -v option for grep which prints only what doesn't match the regex. Plus, go ahead and include a case insensitive option.
ls | grep -vi '\.txt$' | zip -# zipname
Simple, use bash's Extended Glob option like so:
#!/bin/bash
shopt -s extglob
zip -some -options !(*.txt)
Edit
This isn't as good as the -x builtin option to zip but my solution is generic across any command that may not have this nice feature.

bash script to delete old deployments

I have a directory where our deployments go. A deployment (which is itself a directory) is named in the format:
<application-name>_<date>
e.g. trader-gui_20091102
There are multiple applications deployed to this same parent directory, so the contents of the parent directory might look something like this:
trader-gui_20091106
trader-gui_20091102
trader-gui_20091010
simulator_20091106
simulator_20091102
simulator_20090910
simulator_20090820
I want to write a bash script to clean out all deployments except for the most current of each application. (The most current denoted by the date in the name of the deployment). So running the bash script on the above parent directory would leave:
trader-gui_20091106
simulator_20091106
Any help would be appreciated.
A quick one-liner:
ls | sed 's/_[0-9]\{8\}$//' | uniq |
while read name; do
rm $(ls -r ${name}* | tail -n +2)
done
List the files, chop off an underscore followed by eight digits, only keep unique names. For each name, remove everything but the most recent.
Assumptions:
the most recent will be last when sorted alphabetically. If that's not the case, add a sort that does what you want in the pipeline before tail -n +2
no other files in this directory. If there are, limit the output of the ls, or pipe it through a grep to select only what you want.
no weird characters in the filenames. If there are... instead of directly using the output of the inner ls pipeline, you'd probably want to pipe it into another while loop so you can quote the individual lines, or else capture it in an array so you can use the quoted expansion.
shopt -s exglob
ls|awk -F"_" '{a[$1]=$NF}END{for(i in a)print i"_"a[i]}'|while read -r F
do
rm !($F)
done
since your date in filename is already "sortable" , the awk command finds the latest file of each application. rm (!$F) just means remove those filename that is not latest.
You could try find:
# Example: Find and delete all directories in /tmp/ older than 7 days:
find /tmp/ -type d -mtime +7 -exec rm -rf {} \; &>/dev/null

Resources