How do I write a bash script to copy files into a new folder based on name? - bash

I have a folder filled with ~300 files. They are named in this form username#mail.com.pdf. I need about 40 of them, and I have a list of usernames (saved in a file called names.txt). Each username is one line in the file. I need about 40 of these files, and would like to copy over the files I need into a new folder that has only the ones I need.
Where the file names.txt has as its first line the username only:
(eg, eternalmothra), the PDF file I want to copy over is named eternalmothra#mail.com.pdf.
while read p; do
ls | grep $p > file_names.txt
done <names.txt
This seems like it should read from the list, and for each line turns username into username#mail.com.pdf. Unfortunately, it seems like only the last one is saved to file_names.txt.
The second part of this is to copy all the files over:
while read p; do
mv $p foldername
done <file_names.txt
(I haven't tried that second part yet because the first part isn't working).
I'm doing all this with Cygwin, by the way.
1) What is wrong with the first script that it won't copy everything over?
2) If I get that to work, will the second script correctly copy them over? (Actually, I think it's preferable if they just get copied, not moved over).
Edit:
I would like to add that I figured out how to read lines from a txt file from here: Looping through content of a file in bash

Solution from comment: Your problem is just, that echo a > b is overwriting file, while echo a >> b is appending to file, so replace
ls | grep $p > file_names.txt
with
ls | grep $p >> file_names.txt
There might be more efficient solutions if the task runs everyday, but for a one-shot of 300 files your script is good.

Assuming you don't have file names with newlines in them (in which case your original approach would not have a chance of working anyway), try this.
printf '%s\n' * | grep -f names.txt | xargs cp -t foldername
The printf is necessary to work around the various issues with ls; passing the list of all the file names to grep in one go produces a list of all the matches, one per line; and passing that to xargs cp performs the copying. (To move instead of copy, use mv instead of cp, obviously; both support the -t option so as to make it convenient to run them under xargs.) The function of xargs is to convert standard input into arguments to the program you run as the argument to xargs.

Related

bash change absolute path in file line by line for script creation

I'm trying to create a bash script based on a input file (list.txt). The input File contains a list of files with absolute path. The output should be a bash script (move.sh) which moves the files to another location, preserve the folder structure, but changing the target folder name slightly before.
the Input list.txt File example looks like this :
/In/Folder_1/SomeFoldername1/somefilename_x.mp3
/In/Folder_2/SomeFoldername2/somefilename_y.mp3
/In/Folder_3/SomeFoldername3/somefilename_z.mp3
The output file (move.sh) should looks like this after creation :
mv "/In/Folder_1/SomeFoldername1/somefilename_x.mp3" /gain/Folder_1/
mv "/In/Folder_2/SomeFoldername2/somefilename_y.mp3" /gain/Folder_2/
mv "/In/Folder_3/SomeFoldername3/somefilename_z.mp3" /gain/Folder_3/
The folder structure should be preserved, more or less.
after executing the created bash script (move.sh), the result should looks like this :
/gain/Folder_1/somefilename_x.mp3
/gain/Folder_2/somefilename_y.mp3
/gain/Folder_3/somefilename_z.mp3
What I've done so far.
1. create a list of files with absolute path
find /In/ -iname "*.mp3" -type f > /home/maars/mp3/list.txt
2. create the move.sh script
cp -a /home/maars/mp3/list.txt /home/maars/mp3/move.sh
# read the list and split the absolute path into fields
while IFS= read -r line;do
fields=($(printf "%s" "$line"|cut -d'/' --output-delimiter=' ' -f1-))
done < /home/maars/mp3/move.sh
# add the target path based on variables at the end of the line
sed -i -E "s|\.mp3|\.mp3"\"" /gain/"${fields[1]}"/|g" /home/maars/mp3/move.sh
sed -i "s|/In/|mv "\""/In/|g" /home/maars/mp3/move.sh
The script just use the value of ${fields[1]}, which is Folder_1 and put this in all lines at the end. Instead of Folder_2 and Folder_3.
The current result looks like
mv "/In/Folder_1/SomeFoldername1/somefilename_x.mp3" /gain/Folder_1/
mv "/In/Folder_2/SomeFoldername2/somefilename_y.mp3" /gain/Folder_1/
mv "/In/Folder_3/SomeFoldername3/somefilename_z.mp3" /gain/Folder_1/
rsync is not an option since I need the full control of files to be moved.
What could I do better to solve this issue ?
EDIT : #Socowi helped me a lot by pointing me in the right direction. After I did a deep dive into the World of Regex, I could solve my Issues. Thank you very much
The script just use the value of ${fields[1]}, which is Folder_1 and put this in all lines at the end. Instead of Folder_2 and Folder_3.
You iterate over all lines and update fields for every line. After you finished the loop, fields retains its value (from the last line). You would have to move the sed commands into your loop and make sure that only the current line is replaced by sed. However, there's a better way – see down below.
What could I do better
There are a lot of things you could improve, for instance
Creating the array fields with mapfile -d/ fields instead of printf+cut+($()). That way, you also wouldn't have problems with spaces in paths.
Use sed only once instead of creating the array fields and using multiple sed commands. You can replace step 2 with this small script:
cp -a /home/maars/mp3/list.txt /home/maars/mp3/move.sh
sed -i -E 's|^/[^/]*/([^/]*).*$|mv "&" "/gain/\1"|' /home/maars/mp3/move.sh
However, the best optimization would be to drop that three step approach and use only one script to find and move the files:
find /In/ -iname "*.mp3" -type f -exec rename -n 's|^/.*?/(.*?)/.*/(.*)$|/gain/$1/$2|' {} +
The -n option will print what will be renamed without actually renaming anything . Remove the -n when you are happy with the result. Here is the output:
rename(/In/Folder_1/SomeFoldername1/somefilename_x.mp3, /gain/Folder_1/somefilename_x.mp3)
rename(/In/Folder_2/SomeFoldername2/somefilename_y.mp3, /gain/Folder_2/somefilename_y.mp3)
rename(/In/Folder_3/SomeFoldername3/somefilename_z.mp3, /gain/Folder_3/somefilename_z.mp3)
It's not builtin to bash, but the mmv command is nice for this kind of mv where you need to use wildcards in paths. Something like the following should work:
mmv "in/*/*/*" "#1/#3"
Note that this won't create the directories for you - but in your example above it looks like these already exist?

Script to copy directory of filenames to .txt file

This should be fairly easy and I understand the logic of it but my shell scripting is rather beginner.
Basically, I have a directory with a hundred files or so, and I want to copy their filenames to a .txt file. One line per filename. I know I'd want a loop for all the files in the directory, copy name to text file, repeat until there are no more files but not sure how to write that out in a .sh file.
(Also, just out of pure curiosity, how would I omit the file extensions? In this case, they're all the same extension but potentially in the future they may not be, and while I need the extensions right now I may not in the future. I'm assuming there might be a flag for this or would I use '.' as a delimiter to stop copying at that point?)
Thanks in advance!
It could be very easy with ls:
ls -1 [directory] > filename.txt
Note the flag -1, it tells ls to output filenames one per line regardless what the output is. Usually ls acts like ls -C if the stdout is a tty, and acts like ls -1 otherwise. Explicitly specifying this flag forces ls to output one per line.
If you want to do it manually, this is an example:
#!/bin/sh
cd [directory]
for i in *
do
echo "$i"
done > filename.txt
To omit extensions, you can use string replacement:
echo "${i%.*}"
For the first part, you can do
ls <dirname> > files.txt
I alias ls to ls -F, so to avoid any extraneous characters in the output, you would do
printf "%s\n" * > ../filename.txt
I put the output txt file in a different directory so the list of files does not include "filename.txt"
If you want to omit file extensions:
printf "%s\n" * | sed 's/\.[^.]*$//' > ../filename.txt

Search text and append to each end of line of text file - OSX

I'm new to OSX command line tools.
I am trying to find a block of text in a file and append this text at the end of all lines in another text file. At run time I don't know what this text will be, I just know it will be located within "BEGINHMM" and "ENDHMM". Also, I don't know the makeup of the destination file, except for that it will not be an empty text file.
The command which finds the block of text of interest is:
sed -n '/<BEGINHMM>/,/<ENDHMM>/p' proto
where "proto" is a text file containing the text of interest.
I've been trying to pipe the output of the above command to another 'sed' command, in the following manner:
xargs -I '{}' sed -i .bak 's/$/{}/' monophones0.txt
but I am getting some bizarre results, I see the "{}" inserted in the text for example.
I've also tried piping to:
xargs -0 sed -i .bak 's/$/&/' monophones0.txt
but I just get the printout (similar to terminal echo) of the text I am trying to grab.
Ultimately I want to loop over several 'proto' files in multiple directories and copy the text between the "BEGINHMM", "ENDHMM" block in each directory, and append the selected text to that directory's monophones.txt lines.
I am running the commands in the terminal, bash, OSX 10.12.2
Any help would be appreciated.
(1) Your sed command is of the form sed -n '/A/,/B/p'; this will include the lines on which A and B occur, even if these strings do not appear at the beginning of the line. This form may have other surprises in store for you as well (what do expect will happen if B is missing or repeated?), but the remainder of this post assumes that's what you want.
(2) It's not clear how you intend to specify the "proto" files, but you do indicate they might be in several directories, so for the remainder of this post, I'll assume they are listed, one per line, in a file named proto.txt in each directory. This will ensure that you don't run into any limitations on command-line length, but the following can easily be modified if you don't want to create such a file.
(3) Here is a script which will use the sed command you've mentioned to copy segments from each of the "proto" files specified in a directory to monophones0.txt in the directory in which the script is executed.
#!/bin/bash
OUT=monophones0.txt
cat proto.txt | while read file
do
if [ -r "$file" ] ; then
sed -n '/<BEGINHMM>/,/<ENDHMM>/p' "$file" >> $OUT
elif [ -n "$file" ] ; then
echo "NOT FOUND: $file" >&2
fi
done
Just like what you did before. tmpfile=$(mktemp); sed -n '/<BEGINHMM>/,/<ENDHMM>/p' proto >$tmpfile; sed -i .bak "r $tmpfile" monophones0.txt; rm $tmpfile. This is the basic idea; there are other checks you need to perform to make this a robust script.
– 4ae1e1

Shell - saving contents of file to variable then outputting the variable

First off, I'm really bad at shell, as you'll notice :)
Now then, I have the following task: The script gets two arguments (fileName, N). If the number of lines in the file is greater then N, then I need to cut the last N lines, then overwrite the contents of the file with it.
I thought of saving the contents of the file into a variable, then just cat-ing that to the file. However for some reason it's not working.
I have problems with saving the last N lines to a variable.
This is how I tried doing it:
lastNLines=`tail -$2 $1`
cat $lastNLines > $1
Your lastNLines is not a filename. cat takes filenames. You also cannot open the input file for writing, because the shell truncates it before tail can get to it, which is why you need to use a temporary file.
However, if you insist on not using a temporary file, here's a non-portable solution:
tail -n$2 $1 | sponge $1
You may need to install moreutils for sponge.
The arguments cat takes are file names, not the content.
Instead, you can use a temp file, like this:
tail -$2 $1 > $1._tmp
mv $1._tmp $1
To save the content to a variable, you can do what you already included in your question, or:
lastNLines=`cat $1`
(after the mv command, of course)

Create files using grep and wildcards with input file

This should be a no-brainer, but apparently I have no brain today.
I have 50 20-gig logs that contain entries from multiple apps, one of which addes a transaction ID to its log lines. I have 42 transaction IDs I need to review, and I'd like to parse out the appropriate lines into separate files.
To do a single file, the command would be simply,
grep CDBBDEADBEEF2020X02393 server.log* > CDBBDEADBEEF2020X02393.log
that creates a log isolated to that transaction, from all 50 server.logs.
Now, I have a file with 42 txnIDs (shortening to 4 here):
CDBBDEADBEEF2020X02393
CDBBDEADBEEF6548X02302
CDBBDE15644F2020X02354
ABBDEADBEEF21014777811
And I wrote:
#/bin/sh
grep $1 server.\* > $1.log
But that is not working. Changing the shebang to #/bin/bash -xv, gives me this weird output (obviously I'm playing with what the correct escape magic must be):
$ ./xtrakt.sh B7F6E465E006B1F1A
#!/bin/bash -xv
grep - ./server\.\*
' grep - './server.*
: No such file or directory
I have also tried the command line
grep - server.* < txids.txt > $1
But OBVIOUSLY that $1 is pointless and I have no idea how to get a file named per txid using the input redirect form of the command.
Thanks in advance for any ideas. I haven't gone the route of doing a foreach in the shell script, because I want grep to put the original filename in the output lines so I can examine context later if I need to.
Also - it would be great to have the server.* files ordered numerically (server.log.1, server.log.2 NOT server.log.1, server.log.10...)
try this:
while read -r txid
do
grep "$txid" server.* > "$txid.log"
done < txids.txt
and for the file ordering - rename files with one digit to two digit, with leading zeroes, e.g. mv server.log.1 server.log.01.

Resources