Select files reading names from a text file - bash

I have a directory of files. I want to select some of them based on a text file I have. The text file has partial names of the files which I need to select.
I tried using find
ls | while read -r line; do find -type f -name $line; done < ../src_pdm3012/good_G20P1.txt
I also tried using grep for it does not seem to work.

Since names are partial, you must find files with a name that contains the line you read, not a name that equals it:
while read -r line; do find -type f -name "*$line*"; done < ../src_pdm3012/good_G20P1.txt
Since you wanted to use grep, here is what you were looking for:
ls -a | grep -F -f ../src_pdm3012/good_G20P1.txt
... but keep in mind it is not a good practice to grep ls output

Related

Change string using sed with specific condition

I want to change paths in files using sed.
Now I run following :
find . -type f | xargs sed -r -i "s/home\/some_dir/home\/another_dir/g"
But I want this to be applied only for those paths, which correspond actual files in my file system.
For instance : if I don't have file home/some_dir/lol, then corresponding string in some file will be ignored.
UPD (explanation) :
Let's imagine I have following file structure:
-home
--some_dir
---file1
--another_dir
--dir_with_configs
---config.txt
And I am in /home/dir_with_configs directory.
Let config.txt be like:
/home/some_dir/file1
/home/some_dir/lol
After running
find . -type f | xargs sed -r -i "s/home\/some_dir/home\/another_dir/g"
I will have config.txt like:
/home/another_dir/file1
/home/another_dir/lol
But I don't have file /home/another_dir/lol. So I somehow want to add check that file with given path exists and have config.txt like:
/home/another_dir/file1
/home/some_dir/lol
Presumably you want to do the following on each file that you find.
Test that each line is an existing file
Test that each existing file also exists on an alternate path
If both are true, replace the line with the alternate path
sed can't do most of this. You need a different tool. Something like perl or python would probably be the most efficient choice. Either way, your program will have to actually read each line of the input file and test if that line represents a real file on your system before doing anything to it.
Here's a bash example that passes the files discovered by find into a small script that reads the lines from those files, tests them, and makes any necessary substitutions to the output.
find . -type f -exec /bin/bash -c '
for file ; do
while read -r line ; do
if [[ -f "${line}" && -f "${line/some_dir/another_dir}" ]] ; then
printf "%s\n" "${line/some_dir/another_dir}"
else
printf "%s\n" "${line}"
fi
done <"${file}" >"${file}.new" && mv "${file}.new" "${file}"
done
' _ {} +

Using find and exec inside while loop

I've a .txt file which contains names of various files.
When I simply use a while loop it works fine,
while read -r name
do
echo "$name"
done <fileNames.txt
But,
when I try to use find inside the loop, like this:
while read -r name
do
find ./ -iname "$name" -exec sed -i '1s/^/NEW LINE INSERTED \n/' '{}' ';'
done < fileNames.txt
nothing happens!
If i use the find outside the loop like with a specific file name it does what it's supposed to do, I can also use it on all files with a specific file-type but it doesn't work inside the loop.
What am I doing wrong over here?
I'm trying to read file names from a file, search for it inside a folder recursively and then append a line in the beginning using sed.
Use xargs instead to capture the results of find
while read -r name
do
find ./ -iname "$name" |xargs sed -i '1s/^/NEW LINE INSERTED \n/'
done <fileNames.txt

Is there a way to import an argument list to find before running -exec?

I'm writing a script that performs a grep on .php files, but I also want it to perform the same grep on any files that are included by the .php files.
I'm using the PHP function
get_included_files();
to generate a list of included files (this list is then saved to a file) and I want my find to execute a grep on both the .php files found by the find, and all the files listed in my file list.
I've tried the following:
find -iname \*.php -exec grep 'foo' $(cat list.txt) {} +
find -iname \*.php | xargs -I {} grep 'foo' $(cat list.txt) {}
In both cases, I get either:
/usr/bin/find: Argument list too long
/usr/bin/xargs: Argument list too long
Any help would be appreciated
I think you may be able to avoid your problem by first constructing the joined list, then using xargs to provide it to grep :
find -iname \*.php | cat list.txt - | xargs grep 'foo'
Note that the - in cat's arguments refers to stdin in this context, that is the list of files returned by find.
You could also insert a sort -u in there if you want to avoid duplicates.
My supposition on your error is that the content of list.txt itself is enough to break the grep call ; both find and xargs would detect that and refuse to launch the command, without any regards for its arguments.
In your commands, all file names in list.txt are first passed to find/xargs as positional arguments. In your case, this file contains more names than the limitation of the OS.
So just run grep for each name in list.txt:
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
grep 'foo' "$line"
done < list.txt

Copying list of files to a directory

I want to make a search for all .fits files that contain a certain text in their name and then copy them to a directory.
I can use a command called fetchKeys to list the files that contain say 'foo'
The command looks like this : fetchKeys -t 'foo' -F | grep .fits
This returns a list of .fits files that contain 'foo'. Great! Now I want to copy all of these to a directory /path/to/dir. There are too many files to do individually , I need to copy them all using one command.
I'm thinking something like:
fetchKeys -t 'foo' -F | grep .fits > /path/to/dir
or
cp fetchKeys -t 'foo' -F | grep .fits /path/to/dir
but of course neither of these works. Any other ideas?
If this is on Linux/Unix, can you use the find command? That seems very much like fetchkeys.
$ find . -name "*foo*.fit" -type f -print0 | while read -r -d $'\0' file
do
basename=$(basename $file)
cp "$file" "$fits_dir/$basename"
done
The find command will find all files that match *foo*.fits in their name. The -type f says they have to be files and not directories. The -print0 means print out the files found, but separate them with the NUL character. Normally, the find command will simply return a file on each line, but what if the file name contains spaces, tabs, new lines, or even other strange characters?
The -print0 will separate out files with nulls (\0), and the read -d $'\0' file means to read in each file separating by these null characters. If your files don't contain whitespace or strange characters, you could do this:
$ find . -name "*foo*.fit" -type f | while read file
do
basename=$(basename $file)
cp "$file" "$fits_dir/$basename"
done
Basically, you read each file found with your find command into the shell variable file. Then, you can use that to copy that file into your $fits_dir or where ever you want.
Again, maybe there's a reason to use fetchKeys, and it is possible to replace that find with fetchKeys, but I don't know that fetchKeys command.
Copy all files with the name containing foo to a certain directory:
find . -name "*foo*.fit" -type f -exec cp {} "/path/to/dir/" \;
Copy all files themselves containing foo to a certain directory (solution without xargs):
for f in `find . -type f -exec grep -l foo {} \;`; do cp "$f" /path/to/dir/; done
The find command has very useful arguments -exec, -print, -delete. They are very robust and eliminate the need to manually process the file names. The syntax for -exec is: -exec (what to do) \;. The name of the file currently processed will be substituted instead of the placeholder {}.
Other commands that are very useful for such tasks are sed and awk.
The xargs tool can execute a command for every line what it gets from stdin. This time, we execute a cp command:
fetchkeys -t 'foo' -F | grep .fits | xargs -P 1 -n 500 --replace='{}' cp -vfa '{}' /path/to/dir
xargs is a very useful tool, although its parametrization is not really trivial. This command reads in 500 .fits files, and calls a single cp command for every group. I didn't tested it to deep, if it doesn't go, I'm waiting your comment.

How to create a backup of files' lines containing "foo"

Basically I have a directory and sub-directories that needs to be scanned to find .csv files. From there I want to copy all lines containing "foo" from the csv's found to new files (in the same directory as the original) but with the name reflecting the file it was found in.
So far I have
find -type f -name "*.csv" | xargs egrep -i "foo" > foo.csv
which yields one backup file (foo.csv) with everything in it, and the location it was found in is part of the data. Both of which I don't want.
What I want:
For example if I have:
csv1.csv
csv2.csv
and they both have lines containing "foo", I would like those lines copied to:
csv1_foo.csv
csv2_foo.csv
and I don't anything extra entered in the backups, other than the full line containing "foo" from the original file. I.e. I don't want the original file name in the backup data, which is what my current code does.
Also, I suppose I should note that I'm using egrep, but my example doesn't use regex. I will be using regex in my search when I apply it to my specific scenario, so this probably needs to be taken into account when naming the new file. If that seems too difficult, an answer that doesn't account for regex would be fine.
Thanks ahead of time!
try this if helps it anyway.
find -type f -name "*.csv" | xargs -I {} sh -c 'filen=`echo {} | sed 's/.csv//' | sed "s/.\///"` && egrep -i "foo" {} > ${filen}_foo.log'
You can try this:
$ find . -type f -exec grep -H foo '{}' \; | perl -ne '`echo $2 >> $1_foo` if /(.*):(.*)/'
It uses:
find to iterate over files
grep to print file path:line tuples (-H switch)
perl to echo those line to the output files (using backslashes, but it could be done prettier).
You can also try:
find -type f -name "*.csv" -a ! -name "*_foo.csv" | while read f; do
grep foo "$f" > "${f%.csv}_foo.csv"
done

Resources