Using cat and grep commands in Bash - bash

I'm having trouble with trying to achieve this bash command:
Concatenate all the text files in the current directory that have at least one occurrence of the word BOB (in any case) within the text of the file.
Is it correct for me to do this use the cat command then use grep to find the occurences of the word BOB?
cat grep -i [BOB] *.txt > catFile.txt

To handle filenames with whitespace characters correctly:
grep --null -l -i "BOB" *.txt | xargs -0 cat > catFile.txt

Your issue was the need to pass grep's file names to cat as an inline function:
cat $(grep --null -l -i "BOB" *.txt ) > catFile.txt
$(.....) handles the inline execution
-l returns only filenames of the things that matched

You could use find with -exec:
find -maxdepth 1 -name '*.txt' -exec grep -qi 'bob' {} \; \
-exec cat {} + > catFile.txt
-maxdepth 1 makes sure you don't search any deeper than the current directory
-name '*.txt' says to look at all files ending with .txt – for the case that there is also a directory ending in .txt, you could add -type f to only look at files
-exec grep -qi 'bob' {} \; runs grep for each .txt file found. If bob is in the file, the exit status is zero and the next directive is executed. -q makes sure the grep is silent.
-exec cat {} + runs cat on all the files that contain bob

You need to remove the square brackets...
grep -il "BOB" *

You can also use the following command that you must run from the directory containing your BOB files.
grep -il BOB *.in | xargs cat > BOB_concat.out
-i is an option used to set grep in case insensitive mode
-l will be used to output only the filename containing the pattern provided as argument to grep
*.in is used to find all the input files in the dir (should be adapted to your folder content)
then you pipe the result of the command to xargs in order to build the arguments that cat will use to produce your file concatenation.
HYPOTHESIS:
Your folder does only contain files without strange characters in their name (e.g. space)

Related

How to find many files from txt file in directory and subdirectories, then copy all to new folder

I can't find posts that help with this exact problem:
On Mac Terminal I want to read a txt file (example.txt) containing file names such as:
20130815 144129 865 000000 0172 0780.bmp
20130815 144221 511 000003 1068 0408.bmp
....100 more
And I want to search for them in a certain folder/subfolders (example_folder). After each find, the file should be copied to a new folder x (new_destination).
Your help would be much appreciated!
Chers,
Mo
You could use a piped command with a combination of ls, grep, xargs and cp.
So basically you start with getting the list of files
ls
then you filter them with egrep -e, grep -e or whatever flavor of grep Mac uses for their terminal. If you want to find all files ending with text you can use the regex .txt$ (which means ends with '.txt')
ls | egrep -e "yourRegexExpression"
After that you get an input stream, but cp doesn't work with input streams and only takes a bunch of arguments, that's why we use xargs to convert it to arguments. The final step is to add the flag -t to the argument to signify that the next argument is the target directory.
ls | egrep -e "yourRegexExpression" | xargs cp -t DIRECTORY
I hope this helps!
Edit
Sorry I didn't read the question well enough, I updated to be match your problem. Here you can see that the egrep command compiles a rather large regex string with all the file names in this way (filename1|filename2|...|fileN). The $() evaluates the command inside and uses the tr to translate newLines to "|" for the regex.
ls | egrep -e "("$(cat yourtextfile.txt | tr "\n" "|")")" | xargs cp -t DIRECTORY
You could do something like:
$ for i in `cat example.txt`
find /search/path -type f -name "$i" -exec cp "{}" /new/path \;
This is how it works, for every line within example.txt:
for i in `cat example.txt`
it will try to find a file matching the line $i in the defined path:
find /search/path -type f -name "$i"
And if found it will copy it to the desired location:
-exec cp "{}" /new/path \;

How to remove files using grep and rm?

grep -n magenta *| rm *
grep: a.txt: No such file or directory
grep: b: No such file or directory
Above command removes all files present in the directory except ., .. .
It should remove only those files which contains the word "magenta"
Also, tried grep magenta * -exec rm '{}' \; but no luck.
Any idea?
Use xargs:
grep -l --null magenta ./* | xargs -0 rm
The purpose of xargs is to take input on stdin and place it on the command line of its argument.
What the options do:
The -l option tells grep not to print the matching text and instead just print the names of the files that contain matching text.
The --null option tells grep to separate the filenames with NUL characters. This allows all manner of filenames to be handled safely.
The -0 option to xargs to treat its input as NUL-separated.
Here is a safe way:
grep -lr magenta . | xargs -0 rm -f --
-l prints file names of files matching the search pattern.
-r performs a recursive search for the pattern magenta in the given directory .. 
If this doesn't work, try -R.
(i.e., as multiple names instead of one).
xargs -0 feeds the file names from grep to rm -f
-- is often forgotten but it is very important to mark the end of options and allow for removal of files whose names begin with -.
If you would like to see which files are about to be deleted, simply remove the | xargs -0 rm -f -- part.

From UNIX shell, how to find all files containing a specific string, then print the 4th line of each file?

I want to find all files within the current directory that contain a given string, then print just the 4th line of each file.
grep --null -l "$yourstring" * | # List all the files containing your string
xargs -0 sed -n '4p;q' # Print the fourth line of said files.
Different editions of grep have slightly different incantations of --null, but it's usually there in some form. Read your manpage for details.
Update: I believe one of the null file list incantations of grep is a reasonable solution that will cover the vast majority of real-world use cases, but to be entirely portable, if your version of grep does not support any null output it is not perfectly safe to use it with xargs, so you must resort to find.
find . -maxdepth 1 -type f -exec grep -q "$yourstring" {} \; -exec sed -n '4p;q' {} +
Because find arguments can almost all be used as predicates, the -exec grep -q… part filters the files that are eventually fed to sed down to only those that contain the required string.
From other user:
grep -Frl string . | xargs -n 1 sed -n 4p
Give a try to the below GNU find command,
find . -maxdepth 1 -type f -exec grep -l 'yourstring' {} \; | xargs -I {} awk 'NR==4{print; exit}' {}
It finds all the files in the current directory which contains specific string, and prints the line number 4 present in each file.
This for loop should work:
while read -d '' -r file; do
echo -n "$file: "
sed '4q;d' "$file"
done < <(grep --null -l "some-text" *.txt)

Bash find filter and copy - trouble with spaces

So after a lot of searching and trying to interpret others' questions and answers to my needs, I decided to ask for myself.
I'm trying to take a directory structure full of images and place all the images (regardless of extension) in a single folder. In addition to this, I want to be able to remove images matching certain filenames in the process. I have a find command working that outputs all the filepaths for me
find -type f -exec file -i -- {} + | grep -i image | sed 's/\:.*//'
but if I try to use that to copy files, I have trouble with the spaces in the filenames.
cp `find -type f -exec file -i -- {} + | grep -i image | sed 's/\:.*//'` out/
What am I doing wrong, and is there a better way to do this?
With the caveat that it won't work if files have newlines in their names:
find . -type f -exec file -i -- {} + |
awk -vFS=: -vOFS=: '$NF ~ /image/{NF--;printf "%s\0", $0}' |
xargs -0 cp -t out/
(Based on answer by Jonathan Leffler and subsequent comments discussion with him and #devnull.)
The find command works well if none of the file names contain any newlines. Within broad limits, the grep command works OK under the same circumstances. The sed command works fine as long as there are no colons in the file names. However, given that there are spaces in the names, the use of $(...) (command substitution, also indicated by back-ticks `...`) is a disaster. Unfortunately, xargs isn't readily a part of the solution; it splits on spaces by default. Because you have to run file and grep in the middle, you can't easily use the -print0 option to (GNU) find and the -0 option to (GNU) xargs.
In some respects, it is crude, but in many ways, it is easiest if you write an executable shell script that can be invoked by find:
#!/bin/bash
for file in "$#"
do
if file -i -- "$file" | grep -i -q "$file:.*image"
then cp "$file" out/
fi
done
This is a little painful in that it invokes file and grep separately for each name, but it is reliable. The file command is even safe if the file name contains a newline; the grep is probably not.
If that script is called 'copyimage.sh', then the find command becomes:
find . -type f -exec ./copyimage.sh {} +
And, given the way the grep command is written, the copyimage.sh file won't be copied, even though its name contains the magic word 'image'.
Pipe the results of your find command to
xargs -l --replace cp "{}" out/
Example of how this works for me on Ubuntu 10.04:
atomic#atomic-desktop:~/temp$ ls
img.png img space.png
atomic#atomic-desktop:~/temp$ mkdir out
atomic#atomic-desktop:~/temp$ find -type f -exec file -i \{\} \; | grep -i image | sed 's/\:.*//' | xargs -l --replace cp -v "{}" out/
`./img.png' -> `out/img.png'
`./img space.png' -> `out/img space.png'
atomic#atomic-desktop:~/temp$ ls out
img.png img space.png
atomic#atomic-desktop:~/temp$

How to create a backup of files' lines containing "foo"

Basically I have a directory and sub-directories that needs to be scanned to find .csv files. From there I want to copy all lines containing "foo" from the csv's found to new files (in the same directory as the original) but with the name reflecting the file it was found in.
So far I have
find -type f -name "*.csv" | xargs egrep -i "foo" > foo.csv
which yields one backup file (foo.csv) with everything in it, and the location it was found in is part of the data. Both of which I don't want.
What I want:
For example if I have:
csv1.csv
csv2.csv
and they both have lines containing "foo", I would like those lines copied to:
csv1_foo.csv
csv2_foo.csv
and I don't anything extra entered in the backups, other than the full line containing "foo" from the original file. I.e. I don't want the original file name in the backup data, which is what my current code does.
Also, I suppose I should note that I'm using egrep, but my example doesn't use regex. I will be using regex in my search when I apply it to my specific scenario, so this probably needs to be taken into account when naming the new file. If that seems too difficult, an answer that doesn't account for regex would be fine.
Thanks ahead of time!
try this if helps it anyway.
find -type f -name "*.csv" | xargs -I {} sh -c 'filen=`echo {} | sed 's/.csv//' | sed "s/.\///"` && egrep -i "foo" {} > ${filen}_foo.log'
You can try this:
$ find . -type f -exec grep -H foo '{}' \; | perl -ne '`echo $2 >> $1_foo` if /(.*):(.*)/'
It uses:
find to iterate over files
grep to print file path:line tuples (-H switch)
perl to echo those line to the output files (using backslashes, but it could be done prettier).
You can also try:
find -type f -name "*.csv" -a ! -name "*_foo.csv" | while read f; do
grep foo "$f" > "${f%.csv}_foo.csv"
done

Resources