Display filename of tar file - bash

I would like to know how to display the filename along with the lines matching a specfic word of a tar file.
Command wise :
zcat file | grep "stuff" -r # shows what I want
zcat *.gz | grep "stuff" -ar # this fails

You can use zgrep:
For single file, you can use the following command to display filename:
zgrep "stuff" file.gz /dev/null
For multiple files:
zgrep "stuff" *.gz

Maybe this related answer can help. It uses tar to untar (you would need to add -z) and pipes each file of the archive to awk for "grepping" inside it.

I'm not quite sure what the question is but if you are looking for tar files on your system then just do something like this. This will recursively search your current directory and any child directories for .tar files. Hope this helps.
find -name "*.tar"

If zcat file | grep "stuff" -r shows what you want, you can do this for multiple files:
for name in *.gz ; do zcat "$name" | grep -a "stuff" | sed -e "s/^/${name}: /" ; done
This command uses globbing (*) to expand to a list of .gz files in your working directory, then calls zcat for extraction, grep for the search and sed for prefixing with the filename on each of the files.
Note that if you are working with gzipped tarballs, most people give them a .tgz or .tar.gz instead of just .gz extension.

This will output nameOfFileInTar:LineNumber:Match. Invoke with greptar.sh tarfile.tar pattern
If you don't want the line number, remove the -n option. If you only want the line number, add |cut -f1 -d: after the grep
#!/bin/bash
TARFILE=$1
PATTERN=$2
tar ztf $TARFILE | while read -r FILE
do
res=$(tar zxf $TARFILE $FILE -O | grep -n $2 )
if [[ $? == 0 ]]; then
echo "$res" | while read -r line; do
echo $FILE:$line;
done
fi
done

Related

Ubuntu script to rename every file in folders and sub directories NOT matching sha1sum

On Ubuntu i have run this command.
sha1sum /home/abcd/random/1/1.mp4
So I have the following directory
/home/abcd/
Inside this folder are sub folders
/home/abcd/1
/home/abcd/2
etc
etc
These sub folders are filled with video files .mp4 extensions
They have multiple files with a file hash of sha1sum of 3c72363260a32992c3ab2e3a5e9b8cf082e02eac i wanted to rename all the files NOT matching this file to
vid_1.mp4
vid_2.mp4
etc
etc
How can I achieve this?
pseudocode
find mp4s
sha1sum them (outputs <sha1sum> <filename>)
pass lines not 3c72363260a32992c3ab2e3a5e9b8cf082e02eac
change lines from <sha1sum> <filename> to mv "<filename>" "vid_<filename"
execute the lines
code
cd /home/abcd
find . -name "*.mp4" -print0 |
xargs -r -0 sha1sum |
awk '$1!="3c72363260a32992c3ab2e3a5e9b8cf082e02eac"' |
sed 's/^[^\s]\+\s\+\(.*\)/mv "\1" "vid_\1"/' |
sh
You can try this for loop with a nested if loop
for f in /home/abcd/*; do
i=$((i+1));
sum=$(sha1sum "$f" | awk '{print $1}');
if [[ "$sum" != 3c72363260a32992c3ab2e3a5e9b8cf082e02eac ]]; then
mv "$f" "vid_$i.mp4";
fi;
done

Copy files that have at least the mention of one certain word

I want to look through 100K+ text files from a directory and copy to another directory only the ones which contain at least one word from a list.
I tried doing an if statement with grep and cp but I have no idea how to make it to work this way.
for filename in *.txt
do
grep -o -i "cultiv" "protec" "agricult" $filename|wc -w
if [ wc -gt 0 ]
then cp $filename ~/desktop/filepath
fi
done
Obviously this does not work but I have no idea how to store the wc result and then compare it to 0 and only act on those files.
Use the -l option to have grep print all the filenames that match the pattern. Then use xargs to pass these as arguments to cp.
grep -l -E -i 'cultiv|protec|agricult' *.txt | xargs cp -t ~/desktop/filepath --
The -t option is a GNU cp extension, it allows you to put the destination directory first so that it will work with xargs.
If you're using a version without that option, you need to use the -J option to xargs to substitute in the middle of the command.
grep -l -E -i 'cultiv|protec|agricult' *.txt | xargs -J {} cp -- {} ~/desktop/filepath

Shell Script: How to copy files with specific string from big corpus

I have a small bug and don't know how to solve it. I want to copy files from a big folder with many files, where the files contain a specific string. For this I use grep, ack or (in this example) ag. When I'm inside the folder it matches without problem, but when I want to do it with a loop over the files in the following script it doesn't loop over the matches. Here my script:
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" | while read -d $'\0' file; do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done
SEARCH_QUERY holds the String I want to find inside the files, INPUT_DIR is the folder where the files are located, OUTPUT_DIR is the folder where the found files should be copied to. Is there something wrong with the while do?
EDIT:
Thanks for the suggestions! I took this one now, because it also looks for files in subfolders and saves a list with all the files.
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" > "output_list.txt"
while read file
do
echo "${file##*/}"
cp "${file}" "${OUTPUT_DIR}/${file##*/}"
done < "output_list.txt"
Better implement it like below with a find command:
find "${INPUT_DIR}" -name "*.*" | xargs grep -l "${SEARCH_QUERY}" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
or another option:
grep -l "${SEARCH_QUERY}" "${INPUT_DIR}/*.*" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
if you do not mind doing it in just one line, then
grep -lr 'ONE\|TWO\|THREE' | xargs -I xxx -P 0 cp xxx dist/
guide:
-l just print file name and nothing else
-r search recursively the CWD and all sub-directories
match these works alternatively: 'ONE' or 'TWO' or 'THREE'
| pipe the output of grep to xargs
-I xxx name of the files is saved in xxx it is just an alias
-P 0 run all the command (= cp) in parallel (= as fast as possible)
cp each file xxx to the dist directory
If i understand the behavior of ag correctly, then you have to
adjust the read delimiter to '\n' or
use ag -0 -l to force delimiting by '\0'
to solve the problem in your loop.
Alternatively, you can use the following script, that is based on find instead of ag.
while read file; do
echo "$file"
cp "$file" "$OUTPUT_DIR/$file"
done < <(find "$INPUT_DIR" -name "*$SEARCH_QUERY*" -print)

Move files according to number in filename

I am trying to move files in folders according to a number in their names.
Files are names like fooNNN_bar.txt I would like to organise them like /NNN/fooNNN_bar.txt
Here is what I have for now. It prints me the folder each file would have to move to. I'm not sure how to collect the number to add it into a mv command. Is this even the correct way to do it?
#!/bin/bash
for filename in foo*.txt;
do
echo "${filename}" | grep -Eo '[0-9]{1,4}';
done
Assuming your grep works as you want:
#!/bin/bash
for filename in foo*.txt; do
num=$(echo "${filename}" | grep -Eo '[0-9]{1,4}')
mkdir -p "$num"
mv "$filename" "$num"
done

Passing Arguments in Unix command line when using | symble

I am trying to move all my video files that are in my pictures directory to my movies Directory. This is on a Mac by the way.
I thought I could simple Recurse through all my picture directories with an "ls -R"
Then I pipe that to grep -i ".avi" This give me all the movie files.
Now I pipe these values to "mv -n $1 ~/Movies" this I am hoping would move the files to the Movies folder.
I have a few Problems.
1. The "ls -R" does not list the path when listing the files. So I think I may fail to move the file.
2. I can not seem to get the file name to assign to the $1 in the mv command.
All together my command looks like this: Note I am running this from ~/Pictures
ls -R | grep -i ".avi" | mv -n $1 ~/Movies
So right now I am not sure which part is failing but I do get this error:
usage: mv [-f | -i | -n] [-v] source target
mv [-f | -i | -n] [-v] source ... directory
If I remove the 'mv' command I get a listing of avi files with out the path. Example Below:
4883.AVI
4884.AVI
4885.AVI
4886.AVI
4887.AVI
...
Any one have any ideas on how I can get the path in the 'ls' or how to pass a value in between the '|' commands.
Thanks.
It's better if you use the find command:
$ find -name "*.avi" -exec mv {} ~/Movies \;
you should create simple copy.sh like this
#!/bin/bash
cp $1 ~/Movies/
An run command ./copy.sh "$(ls | grep avi)"
The bash for loop can help you find all the avi files easily
shopt -s nullglob
for file in *.avi
do
mv "$file" "$file" ~/Movies/"$file"
done
you can achieve this in many ways, one of it in my openion:
ls -R | grep -i ".avi" | while read movie
do
echo " moving $movie"
mv $movie ~/Movies/
done
Use backticks
mv `ls *.avi` ~/Movies

Resources