extract text informations from many subfolders - bash

I'm looking to extract informations from subfolders.
I have a folder containing several folders containing several folders with text file information.
I've done something like this, but it works only when text files have different names (otherwise files with same names are erased by the most recent ones):
mkdir target_directory
pwd=`pwd`
find $pwd . -name \*.txt -exec cp {} target_directory \;
cd target_directory
cat *.txt > all-info
rm *.txt
I was thinking to had directory to the name of extracted files. How can I do that?
Maybe there is a smarter way?
Thank you!

If your goal is to concatenate all *.txt files in target_directory/all-info then just use cat {} in the exec action of your find command and redirect the output:
$ mkdir -p target_directory
$ find . -type f -name '*.txt' -exec cat {} \; > target_directory/all-info

This should do the trick:
mkdir -p target_directory
find . -name "*.txt" -exec cat {} >> target_directory/all-info \;

Man-pages of cp mention:
-n, --no-clobber
do not overwrite an existing file (overrides a previous -i option)
So I think your solution (only the find part) should be:
find $pwd . -name \*.txt -exec cp -n {} target_directory \;

Related

copy files with the base directory

I am searching specific directory and subdirectories for new files, I will like to copy the files. I am using this:
find /home/foo/hint/ -type f -mtime -2 -exec cp '{}' ~/new/ \;
It is copying the files successfully, but some files have same name in different subdirectories of /home/foo/hint/.
I will like to copy the files with its base directory to the ~/new/ directory.
test#serv> find /home/foo/hint/ -type f -mtime -2 -exec ls '{}' \;
/home/foo/hint/do/pass/file.txt
/home/foo/hint/fit/file.txt
test#serv>
~/new/ should look like this after copy:
test#serv> ls -R ~/new/
/home/test/new/pass/:
file.txt
/home/test/new/fit/:
file.txt
test#serv>
platform: Solaris 10.
Since you can't use rsync or fancy GNU options, you need to roll your own using the shell.
The find command lets you run a full shell in your -exec, so you should be good to go with a one-liner to handle the names.
If I understand correctly, you only want the parent directory, not the full tree, copied to the target. The following might do:
#!/usr/bin/env bash
findopts=(
-type f
-mtime -2
-exec bash -c 'd="${0%/*}"; d="${d##*/}"; mkdir -p "$1/$d"; cp -v "$0" "$1/$d/"' {} ./new \;
)
find /home/foo/hint/ "${findopts[#]}"
Results:
$ find ./hint -type f -print
./hint/foo/slurm/file.txt
./hint/foo/file.txt
./hint/bar/file.txt
$ ./doit
./hint/foo/slurm/file.txt -> ./new/slurm/file.txt
./hint/foo/file.txt -> ./new/foo/file.txt
./hint/bar/file.txt -> ./new/bar/file.txt
I've put the options to find into a bash array for easier reading and management. The script for the -exec option is still a little unwieldy, so here's a breakdown of what it does for each file. Bearing in mind that in this format, options are numbered from zero, the {} becomes $0 and the target directory becomes $1...
d="${0%/*}" # Store the source directory in a variable, then
d="${d##*/}" # strip everything up to the last slash, leaving the parent.
mkdir -p "$1/$d" # create the target directory if it doesn't already exist,
cp "$0" "$1/$d/" # then copy the file to it.
I used cp -v for verbose output as shown in "Results" above, but IIRC it's also not supported by Solaris, and can be safely ignored.
The --parents flag should do the trick:
find /home/foo/hint/ -type f -mtime -2 -exec cp --parents '{}' ~/new/ \;
Try testing with rsync -R, for example:
find /your/path -type f -mtime -2 -exec rsync -R '{}' ~/new/ \;
From the rsync man:
-R, --relative
Use relative paths. This means that the full path names specified on the
command line are sent to the server rather than just the last parts of the
filenames.
The problem with the answers by #Mureinik and #nbari might be that the absolute path of new files will spawn in the target directory. In this case you might want to switch to the base directory before the command and go back to your current directory afterwards:
path_current=$PWD; cd /home/foo/hint/; find . -type f -mtime -2 -exec cp --parents '{}' ~/new/ \; ; cd $path_current
or
path_current=$PWD; cd /home/foo/hint/; find . -type f -mtime -2 -exec rsync -R '{}' ~/new/ \; ; cd $path_current
Both ways work for me at a Linux platform. Let’s hope that Solaris 10 knows about rsync’s -R ! ;)
I found a way around it:
cd ~/new/
find /home/foo/hint/ -type f -mtime -2 -exec nawk -v f={} '{n=split(FILENAME, a, "/");j= a[n-1];system("mkdir -p "j"");system("cp "f" "j""); exit}' {} \;

Concatenate folder name into file

If I have a bunch of folders (f1, f2, f3) and they all have images in them (image1.jpg, image2.jpg) - I'd like to add the folder name into the image name itself.
Ie: f1_image1.jpg
What's the best way to do this? Preferably something I can run on Terminal.
Something like this?
for dir in *
do
for image in ${dir}/*.jpg
do
# remove the 'echo' if you think this works for you
echo mv "${image}" "${dir}_$(basename ${image})"
done
done
This worked for me:
for dir in *
do
for image in ${dir}/*.jpg
do
image_name=$(echo ${image} | cut -f2 -d/)
mv "${image}" "${dir}_${image_name}"
done
done
Some one liner based on find
find <path> -type f -exec sh -c 'mv {} $(dirname {})/$(basename $(dirname {}))_$(basename {})' \;
where <path> is the root folder of your bunch of folders
example:
create the example files
mkdir -p /tmp/ffff/f{1,2,3}
touch /tmp/ffff/f{1,2,3}/image{1,2,3,4}.jpg
run
find /tmp/ffff -type f
you will get
/tmp/ffff/f3/image4.jpg
/tmp/ffff/f3/image3.jpg
/tmp/ffff/f3/image2.jpg
/tmp/ffff/f3/image1.jpg
/tmp/ffff/f2/image4.jpg
/tmp/ffff/f2/image3.jpg
/tmp/ffff/f2/image2.jpg
/tmp/ffff/f2/image1.jpg
/tmp/ffff/f1/image4.jpg
/tmp/ffff/f1/image3.jpg
/tmp/ffff/f1/image2.jpg
/tmp/ffff/f1/image1.jpg
run
find /tmp/ffff -type f -exec sh -c 'mv {} $(dirname {})/$(basename $(dirname {}))_$(basename {})' \;
run again
find /tmp/ffff -type f
you will get
/tmp/ffff/f3/f3_image1.jpg
/tmp/ffff/f3/f3_image2.jpg
/tmp/ffff/f3/f3_image3.jpg
/tmp/ffff/f3/f3_image4.jpg
/tmp/ffff/f2/f2_image1.jpg
/tmp/ffff/f2/f2_image2.jpg
/tmp/ffff/f2/f2_image3.jpg
/tmp/ffff/f2/f2_image4.jpg
/tmp/ffff/f1/f1_image1.jpg
/tmp/ffff/f1/f1_image2.jpg
/tmp/ffff/f1/f1_image3.jpg
/tmp/ffff/f1/f1_image4.jpg

Error message when using find -exec to copy files

I use the find command to copy some files from one destination to another. If I do
$ mkdir dir1 temp
$ touch dir1/dir1-file1
$ find . -iname "*file*" -exec cp {} temp/ \;
everything works fine as expected, but if I do
$ mkdir SR0a temp
$ touch SR0a/SR0a-file1
$ find . -iname "*file*" -exec cp {} temp/ \;
> cp: `./temp/SR0a-file1' and `temp/SR0a-file1' are the same file
I get an error message. I do not understand this behavior. Why do I get an error by simply changing names?
That is because find searchs in SR0a/ folder at first, and then in temp/, and since you have copied into it the file, find founds it again in temp/ folder. It seems that find uses crafty sorting so it just should be take into account on use of find:
$ mkdir temp dir1 SR0a DIR TEMP
$ find .
.
./TEMP
./SR0a
./temp
./dir1
./DIR
So in case the dir1/ find founds the it at first, and this don't make such problems, let see the search sequence:
temp/
dir1/
When you search with SR0a the sequence is:
SR0a/
temp/
so found file is being copied into temp before searching it.
To fix it, either move temp/ folder outside the current one:
$ mkdir SR0a ../temp
$ touch SR0a/SR0a-file1
$ find . -iname "*file*" -exec cp {} ../temp/ \;
or use pipe to separate find and copy procedures:
$ find . -iname "*file*" | while read -r i; do cp "$i" temp/; done
This find should work:
find . -path ./temp -prune -o -iname "*file*" -type f -exec cp '{}' temp/ \;
-path ./misc -prune -o is used to skip ./temp directory while copying files to temp folder.
Your find command is also finding ./temp/*file* files and trying to copy them also into ./temp folder.
It is caused by the find that is trying to copied to it self.
Pipe output using while to separate with find command
Use cp with the option: -frpvT for match with file/dir target path
Print the realpath of the ouput file, see if the file path are the same.
find . -iname "*file*" | while read -r f; do echo cp -frpvT "$(realpath $f)" "/temp/$f"; done
If so, then correct the file path, when it is done then you can remove the echo from the command.

shell script to traverse files recursively

I need some assistance in creating a shell script to run a specific command (any) on each file in a folder, as well as recursively dive into sub-directories.
I'm not sure how to start.
a point in the right direction would suffice. Thank you.
To apply a command (say, echo) to all files below the current path, use
find . -type f -exec echo "{}" \;
for directories, use -type d
You should be looking at the find command.
For example, to change permissions all JPEG files under your /tmp directory:
find /tmp -name '*.jpg' -exec chmod 777 {} ';'
Although, if there are a lot of files, you can combine it with xargs to batch them up, something like:
find /tmp -name '*.jpg' | xargs chmod 777
And, on implementations of find and xargs that support null-separation:
find /tmp -name '*.jpg' -print0 | xargs -0 chmod 777
Bash 4.0
#!/bin/bash
shopt -s globstar
for file in **/*.txt
do
echo "do something with $file"
done
To recursively list all files
find . -name '*'
And lets say for example you want to 'grep' on each file then -
find . -type f -name 'pattern' -print0 | xargs -0 grep 'searchtext'
Within a bash script, you can go through the results from "find" command this way:
for F in `find . -type f`
do
# command that uses $F
done

using find with exec

I want to copy files found by find (with exec cp option) but, i'd like to change name of those files - e.g find ... -exec cp '{}' test_path/"test_"'{}' , which to my test_path should copy all files found by find but with prefix 'test'. but it ain't work.
I'd be glad if anyone could give me some ideas how to do it.
best regards
for i in `find . -name "FILES.EXT"`; do cp $i test_path/test_`basename $i`; done
It is assumed that you are in the directory that has the files to be copied and test_path is a subdir of it.
if you have Bash 4.0 and assuming you are find txt files
cd /path
for file in ./**/*.txt
do
echo cp "$file" "/test_path/test${file}"
done
of with GNU find
find /path -type f -iname "*.txt" | while read -r -d"" FILE
do
cp "$FILE" "test_${FILE}"
done
OR another version of GNU find+bash
find /path -type f -name "*txt" -printf "cp '%p' '/tmp/test_%f'\n" | bash
OR this ugly one if you don't have GNU find
$ find /path -name '*.txt' -type f -exec basename {} \; | xargs -I file echo cp /path/file /destination/test_file
You should put the entire test_path/"test_"'{}' in ""
Like:
find ... -exec cp "{}" "test_path/test_{}" \;
I would break it up a bit, like this;
for line in `find /tmp -type f`; do FULL=$line; name=`echo $line|rev|cut -d / -f -1|rev` ; echo cp $FULL "new/location/test_$name" ;done
Here's the output;
cp /tmp/gcc.version new/location/test_gcc.version
cp /tmp/gcc.version2 new/location/test_gcc.version2
Naturally remove the echo from the last part, so it's not just echo'ng what it woudl of done and running cp

Resources