find and cat to merge csv files - shell

I have thousands of files in sub-directories of ~/data. I wish to merge all those csv files with a certain extension say .x and save the merged file to ~/data/merged.x
I know I need to use find,cat and >> with the option -iname, but I'm finding it hard to do.
Thanks in advance

find ~/data -name "*.x" | while read file
do
cat $file >> ~/data/merged.x
done

find ~/data -type f ! -name 'merged.x' -a -name '*.x' -exec cat {} \+ >> ~/data/merged.x

find ./data/ -type f -name "*.c*" | xargs cat > ~/data/merged.x

Related

Sort names of zipped files and write list to file

I tried to list the zipped files in sort order and transfer this to new file, but it does not work properly in shell script. Why my script is not working?
ls |grep gz|sort -t '.' -k 2,2n >filename;
I did not find any problem with your commands. But they do not seem the right way to do this, at least to me. These two ways I'm pasting I think are better. Try them out.
With only names :
find . -type f -name '*.html' 2>/dev/null -exec basename {} \; | sort > filename.txt
With full paths :
find . -type f -name '*.html' 2>/dev/null | sort > filename.txt
You can also add the "-maxdepth 1" flag to search only on the current directory where you are running this, and not recursively within nested dirs :
find . -type f -maxdepth 1 -name '*.html' 2>/dev/null | sort > filename.txt
Hope this helps you :)

Output of 'cat' to find files with partial filenames

Say I've file1.txt with
ptext1
ptext2
ptext3
ptext4
These are the partial file names (library names) which I'm trying to find from a directory. Something like
cat file1.txt | xargs find . -name "*$0*"
or say,
cat file.txt | awk '{system("find . -name " *$0*)}'
None of them are working.
Please suggest.
I'm sure there is a more elegant way, but you could always loop over and run find on each:
Update to reflect suggestions in comments
while read -r filename; do
find . -type f -name "*$filename*"
done < file1.txt
One way with xargs
xargs -I{} find . -name "*"{}"*" < file

Terminal find, directories last instead of first

I have a makefile that concatenates JavaScript files together and then runs the file through uglify-js to create a .min.js version.
I'm currently using this command to find and concat my files
find src/js -type f -name "*.js" -exec cat {} >> ${jsbuild}$# \;
But it lists files in directories first, this makes heaps of sense but I'd like it to list the .js files in the src/js files above the directories to avoid getting my undefined JS error.
Is there anyway to do this or? I've had a google around and seen the sort command and the -s flag for find but it's a bit above my understanding at this point!
[EDIT]
The final solution is slightly different to the accepted answer but it is marked as accepted as it brought me to the answer. Here is the command I used
cat `find src/js -type f -name "*.js" -print0 | xargs -0 stat -f "%z %N" | sort -n | sed -e "s|[0-9]*\ \ ||"` > public/js/myCleverScript.js
Possible solution:
use find for getting filenames and directory depth, i.e find ... -printf "%d\t%p\n"
sort list by directory depth with sort -n
remove directory depth from output to use filenames only
test:
without sorting:
$ find folder1/ -depth -type f -printf "%d\t%p\n"
2 folder1/f2/f3
1 folder1/file0
with sorting:
$ find folder1/ -type f -printf "%d\t%p\n" | sort -n | sed -e "s|[0-9]*\t||"
folder1/file0
folder1/f2/f3
the command you need looks like
cat $(find src/js -type f -name "*.js" -printf "%d\t%p\n" | sort -n | sed -e "s|[0-9]*\t||")>min.js
Mmmmm...
find src/js -type f
shouldn't find ANY directories at all, and doubly so as your directory names will probably not end in ".js". The brackets around your "-name" parameter are superfluous too, try removing them
find src/js -type f -name "*.js" -exec cat {} >> ${jsbuild}$# \;
find could get the first directory level already expanded on commandline, which enforces the order of directory tree traversal. This solves the problem just for the top directory (unlike the already accepted solution by Sergey Fedorov), but this should answer your question too and more options are always welcome.
Using GNU coreutils ls, you can sort directories before regular files with --group-directories-first option. From reading the Mac OS X ls manpage it seems that directories are grouped always in OS X, you should just drop the option.
ls -A --group-directories-first -r | tac | xargs -I'%' find '%' -type f -name '*.js' -exec cat '{}' + > ${jsbuild}$#
If you do not have the tac command, you could easily implement it using sed. It reverses the order of lines. See info sed tac of GNU sed.
tac(){
sed -n '1!G;$p;h'
}
You could do something like this...
First create a variable holding the name of our output file:
OUT="$(pwd)/theLot.js"
Then, get all "*.js" in top directory into that file:
cat *.js > $OUT
Then have "find" grab all other "*.js" files below current directory:
find . -type d ! -name . -exec sh -c "cd {} ; cat *.js >> $OUT" \;
Just to explain the "find" command, it says:
find
. = starting at current directory
-type d = all directories, not files
-! -name . = except the current one
-exec sh -c = and for each one you find execute the following
"..." = go to that directory and concatenate all "*.js" files there onto end of $OUT
\; = and that's all for today, thank you!
I'd get the list of all the files:
$ find src/js -type f -name "*.js" > list.txt
Sort them by depth, i.e. by the number of '/' in them, using the following ruby script:
sort.rb:
files=[]; while gets; files<<$_; end
files.sort! {|a,b| a.count('/') <=> b.count('/')}
files.each {|f| puts f}
Like so:
$ ruby sort.rb < list.txt > sorted.txt
Concatenate them:
$ cat sorted.txt | while read FILE; do cat "$FILE" >> output.txt; done
(All this assumes that your file names don't contain newline characters.)
EDIT:
I was aiming for clarity. If you want conciseness, you can absolutely condense it to something like:
find src/js -name '*.js'| ruby -ne 'BEGIN{f=[];}; f<<$_; END{f.sort!{|a,b| a.count("/") <=> b.count("/")}; f.each{|e| puts e}}' | xargs cat >> concatenated

Run script against all txt files in directory and sub directories - BASH

What im trying to do is something along the lines of(this is pseudocode):
for txt in $(some fancy command ./*.txt); do
some command here $txt
You can use find:
find /path -type f -name "*.txt" | while read txt; do
echo "$txt"; # Do something else
done
Use the -exec option to find:
find /usr/share/wordlists/*/* -type f -name '*.txt' -exec yourScript {} \;
Try
find . | grep ".txt" | xargs -I script.sh {}
find returns all files in the directory. grep selects only .txt files and xargs sends the file as Parameter to script.sh

Bash: recursively copy and rename files

I have a lot of files whose names end with '_100.jpg'. They spread in nested folder / sub-folders. Now I want a trick to recursively copy and rename all of them to have a suffix of '_crop.jpg'. Unfortunately I'm not familiar with bash scripting so don't know the exact way to do this thing. I googled and tried the 'find' command with the '-exec' para but with no luck.
Plz help me. Thanks.
find bar -iname "*_100.jpg" -printf 'mv %p %p\n' \
| sed 's/_100\.jpg$/_crop\.jpg/' \
| while read l; do eval $l; done
if you have bash 4
shopt -s globstar
for file in **/*_100.jpg; do
echo mv "$file" "${file/_100.jpg/_crop.jpg}"
one
or using find
find . -type f -iname "*_100.jpg" | while read -r FILE
do
echo mv "${FILE}" "${FILE/_100.jpg/_crop.jpg}"
done
This uses a Perl script that you may have already on your system. It's sometimes called prename instead of rename:
find /dir/to/start -type f -iname "*_100.jpg" -exec rename 's/_100/_crop' {} \;
You can make the regexes more robust if you need to protect filenames that have "_100" repeated or in parts of the name you don't want changed.

Resources