Bash script to transverse a directory - bash

I have a directory with XML files and other directories. All other directories have XML files and subdirectories, etc.
I need to write a script (bash probably) that for each directory runs java XMLBeautifier directory and since my skills at bash scripting are a bit rubbish, I would really appreciate a bit of help.

If you have to get the directories, you can use:
$ find . -type d
just pipe this to your program like this:
$ find . -type d | xargs java XMLBeautifier
Another approach would be to get all the files with find and pipe that to your program like this:
$ find . -name "*.xml" | xargs java XMLBeautifier
This takes all .xml files from the current directory and recursively through all subdirectories. Then hands them one by one over with xargs to java XMLBeautifier.

Find is an awesome tool ... however, if you are not sure of the file name but have a vague idea of what those xml file contains then you can use grep.
For instance, if you know for sure that all your xml files contains a phrase "correct xml file" (you can change this phrase to what you feel appropriate) then run the following at your command line ...
grep -IRw "correct xml file" /path/to/directory/*
-I option searches the file and returns the file name when pattern is matched
-R option reaches your directory recursively
-w ensure that the pattern given matches on the whole and not single word individually
Hope this helps!

Related

Given a text file with file names, how can I find files in subdirectories of the current directory?

I have a bunch of files with different names in different subdirectories. I created a txt file with those names but I cannot make find to work using the file. I have seen posts on problems creating the list, on not using find (do not understand the reason though). Suggestions? Is difficult for me to come up with an example because I do not know how to reproduce the directory structure.
The following are the names of the files (just in case there is a formatting problem)
AO-169
AO-170
AO-171
The best that I came up with is:
cat ExtendedList.txt | xargs -I {} find . -name {}
It obviously dies in the first directory that it finds.
I also tried
ta="AO-169 AO-170 AO-171"
find . -name $ta
but it complains find: AO-170: unknown primary or operator
If you are trying to ask "how can I find files with any of these names in subdirectories of the current directory", the answer to that would look something like
xargs printf -- '-o\0-name\0%s\0' <ExtendedList.txt |
xargs -r0 find . -false
The -false is just a cute way to let the list of actual predicates start with "... or".
If the list of names in ExtendedList.txt is large, this could fail if the second xargs decides to break it up between -o and -name.
The option -0 is not portable, but should work e.g. on Linux or wherever you have GNU xargs.
If you can guarantee that the list of strings in ExtendedList.txt does not contain any characters which are problematic to the shell (like single quotes), you could simply say
sed "s/.*/-o -name '&'/" ExtendedList.txt |
xargs -r find . -false

How to move files based on file names in a.csv doc - macOS Terminal?

Terminal noob need a little help :)
I have a 98 row long filename list in a .csv file. For example:
name01; name03, etc.
I have an external hard drive with a lot of files in chaotic file
structure. BUT the file names are consistent, something like:
name01_xy; name01_zq; name02_xyz etc.
I would like to copy every file and directory from the external hard
drive which begins with the filename stored in the .csv file to my
computer.
So basically it's a search and copy based on a text file from an eHDD to my computer. I guess the easiest way to do is a Terminal command. Do you have any advice? Thanks in advance!
The task can be split into three: read search criteria from file; find files by criteria; copy found files. We discuss each one separately and combine them in a one-liner step-by-step:
Read search criteria from .csv file
Since your .csv file is pretty much just a text file with one criterion per line, it's pretty easy: just cat the file.
$ cat file.csv
bea001f001
bea003n001
bea007f005
bea008f006
bea009n003
Find files
We will use find. Example: you have a directory /Users/me/where/to/search and want to find all files in there whose names start with bea001f001:
$ find /Users/me/where/to/search -type f -name "bea001f001*"
If you want to find all files that end with bea001f001, move the star wildcard (zero-or-more) to the beginning of the search criterion:
$ find /Users/me/where/to/search -type f -name "*bea001f001"
Now you can already guess what the search criterion for all files containing the name bea001f001 would look like: "*bea001f001*".
We use -type f to tell find that we are interested only in finding files and not directories.
Combine reading and finding
We use xargs for passing the file contents to find a -name argument:
$ cat file.csv | xargs -I [] find /Users/me/where/to/search -type f -name "[]*"
/Users/me/where/to/search/bea001f001_xy
/Users/me/where/to/search/bea001f001_xyz
/Users/me/where/to/search/bea009n003_zq
Copy files
We use cp. It is pretty straightforward: cp file target will copy file to directory target (if it is a directory, or replace file named target).
Complete one-liner
We pass results from find to cp not by piping, but by using the -exec argument passed to find:
$ cat file.csv | xargs -I [] find /Users/me/where/to/search -type f -name "[]*" -exec cp {} /Users/me/where/to/copy \;
Sorry this is my first post here. In response to the comments above, only the last file is selected likely because the others have a carriage return \r. If you first append the directory to each filename in the csv, you can perform the move with the following command, which strips the \r.
cp `tr -d '\r' < file.csv` /your/target/directory

find/grep to list found specific file that contains specific string

I have a root directory that I need to run a find and/or grep command on to return a list of files that contain a specific string.
Here's an example of the file and directory set up. In reality, this root directory contains a lot of subdirectories that each have a lot of subdirectories and files, but this example, I hope, gets my point across.
From root, I need to go through each of the children directories, specifically into subdir/ and look through file.html for the string "example:". If a result is found, I'd like it to print out the full path to file.html, such as website_two/subdir/file.html.
I figured limiting the search to subdir/file.html will greatly increase the speed of this operation.
I'm not too knowledgeable with find and grep commands, but I have tried the following with no luck, but I honestly don't know how to troubleshoot it.
find . -name "file.html" -exec grep -HI "example:" {} \;
EDIT: I understand this may be marked as a duplicate, but I think my question is more along the lines of how can I tell the command to only search a specific file in a specific path, looping through all root-> level directories.
find ./ -type f -iname file.html -exec grep -l "example:" {} \+;
or
grep -Rl "example:" ./ | grep -iE "file.htm(l)*$" will do the trick.
Quote from GNU Grep 2.25 man page:
-R, --dereference-recursive
Read all files under each directory, recursively. Follow all symbolic links, unlike -r.
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have
been printed. The scanning will stop on the first match.
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files.
-E, --extended-regexp
Interpret PATTERN as an extended regular expression.

Assign the name of the Folder to the name of a file

I would like to paste a number of files in a unique file named, for example, "output.txt".
Nevertheless I would like to assign the name of the folder to the name of the output file so that it will be: "output_<name of the Folder>.txt".
I have thousands of folders so that the "paste" command will run in a for loop I'm able to write. Can anyone help me please?
The question is incredibly unclear. I'm going to interpret it to mean that you have a large number of directories that all contain a file named 'output.txt', and you want to move those files to a single directory with the original path embedded in the name. Assuming that the root of the directory tree containing all the files is /path/to/source and you want to move them to /path/to/destination:
find /path/to/source -name output.txt -exec sh -c 'd=$(dirname "$1" |
tr / _); cp "$1" "/path/to/destination/output_$d.txt"' sh {} \;
Relative paths will work fine as well as absolute paths.
I too am unclear about what you want, but mktemp(1) has TEMPLATES which might help.

How to search for a file which name I don't know in Bash?

I know what's the directory which contains a file which name I don't know, but I know its name's ending (for example, .txt), also I know there's exactly one file with such ending in the directory.
I've tried the following code:file=grep -w $1/*.txt when $1 is the directory address which didn't help at all, neither did file=$1/*.txt.
Is there any way to achieve what I'm looking for?
If you know the precise directory and a wildcard which will not match any other files, you could run a loop which loops only once.
for file in "$1"/*.txt; do
: "$file" refers to your file here
done
Of course, in a lot of situations, you don't really need to know the precise file name. If you want to run grep on the file which matches your wildcard, just do that:
grep "xyzzy" "$1"/*.txt
What you're looking for is the find command.
you can use it like this (make sure you're in the directory you would like to search):
find . -iname '*.txt'
use the command man find to learn more about the command.
/usr/bin/find directory_name -iname "*.txt"
if you want to operate on those searched file you can even use with find -exec

Resources