Piping output of bash function - bash

I'm trying to connect the inputs/outputs of two bash functions with a
pipe. Here is a complete program which illustrates my issue:
function print_info {
files=$(ls);
echo $files;
}
touch "file.pattern"
print_info | grep "pattern"
rm -f file.pattern
But this simply outputs a list of all files, not those that match
"pattern". Can anyone help me understand why?

The reason this isn't working is that in
echo $files;
the variable $files is subject to shell expansion (i.e., it is expanded into individual arguments to echo), and the resulting tokens are printed by echo delimited by spaces. This means that the output of it is a single line, and grep handles it accordingly.
The least invasive fix is to use
echo "$files";

Dont parse the output of ls command. You could do the same using find command like:
find . -maxdepth 1 -type f -exec grep "pattern" {} \;
if you are getting file names from a function then do it like:
grep "pattern" $(print_info)

Related

What does each line of this bash script do?

I found an old past paper question with content not covered in my course. I hope I don't get examined with that but what does this bash script do? I know grep takes user input and outputs the line containing the input and echo just repeats the input and cat just displays the input. But I have on idea what this does as a whole. Any help please?
#!/bin/bash
outputFile=$1
for file in $(find -name '*txt' | grep data)
do
echo $file >> $outputFile
cat $file >> $outputFile
done
Each line:
#!/bin/bash
Hash-bang the script to use bash
outputFile=$1
Set the variable named "outputFile" to the first parameter passed into the script. Running the script would look like bash myScript.sh "/some/file/to/output.txt"
for file in $(find -name '*txt' | grep data)
do
Loop through every file in this directory and it's subdirectories looking for a file that ends in with the characters "txt" and contains the characters "data" somewhere in the name. For each iteration of the for loop/file found, set the file name to the variable "file"
echo $file >> $outputFile
Echo out/print the file name stored in the variable "file" to the outputFile
cat $file >> $outputFile
Take the contents of the file and stick it in the outputFile.
done
End the For Loop
There's some issues with this script though. If $outputFile or $file have a space in their name or path, then it will fail. It's good practice to toss double quotes around variables like:
cat "$file" >> "$outputFile"
#!/bin/bash
The shebang. If this script is executable an invoked directly as in ./this_script or found in the PATH, it will be invoked with /bin/bash.
outputFile=$1
Assign the first argument to the name outputFile.
++ find -name '*txt'
Recursively list all files with a name ending in "txt". It would be more standard to include the path and write this as find . -name '*.txt'.
+ … | grep data
Filter the previous list of file names. Only list those containing the string "data" in their names. This pipe could be eliminated by writing find . -name '*data*txt'.
for file in $(find -name '*txt' | grep data)
For every word in the output of the find | grep pipeline, assign that word to the name file and run the loop. This can break down if any of the found names have whitespace or glob characters in them. It would be better to use find's native -exec flag to handle this.
echo $file >> $outputFile
Append the expansion of the variable "file" to a new or existing file at the path found by expanding $outputFile. If the former expansion starts with a dash, it could cause echo to treat it as an argument. If the latter expansion has whitespace or a glob character in it, this may cause an "ambiguous redirect" error. It would be better to quote the expansions, and use printf to avoid the argument edge-case to echo, as in printf '%s\n' "$file" >> "$outputFile".
cat $file >> $outputFile
Append the contents of the file found at the expansion of the variable "file" to the path found by expanding $outputFile, or cause another ambiguous redirect error. It would be better to quote the expansions, like cat "$file" >> "$outputFile".
Assuming that none of the aforementioned expansion edge-cases were expected, it would be better to write this entire script like this:
find . -name '*data*txt' -print -exec cat {} \; >> "$1"

How to use >> inside find -exec statement?

From time to time I have to append some text at the end of a bunch of files. I would normally find these files with find.
I've tried
find . -type f -name "test" -exec tail -n 2 /source.txt >> {} \;
This however results in writing the last two lines from /source.txt to a file named {} however many times a file was found matching the search criteria.
I guess I have to escape >> somehow but so far I wasn't successful.
Any help would be greatly appreciated.
-exec only takes one command (with optional arguments) and you can't use any bash operators in it.
So you need to wrap it in a bash -c '...' block, which executes everything between '...' in a new bash shell.
find . -type f -name "test" -exec bash -c 'tail -n 2 /source.txt >> "$1"' bash {} \;
Note: Everything after '...' is passed as regular arguments, except they start at $0 instead of $1. So the bash after ' is used as a placeholder to match how you would expect arguments and error processing to work in a regular shell, i.e. $1 is the first argument and errors generally start with bash or something meaningful
If execution time is an issue, consider doing something like export variable="$(tail -n 2 /source.txt)" and using "$variable" in the -exec. This will also always write the same thing, unlike using tail in -exec, which could change if the file changes. Alternatively, you can use something like -exec ... + and pair it with tee to write to many files at once.
A more efficient alternative (assuming bash 4):
shopt -s globstar
to_augment=( **/test )
tail -n 2 /source.txt | tee -a "${to_augment[#]}" > /dev/null
First, you create an array with all the file names, using a simple pattern that should be equivalent to your call to find. Then, use tee to append the desired lines to all those files at once.
If you have more criteria for the find command, you can still use it; this version is not foolproof, as it assumes no filename contains a newline, but fixing that is best left to another question.
while read -r fname; do
to_augment+=( "$fname" )
done < <(find ...)

convert a file path into string

I'm having an error trying to find a way to replace a string in a directory path with another string
sed: Error tryning to read from {directory_path}: It's a directory
The shell script
#!/bin/sh
R2K_SOURCE="source/"
R2K_PROCESSED="processed/"
R2K_TEMP_DIR=""
echo " Procesando archivos desde $R2K_SOURCE "
for file in $(find $R2K_SOURCE )
do
if [ -d $file ]
then
R2K_TEMP_DIR=$( sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g' $file )
echo "directorio $R2K_TEMP_DIR"
else
# some code executes
:
fi
done
# find $R2K_PROCCESED -type f -size -200c -delete
i'm understanding that the rror it's in this line
R2K_TEMP_DIR=$( sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g' $file )
but i don't know how to tell sh that treats $file variable as string and not as a directory object.
If you want ot replace part of path name you can echo path name and take it to sed over pipe.
Also you must enable globbing by placing sed commands into double quotes instead of single and change separator for 's' command like that:
R2K_TEMP_DIR=$(echo "$file" | sed "s:$R2K_SOURCE:$R2K_PROCESSED:g")
Then you will be able to operate with slashes inside 's' command.
Update:
Even better is to remove useless echo and use "here is string" instead:
R2K_TEMP_DIR=$(sed "s:$R2K_SOURCE:$R2K_PROCESSED:g" <<< "$file")
First, don't use:
for item in $(find ...)
because you might overload the command line. Besides, the for loop cannot start until the process in $(...) finishes. Instead:
find ... | while read item
You also need to watch out for funky file names. The for loop will cough on all files with spaces in them. THe find | while will work as long as files only have a single space in their name and not double spaces. Better:
find ... -print0 | while read -d '' -r item
This will put nulls between file names, and read will break on those nulls. This way, files with spaces, tabs, new lines, or anything else that could cause problems can be read without problems.
Your sed line is:
R2K_TEMP_DIR=$( sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g' $file )
What this is attempting to do is edit your $file which is a directory. What you want to do is munge the directory name itself. Therefore, you have to echo the name into sed as a pipe:
R2K_TEMP_DIR=$(echo $file | sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g')
However, you might be better off using environment variable parameters to filter your environment variable.
Basically, you have a directory called source/ and all of the files you're looking for are under that directory. You simply want to change:
source/foo/bar
to
processed/foo/bar
You could do something like this ${file#source/}. The # says this is a left side filter and it will remove the least amount to match the glob expression after the #. Check the manpage for bash and look under Parameter Expansion.
This, you could do something like this:
#!/bin/sh
R2K_SOURCE="source/"
R2K_PROCESSED="processed/"
R2K_TEMP_DIR=""
echo " Procesando archivos desde $R2K_SOURCE "
find $R2K_SOURCE -print0 | while read -d '' -r file
do
if [ -d $file ]
then
R2K_TEMP_DIR="processed/${file#source/}"
echo "directorio $R2K_TEMP_DIR"
else
# some code executes
:
fi
done
R2K_TEMP_DIR="processed/${file#source/}" removes the source/ from the start of $file and you merely prepend processed/ in its place.
Even better, it's way more efficient. In your original script, the $(..) creates another shell process to run your echo in which then pipes out to another process to run sed. (Assuming you use loentar's solution). You no longer have any subprocesses running. The whole modification of your directory name is internal.
By the way, this should also work too:
R2K_TEMP_DIR="$R2K_PROCESSED/${file#$R2K_SOURCE}"
I just didn't test that.

How can I read a list of filenames from a file in bash?

I'm trying to write a bash script that will process a list of files whose names are stored one per line in an input file, something the likes of
find . -type f -mtime +15 > /tmp/filelist.txt
for F in $(cat /tmp/filelist.txt) ; do
...
done;
My problem is that filenames in filelist.txt may contain spaces, so the snipped above will expand the line
my text file.txt
to three different filenames, my, text and file.txt. How can I fix that?
Use read:
while read F ; do
echo $F
done </tmp/filelist.txt
Alternatively use IFS to change how the shell separates your list:
OLDIFS=$IFS
IFS="
"
for F in $(cat /tmp/filelist.txt) ; do
echo $F
done
IFS=$OLDIFS
Alternatively (as suggested by #tangens), convert the body of your loop into a separate script, then use find's -exec option to run if for each file found directly.
You can do this without a temporary file using process substitution:
while read F
do
...
done < <(find . -type f -mtime +15)
use while read
cat $FILE | while read line
do
echo $line
done
You can do redirect instead of cat with a pipe
You could use the -exec parameter of find and use the file names directly:
find . -type f -mtime +15 -exec <your command here> {} \;
The {} is a placeholder for the file name.
pipe your find command straight to while read loop
find . -type f -mtime +15 | while read -r line
do
printf "do something with $line\n"
done
I'm not a bash expert by any means ( I usually write my script in ruby or python to be cross-platform), but I would use a regex expration to escape spaces in each line before you process it.
For Bash Regex:
http://www.linuxjournal.com/node/1006996
In a similar situation in Ruby ( processing a csv file, and cleaning up each line before using it):
File.foreach(csv_file_name) do |line|
clean_line = line.gsub(/( )/, '\ ')
#this finds the space in your file name and escapes it
#do more stuff here
end
I believe you can skip the temporary file entirely and just directly iterate over the results of find, i.e.:
for F in $(find . -type f -mtime +15) ; do
...
done;
No guarantees that my syntax is correct but I'm pretty sure the concept works.
Edit: If you really do have to process the file with a list of filenames and can't simply combine the commands as I did above, then you can change the value of the IFS variable--it stands for Internal Field Separator--to change how bash determines fields. By default it is set to whitespace, so a newline, space, or tab will begin a new field. If you set it to contain only a newline, then you can iterate over the file just as you did before.

calling grep from a bash script

I'm new to bash scripts (and the *nix shell altogether) but I'm trying to write this script to make grepping a codebase easier.
I have written this
#!/bin/bash
args=("$#");
for arg in args
grep arg * */* */*/* */*/*/* */*/*/*/*;
done
when I try to run it, this is what happens:
~/Work/richmond $ ./f.sh "\$_REQUEST\['a'\]"
./f.sh: line 4: syntax error near unexpected token `grep'
./f.sh: line 4: ` grep arg * */* */*/* */*/*/* */*/*/*/*;'
~/Work/richmond $
How do I do this properly?
And, I think a more important question is, how can I make grep recurse through subdirectories properly like this?
Any other tips and/or pitfalls with shell scripting and using bash in general would also be appreciated.
The syntax error is because you're missing do. As for searching recursively if your grep has the -R option you would do:
#!/bin/bash
for arg in "$#"; do
grep -R "$arg" *
done
Otherwise you could use find:
#!/bin/bash
for arg in "$#"; do
find . -exec grep "$arg" {} +
done
In the latter example, find will execute grep and replace the {} braces with the file names it finds, starting in the current directory ..
(Notice that I also changed arg to "$arg". You need the dollar sign to get the variable's value, and the quotes tell the shell to treat its value as one big word, even if $arg contains spaces or newlines.)
On recusive grepping:
Depending on your grep version, you can pass -R to your grep command to have it search Recursively (in subdirectories).
The best solution is stated above, but try putting your statement in back ticks:
`grep ...`
You should use 'find' plus 'xargs' to do the file searching.
for arg in "$#"
do
find . -type f -print0 | xargs -0 grep "$arg" /dev/null
done
The '-print0' and '-0' options assume you're using GNU grep and ensure that the script works even if there are spaces or other unexpected characters in your path names. Using xargs like this is more efficient than having find execute it for each file; the /dev/null appears in the argument list so grep always reports the name of the file containing the match.
You might decide to simplify life - perhaps - by combining all the searches into one using either egrep or grep -E. An optimization would be to capture the output from find once and then feed that to xargs on each iteration.
Have a look at the findrepo script which may give you some pointers
If you just want a better grep and don't want to do anything yourself, use ack, which you can get at http://betterthangrep.com/.

Resources