Reading and writing line by line in a bash script

Reading and writing line by line in a bash script - bash

After searching online I was able to figure out how to read a file line by line:
while read p; do
echo $p
done < file.txt
But I would actually like to modify the line in the file.
For example:
while read p; do
if condition
then
echo $p | perl -i -pe 's/a/b/'
fi
done < file.txt
However this doesn't actually modify the file.

Update A far better version of bash code added. Thanks to Charles Duffy for comments.
Your Perl one-liner takes a line piped into it by echo $p |, getting its standard input that way. It doesn't do anything with the file itself, so the -i flag has no effect. The -p makes it print to the standard output stream. So that whole line, echo ..., doesn't touch the file.
You can redirect the output to a new file and then move that to overwrite file.txt. Here is a simple minded example, that appends each line to a new file. For better bash code see the update below.
while read p; do
if condition
then
echo $p | perl -pe 's/a/b/' >> temp_out.txt
else
echo $p >> temp_out.txt
fi
done < file.txt
mv temp_out.txt file.txt
We have to add the else where all unmodified lines are also appended. Note that in general we cannot have just some lines replaced but the whole file has to be re-written.
If this is all that the script does you can do it with a very simple one-liner, see the end. If more work is done you can also put it all in a Perl script but I take it that there may be other good reasons for a bash script.
Update A much better version of the above. See read and echo in Builtins in Bash manual
Appending each line opens the file anew each time without a need for that.
Just redirect at the end of the loop, much like it is done in the terminal
read uses backslash for escaping, removing it from input. Turn that off with -r
Trailing white space is removed, as a part of breaking the line into words. Suppress this by unsetting the variable that controls which characters are used for splitting, IFS=
The echo $p can do all kinds of unintended things. A formatted print is better, printf '%s\n' "$p", or at least echo "$p"
With this,
while IFS= read -r p; do
if condition
then
echo "$p" | perl -pe 's/a/b/'
else
echo "$p"
fi
done < file.txt > temp_out.txt
mv temp_out.txt file.txt
Finally, if the sole purpose of the Perl one-liner were to run a simple substitution, it is much better to simply do that in the shell itself than to have a pipeline and run a whole new process for each line.
echo "${p//a/b}"
Thanks to Charles Duffy for raising all these points in comments.
A few comments on Perl one-liners. See documentation at perlrun.
The command perl -e '...' executes any valid Perl code between ''. When we add the -n or -p switch it also reads standard input and executes that code on a line of it at the time, where -p also prints out each line after it's processed. The standard input can be supplied to it from a file,
perl -pe '...' input.txt
in which case adding -i flag will result in the file being changed in-place. Or, the input can be piped into it, for example
echo "input text" | perl -pe '...'
in which case the processed line is printed to standard output. This can be redirected to a file, as in the answer above.
To make changes to a given file a line at a time you only need this on the command line
perl -i -pe 's/a/b/' file.txt
If there is more work to do then it may well be better to put it in a script, of course. In this case the one-liner can be a command in the bash script as well, replacing all that code above (unless some bash-specific functionality is preferred for processing lines).

Related

Unix Bash content of a file as argument stops at first line

I'm having an issue in something that seems to be a rookie error, but I can't find a way to find a solution.
I have a bash script : log.sh
which is :
#!/bin/bash
echo $1 >> log_out.txt
And with a file made of filenames (taken from the output of "find" which names is filesnames.txt and contains 53 lines of absolute paths) I try :
./log.sh $(cat filenames.txt)
the only output I have in the log_out.txt is the first line.
I need each line to be processed separately as I need to put them in arguments in a pipeline with 2 softwares.
I checked for :
my lines being terminated with /n
using a simple echo without writing to a file
all the sorts of cat filenames.txt or (< filenames.txt) found on internet
I'm sure it's a very dumb thing, but I can't find why I can't iterate more than one line :(
Thanks

It is because your ./log.sh $(cat filenames.txt) is being treated as one argument.
while IFS= read -r line; do
echo "$line";
done < filenames.txt
Edit according to: https://mywiki.wooledge.org/DontReadLinesWithFor
Edit#2:
To preserve leading and trailing whitespace in the result, set IFS to the null string.
You could simplify more and skip using explicit variable and use the default $REPLY
Source: http://wiki.bash-hackers.org/commands/builtin/read

You need to quote the command substitution. Otherwise $1 will just be the first word in the file.
./log.sh "$(cat filenames.txt)"
You should also quote the variable in the script, otherwise all the newlines will be converted to spaces.
echo "$1" >> log_out.txt
If you want to process each word separately, you can leave out the quotes
./log.sh $(cat filenames.txt)
and then use a loop in the script:
#!/bin/bash
for word in "$#"
do
echo "$word"
done >> log_out.txt
Note that this solution only works correctly when the file has one word per line and there are no wildcards in the words. See mywiki.wooledge.org/DontReadLinesWithFor for why this doesn't generalize to more complex lines.

You can iterate with each line.
#!/bin/bash
for i in $*
do
echo $i >> log_out.txt
done

Read an input file in shell script and store its lines in a variable

I'm new to UNIX and have this really simple problem:
I have a text-file (input.txt) containing a string in each line. It looks like this:
House
Monkey
Car
And inside my shell script I need to read this input file line by line to get to a variable like this:
things="House,Monkey,Car"
I know this sounds easy, but I just couldnt find any simple solution for this. My closest attempt so far:
#!/bin/sh
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done <input.txt
echo $things
But this won't work. Regarding to my google research I thought the while loop would create a new sub shell, but this I was wrong there (see the comment section). Nevertheless the variable "things" was still not available in the echo later on. (I cannot just write the echo inside the while loop, because I need to work with that string later on)
Could you please help me out here? Any help will be appreciated, thank you!

What you proposed works fine! I've only made two changes here: Adding missing quotes, and handling the empty-string case.
things=""
addToString() {
if [ -n "$things" ]; then
things="${things},$1"
else
things="$1"
fi
}
while read -r line; do addToString "$line"; done <input.txt
echo "$things"
If you were piping into while read, this would create a subshell, and that would eat your variables. You aren't piping -- you're doing a <input.txt redirection. No subshell, code works without changes.
That said, there are better ways to read lists of items into shell variables. On any version of bash after 3.0:
IFS=$'\n' read -r -d '' -a things <input.txt # read into an array
printf -v things_str '%s,' "${things[#]}" # write array to a comma-separated string
echo "${things_str%,}" # print that string w/o trailing comma
...on bash 4, that first line can be:
readarray -t things <input.txt # read into an array

This is not a shell solution, but the truth is that solutions in pure shell are often excessively long and verbose. So e.g. to do string processing it is better to use special tools that are part of the “default” Unix environment.
sed ':b;N;$!bb;s/\n/,/g' < input.txt
If you want to omit empty lines, then:
sed ':b;N;$!bb;s/\n\n*/,/g' < input.txt
Speaking about your solution, it should work, but you should really always use quotes where applicable. E.g. this works for me:
things=""
while read line; do things="$things,$line"; done < input.txt
echo "$things"
(Of course, there is an issue with this code, as it outputs a leading comma. If you want to skip empty lines, just add an if check.)

This might/might not work, depending on the shell you are using. On my Ubuntu 14.04/x64, it works with both bash and dash.
To make it more reliable and independent from the shell's behavior, you can try to put the whole block into a subshell explicitly, using the (). For example:
(
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done
echo $things
) < input.txt
P.S. You can use something like this to avoid the initial comma. Without bash extensions (using short-circuit logical operators instead of the if for shortness):
test -z "$things" && things="$1" || things="${things},${1}"
Or with bash extensions:
things="${things}${things:+,}${1}"
P.P.S. How I would have done it:
tr '\n' ',' < input.txt | sed 's!,$!\n!'

You can do this too:
#!/bin/bash
while read -r i
do
[[ $things == "" ]] && things="$i" || things="$things","$i"
done < <(grep . input.txt)
echo "$things"
Output:
House,Monkey,Car
N.B:
Used grep to tackle with empty lines and the probability of not having a new line at the end of file. (Normal while read will fail to read the last line if there is no newline at the end of file.)

Loop for deleting first line of Multiple Files using Bash Script

Just new to Bash scripting and programming in general. I would like to automate the deletion of the first line of multiple .data files in a directory. My script is as follows:
#!/bin/bash
for f in *.data ;
do tail -n +2 $f | echo "processing $f";
done
I get the echo message but when I cat the file nothing has changed. Any ideas?
Thanks in advance

I get the echo message but when I cat the file nothing has changed.
Because simply tailing wouldn't change the file.
You could use sed to modify the files in-place with the first line excluded. Saying
sed -i '1d' *.data
would delete the first line from all .data files.
EDIT: BSD sed (on OSX) would expect an argument to -i, so you can either specify an extension to backup older files, or to edit the files in-place, say:
sed -i '' '1d' *.data

You are not changing the file itself. By using tail you simply read the file and print parts of it to stdout (the terminal), you have to redirect that output to a temporary file and then overwrite the original file with the temporary one.
#!/usr/bin/env bash
for f in *.data; do
tail -n +2 "$f" > "${f}".tmp && mv "${f}".tmp "$f"
echo "Processing $f"
done
Moreover it's not clear what you'd like to achieve with the echo command. Why do you use a pipe (|) there?
sed will give you an easier way to achieve this. See devnull's answer.

I'd do it this way:
#!/usr/bin/env bash
set -eu
for f in *.data; do
echo "processing $f"
tail -n +2 "$f" | sponge "$f"
done
If you don't have sponge you can get it in the moreutils package.
The quotes around the filename are important--they will make it work with filenames containing spaces. And the env thing at the top is so that people can set which Bash interpreter they want to use via their PATH, in case someone has a non-default one. The set -eu makes Bash exit if an error occurs, which is usually safer.

ed is the standard editor:
shopt -s nullglob
for f in *.data; do
echo "Processing file \`$f'"
ed -s -- "$f" < <( printf '%s\n' "1d" "wq" )
done
The shopt -s nullglob is here just because you should always use this when using globs, especially in a script: it will make globs expand to nothing if there are no matches; you don't want to run commands with uncontrolled arguments.
Next, we loop on all your files, and use ed with the commands:
1: go to first line
d: delete that line
wq: write and quit
Options for ed:
-s: tells ed to shut up! we don't want ed to print its junk on our screen.
--: end of options: this will make your script much more robust, in case a file name starts with a hypen: in this case, the hyphen will confuse ed trying to process it as an option. With --, ed knows that there are no more options after that and will happily process any files, even those starting with a hyphen.

Read file line by line and perform action for each in bash

I have a text file, it contains a single word on each line.
I need a loop in bash to read each line, then perform a command each time it reads a line, using the input from that line as part of the command.
I am just not sure of the proper syntax to do this in bash. If anyone can help, it would be great. I need to use the line from the test file obtained as a paramter to call another function. The loop should stop when there are no more lines in the text file.
Psuedo code:
Read testfile.txt.
For each in testfile.txt
{
some_function linefromtestfile
}

How about:
while read line
do
echo $line
// or some_function "$line"
done < testfile.txt

As an alternative, using a file descriptor (#4 in this case):
file='testfile.txt'
exec 4<$file
while read -r -u4 t ; do
echo "$t"
done
Don't use cat! In a loop cat is almost always wrong, i.e.
cat testfile.txt | while read -r line
do
# do something with "$line" here
done
and people might start to throw an UUoCA at you.

while read line
do
nikto -Tuning x 1 6 -h $line -Format html -o NiktoSubdomainScans.html
done < testfile.txt
Tried this to automate nikto scan of list of domains after changing from cat approach. Still just read the first line and ignored everything else.

How do I iterate over each line in a file with Bash?

Given a text file with multiple lines, I would like to iterate over each line in a Bash script. I had attempted to use cut, but cut does not accept \n (newline) as a delimiter.
This is an example of the file I am working with:
one
two
three
four
Does anyone know how I can loop through each line of this text file in Bash?

I found myself in the same problem, this works for me:
cat file.cut | cut -d$'\n' -f1
Or:
cut -d$'\n' -f1 file.cut

Use cat for concatenating or displaying. No need for it here.
file="/path/to/file"
while read line; do
echo "${line}"
done < "${file}"

Simply use:
echo -n `cut ...`
This suppresses the \n at the end

cat FILE|while read line; do # 'line' is the variable name
echo "$line" # do something here
done
or (see comment):
while read line; do # 'line' is the variable name
echo "$line" # do something here
done < FILE

So, some really good (possibly better) answers have been provided already. But looking at the phrasing of the original question, in wanting to use a BASH for-loop, it amazed me that nobody mentioned a solution with change of Field Separator IFS. It's a pure bash solution, just like the accepted read line
old_IFS=$IFS
IFS='\n'
for field in $(<filename)
do your_thing;
done
IFS=$old_IFS

If you are sure that the output will always be newline-delimited, use head -n 1 in lieu of cut -f1 (note that you mentioned a for loop in a script and your question was ultimately not script-related).
Many of the other answers, including the accepted one, have multiple lines unnecessarily. No need to do this over multiple lines or changing the default delimiter on the system.
Also, the solution provided by Ivan with -d$'\n' did not work for me either on Mac OSX or CentOS 7. Since his answer is four years old, I assume something must have changed on the logic of the $ character for this situation.

While loop with input redirection and read command.
You should not be using cut to perform a sequential iteration of each line in a file as cut was not designed to do this.
Print selected parts of lines from each FILE to standard output.
— man cut
TL;DR
You should use a while loop with the read -r command and redirect standard input to your file inside a function scope where IFS is set to \n and use -E when using echo.
processFile() { # Function scope to prevent overwriting IFS globally
file="$1" # Any file that exists
local IFS="\n" # Allows spaces and tabs
while read -r line; do # Read exits with 1 when done; -r allows \
echo -E "$line" # -E allows printing of \ instead of gibberish
done < $file # Input redirection allows us to read file from stdin
}
processFile /path/to/file
Iteration
In order to iterate over each line of a file, we can use a while loop. This will let us iterate as many times as we need to.
while <condition>; do
<body>
done
Getting our file ready to read
We can use the read command to store a single line from standard input in a variable. Before we can use that to read a line from our file, we need to redirect standard input to point to our file. We can do this with input redirection. According to the man pages for bash, the syntax for redirection is [fd]<file where fd defaults to standard input (a.k.a file descriptor 0). We can place this before or after our while loop.
while <condition>; do
<body>
done < /path/to/file
# or the non-traditional way
</path/to/file while <condition>; do
<body>
done
Reading the file and ending the loop
Now that our file can be read from standard input, we can use read. The syntax for read in our context is read [-r] var... where -r preserves the \ (backslash) character, instead of using it as an escape sequence character, and var is the name of the variable to store the input in. You can have multiple variables to store pieces of the input in but we only need one to read an entire line. Along with this, to preserve any backslashes in any output from echo you will likely need to use the -E flag to disable the interpretation of backslash escapes. If you have any indentation (spaces or tabs), you will need to temporarily change the IFS (Input Field Separators) variable to only "\n"; normally it is set to " \t\n".
main() {
local IFS="\n"
read -r line
echo -E "$line"
}
main
How do we use read to end our while loop?
There is really only one reliable way, that I know of, to determine when you've finished reading a file with read: check the exit value of read. If the exit value of read is 0 then we successfully read a line, if it is 1 or higher then we reached EOF (end of file). With that in mind, we can place the call to read in our while loop's condition section.
processFile() {
# Could be any file you want hardcoded or dynamic
file="$1"
local IFS="\n"
while read -r line; do
# Process line here
echo -E "$line"
done < $file
}
processFile /path/to/file1
processFile /path/to/file2
A visual breakdown of the above code via Explain Shell.

If I am executing a command and want to cut the output but it has multiple lines I found it helpful to do
echo $([command]) | cut [....]
This puts all the output of [command] on a single line that can be easier to process.

My opinion is that "cut" uses '\n' as its default delimiter.
If you want to use cut, I have two ways:
cut -d^M -f1 file_cut
I make ^M By click Enter After Ctrl+V. Another way is
cut -c 1- file_cut
Does that help?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio