After searching online I was able to figure out how to read a file line by line:
while read p; do
echo $p
done < file.txt
But I would actually like to modify the line in the file.
For example:
while read p; do
if condition
then
echo $p | perl -i -pe 's/a/b/'
fi
done < file.txt
However this doesn't actually modify the file.
Update A far better version of bash code added. Thanks to Charles Duffy for comments.
Your Perl one-liner takes a line piped into it by echo $p |, getting its standard input that way. It doesn't do anything with the file itself, so the -i flag has no effect. The -p makes it print to the standard output stream. So that whole line, echo ..., doesn't touch the file.
You can redirect the output to a new file and then move that to overwrite file.txt. Here is a simple minded example, that appends each line to a new file. For better bash code see the update below.
while read p; do
if condition
then
echo $p | perl -pe 's/a/b/' >> temp_out.txt
else
echo $p >> temp_out.txt
fi
done < file.txt
mv temp_out.txt file.txt
We have to add the else where all unmodified lines are also appended. Note that in general we cannot have just some lines replaced but the whole file has to be re-written.
If this is all that the script does you can do it with a very simple one-liner, see the end. If more work is done you can also put it all in a Perl script but I take it that there may be other good reasons for a bash script.
Update A much better version of the above. See read and echo in Builtins in Bash manual
Appending each line opens the file anew each time without a need for that.
Just redirect at the end of the loop, much like it is done in the terminal
read uses backslash for escaping, removing it from input. Turn that off with -r
Trailing white space is removed, as a part of breaking the line into words. Suppress this by unsetting the variable that controls which characters are used for splitting, IFS=
The echo $p can do all kinds of unintended things. A formatted print is better, printf '%s\n' "$p", or at least echo "$p"
With this,
while IFS= read -r p; do
if condition
then
echo "$p" | perl -pe 's/a/b/'
else
echo "$p"
fi
done < file.txt > temp_out.txt
mv temp_out.txt file.txt
Finally, if the sole purpose of the Perl one-liner were to run a simple substitution, it is much better to simply do that in the shell itself than to have a pipeline and run a whole new process for each line.
echo "${p//a/b}"
Thanks to Charles Duffy for raising all these points in comments.
A few comments on Perl one-liners. See documentation at perlrun.
The command perl -e '...' executes any valid Perl code between ''. When we add the -n or -p switch it also reads standard input and executes that code on a line of it at the time, where -p also prints out each line after it's processed. The standard input can be supplied to it from a file,
perl -pe '...' input.txt
in which case adding -i flag will result in the file being changed in-place. Or, the input can be piped into it, for example
echo "input text" | perl -pe '...'
in which case the processed line is printed to standard output. This can be redirected to a file, as in the answer above.
To make changes to a given file a line at a time you only need this on the command line
perl -i -pe 's/a/b/' file.txt
If there is more work to do then it may well be better to put it in a script, of course. In this case the one-liner can be a command in the bash script as well, replacing all that code above (unless some bash-specific functionality is preferred for processing lines).
I made a script like this:
#! /usr/bin/bash
a=`ls ../wrfprd/wrfout_d0${i}* | cut -c22-25`
b=`ls ../wrfprd/wrfout_d0${i}* | cut -c27-28`
c=`ls ../wrfprd/wrfout_d0${i}* | cut -c30-31`
d=`ls ../wrfprd/wrfout_d0${i}* | cut -c33-34`
f=$a$b$c$d
echo $f
sed "s/.* startdate=.*/export startdate=${f}/g" ./post_process > post_process2
echo command works and gives 2008042118 that is what I want but in file post_process2 is like this export startdate= and can not recall variable f. I want to produce a line like export startdate=2008042118
First -- don't use ls here -- it's both expensive in terms of performance (compared to globbing, which is performed internal to the shell without starting any external programs), and doesn't guarantee useful output for the full range of possible filenames, making its use in this context inherently bug-prone. A better way to retrieve pieces from a filename, assuming a ksh-derived shell such as bash or zsh, would look like this:
#!/bin/bash
# this is an array, but we're only going to use the first element
file=( "../wrfprd/wrfout_d0${i}"* )
[[ -e $file ]] || { echo "No file found" >&2; exit 1; }
f=${file:22:4}${file:27:2}${file:30:2}${file:33:2}
Second, don't use sed to modify code -- doing so requires that your runtime user have permission to modify its own code, and moreover invites injection vulnerabilities. Just write your content out to a data file:
printf '%s\n' "$f" >startdate.txt
...and, in your second script, to read in the value from that file:
# if the shebang is #!/bin/bash
startdate=$(<startdate.txt)
# if the shebang is #!/bin/sh
startdate=$(cat startdate.txt)
I have a big problem doing a script: basically, I read a line from files.
All lines are made of 3 to 8 characters contiguous (no space).
Then I used sed to replace those lines inside a pattern (aka "var" in my minimal script below)
var="iao"
for m in `more meshing/junction_names.txt`
do
echo $m
echo -n $m | xxd -ps | sed 's/[[:xdigit:]]\{2\}/\\x&/g'
echo $var |sed "s/a/b/"
echo $var |sed "s/a/$m/"
done
Now these are the first 3 record of my output (they are all the same anyway).
I am using linux. According kate, all files are encoded UTF-8. Very weird huh? Any idea why that is is welcome.
J_LEAK
\x4a\x5f\x4c\x45\x41\x4b\x0d
ibo
oJ_LEAK
JO_1
\x4a\x4f\x5f\x31\x0d
ibo
oJO_1
JPL2_F
\x4a\x50\x4c\x32\x5f\x46\x0d
ibo
oJPL2_F
JF_PL2
Your input file contains DOS carriage returns (or possibly, the absurd attempt to read it with more introduces them). The hex dump shows this clearly; every value ends with \x0d which translates to a control code which causes the terminal to jump the cursor back to the beginning of the line.
This is a massive FAQ and you can find many examples of how to troubleshoot this basic problem, including in the bash tag wiki.
Tangentially, you should always quote strings unless you specifically require the shell to perform wildcard expansion and whitespace tokenization on the value; and Bash has built-ins to avoid the inelegant and somewhat error-prone echo | sed. Finally, don't read lines with for.
var="iao"
tr -d '\015' <meshing/junction_names.txt |
while read -r m; do # don't use a for loop
echo "$m" # quote!
echo -n "$m" | xxd -ps | sed 's/[[:xdigit:]]\{2\}/\\x&/g'
echo "${var/a/b}" # quote; use Bash built-in substitution mechanism
echo "${var/a/$m}"
done
Perhaps you want to remove the carriage returns once and for all, and then just use while read .... done <fixed-file instead of the tr pipeline.
I'm new to UNIX and have this really simple problem:
I have a text-file (input.txt) containing a string in each line. It looks like this:
House
Monkey
Car
And inside my shell script I need to read this input file line by line to get to a variable like this:
things="House,Monkey,Car"
I know this sounds easy, but I just couldnt find any simple solution for this. My closest attempt so far:
#!/bin/sh
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done <input.txt
echo $things
But this won't work. Regarding to my google research I thought the while loop would create a new sub shell, but this I was wrong there (see the comment section). Nevertheless the variable "things" was still not available in the echo later on. (I cannot just write the echo inside the while loop, because I need to work with that string later on)
Could you please help me out here? Any help will be appreciated, thank you!
What you proposed works fine! I've only made two changes here: Adding missing quotes, and handling the empty-string case.
things=""
addToString() {
if [ -n "$things" ]; then
things="${things},$1"
else
things="$1"
fi
}
while read -r line; do addToString "$line"; done <input.txt
echo "$things"
If you were piping into while read, this would create a subshell, and that would eat your variables. You aren't piping -- you're doing a <input.txt redirection. No subshell, code works without changes.
That said, there are better ways to read lists of items into shell variables. On any version of bash after 3.0:
IFS=$'\n' read -r -d '' -a things <input.txt # read into an array
printf -v things_str '%s,' "${things[#]}" # write array to a comma-separated string
echo "${things_str%,}" # print that string w/o trailing comma
...on bash 4, that first line can be:
readarray -t things <input.txt # read into an array
This is not a shell solution, but the truth is that solutions in pure shell are often excessively long and verbose. So e.g. to do string processing it is better to use special tools that are part of the “default” Unix environment.
sed ':b;N;$!bb;s/\n/,/g' < input.txt
If you want to omit empty lines, then:
sed ':b;N;$!bb;s/\n\n*/,/g' < input.txt
Speaking about your solution, it should work, but you should really always use quotes where applicable. E.g. this works for me:
things=""
while read line; do things="$things,$line"; done < input.txt
echo "$things"
(Of course, there is an issue with this code, as it outputs a leading comma. If you want to skip empty lines, just add an if check.)
This might/might not work, depending on the shell you are using. On my Ubuntu 14.04/x64, it works with both bash and dash.
To make it more reliable and independent from the shell's behavior, you can try to put the whole block into a subshell explicitly, using the (). For example:
(
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done
echo $things
) < input.txt
P.S. You can use something like this to avoid the initial comma. Without bash extensions (using short-circuit logical operators instead of the if for shortness):
test -z "$things" && things="$1" || things="${things},${1}"
Or with bash extensions:
things="${things}${things:+,}${1}"
P.P.S. How I would have done it:
tr '\n' ',' < input.txt | sed 's!,$!\n!'
You can do this too:
#!/bin/bash
while read -r i
do
[[ $things == "" ]] && things="$i" || things="$things","$i"
done < <(grep . input.txt)
echo "$things"
Output:
House,Monkey,Car
N.B:
Used grep to tackle with empty lines and the probability of not having a new line at the end of file. (Normal while read will fail to read the last line if there is no newline at the end of file.)
I have some script that produces output with colors and I need to remove the ANSI codes.
#!/bin/bash
exec > >(tee log) # redirect the output to a file but keep it on stdout
exec 2>&1
./somescript
The output is (in log file):
java (pid 12321) is running...#[60G[#[0;32m OK #[0;39m]
I didn't know how to put the ESC character here, so I put # in its place.
I changed the script into:
#!/bin/bash
exec > >(tee log) # redirect the output to a file but keep it on stdout
exec 2>&1
./somescript | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g"
But now it gives me (in log file):
java (pid 12321) is running...#[60G[ OK ]
How can I also remove this '#[60G?
Maybe there is a way to completely disable coloring for the entire script?
According to Wikipedia, the [m|K] in the sed command you're using is specifically designed to handle m (the color command) and K (the "erase part of line" command). Your script is trying to set absolute cursor position to 60 (^[[60G) to get all the OKs in a line, which your sed line doesn't cover.
(Properly, [m|K] should probably be (m|K) or [mK], because you're not trying to match a pipe character. But that's not important right now.)
If you switch that final match in your command to [mGK] or (m|G|K), you should be able to catch that extra control sequence.
./somescript | sed -r "s/\x1B\[([0-9]{1,3}(;[0-9]{1,2};?)?)?[mGK]//g"
IMHO, most of these answers try too hard to restrict what is inside the escape code. As a result, they end up missing common codes like [38;5;60m (foreground ANSI color 60 from 256-color mode).
They also require the -r option which enables GNU extensions. These are not required; they just make the regex read better.
Here is a simpler answer that handles the 256-color escapes and works on systems with non-GNU sed:
./somescript | sed 's/\x1B\[[0-9;]\{1,\}[A-Za-z]//g'
This will catch anything that starts with [, has any number of decimals and semicolons, and ends with a letter. This should catch any of the common ANSI escape sequences.
For funsies, here's a larger and more general (but minimally tested) solution for all conceivable ANSI escape sequences:
./somescript | sed 's/\x1B[#A-Z\\\]^_]\|\x1B\[[0-9:;<=>?]*[-!"#$%&'"'"'()*+,.\/]*[][\\#A-Z^_`a-z{|}~]//g'
(and if you have #edi9999's SI problem, add | sed "s/\x0f//g" to the end; this works for any control char by replacing 0f with the hex of the undesired char)
I couldn't get decent results from any of the other answers, but the following worked for me:
somescript | sed -r "s/[[:cntrl:]]\[[0-9]{1,3}m//g"
If I only removed the control char "^[", it left the rest of the color data, e.g., "33m". Including the color code and "m" did the trick. I'm puzzled with s/\x1B//g doesn't work because \x1B[31m certainly works with echo.
I came across ansi2txt tool from colorized-logs package in Debian. The tool drops ANSI control codes from STDIN.
Usage example:
./somescript | ansi2txt
Source code http://github.com/kilobyte/colorized-logs
For Mac OSX or BSD use
./somescript | sed $'s,\x1b\\[[0-9;]*[a-zA-Z],,g'
The regular expression below will miss some ANSI Escape Codes sequences, as well as 3 digit colors. Example and Fix on regex101.com.
Use this instead:
./somescript | sed -r 's/\x1B\[(;?[0-9]{1,3})+[mGK]//g'
I also had the problem that sometimes, the SI character appeared.
It happened for example with this input : echo "$(tput setaf 1)foo$(tput sgr0) bar"
Here's a way to also strip the SI character (shift in) (0x0f)
./somescript | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | sed "s/\x0f//g"
Much simpler function in pure Bash to filter-out common ANSI codes from a text stream:
# Strips common ANSI codes from a text stream
shopt -s extglob # Enable Bash Extended Globbing expressions
ansi_filter() {
local line
local IFS=
while read -r line || [[ "$line" ]]; do
printf '%s\n' "${line//$'\e'[\[(]*([0-9;])[#-n]/}"
done
}
See:
linuxjournal.com: Extended Globbing
gnu.org: Bash Parameter Expansion
I had a similar problem. All solutions I found did work well for the color codes but did not remove the characters added by "$(tput sgr0)" (resetting attributes).
Taking, for example, the solution in the comment by davemyron the length of the resulting string in the example below is 9, not 6:
#!/usr/bin/env bash
string="$(tput setaf 9)foobar$(tput sgr0)"
string_sed="$( sed -r "s/\x1B\[[0-9;]*[JKmsu]//g" <<< "${string}" )"
echo ${#string_sed}
In order to work properly, the regex had to be extend to also match the sequence added by sgr0 ("\E(B"):
string_sed="$( sed -r "s/\x1B(\[[0-9;]*[JKmsu]|\(B)//g" <<< "${string}" )"
Not sure what's in ./somescript but if escape sequences are not hardcoded you can set the terminal type to avoid them
TERM=dumb ./somescript
For example, if you try
TERM=dumb tput sgr0 | xxd
you'll see it produces no output while
tput sgr0 | xxd
00000000: 1b28 421b 5b6d .(B.[m
does (for xterm-256color).
Hmm, not sure if this will work for you, but 'tr' will 'strip' (delete) control codes - try:
./somescript | tr -d '[:cntrl:]'
There's also a dedicated tool to handle ANSI escape sequences: ansifilter. Use the default --text output format to strip all ANSI escape sequences (note: not just coloring).
ref: https://stackoverflow.com/a/6534712
Here's a pure Bash solution.
Save as strip-escape-codes.sh, make executable and then run <command-producing-colorful-output> | ./strip-escape-codes.sh.
Note that this strips all ANSI escape codes/sequences. If you want to strip colors only, replace [a-zA-Z] with "m".
Bash >= 4.0:
#!/usr/bin/env bash
# Strip ANSI escape codes/sequences [$1: input string, $2: target variable]
function strip_escape_codes() {
local _input="$1" _i _char _escape=0
local -n _output="$2"; _output=""
for (( _i=0; _i < ${#_input}; _i++ )); do
_char="${_input:_i:1}"
if (( ${_escape} == 1 )); then
if [[ "${_char}" == [a-zA-Z] ]]; then
_escape=0
fi
continue
fi
if [[ "${_char}" == $'\e' ]]; then
_escape=1
continue
fi
_output+="${_char}"
done
}
while read -r line; do
strip_escape_codes "${line}" line_stripped
echo "${line_stripped}"
done
Bash < 4.0:
#!/usr/bin/env bash
# Strip ANSI escape codes/sequences [$1: input string, $2: target variable]
function strip_escape_codes() {
local input="${1//\"/\\\"}" output="" i char escape=0
for (( i=0; i < ${#input}; ++i )); do # process all characters of input string
char="${input:i:1}" # get current character from input string
if (( ${escape} == 1 )); then # if we're currently within an escape sequence, check if
if [[ "${char}" == [a-zA-Z] ]]; then # end is reached, i.e. if current character is a letter
escape=0 # end reached, we're no longer within an escape sequence
fi
continue # skip current character, i.e. do not add to ouput
fi
if [[ "${char}" == $'\e' ]]; then # if current character is '\e', we've reached the start
escape=1 # of an escape sequence -> set flag
continue # skip current character, i.e. do not add to ouput
fi
output+="${char}" # add current character to output
done
eval "$2=\"${output}\"" # assign output to target variable
}
while read -r line; do
strip_escape_codes "${line}" line_stripped
echo "${line_stripped}"
done
#jeff-bowman's solution helped me getting rid of SOME of the color codes.
I added another small portion to the regex in order to remove some more:
sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" # Original. Removed Red ([31;40m[1m[error][0m)
sed -r "s/\x1B\[([0-9];)?([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" # With an addition, removed yellow and green ([1;33;40m[1m[warning][0m and [1;32;40m[1m[ok][0m)
^^^^^^^^^
remove Yellow and Green (and maybe more colors)
The controversial idea would be to reconfigure terminal settings for this process environment to let the process know that terminal does not support colors.
Something like TERM=xterm-mono ./somescript comes to my mind. YMMV with your specific OS and ability of your script to understand terminal color settings.
I had some issues with colorized output which the other solutions here didn't process correctly, so I built this perl one liner. It looks for escape \e followed by opening bracket \[ followed by one or color codes \d+ separated by semicolons, ending on m.
perl -ple 's/\e\[\d+(;\d+)*m//g'
It seems to work really well for colorized compiler output.
I came across this question/answers trying to do something similar as the OP. I found some other useful resources and came up with a log script based on those. Posting here in case it can help others.
Digging into the links helps understand some of the redirection which I won't try and explain because I'm just starting to understand it myself.
Usage will render the colorized output to the console, while stripping the color codes out of the text going to the log file. It will also include stderr in the logfile for any commands that don't work.
Edit: adding more usage at bottom to show how to log in different ways
#!/bin/bash
set -e
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
. $DIR/dev.conf
. $DIR/colors.cfg
filename=$(basename ${BASH_SOURCE[0]})
# remove extension
# filename=`echo $filename | grep -oP '.*?(?=\.)'`
filename=`echo $filename | awk -F\. '{print $1}'`
log=$DIR/logs/$filename-$target
if [ -f $log ]; then
cp $log "$log.bak"
fi
exec 3>&1 4>&2
trap 'exec 2>&4 1>&3' 0 1 2 3
exec 1>$log 2>&1
# log message
log(){
local m="$#"
echo -e "*** ${m} ***" >&3
echo "=================================================================================" >&3
local r="$#"
echo "================================================================================="
echo -e "*** $r ***" | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g"
echo "================================================================================="
}
echo "=================================================================================" >&3
log "${Cyan}The ${Yellow}${COMPOSE_PROJECT_NAME} ${filename} ${Cyan}script has been executed${NC}"
log $(ls) #log $(<command>)
log "${Green}Apply tag to image $source with version $version${NC}"
# log $(exec docker tag $source $target 3>&2) #prints error only to console
# log $(docker tag $source $target 2>&1) #prints error to both but doesn't exit on fail
log $(docker tag $source $target 2>&1) && exit $? #prints error to both AND exits on fail
# docker tag $source $target 2>&1 | tee $log # prints gibberish to log
echo $? # prints 0 because log function was successful
log "${Purple}Push $target to acr${NC}"
Here are the other links that helped:
Can I use sed to manipulate a variable in bash?
https://www.cyberciti.biz/faq/redirecting-stderr-to-stdout/
https://unix.stackexchange.com/questions/42728/what-does-31-12-23-do-in-a-script
https://serverfault.com/questions/103501/how-can-i-fully-log-all-bash-scripts-actions
https://www.gnu.org/software/bash/manual/bash.html#Redirections
I used perl as I have to do this frequently on many files. This will go through all files with filename*.txt and will remove any formatting. This works for my use case and may be useful for someone else too so just thought of posting here. replace whatever your file name is in place of filename*.txt or you can put file names separated by spaces in setting the FILENAME variable below.
$ FILENAME=$(ls filename*.txt) ; for file in $(echo $FILENAME); do echo $file; cat $file | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g' | col -b > $file-new; mv $file-new $file; done
my contribution:
./somescript | sed -r "s/\\x1B[\\x5d\[]([0-9]{1,3}(;[0-9]{1,3})?(;[0-9]{1,3})?)?[mGK]?//g"
This works for me:
./somescript | cat