how to restrain bash from removing blanks when processing file - bash

A simple yet annoying thing:
Using a script like this:
while read x; do
echo "$x"
done<file
on a file containing whitespace:
text
will give me an output without the whitespace:
text
The problem is i need this space before text (it's one tab mostly but not always).
So the question is: how to obtain identical lines as are in input file in such a script?
Update: Ok, so I changed my while read x to while IFS= read x.
echo "$x" gives me correct answer without stripping first tab, but, eval "echo $x" strips this tab.
What should I do then?

read is stripping the whitespace. Wipe $IFS first.
while IFS= read x
do
echo "$x"
done < file

The entire contents of the read are put into a variable called REPLY. If you use REPLY instead of 'x', you won't have to worry about read's word splitting and IFS and all that.
I ran into the same trouble you are having when attempting to strip spaces off the end of filenames. REPLY came to the rescue:
find . -name '* ' -depth -print | while read; do mv -v "${REPLY}" "`echo "${REPLY}" | sed -e 's/ *$//'`"; done

I found the solution to the problem 'eval "echo $x" strips this tab.' This should fix it:
eval "echo \"$x\""
I think this causes the inner (escaped) quotes will be evaluated with the echo, whereas I think that both
eval "echo $x"
and
eval echo "$x"
cause the quotes to be evaluated before the echo, which means that the string passed to echo has no quotes, causing the white space to be lost. So the complete answer is:
while IFS= read x
do
eval "echo \"$x\""
done < file

Related

echo inside a for loop lists files matching the output pattern

I have a problem with the following for loop:
X="*back* OLD"
for P in $X
do
echo "-$P"
done
I need it to output just:
-*back*
-OLD
However, it lists all files in the current directory matching the *back* pattern. For example it gives the following:
-backup.bkp
-backup_new.bkp
-backup_X
-OLD
How to force it to output the exact pattern?
Use an array, as unquoted parameter expansions are still subject to globbing.
X=( "*back*" OLD )
for P in "${X[#]}"; do
printf '%s\n' "$P"
done
(Use printf, as echo could try to interpret an argument as an option, for example, if you had n in the value of X.)
Use set -o noglob before your loop and set +o noglob after to disable and enable globbing.
To prevent filename expansion you could read in the string as a Here String.
To iterate over the items, you could turn them into lines using parameter expansion and read them linewise using read. In order to be able to put a - sign as the first character, use printf instead of echo.
X="*back* OLD"
while read -r x
do printf -- '-%s\n' "$x"
done <<< "${X/ /$'\n'}"
Another way could be to use tr to transform the string into lines, then use paste with the - sign as delimiter and "nothing" from /dev/null as first column.
X="*back* OLD"
tr ' ' '\n' <<< "$X" | paste -d- /dev/null -
Both should output:
-*back*
-OLD

Bash doesn't print string with question mark in for loop

I have a text file with the following contents,
My test
strings
that dont have
a question
mark except this line?
but not
these two
and when i try to read the file in bash using, for example,
ph_lines="/path/to/file.txt"
for l in $(cat "$ph_lines")
do
echo "$l"
done
everything prints on the output except for the string with the question mark in it.
I have tried using while read line; echo line; done < $filename and it still has the same problem
The only thing that would work to capture all of the lines is when i used sed to remove question marks.
for l in $(cat ${ph_lines} | sed $'s/\?//' )
Thank you!
To correctly read from a file, use the example from Read a file line by line assigning the value to a variable:
while IFS= read -r line; do
echo "$line"
done < my_filename.txt
The difference is not that it's a while loop (because you've tried that), but because this loop is correctly quoted. The problem you're seeing happens because you secretly enabled nullglob first, and then neglected to quote:
$ shopt -s nullglob
$ var='question?'
$ echo "$var"
question?
$ echo $var
(blank line)
Unquoted expansion causes pathname expansion, and since you enabled nullglob and have no matching files, the previous example shows nothing. If you had some matching files, you'd see those instead:
$ touch questions question2
$ echo $var
question2 questions
You can set up shellcheck in your editor to get automatic warnings about these issues.

sed result differs b/w command line & shell script

The following sed command from commandline returns what I expect.
$ echo './Adobe ReaderScreenSnapz001.jpg' | sed -e 's/.*\./After-1\./'
After-1.jpg <--- result
Howerver, in the following bash script, sed seeems not to act as I expect.
#!/bin/bash
beforeNamePrefix=$1
i=1
while IFS= read -r -u3 -d '' base_name; do
echo $base_name
rename=`(echo ${base_name} | sed -e s/.*\./After-$i./g)`
echo 'Renamed to ' $rename
i=$((i+1))
done 3< <(find . -name "$beforeNamePrefix*" -print0)
Result (with several files with similar names in the same directory):
./Adobe ReaderScreenSnapz001.jpg
Renamed to After-1. <--- file extension is missing.
./Adobe ReaderScreenSnapz002.jpg
Renamed to After-2.
./Adobe ReaderScreenSnapz003.jpg
Renamed to After-3.
./Adobe ReaderScreenSnapz004.jpg
Renamed to After-4.
Where am I wrong? Thank you.
You have omitted the single quotes around the program in your script. Without quoting, the shell will strip the backslash from .*\. yielding a regular expression with quite a different meaning. (You will need double quotes in order for the substitution to work, though. You can mix single and double quotes 's/.*\./'"After-$i./" or just add enough backslashes to escape the escaped escape sequence (sic).
Just use Parameter Expansion
#!/bin/bash
beforeNamePrefix="$1"
i=1
while IFS= read -r -u3 -d '' base_name; do
echo "$base_name"
rename="After-$((i++)).${base_name##*.}"
echo "Renamed to $rename"
done 3< <(find . -name "$beforeNamePrefix*" -print0)
I also fixed some quoting to prevent unwanted word splitting

Bash Script Looping over line input

I'm doing the following, which basically works.
The script tries to insert some lines into a file to rewrite it.
But it is stripping all blank lines and also all line padding.
The main problem is that it does not process the last line of the file.
I'm not sure why.
while read line; do
<... process some things ...>
echo ${line}>> "${ACTION_PATH_IN}.work"
done < "${ACTION_PATH_IN}"
What can be done to fix this?
while IFS= read -r line; do
## some work
printf '%s\n' "$line" >> output
done < <(printf '%s\n' "$(cat input)")
An empty IFS tells read not to strip leading and trailing whitespace.
read -r prevents backslash at EOL from creating a line continuation.
Double-quote your parameter substitution ("$line") to prevent the shell from doing word splitting and globbing on its value.
Use printf '%s\n' instead of echo because it is reliable when processing values like like -e, -n, etc.
< <(printf '%s\n' "$(cat input)") is an ugly way of LF terminating the contents of input. Other constructions are possible, depending on your requirements (pipe instead of redirect from process substitution if it is okay that your whole while runs in a subshell).
It might be better if you just ensured that it was LF-terminated before processing it.
Best yet, use a tool such as awk instead of the shell's while loop. First, awk is meant for parsing/manipulating files so for a huge file to process, awk has the advantage. Secondly, you won't have to care whether you have the last newline or not (for your case).
Hence the equivalent of your while read loop:
awk '{
# process lines
# print line > "newfile.txt"
}' file
One possible reason for not reading the last line is that the file does not end with a newline. On the whole, I'd expect it to work even so, but that could be why.
On MacOS X (10.7.1), I got this output, which is the behaviour you are seeing:
$ /bin/echo -n Hi
Hi$ /bin/echo -n Hi > x
$ while read line; do echo $line; done < x
$
The obvious fix is to ensure that the file ends with a newline.
First thing, use
echo "$line" >> ...
Note the quotes. If you don't put them, the shell itself will remove the padding.
As for the last line, it is strange. It may have to do with whether the last line of the file is terminated by a \n or not (it is a good practice to do so, and almost any editor will do that for you).

grep a pattern and output non-matching part of line

I know it is possible to invert grep output with the -v flag. Is there a way to only output the non-matching part of the matched line? I ask because I would like to use the return code of grep (which sed won't have). Here's sort of what I've got:
tags=$(grep "^$PAT" >/dev/null 2>&1)
[ "$?" -eq 0 ] && echo $tags
You could use sed:
$ sed -n "/$PAT/s/$PAT//p" $file
The only problem is that it'll return an exit code of 0 as long as the pattern is good, even if the pattern can't be found.
Explanation
The -n parameter tells sed not to print out any lines. Sed's default is to print out all lines of the file. Let's look at each part of the sed program in between the slashes. Assume the program is /1/2/3/4/5:
/$PAT/: This says to look for all lines that matches pattern $PAT to run your substitution command. Otherwise, sed would operate on all lines, even if there is no substitution.
/s/: This says you will be doing a substitution
/$PAT/: This is the pattern you will be substituting. It's $PAT. So, you're searching for lines that contain $PAT and then you're going to substitute the pattern for something.
//: This is what you're substituting for $PAT. It is null. Therefore, you're deleting $PAT from the line.
/p: This final p says to print out the line.
Thus:
You tell sed not to print out the lines of the file as it processes them.
You're searching for all lines that contain $PAT.
On these lines, you're using the s command (substitution) to remove the pattern.
You're printing out the line once the pattern is removed from the line.
How about using a combination of grep, sed and $PIPESTATUS to get the correct exit-status?
$ echo Humans are not proud of their ancestors, and rarely invite
them round to dinner | grep dinner | sed -n "/dinner/s/dinner//p"
Humans are not proud of their ancestors, and rarely invite them round to
$ echo $PIPESTATUS[1]
0[1]
The members of the $PIPESTATUS array hold the exit status of each respective command executed in a pipe. $PIPESTATUS[0] holds the exit status of the first command in the pipe, $PIPESTATUS[1] the exit status of the second command, and so on.
Your $tags will never have a value because you send it to /dev/null. Besides from that little problem, there is no input to grep.
echo hello |grep "^he" -q ;
ret=$? ;
if [ $ret -eq 0 ];
then
echo there is he in hello;
fi
a successful return code is 0.
...here is 1 take at your 'problem':
pat="most of ";
data="The apples are ripe. I will use most of them for jam.";
echo $data |grep "$pat" -q;
ret=$?;
[ $ret -eq 0 ] && echo $data |sed "s/$pat//"
The apples are ripe. I will use them for jam.
... exact same thing?:
echo The apples are ripe. I will use most of them for jam. | sed ' s/most\ of\ //'
It seems to me you have confused the basic concepts. What are you trying to do anyway?
I am going to answer the title of the question directly instead of considering the detail of the question itself:
"grep a pattern and output non-matching part of line"
The title to this question is important to me because the pattern I am searching for contains characters that sed will assign special meaning to. I want to use grep because I can use -F or --fixed-strings to cause grep to interpret the pattern literally. Unfortunately, sed has no literal option, but both grep and bash have the ability to interpret patterns without considering any special characters.
Note: In my opinion, trying to backslash or escape special characters in a pattern appears complex in code and is unreliable because it is difficult to test. Using tools which are designed to search for literal text leaves me with a comfortable 'that will work' feeling without considering POSIX.
I used both grep and bash to produce the result because bash is slow and my use of fast grep creates a small output from a large input. This code searches for the literal twice, once during grep to quickly extract matching lines and once during =~ to remove the match itself from each line.
while IFS= read -r || [[ -n "$RESULT" ]]; do
if [[ "$REPLY" =~ (.*)("$LITERAL_PATTERN")(.*) ]]; then
printf '%s\n' "${BASH_REMATCH[1]}${BASH_REMATCH[3]}"
else
printf "NOT-REFOUND" # should never happen
exit 1
fi
done < <(grep -F "$LITERAL_PATTERN" < "$INPUT_FILE")
Explanation:
IFS= Reassigning the input field separator is a special prefix for a read statement. Assigning IFS to the empty string causes read to accept each line with all spaces and tabs literally until end of line (assuming IFS is default space-tab-newline).
-r Tells read to accept backslashes in the input stream literally instead of considering them as the start of an escape sequence.
$REPLY Is created by read to store characters from the input stream. The newline at the end of each line will NOT be in $REPLY.
|| [[ -n "$REPLY" ]] The logical or causes the while loop to accept input which is not newline terminated. This does not need to exist because grep always provides a trailing newline for every match. But, I habitually use this in my read loops because without it, characters between the last newline and the end of file will be ignored because that causes read to fail even though content is successfully read.
=~ (.*)("$LITERAL_PATTERN")(.*) ]] Is a standard bash regex test, but anything in quotes in taken as a literal. If I wanted =~ to consider the regex characters in contained in $PATTERN, then I would need to eliminate the double quotes.
"${BASH_REMATCH[#]}" Is created by [[ =~ ]] where [0] is the entire match and [N] is the contents of the match in the Nth set of parentheses.
Note: I do not like to reassign stdin to a while loop because it is easy to error and difficult to see what is happening later. I usually create a function for this type of operation which acts typically and expects file_name parameters or reassignment of stdin during the call.

Resources