Delimiting single quote in BASH script from a SQL dump - bash

I am scrubbing a SQL dump file from MYSQL so that it is free of user information. The file is 100s of megs in size, and I have it all working except for the SQL quotes. I go line by line through the file and then use the statements in the form of:
RESULT=echo $LINE | sed "<something>"
This worked great until I came across this line:
INSERT INTO `brand` VALUES (42,84,'','brands/large_logo/L\'OrealLogo.jpg',0);
When I echo the line, the result is that I lose the L\'Oreal delimiter, and when I try to when load it back via SQL, it get an error. Here's the lined via echo $LINE:
The problem is here v
INSERT INTO `brand` VALUES (42,84,'','brands/large_logo/L'OrealLogo.jpg',0);
Is there a way to keep echo from using the \' as an escape sequence for '? I feel like I am missing something obvious here, but just cannot get my finger on it.

Psychic debugging suggests that you are using a while read loop, but not suppling -r:
$ cat file
'Notice the \' here'
$ while read LINE; do echo "$LINE"; done < file
'Notice the ' here'
$ while read -r LINE; do echo "$LINE"; done < file
'Notice the \' here'
There are other concerns like missing $(..) and quoting in your RESULT=echo $LINE | sed "<something>" and the fact that you're running sed once for each line rather than for the stream, but these are separate issues.

Related

Reading and writing line by line in a bash script

After searching online I was able to figure out how to read a file line by line:
while read p; do
echo $p
done < file.txt
But I would actually like to modify the line in the file.
For example:
while read p; do
if condition
then
echo $p | perl -i -pe 's/a/b/'
fi
done < file.txt
However this doesn't actually modify the file.
Update   A far better version of bash code added. Thanks to Charles Duffy for comments.
Your Perl one-liner takes a line piped into it by echo $p |, getting its standard input that way. It doesn't do anything with the file itself, so the -i flag has no effect. The -p makes it print to the standard output stream. So that whole line, echo ..., doesn't touch the file.
You can redirect the output to a new file and then move that to overwrite file.txt. Here is a simple minded example, that appends each line to a new file. For better bash code see the update below.
while read p; do
if condition
then
echo $p | perl -pe 's/a/b/' >> temp_out.txt
else
echo $p >> temp_out.txt
fi
done < file.txt
mv temp_out.txt file.txt
We have to add the else where all unmodified lines are also appended. Note that in general we cannot have just some lines replaced but the whole file has to be re-written.
If this is all that the script does you can do it with a very simple one-liner, see the end. If more work is done you can also put it all in a Perl script but I take it that there may be other good reasons for a bash script.
Update   A much better version of the above. See read and echo in Builtins in Bash manual
Appending each line opens the file anew each time without a need for that.
Just redirect at the end of the loop, much like it is done in the terminal
read uses backslash for escaping, removing it from input. Turn that off with -r
Trailing white space is removed, as a part of breaking the line into words. Suppress this by unsetting the variable that controls which characters are used for splitting, IFS=
The echo $p can do all kinds of unintended things. A formatted print is better, printf '%s\n' "$p", or at least echo "$p"
With this,
while IFS= read -r p; do
if condition
then
echo "$p" | perl -pe 's/a/b/'
else
echo "$p"
fi
done < file.txt > temp_out.txt
mv temp_out.txt file.txt
Finally, if the sole purpose of the Perl one-liner were to run a simple substitution, it is much better to simply do that in the shell itself than to have a pipeline and run a whole new process for each line.
echo "${p//a/b}"
Thanks to Charles Duffy for raising all these points in comments.
A few comments on Perl one-liners. See documentation at perlrun.
The command perl -e '...' executes any valid Perl code between ''. When we add the -n or -p switch it also reads standard input and executes that code on a line of it at the time, where -p also prints out each line after it's processed. The standard input can be supplied to it from a file,
perl -pe '...' input.txt
in which case adding -i flag will result in the file being changed in-place. Or, the input can be piped into it, for example
echo "input text" | perl -pe '...'
in which case the processed line is printed to standard output. This can be redirected to a file, as in the answer above.
To make changes to a given file a line at a time you only need this on the command line
perl -i -pe 's/a/b/' file.txt
If there is more work to do then it may well be better to put it in a script, of course. In this case the one-liner can be a command in the bash script as well, replacing all that code above (unless some bash-specific functionality is preferred for processing lines).

Very weird behavior using sed

I have a big problem doing a script: basically, I read a line from files.
All lines are made of 3 to 8 characters contiguous (no space).
Then I used sed to replace those lines inside a pattern (aka "var" in my minimal script below)
var="iao"
for m in `more meshing/junction_names.txt`
do
echo $m
echo -n $m | xxd -ps | sed 's/[[:xdigit:]]\{2\}/\\x&/g'
echo $var |sed "s/a/b/"
echo $var |sed "s/a/$m/"
done
Now these are the first 3 record of my output (they are all the same anyway).
I am using linux. According kate, all files are encoded UTF-8. Very weird huh? Any idea why that is is welcome.
J_LEAK
\x4a\x5f\x4c\x45\x41\x4b\x0d
ibo
oJ_LEAK
JO_1
\x4a\x4f\x5f\x31\x0d
ibo
oJO_1
JPL2_F
\x4a\x50\x4c\x32\x5f\x46\x0d
ibo
oJPL2_F
JF_PL2
Your input file contains DOS carriage returns (or possibly, the absurd attempt to read it with more introduces them). The hex dump shows this clearly; every value ends with \x0d which translates to a control code which causes the terminal to jump the cursor back to the beginning of the line.
This is a massive FAQ and you can find many examples of how to troubleshoot this basic problem, including in the bash tag wiki.
Tangentially, you should always quote strings unless you specifically require the shell to perform wildcard expansion and whitespace tokenization on the value; and Bash has built-ins to avoid the inelegant and somewhat error-prone echo | sed. Finally, don't read lines with for.
var="iao"
tr -d '\015' <meshing/junction_names.txt |
while read -r m; do # don't use a for loop
echo "$m" # quote!
echo -n "$m" | xxd -ps | sed 's/[[:xdigit:]]\{2\}/\\x&/g'
echo "${var/a/b}" # quote; use Bash built-in substitution mechanism
echo "${var/a/$m}"
done
Perhaps you want to remove the carriage returns once and for all, and then just use while read .... done <fixed-file instead of the tr pipeline.

Read an input file in shell script and store its lines in a variable

I'm new to UNIX and have this really simple problem:
I have a text-file (input.txt) containing a string in each line. It looks like this:
House
Monkey
Car
And inside my shell script I need to read this input file line by line to get to a variable like this:
things="House,Monkey,Car"
I know this sounds easy, but I just couldnt find any simple solution for this. My closest attempt so far:
#!/bin/sh
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done <input.txt
echo $things
But this won't work. Regarding to my google research I thought the while loop would create a new sub shell, but this I was wrong there (see the comment section). Nevertheless the variable "things" was still not available in the echo later on. (I cannot just write the echo inside the while loop, because I need to work with that string later on)
Could you please help me out here? Any help will be appreciated, thank you!
What you proposed works fine! I've only made two changes here: Adding missing quotes, and handling the empty-string case.
things=""
addToString() {
if [ -n "$things" ]; then
things="${things},$1"
else
things="$1"
fi
}
while read -r line; do addToString "$line"; done <input.txt
echo "$things"
If you were piping into while read, this would create a subshell, and that would eat your variables. You aren't piping -- you're doing a <input.txt redirection. No subshell, code works without changes.
That said, there are better ways to read lists of items into shell variables. On any version of bash after 3.0:
IFS=$'\n' read -r -d '' -a things <input.txt # read into an array
printf -v things_str '%s,' "${things[#]}" # write array to a comma-separated string
echo "${things_str%,}" # print that string w/o trailing comma
...on bash 4, that first line can be:
readarray -t things <input.txt # read into an array
This is not a shell solution, but the truth is that solutions in pure shell are often excessively long and verbose. So e.g. to do string processing it is better to use special tools that are part of the “default” Unix environment.
sed ':b;N;$!bb;s/\n/,/g' < input.txt
If you want to omit empty lines, then:
sed ':b;N;$!bb;s/\n\n*/,/g' < input.txt
Speaking about your solution, it should work, but you should really always use quotes where applicable. E.g. this works for me:
things=""
while read line; do things="$things,$line"; done < input.txt
echo "$things"
(Of course, there is an issue with this code, as it outputs a leading comma. If you want to skip empty lines, just add an if check.)
This might/might not work, depending on the shell you are using. On my Ubuntu 14.04/x64, it works with both bash and dash.
To make it more reliable and independent from the shell's behavior, you can try to put the whole block into a subshell explicitly, using the (). For example:
(
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done
echo $things
) < input.txt
P.S. You can use something like this to avoid the initial comma. Without bash extensions (using short-circuit logical operators instead of the if for shortness):
test -z "$things" && things="$1" || things="${things},${1}"
Or with bash extensions:
things="${things}${things:+,}${1}"
P.P.S. How I would have done it:
tr '\n' ',' < input.txt | sed 's!,$!\n!'
You can do this too:
#!/bin/bash
while read -r i
do
[[ $things == "" ]] && things="$i" || things="$things","$i"
done < <(grep . input.txt)
echo "$things"
Output:
House,Monkey,Car
N.B:
Used grep to tackle with empty lines and the probability of not having a new line at the end of file. (Normal while read will fail to read the last line if there is no newline at the end of file.)

How do I iterate over each line in a file with Bash?

Given a text file with multiple lines, I would like to iterate over each line in a Bash script. I had attempted to use cut, but cut does not accept \n (newline) as a delimiter.
This is an example of the file I am working with:
one
two
three
four
Does anyone know how I can loop through each line of this text file in Bash?
I found myself in the same problem, this works for me:
cat file.cut | cut -d$'\n' -f1
Or:
cut -d$'\n' -f1 file.cut
Use cat for concatenating or displaying. No need for it here.
file="/path/to/file"
while read line; do
echo "${line}"
done < "${file}"
Simply use:
echo -n `cut ...`
This suppresses the \n at the end
cat FILE|while read line; do # 'line' is the variable name
echo "$line" # do something here
done
or (see comment):
while read line; do # 'line' is the variable name
echo "$line" # do something here
done < FILE
So, some really good (possibly better) answers have been provided already. But looking at the phrasing of the original question, in wanting to use a BASH for-loop, it amazed me that nobody mentioned a solution with change of Field Separator IFS. It's a pure bash solution, just like the accepted read line
old_IFS=$IFS
IFS='\n'
for field in $(<filename)
do your_thing;
done
IFS=$old_IFS
If you are sure that the output will always be newline-delimited, use head -n 1 in lieu of cut -f1 (note that you mentioned a for loop in a script and your question was ultimately not script-related).
Many of the other answers, including the accepted one, have multiple lines unnecessarily. No need to do this over multiple lines or changing the default delimiter on the system.
Also, the solution provided by Ivan with -d$'\n' did not work for me either on Mac OSX or CentOS 7. Since his answer is four years old, I assume something must have changed on the logic of the $ character for this situation.
While loop with input redirection and read command.
You should not be using cut to perform a sequential iteration of each line in a file as cut was not designed to do this.
Print selected parts of lines from each FILE to standard output.
— man cut
TL;DR
You should use a while loop with the read -r command and redirect standard input to your file inside a function scope where IFS is set to \n and use -E when using echo.
processFile() { # Function scope to prevent overwriting IFS globally
file="$1" # Any file that exists
local IFS="\n" # Allows spaces and tabs
while read -r line; do # Read exits with 1 when done; -r allows \
echo -E "$line" # -E allows printing of \ instead of gibberish
done < $file # Input redirection allows us to read file from stdin
}
processFile /path/to/file
Iteration
In order to iterate over each line of a file, we can use a while loop. This will let us iterate as many times as we need to.
while <condition>; do
<body>
done
Getting our file ready to read
We can use the read command to store a single line from standard input in a variable. Before we can use that to read a line from our file, we need to redirect standard input to point to our file. We can do this with input redirection. According to the man pages for bash, the syntax for redirection is [fd]<file where fd defaults to standard input (a.k.a file descriptor 0). We can place this before or after our while loop.
while <condition>; do
<body>
done < /path/to/file
# or the non-traditional way
</path/to/file while <condition>; do
<body>
done
Reading the file and ending the loop
Now that our file can be read from standard input, we can use read. The syntax for read in our context is read [-r] var... where -r preserves the \ (backslash) character, instead of using it as an escape sequence character, and var is the name of the variable to store the input in. You can have multiple variables to store pieces of the input in but we only need one to read an entire line. Along with this, to preserve any backslashes in any output from echo you will likely need to use the -E flag to disable the interpretation of backslash escapes. If you have any indentation (spaces or tabs), you will need to temporarily change the IFS (Input Field Separators) variable to only "\n"; normally it is set to " \t\n".
main() {
local IFS="\n"
read -r line
echo -E "$line"
}
main
How do we use read to end our while loop?
There is really only one reliable way, that I know of, to determine when you've finished reading a file with read: check the exit value of read. If the exit value of read is 0 then we successfully read a line, if it is 1 or higher then we reached EOF (end of file). With that in mind, we can place the call to read in our while loop's condition section.
processFile() {
# Could be any file you want hardcoded or dynamic
file="$1"
local IFS="\n"
while read -r line; do
# Process line here
echo -E "$line"
done < $file
}
processFile /path/to/file1
processFile /path/to/file2
A visual breakdown of the above code via Explain Shell.
If I am executing a command and want to cut the output but it has multiple lines I found it helpful to do
echo $([command]) | cut [....]
This puts all the output of [command] on a single line that can be easier to process.
My opinion is that "cut" uses '\n' as its default delimiter.
If you want to use cut, I have two ways:
cut -d^M -f1 file_cut
I make ^M By click Enter After Ctrl+V. Another way is
cut -c 1- file_cut
Does that help?

Cat with new line

My input file's contents are:
welcome
welcome1
welcome2
My script is:
for groupline in `cat file`
do
echo $groupline;
done
I got the following output:
welcome
welcome1
welcome2
Why doesn't it print the empty line?
you need to set IFS to newline \n
IFS=$"\n"
for groupline in $(cat file)
do
echo "$groupline";
done
Or put double quotes. See here for explanation
for groupline in "$(cat file)"
do
echo "$groupline";
done
without meddling with IFS, the "proper" way is to use while read loop
while read -r line
do
echo "$line"
done <"file"
Because you're doing it all wrong. You want while not for, and you want read, not cat:
while read groupline
do
echo "$groupline"
done < file
The solution ghostdog74 provided is helpful, but has a flaw.
IFS could not use double quotes (at least in Mac OS X), but can use single quotes like:
IFS=$'\n'
It's nice but not dash-compatible, maybe this is better:
IFS='
'
The blank line will be eaten in the following program:
IFS='
'
for line in $(cat file)
do
echo "$line"
done
But you can not add double quotes around $(cat file), it will treat the whole file as one single string.
for line in "$(cat file)"
If want blank line also be processed, using the following
while read line
do
echo "$line"
done < file
Using IFS=$"\n" and var=$(cat text.txt) removes all the "n" characters from the output echo $var

Resources