bash script to update postgres database - bash

I have some html data stored in text files right now. I recently decided to store the HTML data in the pgsql database instead of flat files. Right now, the 'entries' table contains a 'path' column that points to the file. I have added a 'content' column that should now store the data in the file pointed to by 'path'. Once that is complete, the 'path' column will be deleted. The problem that I am having is that the files contain apostrophes that throw my script out of whack. What can I do to correct this issue??
Here is the script
#!/bin/sh
dbname="myDB"
username="username"
fileroot="/path/to/the/files/*"
for f in $fileroot
do
psql $dbname $username -c "
UPDATE entries
SET content='`cat $f`'
WHERE id=SELECT id FROM entries WHERE path LIKE '*`$f`';"
done
Note: The logic in the id=SELECT...FROM...WHERE path LIKE "" is not the issue. I have tested this with sample filenames in the pgsql environment.
The problem is that when I cat $f, any apostrophe in Edit: the contents of $f closes the SQL string, and I get a syntax error.

For the single quote escaping issue, a reasonable workaround might be to double the quotes, so you'd use:
`sed "s/'/''/g" < "$f"`
to include the file contents instead of the cat, and for the second invocation in the LIKE where you appeared to intend to use the file name use:
${f/"'"/"''"/}
to include the literal string content of $f instead of executing it, and double the quotes. The ${varname/match/replace} expression is bash syntax and may not work in all shells; use:
`echo "$f" | sed "s/'/''/g"`
if you need to worry about other shells.
There are a bunch of other problems in that SQL.
You're trying to execute $f in your second invocation. I'm pretty sure you didn't intend that; I imagine you meant to include the literal string.
Your subquery is also wrong, it lacks parentheses; (SELECT ...) not just SELECT.
Your LIKE expression is also probably not doing what you intended; you probably meant % instead of *, since % is the SQL wildcard.
If I also change backticks to $() (because it's clearer and easier to read IMO), fix the subquery syntax and add an alias to disambiguate the columns, and use a here-document instead passed to psql's stdin, the result is:
psql $dbname $username <<__END__
UPDATE entries
SET content=$(sed "s/'/''/g" < "$f")
WHERE id=(SELECT e.id FROM entries e WHERE e.path LIKE '$(echo "$f" | sed "s/'/''/g")');
__END__
The above assumes you're using a reasonably modern PostgreSQL with standard_conforming_strings = on. If you aren't, change the regexp to escape apostrophes with \ instead of doubling them, and prefix the string with E, so O'Brien becomes E'O\'Brien'. In modern PostgreSQL it'd instead become 'O''Brien'.
In general, I'd recommend using a real scripting language like Perl with DBD::Pg or Python with psycopg to solve scripting problems with databases. Working with the shell is a bit funky. This expression would be much easier to write with a database interface that supported parameterised statements.
For example, I'd write this as follows:
import os
import sys
import psycopg2
try:
connstr = sys.argv[1]
filename = sys.argv[2]
except IndexError as ex:
print("Usage: %s connect_string filename" % sys.argv[0])
print("Eg: %s \"dbname=test user=fred\" \"some_file\"" % sys.argv[0])
sys.exit(1)
def load_file(connstr,filename):
conn = psycopg2.connect(connstr)
curs = conn.cursor()
curs.execute("""
UPDATE entries
SET content = %s
WHERE id = (SELECT e.id FROM entries e WHERE e.path LIKE '%%'||%s);
""", (filename, open(filename,"rb").read()))
curs.close()
if __name__ == '__main__':
load_file(connstr,filename)
Note the SQL wildcard % is doubled to escape it, so it results in a single % in the final SQL. That's because Python is using % as its format-specifier so a literal % must be doubled to escape it.
You can trivially modify the above script to accept a list of file names, connect to the database once, and loop over the list of all file names. That'll be a lot faster, especially if you do it all in one transaction. It's a real pain to do that with psql scripting; you have to use bash co-process as shown here ... and it isn't worth the hassle.

In the original post, I made it sound like there were apostrophes in the filename represented by $f. This was NOT the case, so a simple echo "$f" was able to fix my issue.
To make it more clear, the contents of my files were formatted as html snippets, typically something like <p>Blah blah <b>blah</b>...</p>. After trying the solution posted by Craig, I realized I had used single quotes in some anchor tags, and I did NOT want to change those to something else. There were only a few files where this violation occurred, so I just changed these to double quotes by hand. I also realized that instead of escaping the apostrophes, it would be better to convert them to &apos; Here is the final script that I ended up using:
dbname="myDB"
username="username"
fileroot="/path/to/files/*"
for f in $fileroot
do
psql $dbname $username << __END__
UPDATE entries
SET content='$(sed "s/'/\&apos;/g" < "$f")'
WHERE id=(SELECT e.id FROM entries e WHERE path LIKE '%$(echo "$f")');
__END__
done
The format coloring on here might make it look like the syntax is incorrect, but I have verified that it is correct as posted.

Related

How to create new names for files with problematic characters for use in an existing bash scripted environment?

The goal is to get rid of (by changing) filenames that give headaches for scripting by translating them to something else. The reason is that in this nearly 30 year Unix / Linux environment, with a lot of existing scripts that may not be "written correctly", a new, large and important cache of files arrived that have to be managed, and so, a colleague has asked me to write a script to help with "problematic filenames" and translate them. They've got a list of chars to turn into dots, such as the comma, and another list to turn into underscores, such as whitespace, as but two examples and ran into problems which I asked about over here.
I was using tr to do it, but commenters to it said I should perhaps ask just about this instead of how to get tr to work. So, I have!
Parameter expansion can do this for you.
Note that unlike when using tr (as requested on your other question), when using parameter expansion you don't need to use backslashes inside your character class definitions: put the expansion in double quotes and bash will treat the results of that expansion as literal.
#!/usr/bin/env bash
toDots='\,;:|+##$%^&*~'
toUnderscores='}{]['"'"'="()`!'
# requires bash 5+: if debug=1, then print what we would do instead of doing it
runOrDebug() {
if (( debug )); then
printf '%s\n' "${*#Q}"
else
"$#"
fi
}
renameFiles() {
local name subDots subBoth
for name; do
subDots=${name//["$toDots"]/.}
subBoth=${subDots//["$toUnderscores"]/_}
if [[ $subBoth != "$name" ]]; then
runOrDebug mv -- "$name" "$subBoth"
fi
done
}
debug=1 renameFiles '[/a],/;[p:r|o\b+lem#a#t$i%c]/#(%$^!/(e^n&t*ry)~='
Note that toUnderscores is (except for the single quote in the middle) in single quotes, so all the backslashes in it are part of the variable's data rather than being syntax; because globs use character class syntax from REs, they're parsed as POSIX regular expression character class syntax.
See a demonstration of the technique running at https://ideone.com/kKE7IJ

Renaming the file Directory which contains Space based on CSV in Shell

I need to rename the files inside the folder that has a space in it eg(Deco/main library/file1.txt )
code:
while IFS="," read orig new pat
do
mv -v $pat$new $pat$orig
done < new.csv
csv file:
newname,file1.txt,Deco/main\\\ library/
error:
mv: invalid option -- '\'
Welcome to Stackoverflow!
First: Use quotes around the use of variables. That means except in very rare occasions, you always should use "$foo" instead of $foo because if you are using the latter, the shell is supposed (and will) interpret spaces in the variables as word delimiters which you rarely want. Especially in your case you do not want it.
Second: Your CSV file seems to contain backslashes to quote the spaces. And some additional step seems to have added another level of quotation so than now you end up with three backslashes and a space for each original space. If this really is the case (please double check if what you wrote in your question is correct, otherwise my answer doesn't fit), you need to unquote this before you can use it.
There are security issues involved in using eval, so do not use it lightly (this disclaimer is necessary whenever proposing to use eval), but if you have trust in the input you are handling to not contain any nastinesses, then you can do this using this code:
while IFS="," read orig new pat
do
eval eval mv -v "$pat$new" "$pat$orig"
done < new.csv
Using this, two levels of quotation are evaluated (that's what eval does) before the mv command is executed.
I strongly suggest to do a dry run by adding echo before the mv first. Then instead of executing your commands they are merely printed first.

How do I locally source environment variables that I have defined in a Docker-format env-file?

I've written a bunch of environment variables in Docker format, but now I want to use them outside of that context. How can I source them with one line of bash?
Details
Docker run and compose have a convenient facility for importing a set of environment variables from a file. That file has a very literal format.
The value is used as is and not modified at all. For example if the value is surrounded by quotes (as is often the case of shell variables), the quotes are included in the value passed
Lines beginning with # are treated as comments and are ignored
Blank lines are also ignored.
"If no = is provided and that variable is…exported in your local environment," docker "passes it to the container"
Thankfully, whitespace before the = will cause the run to fail
so, for example, this env-file:
# This is a comment, with an = sign, just to mess with us
VAR1=value1
VAR2=value2
USER
VAR3=is going to = trouble
VAR4=this $sign will mess with things
VAR5=var # with what looks like a comment
#VAR7 =would fail
VAR8= but what about this?
VAR9="and this?"
results in these env variables in the container:
user=ubuntu
VAR1=value1
VAR2=value2
VAR3=is going to = trouble
VAR4=this $sign will mess with things
VAR5=var # with what looks like a comment
VAR8= but what about this?
VAR9="and this?"
The bright side is that once I know what I'm working with, it's pretty easy to predict the effect. What I see is what I get. But I don't think bash would be able to interpret this in the same way without a lot of changes. How can I put this square Docker peg into a round Bash hole?
tl;dr:
source <(sed -E -e "s/^([^#])/export \1/" -e "s/=/='/" -e "s/(=.*)$/\1'/" env.list)
You're probably going to want to source a file, whose contents
are executed as if they were printed at the command line.
But what file? The raw docker env-file is inappropriate, because it won't export the assigned variables such that they can be used by child processes, and any of the input lines with spaces, quotes, and other special characters will have undesirable results.
Since you don't want to hand edit the file, you can use a stream editor to transform the lines to something more bash-friendly. I started out trying to solve this with one or two complex Perl 5 regular expressions, or some combination of tools, but I eventually settled on one sed command with one simple and two extended regular expressions:
sed -E -e "s/^([^#])/export \1/" -e "s/=/='/" -e "s/(=.*)$/\1'/" env.list
This does a lot.
The first expression prepends export to any line whose first character is anything but #.
As discussed, this makes the variables available to anything else you run in this session, your whole point of being here.
The second expression simply inserts a single-quote after the first = in a line, if applicable.
This will always enclose the whole value, whereas a greedy match could lop off some of (e.g.) VAR3, for example
The third expression appends a second quote to any line that has at least one =.
it's important here to match on the = again so we don't create an unmatched quotation mark
Results:
# This is a comment, with an =' sign, just to mess with us'
export VAR1='value1'
export VAR2='value2'
export USER
export VAR3='is going to = trouble'
export VAR4='this $sign will mess with things'
export VAR5='var # with what looks like a comment'
#VAR7 ='would fail'
export VAR8=' but what about this?'
export VAR9='"and this?"'
Some more details:
By wrapping the values in single-quotes, you've
prevented bash from assuming that the words after the space are a command
appropriately brought the # and all succeeding characters into the VAR5
prevented the evaluation of $sign, which, if wrapped in double-quotes, bash would have interpreted as a variable
Finally, we'll take advantage of process substitution to pass this stream as a file to source, bring all of this down to one line of bash.
source <(sed -E -e "s/^([^#])/export \1/" -e "s/=/='/" -e "s/(=.*)$/\1'/" env.list)
Et voilĂ !

Printf splits a string at spaces using Bash [duplicate]

This question already has answers here:
Why a variable assignment replaces tabs with spaces
(2 answers)
Closed 7 years ago.
I'm having some troubles with the printf function in bash.
I wrote a little script on which I pass a name and two letters (such as "sh", "py", "ht") and it creates a file in the current working directory named "name.extension".
For instance, if I execute seed test py a file named test.py is created in the current working dir with the shebang #!/usr/bin/python3.
So far, so good, nothing fancy: I'm learning shell scripting and I thought this could be a simple exercise to test the knowledge gained so far.
The problem is when I want to create an HTML file. This is the function that I use:
creaHtml(){
head='<!--DOCTYPE html-->\n<html>\n\t<head>\n\t\t<meta charset=\"UTF-8\">\n\t</head>\n\t<body>\n\t</body>\n</html>'
percorso=$CARTELLA_CORRENTE/$NOME_FILE.html
printf $head>>$percorso
chmod 755 $percorso
}
If I run, for instance, seed test ht the correct function (creaHtml) is called, test.html is created but if I try to look into it I only see:
<!--DOCTYPE
And nothing else.
This is the trace for that function:
[sviluppo:~/bin]$ seed test ht
+ creaHtml
+ head='<!--DOCTYPE html-->\n<html>\n\t<head>\n\t\t<meta charset=\"UTF-8\">\n\t</head>\n\t<body>\n\t</body>\n</html>'
+ percorso=/home/sviluppo/bin/test.html
+ printf '<!--DOCTYPE' 'html-->\n<html>\n\t<head>\n\t\t<meta' 'charset=\"UTF-8\">\n\t</head>\n\t<body>\n\t</body>\n</html>'
+ chmod 755 /home/sviluppo/bin/test.html
+ set +x
However, if I try to run printf '<!--DOCTYPE html-->\n<html>\n\t<head>\n\t\t<meta charset=\"UTF-8\">\n\t</head>\n\t<body>\n\t</body>\n</html>' from the terminal, I see the correct output: the "skeleton" of an HTML file neatly displayed with indentation and everything. What am I missing here?
Try echo -e instead of printf. printf is for printing formatted strings. Since you didn't protect $head with quotes, bash splits the string to form the command. The first word (before first white space) forms the format string. The rest are just arguments for things you didn't specify to print.
echo -e "$head" > "$percorso"
The -e evaluates your \n into newlines. I changed your >> to > since it looks like you want this to be the whole file, rather than append to any existing file you might have.
You have to be careful with quotes in bash. One thing can become many things. This actually makes it more powerful, but it can be confusing for people learning. Notice that I also put the file name "$percorso" in double quotes too. This evaluates the variable and makes sure that it ends up as one thing. If you use single quotes, it will be one word, but not evaluated. Unlike Python, there is a big difference between single and double quotes.
If you want to use printf for compatibility as #chepner pointed out, just be sure to quote it:
printf "$head" > "$percorso"
Actually that is much simpler anyway.

error in shell script: unexpected end of file

The following script is showing me "unexpected end of file" error. I have no clue why am I facing this error. My all the quotes are closed properly.
#!/usr/bin/sh
insertsql(){
#sqlite3 /mnt/rd/stats_flow_db.sqlite <<EOF
echo "insert into flow values($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18)"
#.quit
}
for i in {1..100}
do
src_ip = "10.1.2."+$i
echo $src_ip
src_ip_octets = ${src_ip//,/}
src_ip_int = $src_ip_octets[0]*1<<24+$src_ip_octets[1]*1<<16+$src_ip_octets[2]*1<<8+$src_ip_octets[3]
dst_ip = "10.1.1."+$i
dst_ip_octets = ${dst_ip//,/}
dst_ip_int = $dst_ip_octets[0]*1<<24+$dst_ip_octets[1]*1<<16+$dst_ip_octets[2]*1<<8+$dst_ip_octets[3]
insertsql(1, 10000, $dst_ip, 20000, $src_ip, "2012-08-02,12:30:25.0","2012-08-02,12:45:25.0",0,0,0,"flow_a010105_a010104_47173_5005_1_50183d19.rrd",0,12,$src_ip_int,$dst_ip_int,3,50000000,80000000)
done
That error is caused by <<. When encountering that, the script tries to read until it finds a line which has exactly (starting in the first column) what is found after the <<. As that is never found, the script searches to the end and then complains that the file ended unexpectedly.
That will not be your only problem, however. I see at least the following other problems:
You can only use $1 to $9 for positional parameters. If you want to go beyond that, the use of the shift command is required or, if your version of the shell supports it, use braces around the variable name; e.g. ${10}, ${11}...
Variable assignments must not have whitespace arount the equal sign
To call your insertsql you must not use ( and ); you'd define a new function that way.
The cass to your insertsql function must pass the parameters whitespace separated, not comma separated.
A couple of problems:
There should be no space between equal sign and two sides of an assignment: e.g.,: dst_ip="10.1.1.$i"
String concatenation is not done using plus sign e.g., dst_ip="10.1.1.$i"
There is no shift operator in bash, no <<: $dst_ip_octets[0]*1<<24 can be done with expr $dst_ip_octets[0] * 16777216 `
Functions are called just like shell scripts, arguments are separated by space and no parenthesis: insertsql 1 10000 ...
That is because you don't follow shell syntax.
To ser variable you are not allowed to use space around = and to concatenate two parts of string you shouldn't use +. So the string
src_ip = "10.1.2."+$i
become
src_ip="10.1.2.$i"
Why you're using the string
src_ip_octets = ${src_ip//,/}
I don't know. There is absolutely no commas in you variable. So even to delete all commas it should look like (the last / is not required in case you're just deleting symbols):
src_ip_octets=${src_ip//,}
The next string has a lot of symbols that shell intepreter at its own way and that's why you get the error about unexpected end of file (especially due to heredoc <<)
src_ip_int = $src_ip_octets[0]*1<<24+$src_ip_octets[1]*1<<16+$src_ip_octets[2]*1<<8+$src_ip_octets[3]
So I don't know what exactly did you mean, though it seems to me it should be something like
src_ip_int=$(( ${src_ip_octets%%*.}+$(echo $src_ip_octets|sed 's/[0-9]\+\.\(\[0-9]\+\)\..*/\1/')+$(echo $src_ip_octets|sed 's/\([0-9]\+\.\)\{2\}\(\[0-9]\+\)\..*/\1/')+${src_ip_octets##*.} ))
The same stuff is with the next strings.
You can't do this:
dst_ip_int = $dst_ip_octets[0]*1<<24+$dst_ip_octets[1]*1<<16+$dst_ip_octets[2]*1<<8+$dst_ip_octets[3]
The shell doesn't do math. This isn't C. If you want to do this sort of calculation, you'll need to use something like bc, dc or some other tool that can do the sort of math you're attempting here.
Most of those operators are actually shell metacharacters that mean something entirely different. For example, << is input redirection, and [ and ] are used for filename globbing.

Resources