How to use a variable in sqlite3 csv import via awk - bash

I want to import a txt file (semicolon separated) in a sqlite3 database. Unfortunately there are a lot of spaces in the file, I have to remove first. So I used something like this:
sqlite3 -csv sqliteout.db ".mode list" ".separator ;" ".import '|./normalize.awk infile.csv' importtable"
The normalize.awk removes all the spaces and returns everything again. The pipe is used to read from the input and not from a file. That works fine so far. Since I want to use this in a shell script I would like to replace the "infile.csv" by a variable but I can't find a way to do this because the variable is not evaluated with the single quotes. So, this is not working:
infile="infile.csv"
sqlite3 -csv sqliteout.db ".mode list" ".separator ;" ".import '|./normalize.awk ${infile}' importtable"
Can't I somehow see the solution/problem? Can anybody help?

Related

sqlite dot commands variable expansion like bash

I have a loop in bash that sets some pathnames as variables.
Within that loop I want to perform some sqlite commands based on these variables.
for example:
sqlitedb="/Users/Documents/database.db"
for mm in 01 02 03; do
filename1="A:/data1-${mm}.csv"
filename2="D:/data2-${mm}.csv"
sqlite3 "$sqlitedb" #create new file, it is a temporary file. no problem with this command.
.mode csv
.import "$filename1" data_1 #try to import the first data file. This command doesn't work
.import "$filename2" data_2 #try to import the second data file. This command doesn't work
# now do some important sql stuff which joins these files.
.quit
rm -f "$sqlitedb" #remove old file, ready for the next loop
done
Clearly, SQLITE doesn't know about my BASH variables. What is the best way to set variables, loop through files, etc within sqlite3?
If it helps, I'm using WSL ubuntu 18.04
You need a heredoc, as mentioned in comments:
for mm in 01 02 03; do
filename1="A:/data1-${mm}.csv"
filename2="D:/data2-${mm}.csv"
sqlite3 -batch -csv <<EOF
.import "$filename1" data_1
.import "$filename2" data_2
-- Do stuff with the tables
EOF
done
(If you leave off a filename, sqlite uses a in-memory database so you don't need a manual temporary database file unless you have a lot of data to store)

How can I update column values with the content of a file without interpreting it?

I need to update values in a column when the row matches a certain WHERE clause, using the content of a text file.
The content of the file is javascript code and as such it may contain single quotes, double quotes, slashes and backslashes - out of the top of my mind, it could contain other special characters.
The content of the file cannot be modified.
This has to be done via psql, since the update is automated using bash scripts.
Using the following command - where scriptName is a previously declared bash variable -
psql -U postgres db<<EOF
\set script $(cat $scriptName.js))
UPDATE table SET scriptColumn=:script WHERE nameColumn='$scriptName';
EOF
returns the following error
ERROR: syntax error at or near "{"
LINE 1: ...{//...
^
I would like to treat the content of the file $scriptName.js as plain text, and avoid any interpretation of it.
You should quote the variable:
UPDATE table SET scriptColumn=:'script' WHERE ...
That causes the contents of the variable to be properly escaped as a string literal.
I found a solution to my problem, even though I don't know why it works.
I leave it here in the hope it might be useful to someone else, or that someone more knowledgeable than me will be able to explain why it works now.
In short, setting the variable as a psql parameter did the trick:
psql -U postgres db -v script="$(cat $scriptName.js)"<<EOF
UPDATE table SET scriptColumn=:'script' WHERE nameColumn='$scriptName'
EOF
Not sure how this differs from
psql -U postgres db <<EOF
\set script "$(cat $scriptName.js)"
UPDATE table SET scriptColumn=:'script' WHERE nameColumn='$scriptName'
EOF
which I tried previously and returns the following error:
unterminated quoted string
ERROR: syntax error at or near "//"
LINE 1: // dummy text blahblah
Thanks to everybody who helped!

Pre-pending and appending to a shell variable

My goal is to load an external tables log file into a CLOB column in an oracle database. I've been having issues with the max size you can insert at once but I am able to insert the whole file if I to_clob each line of the log file, concatenate and then insert them (as far as I'm aware this seems to be the quickest and easiest way?):
insert into clob_insert_test values (to_clob('hfsdjhfjsdhfjksd')||chr(10)||to_clob('jhfklsdjfklsdjklfjdsjlk'));
My question is:
I'm reading the file into a shell variable as below so what I need to do is pre-pend to_clob(' to the beginning of each line of the variable and then append ')||chr(10)|| and remove the last ||chr(10)|| from the variable to finish. I can then use that variable in the SQL insert statement for the clob column. Is there a way I can directly do this on the variable rather than modifying the log file before reading it in?
log_content=$(<"$log_file")
Edit:
Sorry I don't think I was clear. Given the example log file I would expect the following variable contents.
Input file:
LOG file opened at 05/05/15 15:12:24
Field Definitions for table ext_loading
Record format DELIMITED BY NEWLINE
Variable contents:
to_clob('LOG file opened at 05/05/15 15:12:24')||char(10)||to_clob('Field Definitions for table ext_loading')||char(10)||to_clob('Record format DELIMITED BY NEWLINE')
I assume you have a file like:
this is me||chr(10)||adfasdf
asdas||chr(10)||asdfasdfasdas
And you want it to become something like:
to_clob('this is meadfasdf')||chr(10)||
to_clob('asdasasdfasdfasdas')||chr(10)||
If so, you can use sed like this:
sed -e "s/||chr(10)||//" -e "s/^/to_clob('/" -e "s/$/')||chr(10)||/" file
That is:
remove ||chr(10)|| once from each line.
add to_clob(' to the begining of each line.
add ')||chr(10)|| to the end of each line.
And to store it in a variable:
log_content=$(sed -e "s/||chr(10)||//" -e "s/^/to_clob('/" -e "s/$/')||chr(10)||/" "$log_file")
Update
To match what you really need, you can also do this:
line=$(sed -e "/./s/^/to_clob('/" -e "/./s/$/')||chr(10)||/" "$log_file")
Then the output is:
$ echo $line # note, without quotes to have all of it together!
to_clob('LOG file opened at 05/05/15 15:12:24')||chr(10)|| to_clob('Field Definitions for table ext_loading')||chr(10)|| to_clob('Record format DELIMITED BY NEWLINE')||chr(10)||
And remove the last ||chr(10)|| with:
$ echo $line | sed 's/||chr(10)||$//'
to_clob('LOG file opened at 05/05/15 15:12:24')||chr(10)|| to_clob('Field Definitions for table ext_loading')||chr(10)|| to_clob('Record format DELIMITED BY NEWLINE')

rrdtool xport filename with spaces

I'm trying to call rrdtool xport command on arbitrary number of files, so I'm writing a script that reads in the rrd file names and builds the DEF argument. The problem is some of the rrd files have whitespaces in them, i.e. "foo bar.rrd" (-_-)...and when the DEF argument is generated, it looks something like this:
DEF:a=foo bar.rrd:sum:AVERAGE
and when this is passed in to the rrdtool command, it generates an error saying "problems reading database name". I also have tried inserting the escape character ("\") before whitespace so it would look like "foo\ bar.rrd", but when this is run in bash, it still produces same error, whereas when I echo the command and copy paste it on the prompt and run it then it works fine...
Just put quotes around the whole thing
"DEF:a=foo bar.rrd:sum:AVERAGE"
rrdtool should be fine with the spaces.

bash script to update postgres database

I have some html data stored in text files right now. I recently decided to store the HTML data in the pgsql database instead of flat files. Right now, the 'entries' table contains a 'path' column that points to the file. I have added a 'content' column that should now store the data in the file pointed to by 'path'. Once that is complete, the 'path' column will be deleted. The problem that I am having is that the files contain apostrophes that throw my script out of whack. What can I do to correct this issue??
Here is the script
#!/bin/sh
dbname="myDB"
username="username"
fileroot="/path/to/the/files/*"
for f in $fileroot
do
psql $dbname $username -c "
UPDATE entries
SET content='`cat $f`'
WHERE id=SELECT id FROM entries WHERE path LIKE '*`$f`';"
done
Note: The logic in the id=SELECT...FROM...WHERE path LIKE "" is not the issue. I have tested this with sample filenames in the pgsql environment.
The problem is that when I cat $f, any apostrophe in Edit: the contents of $f closes the SQL string, and I get a syntax error.
For the single quote escaping issue, a reasonable workaround might be to double the quotes, so you'd use:
`sed "s/'/''/g" < "$f"`
to include the file contents instead of the cat, and for the second invocation in the LIKE where you appeared to intend to use the file name use:
${f/"'"/"''"/}
to include the literal string content of $f instead of executing it, and double the quotes. The ${varname/match/replace} expression is bash syntax and may not work in all shells; use:
`echo "$f" | sed "s/'/''/g"`
if you need to worry about other shells.
There are a bunch of other problems in that SQL.
You're trying to execute $f in your second invocation. I'm pretty sure you didn't intend that; I imagine you meant to include the literal string.
Your subquery is also wrong, it lacks parentheses; (SELECT ...) not just SELECT.
Your LIKE expression is also probably not doing what you intended; you probably meant % instead of *, since % is the SQL wildcard.
If I also change backticks to $() (because it's clearer and easier to read IMO), fix the subquery syntax and add an alias to disambiguate the columns, and use a here-document instead passed to psql's stdin, the result is:
psql $dbname $username <<__END__
UPDATE entries
SET content=$(sed "s/'/''/g" < "$f")
WHERE id=(SELECT e.id FROM entries e WHERE e.path LIKE '$(echo "$f" | sed "s/'/''/g")');
__END__
The above assumes you're using a reasonably modern PostgreSQL with standard_conforming_strings = on. If you aren't, change the regexp to escape apostrophes with \ instead of doubling them, and prefix the string with E, so O'Brien becomes E'O\'Brien'. In modern PostgreSQL it'd instead become 'O''Brien'.
In general, I'd recommend using a real scripting language like Perl with DBD::Pg or Python with psycopg to solve scripting problems with databases. Working with the shell is a bit funky. This expression would be much easier to write with a database interface that supported parameterised statements.
For example, I'd write this as follows:
import os
import sys
import psycopg2
try:
connstr = sys.argv[1]
filename = sys.argv[2]
except IndexError as ex:
print("Usage: %s connect_string filename" % sys.argv[0])
print("Eg: %s \"dbname=test user=fred\" \"some_file\"" % sys.argv[0])
sys.exit(1)
def load_file(connstr,filename):
conn = psycopg2.connect(connstr)
curs = conn.cursor()
curs.execute("""
UPDATE entries
SET content = %s
WHERE id = (SELECT e.id FROM entries e WHERE e.path LIKE '%%'||%s);
""", (filename, open(filename,"rb").read()))
curs.close()
if __name__ == '__main__':
load_file(connstr,filename)
Note the SQL wildcard % is doubled to escape it, so it results in a single % in the final SQL. That's because Python is using % as its format-specifier so a literal % must be doubled to escape it.
You can trivially modify the above script to accept a list of file names, connect to the database once, and loop over the list of all file names. That'll be a lot faster, especially if you do it all in one transaction. It's a real pain to do that with psql scripting; you have to use bash co-process as shown here ... and it isn't worth the hassle.
In the original post, I made it sound like there were apostrophes in the filename represented by $f. This was NOT the case, so a simple echo "$f" was able to fix my issue.
To make it more clear, the contents of my files were formatted as html snippets, typically something like <p>Blah blah <b>blah</b>...</p>. After trying the solution posted by Craig, I realized I had used single quotes in some anchor tags, and I did NOT want to change those to something else. There were only a few files where this violation occurred, so I just changed these to double quotes by hand. I also realized that instead of escaping the apostrophes, it would be better to convert them to &apos; Here is the final script that I ended up using:
dbname="myDB"
username="username"
fileroot="/path/to/files/*"
for f in $fileroot
do
psql $dbname $username << __END__
UPDATE entries
SET content='$(sed "s/'/\&apos;/g" < "$f")'
WHERE id=(SELECT e.id FROM entries e WHERE path LIKE '%$(echo "$f")');
__END__
done
The format coloring on here might make it look like the syntax is incorrect, but I have verified that it is correct as posted.

Resources