I have a loop in bash that sets some pathnames as variables.
Within that loop I want to perform some sqlite commands based on these variables.
for example:
sqlitedb="/Users/Documents/database.db"
for mm in 01 02 03; do
filename1="A:/data1-${mm}.csv"
filename2="D:/data2-${mm}.csv"
sqlite3 "$sqlitedb" #create new file, it is a temporary file. no problem with this command.
.mode csv
.import "$filename1" data_1 #try to import the first data file. This command doesn't work
.import "$filename2" data_2 #try to import the second data file. This command doesn't work
# now do some important sql stuff which joins these files.
.quit
rm -f "$sqlitedb" #remove old file, ready for the next loop
done
Clearly, SQLITE doesn't know about my BASH variables. What is the best way to set variables, loop through files, etc within sqlite3?
If it helps, I'm using WSL ubuntu 18.04
You need a heredoc, as mentioned in comments:
for mm in 01 02 03; do
filename1="A:/data1-${mm}.csv"
filename2="D:/data2-${mm}.csv"
sqlite3 -batch -csv <<EOF
.import "$filename1" data_1
.import "$filename2" data_2
-- Do stuff with the tables
EOF
done
(If you leave off a filename, sqlite uses a in-memory database so you don't need a manual temporary database file unless you have a lot of data to store)
Related
I want to import a txt file (semicolon separated) in a sqlite3 database. Unfortunately there are a lot of spaces in the file, I have to remove first. So I used something like this:
sqlite3 -csv sqliteout.db ".mode list" ".separator ;" ".import '|./normalize.awk infile.csv' importtable"
The normalize.awk removes all the spaces and returns everything again. The pipe is used to read from the input and not from a file. That works fine so far. Since I want to use this in a shell script I would like to replace the "infile.csv" by a variable but I can't find a way to do this because the variable is not evaluated with the single quotes. So, this is not working:
infile="infile.csv"
sqlite3 -csv sqliteout.db ".mode list" ".separator ;" ".import '|./normalize.awk ${infile}' importtable"
Can't I somehow see the solution/problem? Can anybody help?
Im trying to source a variable list which is populated into one single variable in bash.
I then want to source this single variable to the contents (which are other variables) of the variable are available to the script.
I want to achieve this without having to spool the sqlplus file then source this file (this already works as I tried it).
Please find below what Im trying:
#!/bin/bash
var_list=$(sqlplus -S /#mydatabase << EOF
set pagesize 0
set trimspool on
set headsep off
set echo off
set feedback off
set linesize 1000
set verify off
set termout off
select varlist from table;
EOF
)
#This already works when I echo any variable from the list
#echo "$var_list" > var_list.dat
#. var_list.dat
#echo "$var1, $var2, $var3"
#Im trying to achieve the following
. $(echo "var_list")
echo "$any_variable_from_var_list"
The contents of var_list from the database are as follows:
var1="Test1"
var2="Test2"
var3="Test3"
I also tried sourcing it other ways such as:
. <<< $(echo "$var_list")
. $(cat "$var_list")
Im not sure if I need to read in each line now using a while loop.
Any advice is appreciated.
You can:
. /dev/stdin <<<"$varlist"
<<< is a here string. It redirects the content of data behind <<< to standard input.
/dev/stdin represents standard input. So reading from the 0 file descriptor is like opening /dev/stdin and calling read() on resulting file descriptor.
Because source command needs a filename, we pass to is /dev/stdin and redirect the data to be read to standard input. That way source reads the commands from standard input thinking it's reading from file, while we pass our data to the input that we want to pass.
Using /dev/stdin for tools that expect a file is quite common. I have no idea what references to give, I'll link: bash manual here strings, Posix 7 base definitions 2.1.1p4 last bullet point, linux kernel documentation on /dev/ directory entires, bash manual shell builtins, maybe C99 7.19.3p7.
I needed a way to store dotenv values in files locally and vars for DevOps pipelines, so I could then source to the runtime environment on demand (from file when available and vars when not). More though, I needed to store different dotenv sets in different vars and use them based on the source branch (which I load to $ENV in .gitlab-ci.yml via export ENV=${CI_COMMIT_BRANCH:=develop}). With this I'll have developEnv, qaEnv, and productionEnv, each being a var containing it's appropriate dotenv contents (being redundant to be clear.)
unset FOO; # Clear so we can confirm loading after
ENV=develop; #
developEnv="VERSION=1.2.3
FOO=bar"; # Creating a simple dotenv in a var, with linebreak (as the files passed through will have)
envVarName=${ENV}Env # Our dynamic var-name
source <(cat <<< "${!envVarName}") # Using the dynamic name,
echo $FOO;
# bar
I wrote a single line shell script to import a .csv file to sqlite3 database table.
echo -e '.separator "," \n.import testing.csv aj_test' | sqlite3 ajtest.db
sqlite3 database = ajtest.db
sqlite3 table in ajtest.db = new_test
the testing.csv has 3 columns, first one is int the rest two are texts; so accordingly the structure of new_test is also--
sqlite> .schema aj_test
CREATE TABLE aj_test(number integer not null,
first_name varchar(20) not null,
last_name varchar(20) not null);
when the script is run, it does not show any error, but it also does not import any data. any guidelines as to what I have missed ???
After much studies and discussion, I found an answer that is working properly,
echo -e ".separator ","\n.import /home/aj/ora_exported.csv qt_exported2" | sqlite3 testdatabase.db
the main thing is, that I needed to include the path of the .csv file in the import statement.
I found this to work:
(echo .separator ,; echo .import path/to/file.csv table_name) | sqlite3 filename.db
The accepted answer fails to work for me.
Why don't you take advantage of Sqlite's built-in command-line options to load CVS file to Sqlite database table? I assume you are writing bash shell script to load CSV files data to SQLite table.
Have a look on bellow single line bash script:
#!/bin/bash
sqlite3 -separator "," -cmd ".import /path/to/test.csv aj_test" ajtest.db
With my limited knowladge, I can't give you any example to automatically logout sqlite cli after being load done on db!
This worked best for my needs because it is straightforward (unlike the echo solutions) and it doesn't leave the sqlite shell open.
#!/bin/bash
sqlite3 -separator ',' stuff.db ".import myfile.csv t_table_name"
I have some html data stored in text files right now. I recently decided to store the HTML data in the pgsql database instead of flat files. Right now, the 'entries' table contains a 'path' column that points to the file. I have added a 'content' column that should now store the data in the file pointed to by 'path'. Once that is complete, the 'path' column will be deleted. The problem that I am having is that the files contain apostrophes that throw my script out of whack. What can I do to correct this issue??
Here is the script
#!/bin/sh
dbname="myDB"
username="username"
fileroot="/path/to/the/files/*"
for f in $fileroot
do
psql $dbname $username -c "
UPDATE entries
SET content='`cat $f`'
WHERE id=SELECT id FROM entries WHERE path LIKE '*`$f`';"
done
Note: The logic in the id=SELECT...FROM...WHERE path LIKE "" is not the issue. I have tested this with sample filenames in the pgsql environment.
The problem is that when I cat $f, any apostrophe in Edit: the contents of $f closes the SQL string, and I get a syntax error.
For the single quote escaping issue, a reasonable workaround might be to double the quotes, so you'd use:
`sed "s/'/''/g" < "$f"`
to include the file contents instead of the cat, and for the second invocation in the LIKE where you appeared to intend to use the file name use:
${f/"'"/"''"/}
to include the literal string content of $f instead of executing it, and double the quotes. The ${varname/match/replace} expression is bash syntax and may not work in all shells; use:
`echo "$f" | sed "s/'/''/g"`
if you need to worry about other shells.
There are a bunch of other problems in that SQL.
You're trying to execute $f in your second invocation. I'm pretty sure you didn't intend that; I imagine you meant to include the literal string.
Your subquery is also wrong, it lacks parentheses; (SELECT ...) not just SELECT.
Your LIKE expression is also probably not doing what you intended; you probably meant % instead of *, since % is the SQL wildcard.
If I also change backticks to $() (because it's clearer and easier to read IMO), fix the subquery syntax and add an alias to disambiguate the columns, and use a here-document instead passed to psql's stdin, the result is:
psql $dbname $username <<__END__
UPDATE entries
SET content=$(sed "s/'/''/g" < "$f")
WHERE id=(SELECT e.id FROM entries e WHERE e.path LIKE '$(echo "$f" | sed "s/'/''/g")');
__END__
The above assumes you're using a reasonably modern PostgreSQL with standard_conforming_strings = on. If you aren't, change the regexp to escape apostrophes with \ instead of doubling them, and prefix the string with E, so O'Brien becomes E'O\'Brien'. In modern PostgreSQL it'd instead become 'O''Brien'.
In general, I'd recommend using a real scripting language like Perl with DBD::Pg or Python with psycopg to solve scripting problems with databases. Working with the shell is a bit funky. This expression would be much easier to write with a database interface that supported parameterised statements.
For example, I'd write this as follows:
import os
import sys
import psycopg2
try:
connstr = sys.argv[1]
filename = sys.argv[2]
except IndexError as ex:
print("Usage: %s connect_string filename" % sys.argv[0])
print("Eg: %s \"dbname=test user=fred\" \"some_file\"" % sys.argv[0])
sys.exit(1)
def load_file(connstr,filename):
conn = psycopg2.connect(connstr)
curs = conn.cursor()
curs.execute("""
UPDATE entries
SET content = %s
WHERE id = (SELECT e.id FROM entries e WHERE e.path LIKE '%%'||%s);
""", (filename, open(filename,"rb").read()))
curs.close()
if __name__ == '__main__':
load_file(connstr,filename)
Note the SQL wildcard % is doubled to escape it, so it results in a single % in the final SQL. That's because Python is using % as its format-specifier so a literal % must be doubled to escape it.
You can trivially modify the above script to accept a list of file names, connect to the database once, and loop over the list of all file names. That'll be a lot faster, especially if you do it all in one transaction. It's a real pain to do that with psql scripting; you have to use bash co-process as shown here ... and it isn't worth the hassle.
In the original post, I made it sound like there were apostrophes in the filename represented by $f. This was NOT the case, so a simple echo "$f" was able to fix my issue.
To make it more clear, the contents of my files were formatted as html snippets, typically something like <p>Blah blah <b>blah</b>...</p>. After trying the solution posted by Craig, I realized I had used single quotes in some anchor tags, and I did NOT want to change those to something else. There were only a few files where this violation occurred, so I just changed these to double quotes by hand. I also realized that instead of escaping the apostrophes, it would be better to convert them to ' Here is the final script that I ended up using:
dbname="myDB"
username="username"
fileroot="/path/to/files/*"
for f in $fileroot
do
psql $dbname $username << __END__
UPDATE entries
SET content='$(sed "s/'/\'/g" < "$f")'
WHERE id=(SELECT e.id FROM entries e WHERE path LIKE '%$(echo "$f")');
__END__
done
The format coloring on here might make it look like the syntax is incorrect, but I have verified that it is correct as posted.
I'm trying to write a script which lists a directory and creates an SQL script to insert these directories, problem is I only want to insert new directories, here is what I have so far:
#If file doesn't exist add the search path test
if [ ! -e /home/aydin/movies.sql ]
then
echo "SET SEARCH_PATH TO noti_test;" >> /home/aydin/movies.sql;
fi
cd /media/htpc/
for i in *
do
#for each directory escape any single quotes
movie=$(echo $i | sed "s:':\\\':g" )
#build sql insert string
insertString="INSERT INTO movies (movie) VALUES (E'$movie');";
#if sql string exists in file already
if grep -Fxq "$insertString" /home/aydin/movies.sql
then
#comment out string
sed -i "s/$insertString/--$insertString/g" /home/aydin/movies.sql
else
#add sql string
echo $insertString >> /home/aydin/movies.sql;
fi
done;
#execute script
psql -U "aydin.hassan" -d "aydin_1.0" -f /home/aydin/movies.sql;
It seems to work apart from one thing, the script doesn't recognise entries with single quotes in them, so upon running the script again with no new dirs, this is what the file looks like:
--INSERT INTO movies (movie) VALUES (E'007, Moonraker (1979)');
--INSERT INTO movies (movie) VALUES (E'007, Octopussy (1983)');
INSERT INTO movies (movie) VALUES (E'007, On Her Majesty\'s Secret Service (1969)');
I'm open to suggestions on a better way to do this also, my process seems pretty elongated and inefficient :)
Script looks generally good to me. Consider the revised version (untested):
#! /bin/bash
#If file doesn't exist add the search path test
if [ ! -e /home/aydin/movies.sql ]
then
echo 'SET search_path=noti_test;' > /home/aydin/movies.sql;
fi
cd /media/htpc/
for i in *
do
#build sql insert string - single quotes work fine inside dollar-quoting
insertString="INSERT INTO movies (movie) SELECT \$x\$$movie\$x\$
WHERE NOT EXISTS (SELECT 1 FROM movies WHERE movie = \$x\$$movie\$x\$);"
#no need for grep. SQL is self-contained.
echo $insertString >> /home/aydin/movies.sql
done
#execute script
psql -U "aydin.hassan" -d "aydin_1.0" -f /home/aydin/movies.sql;
To start a new file, use > instead of >>
Use single quotes ' for string constants without variables to expand
Use PostgreSQL dollar-quoting so you don't have to worry about single-quotes in the strings. You'll have to escape the $ character in the shell to remove its special meaning in the shell.
Use an "impossible" string for the dollar-quote, so it cannot appear in the string. If you don't have one, you can test for the quote-string and alter it in the unlikely case it should be matched, to be absolutely sure.
Use SELECT .. WHERE NOT EXISTS for the INSERT to automatically prevent already existing entries to be re-inserted. This prevents duplicate entries in the table completely - not just among the new entries.
An index on movies.movie (possibly, but not necessarily UNIQUE) would speed up the INSERTs.
Why bother with grep and sed and not just let the database detect duplicates?
Add a unique index on movie and create a new (temporary) insert script on each run and then execute it with autocommit (default) or with the -v ON_ERROR_ROLLBACK=1 option of psql. To get a full insert script of your movie database dump it with the --column-inserts option of pg_dump.
Hope this helps.
There's utility daemon called incron, which will fire your script whenever some file is written in watched directory. It uses kernel events, no loops - Linux only.
In its config (full file path):
/media/htpc IN_CLOSE_WRITE /home/aydin/added.sh $#/$#
Then simplest adder.sh script without any param check:
#!/bin/bash
cat <<-EOsql | psql -U "aydin.hassan" -d "aydin_1.0"
INSERT INTO movies (movie) VALUES (E'$1');
EOsql
You can have thousands of files in one directory and no issue as you can face with your original script.