I wrote a single line shell script to import a .csv file to sqlite3 database table.
echo -e '.separator "," \n.import testing.csv aj_test' | sqlite3 ajtest.db
sqlite3 database = ajtest.db
sqlite3 table in ajtest.db = new_test
the testing.csv has 3 columns, first one is int the rest two are texts; so accordingly the structure of new_test is also--
sqlite> .schema aj_test
CREATE TABLE aj_test(number integer not null,
first_name varchar(20) not null,
last_name varchar(20) not null);
when the script is run, it does not show any error, but it also does not import any data. any guidelines as to what I have missed ???
After much studies and discussion, I found an answer that is working properly,
echo -e ".separator ","\n.import /home/aj/ora_exported.csv qt_exported2" | sqlite3 testdatabase.db
the main thing is, that I needed to include the path of the .csv file in the import statement.
I found this to work:
(echo .separator ,; echo .import path/to/file.csv table_name) | sqlite3 filename.db
The accepted answer fails to work for me.
Why don't you take advantage of Sqlite's built-in command-line options to load CVS file to Sqlite database table? I assume you are writing bash shell script to load CSV files data to SQLite table.
Have a look on bellow single line bash script:
#!/bin/bash
sqlite3 -separator "," -cmd ".import /path/to/test.csv aj_test" ajtest.db
With my limited knowladge, I can't give you any example to automatically logout sqlite cli after being load done on db!
This worked best for my needs because it is straightforward (unlike the echo solutions) and it doesn't leave the sqlite shell open.
#!/bin/bash
sqlite3 -separator ',' stuff.db ".import myfile.csv t_table_name"
Related
I have a loop in bash that sets some pathnames as variables.
Within that loop I want to perform some sqlite commands based on these variables.
for example:
sqlitedb="/Users/Documents/database.db"
for mm in 01 02 03; do
filename1="A:/data1-${mm}.csv"
filename2="D:/data2-${mm}.csv"
sqlite3 "$sqlitedb" #create new file, it is a temporary file. no problem with this command.
.mode csv
.import "$filename1" data_1 #try to import the first data file. This command doesn't work
.import "$filename2" data_2 #try to import the second data file. This command doesn't work
# now do some important sql stuff which joins these files.
.quit
rm -f "$sqlitedb" #remove old file, ready for the next loop
done
Clearly, SQLITE doesn't know about my BASH variables. What is the best way to set variables, loop through files, etc within sqlite3?
If it helps, I'm using WSL ubuntu 18.04
You need a heredoc, as mentioned in comments:
for mm in 01 02 03; do
filename1="A:/data1-${mm}.csv"
filename2="D:/data2-${mm}.csv"
sqlite3 -batch -csv <<EOF
.import "$filename1" data_1
.import "$filename2" data_2
-- Do stuff with the tables
EOF
done
(If you leave off a filename, sqlite uses a in-memory database so you don't need a manual temporary database file unless you have a lot of data to store)
In our environment we do not have access to Hive meta store to directly query.
I have a requirement to generate tablename , columnname pairs for a set of tables dynamically.
I was trying to achieve this by running "describe extended $tablename" to a file for all tables and pick up tablename and column name pairs from the file.
is there any easier way it is done/it can be done other than this way .
The desired output is like
table1|col1
table1|col2
table1|col3
table2|col1
table2|col2
table3|col1
This script will print columns in desired format for single table. AWK parses strings from describe command, takes only column_name, concatenates with "|" and table_name variable, each string printed with \n as a delimiter between them.
#!/bin/bash
#Set table name here
TABLE_NAME=your_schema.your_table
TABLE_COLUMNS=$(hive -S -e "set hive.cli.print.header=false; describe ${TABLE_NAME};" | awk -v table_name="${TABLE_NAME}" -F " " 'f&&!NF{exit}{f=1}f{printf c table_name "|" toupper($1)}{c="\n"}')
You can easily modify it for generating output for all tables using show tables command for example.
The easier way is to access metadata database directly.
In my perl script I'm fetching records from oracle DB.column:SQL_STATUS_INFO with data type LONG.
$sql_statement = "select SQL_STATEMENT";
$sql_statement .= " from SQL_STATUS_INFO";
my $sql_return = SQL_Exec($sql_statement);
my $DbRtn_DB = SQL_Fetch();
foreach my $fromdb (#{$DbRtn_DB}) {
printf "SQL_STATEMENT :#{$fromdb}[0]\n";
}
SQL_Exec,SQL_Fetch are from perl module for executing the statements and fetching the results.
In the table SQL_STATUS_INFO, I have below data in column SQL_STATEMENT.
CREATE TABLE "ECLIPSFILE_4G_DUMMY_TRNX" (ENBID VARCHAR2(30 CHAR),MCC NUMBER(38),MNC NUMBER,MNCLENGTH NUMBER,CELL_IDNUMBER(38),EARFCNDL NUMBER,PHYSICALID)
While fetching it through above script, I was not able to get the entire statement into $DbRtn_DB.
The output I am getting when I ran above script is:
CREATE TABLEECLIPSFILE_4G_DUMMY_TRNX (ENBID VARCHAR2(30 CHAR),MCC NUMBER(38),MNC
So, I am not able to get the entire statement.
But I am able to insert the big text into db. I tried the above SQL statement in SQL_DEVELOPER and there it is giving me the entire string as output but not through the perl script.
How can I get that entire string into my perl variable.
Do you have the same errors when using DBI directly ?
In any cases, without seeing the code of SQL_Exec and SQL_Fetch it is hard to debug your problem.
The result you quote is specifically 80 characters long, which may not be a coincidence.
I am a beginner in Hadoop/Hive. I did some research to find out a way to export results of HiveQL query to CSV.
I am running below command line in Putty -
Hive -e ‘use smartsourcing_analytics_prod; select * from solution_archive_data limit 10;’ > /home/temp.csv;
However below is the error I am getting
ParseException line 1:0 cannot recognize input near 'Hive' '-' 'e'
I would appreciate inputs regarding this.
Run your command from outside the hive shell - just from the linux shell.
Run with 'hive' instead of 'Hive'
Just redirecting your output into csv file won't work. You can do:
hive -e 'YOUR QUERY HERE' | sed 's/[\t]/,/g' > sample.csv
like was offered here: How to export a Hive table into a CSV file?
AkashNegi answer will also work for you... a bit longer though
One way I do such things is to create an external table with the schema you want. Then do INSERT INTO TABLE target_table ... Look at the example below:
CREATE EXTERNAL TABLE isvaliddomainoutput (email_domain STRING, `count` BIGINT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ","
STORED AS TEXTFILE
LOCATION "/user/cloudera/am/member_email/isvaliddomain";
INSERT INTO TABLE isvaliddomainoutput
SELECT * FROM member_email WHERE isvalid = 1;
Now go to "/user/cloudera/am/member_email/isvaliddomain" and find your data.
Hope this helps.
I'm trying to write a script which lists a directory and creates an SQL script to insert these directories, problem is I only want to insert new directories, here is what I have so far:
#If file doesn't exist add the search path test
if [ ! -e /home/aydin/movies.sql ]
then
echo "SET SEARCH_PATH TO noti_test;" >> /home/aydin/movies.sql;
fi
cd /media/htpc/
for i in *
do
#for each directory escape any single quotes
movie=$(echo $i | sed "s:':\\\':g" )
#build sql insert string
insertString="INSERT INTO movies (movie) VALUES (E'$movie');";
#if sql string exists in file already
if grep -Fxq "$insertString" /home/aydin/movies.sql
then
#comment out string
sed -i "s/$insertString/--$insertString/g" /home/aydin/movies.sql
else
#add sql string
echo $insertString >> /home/aydin/movies.sql;
fi
done;
#execute script
psql -U "aydin.hassan" -d "aydin_1.0" -f /home/aydin/movies.sql;
It seems to work apart from one thing, the script doesn't recognise entries with single quotes in them, so upon running the script again with no new dirs, this is what the file looks like:
--INSERT INTO movies (movie) VALUES (E'007, Moonraker (1979)');
--INSERT INTO movies (movie) VALUES (E'007, Octopussy (1983)');
INSERT INTO movies (movie) VALUES (E'007, On Her Majesty\'s Secret Service (1969)');
I'm open to suggestions on a better way to do this also, my process seems pretty elongated and inefficient :)
Script looks generally good to me. Consider the revised version (untested):
#! /bin/bash
#If file doesn't exist add the search path test
if [ ! -e /home/aydin/movies.sql ]
then
echo 'SET search_path=noti_test;' > /home/aydin/movies.sql;
fi
cd /media/htpc/
for i in *
do
#build sql insert string - single quotes work fine inside dollar-quoting
insertString="INSERT INTO movies (movie) SELECT \$x\$$movie\$x\$
WHERE NOT EXISTS (SELECT 1 FROM movies WHERE movie = \$x\$$movie\$x\$);"
#no need for grep. SQL is self-contained.
echo $insertString >> /home/aydin/movies.sql
done
#execute script
psql -U "aydin.hassan" -d "aydin_1.0" -f /home/aydin/movies.sql;
To start a new file, use > instead of >>
Use single quotes ' for string constants without variables to expand
Use PostgreSQL dollar-quoting so you don't have to worry about single-quotes in the strings. You'll have to escape the $ character in the shell to remove its special meaning in the shell.
Use an "impossible" string for the dollar-quote, so it cannot appear in the string. If you don't have one, you can test for the quote-string and alter it in the unlikely case it should be matched, to be absolutely sure.
Use SELECT .. WHERE NOT EXISTS for the INSERT to automatically prevent already existing entries to be re-inserted. This prevents duplicate entries in the table completely - not just among the new entries.
An index on movies.movie (possibly, but not necessarily UNIQUE) would speed up the INSERTs.
Why bother with grep and sed and not just let the database detect duplicates?
Add a unique index on movie and create a new (temporary) insert script on each run and then execute it with autocommit (default) or with the -v ON_ERROR_ROLLBACK=1 option of psql. To get a full insert script of your movie database dump it with the --column-inserts option of pg_dump.
Hope this helps.
There's utility daemon called incron, which will fire your script whenever some file is written in watched directory. It uses kernel events, no loops - Linux only.
In its config (full file path):
/media/htpc IN_CLOSE_WRITE /home/aydin/added.sh $#/$#
Then simplest adder.sh script without any param check:
#!/bin/bash
cat <<-EOsql | psql -U "aydin.hassan" -d "aydin_1.0"
INSERT INTO movies (movie) VALUES (E'$1');
EOsql
You can have thousands of files in one directory and no issue as you can face with your original script.