I need to export data from a table containing around 3 million data. The table has 9 columns and is in the following format:
Source | Num | User | Offer | Simul | Start_Date | End_Date | Label | Value
coms p | 0012| plin | synth | null | 04-JAN-15 | 31-JAN-15| page v | 8
However when I am opening the csv file using notepad++ only around 600 000 lines are displayed and the display is as follows:
coms p ,12,plin ,synth , ,04/01/2015 00:00:00,04/01/2015 00:00:00,page v
8
As you can see there are lots of spaces in some fields despite having none in the field in the table, the 0012 value of Num field is displayed as 12 and the last field is on another line.
What's more is that there is an empty line in the csv between 2 rows of the table.
Any idea of how to make those useless spaces disappear, and how to display the whole row data in a single row in the csv, how to make the 00 appear for the Num field and why is it that only 600 000 is being displayed in Notepad++? I read that there is no row limit for csv files.
The sql I am using is below:
SET SQLFORMAT csv
SET HEAD OFF
spool /d:/applis/test/file.csv
select * from TEST;
spool off;
First, there are much easier ways to export a CSV if you're using SQL Developer or TOAD.
But for sql*plus, you can use set linesize 32000 to get all the columns to display on a single line, and set pagesize 0 will get rid of the initial CRLF. But it's displaying the columns in fixed-width format because that's how spool output works.
If you want to have variable-width columns, the most portable standardized way is to manually concatenate the columns yourself and not use select *.
set linesize 32000 -- print as much as possible on each line
set trimspool on -- don't pad lines with blank spaces
set pagesize 0 -- don't print blank lines between some rows
set termout off -- just print to spool file, not console (faster)
set echo off -- don't echo commands to output
set feedback on -- just for troubleshooting; will print the rowcount at the end of the file
spool /d:/applis/test/file.csv
select col1 || ',' || col2 || ',' || col3 from TEST;
spool off
It might be because of the column length in the database.
Suppose Source column length in database is 50, then it with take a length of 50 character in the file.
Try trimout and trimspool in query as:
SET TRIMOUT ON
SET TRIMSPOOL ON
Related
This question branches off a question already asked.
I want to make a csv file with the db2 results including column names.
EXPORT TO ...
SELECT 1 as id, 'COL1', 'COL2', 'COL3' FROM sysibm.sysdummy1
UNION ALL
(SELECT 2 as id, COL1, COL2, COL3 FROM myTable)
ORDER BY id
While this does work, I am left with an unwanted column and rows of 1 and 2's
Is there a way to do this via the db2 command or a full bash alternative without redundant columns while keeping the header at the top?
e.g.
Column 1 Column 2 Column 3
data 1 data 2 data3
... ... ...
instead of:
1 Column 1 Column 2 Column 3
2 data 1 data 2 data3
2 ... ... ...
All the answers I've seen use two separate export statements. The first generates the column headers:
db2 "EXPORT TO /tmp/header.csv of del
SELECT
SUBSTR(REPLACE(REPLACE(XMLSERIALIZE(CONTENT XMLAGG(XMLELEMENT(NAME c,colname)
ORDER BY colno) AS VARCHAR(1500)),'<C>',', '),'</C>',''),3)
FROM syscat.columns WHERE tabschema=${SCHEMA} and tabname=${TABLE}"
then the query body
db2 "EXPORT TO /tmp/body.csv of del
SELECT * FROM ${SCHEMA}.${TABLE}"
then
cat /tmp/header.csv /tmp/body.csv > ${TABLE}.csv
If you just want the headers for the extracted data and you want those headers to always be on top and you want to be able to change the names of those headers so it appears more user-friendly and put it all into a CSV file.
You can do the following:
# Creates headers and new output file
HEADERS="ID,USERNAME,EMAIL,ACCOUNT DISABLED?"
echo "$HEADERS" > "$OUTPUT_FILE"
# Gets results from database
db2 -x "select ID, USERNAME, DISABLED FROM ${SCHEMA}.USER WHERE lcase(EMAIL)=lcase('$USER_EMAIL')" | while read ID USERNAME DISABLED ;
do
# Appends result to file
echo "${ID},${USERNAME},${USER_EMAIL},${DISABLED}" >> "$OUTPUT_FILE"
done
No temporary files or merging required.
Db2 for Linux/Unix/Windows lacks a (long overdue) simple opting (to the export command) for this common requirement.
But using the bash shell you can run two separate exports (one for the column-headers, the other for the data ) and concat the results to a file via an intermediate named pipe.
Using an intermediate named pipe means you don't need two flat-file copies of the data.
It is ugly and awkward but it works.
Example fragment (you can initialize the variables to suit your environment):
mkfifo ${target_file_tmp}
(( $? != 0 )) && print "\nERROR: failed to create named pipe ${target_file_tmp}" && exit 1
db2 -v "EXPORT TO ${target_file_header} of del SELECT 'COL1', 'COL2', 'COL3' FROM sysibm.sysdummy1 "
cat ${target_file_header} ${target_file_tmp} >> ${target_file} &
(( $? > 0 )) && print "Failed to append ${target_file} . Check permissions and free space" && exit 1
db2 -v "EXPORT TO ${target_file_tmp} of del SELECT COL1, COL2, COL3 FROM myTable ORDER BY 1 "
rc=$?
(( rc == 1 )) && print "Export found no rows matching the query" && exit 1
(( rc == 2 )) && print "Export completed with warnings, your data might not be what you expect" && exit 1
(( rc > 2 )) && print "Export failed. Check the messages from export" && exit 1
This would work for your simple case
EXPORT TO ...
SELECT C1, C2, C3 FROM (
SELECT 1 as id, 'COL1' as C1, 'COL2' as C2, 'COL3' as C3 FROM sysibm.sysdummy1
UNION ALL
(SELECT 2 as id, COL1, COL2, COL3 FROM myTable)
)
ORDER BY id
Longer term, EXTERNAL TABLE support (already in Db2 Warehouse) which has the INCLUDEHEADER option is (I guess) going to appear in Db2 at some point.
I wrote a stored procedure that extracts the header via describe command. The names can be retrieved from a temporary table, and be exported to a file. The only thing it is still not possible is to concatenate the files via SQL, thus a cat to both file and redirection to another file is necessary as last step.
CALL DBA.GENERATE_HEADERS('SELECT * FORM SYSCAT.TABLES') #
EXPORT TO myfile_header OF DEL SELECT * FROM SESSION.header #
EXPORT TO myfile_body OF DEL SELECT * FORM SYSCAT.TABLES #
!cat myfile_header myfile_body > myfile #
The code of the stored procedure is at: https://gist.github.com/angoca/8a2d616cd1159e5d59eff7b82f672b72
More information at: https://angocadb2.blogspot.com/2019/11/export-headers-of-export-in-db2.html.
I'm trying to run copy command that populate the db based on concatination of the csv.
db columns names are:
col1,col2,col3
csv content is (just the numbers, names are the db column names):
1234,5678,5436
what i need is a way to insert data say like this:
based on my example:
i want to put in the db:
col1, col2, col3
1234, 5678 "1234_XX_5678"
should i use FILLERS?
if so what is the command?
my starting point is:
COPY SAMPLE.MYTABLE (col1,col2,col3)
FROM LOCAL
'c:\\1\\test.CSV'
UNCOMPRESSED DELIMITER ',' NULL AS 'NULL' ESCAPE AS '\' RECORD TERMINATOR '
' ENCLOSED BY '"' DIRECT STREAM NAME 'Identifier_0' EXCEPTIONS 'c:\\1\\test.exceptions'
REJECTED DATA 'c:\\1\\test.rejections' ABORT ON ERROR NO COMMIT;
can you help how to load those columns (basically col3)
thanks
There different ways to do this.
1 - Pipe the data into vsql and do the data edit on the fly using linux
Eg:
cat file.csv |sed 's/,/ , /g' | awk {'print $1 $2 $3 $4 $1"_XX_"$3'}
|vsql -U user -w passwd -d dbname -c "COPY tbl FROM STDIN DELIMITER ',';"
2 - Use Fillers
copy tbl(
v1 filler int ,
v2 filler int ,
v3 filler int,
col1 as v1,
col2 as v2,
col3 as v1||'_XX_'||v2) from '/tmp/file.csv' delimiter ',' direct;
dbadmin=> select * from tbl;
col1 | col2 | col3
------+------+--------------
1234 | 5678 | 1234_XX_5678
(1 row)
I hope this helps :)
You don't even have to make the two input columns - which you load as-is anyway - FILLERs. This will do:
COPY mytable (
col1
, col2
, col3f FILLER int
, col3 AS col1::CHAR(4)||'_XX_'||col2::CHAR(4)
)
FROM LOCAL 'foo.txt'
DELIMITER ','
I have the result for the query:
SELECT leg_store_wh_code || ',' || rms_location col1 from SKS_CNV_LOCATION_XREF as
101,101 1,601 202,602 3,603 4,604 207,607 8,608 9,609 10,610 212,612 613,613 14,614 16,616 17,617 18,618 619,619 20,620 21,621 23,623 24,624 85,625 26,626 28,628 29,629 30,630 31,631 32,632 90,633 34,634 635,635 36,636(store_list_result holds this)
I want to use all the values inside the loop. But only the first value is printing.its going to the next values.Can anyone help me
store_list="SELECT leg_store_wh_code || ',' || rms_location col1 from SKS_CNV_LOCATION_XREF ;"
store_list_result=`sqlplus -s $UP <<EOF
SET FEEDBACK OFF
SET HEAD OFF
SET AUTOPRINT OFF
SET LINESIZE 1000
SET TAB OFF
SET ECHO OFF
SET PAGESIZE 0
SET TERMOUT OFF
SET TRIMSPOOL ON
${store_list}
exit
EOF`
for i in store_list_result
do
LEG_ID=`echo $store_list_result | cut -d',' -f1`
echo $LEG_ID
RMS_ID=`echo $store_list_result | cut -d',' -f2 | cut -d' ' -f1`
echo $RMS_ID
result I got is:
101
101
for i in store_list_result do LEG_ID=`echo $store_list_result | cut -d',' -f1`
How is it ever supposed to work? You iterating once ($i="store_list_result", not this variable's value), then you using $store_list_result every time (still once, but it's wrong nevertheless). And format your code properly, grave signs are getting eaten.
I have a shell script environmental variable being populated by an sql command
the sql command is returning multiple records of 3 columns. each record I need to pass to another shell script.
QUERYRESULT=`${SQLPLUS_COMMAND} -s ${SQL_USER}/${SQL_PASSWD}#${SQL_SCHEMA}<<endl
set heading off feedback off
select col1, col2, col3
from mytable
where ......)
order by ......
;
exit
endl`
echo ${QUERYRESULT}
outputs a single line of all columns space separated, all variables are guaranteed to be not null
val1 val2 val3 val1 val2 val3 val1 val2 val3 ......
I need to call the following for each record
nextScript.bash val1 val2 val3
I can also run the query but count the records to determine how many times I need to call nextScript.bash.
any thoughts on how to get from a single env variable of multiple 3 parameters to my executing the next script?
Without the use of a variable:
( ${SQLPLUS_COMMAND} -s ${SQL_USER}/${SQL_PASSWD}#${SQL_SCHEMA}<<endl
set heading off feedback off
select col1, col2, col3
from mytable
where ......)
order by ......
;
exit
endl
) | while read line;do echo $line; done
Shell Script
#! /bin/bash
sqlplus -s <username>/<passwd>#dbname << EOF
set echo on
set pagesize 0
set verify off
set lines 32000
set trimspool on
set feedback off
SELECT *
FROM <dbname>.<tablename1> tr
LEFT JOIN <tablename2> t2 ON t2.id2 = tr.id1
LEFT JOIN <tablename3> t3 ON t3.id2 = tr.id1
LEFT JOIN <tablename4> t4 ON t4.id2 = tr.id1
WHERE tr.TIMESTAMP > SYSDATE - 75 / 1440
AND tr.TIMESTAMP <= SYSDATE - 15 / 1440
AND t2.value in ( value1, value2, etc...)
ORDER BY timestamp;
exit;
EOF
Now, the purpose is to read 32000 values in t2.value column. These values are only numbers like 1234,4567,1236, etc. I guess i should put these numbers in a separate file and then reading that file in t2.value. But i want the SQL to be excuted only once not 32000 times. can you please advise how is that possible ? How can i get the values (separated by commas) in t2.value (by some loop, reading line probably) ?
you could use SQL*Loader to load those values into a temporary table that you have created in a first time with an index on its only column.
sqlldr user/password%#sid control=ctl_file
Contents of ctl_file:
load data
infile *
append
into table MY_TEMP_TABLE
fields terminated by ";" optionally enclosed by '"'
(
column1
)
begindata
"value1"
"value2"
[...]
(double quotes are optional, and unneeded for numbers.
Then modify your query with:
AND t2 in (SELECT column1 FROM my_temp_table)
and DROP my_temp_table afterwards.
You can create a comma separated list from the file that contains all the numbers one per line as:
t2val=$(cat your_file_with_numbers | tr '\n' ',' | sed 's/,$//')
Next you can use this variable $t2val as:
....
and t2.value in ( "$t2val")
We are replacing the \n between the lines with a comma and deleting the last comma which as no following number as it will create a syntax error in Oracle.
#!/bin/bash
2 t2val=$(cat /home/trnid | tr '\n' ',' | sed 's/,$//')
3 sqlplus -s <username>/<passwd>#dbname > /home/file << EOF
4 set echo on
5 set pagesize 0
6 set verify off
7 set lines 32000
8 set trimspool on
9 set feedback off
10 SELECT *
FROM <dbname>.<tablename1> tr
LEFT JOIN <tablename2> t2 ON t2.id2 = tr.id1
LEFT JOIN <tablename3> t3 ON t3.id2 = tr.id1
LEFT JOIN <tablename4> t4 ON t4.id2 = tr.id1
WHERE tr.TIMESTAMP > SYSDATE - 75 / 1440
AND tr.TIMESTAMP <= SYSDATE - 15 / 1440
and t2.value in ( "t2val")
order by timestamp;
26 exit;
27 EOF
trnid file has total of 32000 lines (each number on separate line). The length of each number is 11 digits.
I just happen to see different Error :
Input truncated to 7499 characters
SP2-0027: Input is too long (> 2499 characters) - line ignored
Input truncated to 7499 characters
SP2-0027: Input is too long (> 2499 characters) - line ignored.
The previous Error I got bcoz i inserted the numbers in trnid file separated by commas and in different line. In the case, i used only the command :
t2val=$(cat /home/trnid )