folks,
H2 skips/drops the FIRST line of the following csv-Dataset ...
and I couldn't find a solution or workaround.
I have already looked through the various H2-tutorials and of course skimmed
the internet ...
Am I the only one (newbie - my "home" is the IBM-Mainframe)
who has such a problem inserting into a H2-database by using CSVREAD?
I expected here in this example the CSVREAD-Utility to insert 5(five!) lines
into the created table "VL01T098".
!!! there is no "Column-Header-Line" in the csv-dataset - I get the data this way only !!!
AJ52B1;999;2013-01-04;2014-03-01;03Z;A
AJ52C1;777;2012-09-03;2012-08-19;03Z;
AJ52B1;;2013-01-04;2014-03-01;;X
AJ52B1;321;2014-05-12;;03Z;Y
AJ52B1;999;;2014-03-01;03Z;Z
And here is my SQL (from the H2-joboutput):
DROP TABLE IF EXISTS VL01T098;
Update count: 0
(0 ms)
CREATE TABLE VL01T098 (
MODELL CHAR(6)
, FZG_STAT CHAR(3)
, ABGABE_DATUM DATE
, VERSAND_DATUM DATE
, FZG_GRUPPE CHAR(3)
, AV_KZ CHAR(1))
AS SELECT * FROM
CSVREAD
('D:\VL01D_Test\LOAD-csv\T098.csv',
null,
'charset=UTF-8 fieldSeparator=; lineComment=#');
COMMIT;
select count(*) from VL01T098;
select * from VL01T098;
MODELL FZG_STAT ABGABE_DATUM VERSAND_DATUM FZG_GRUPPE AV_KZ
AJ52C1 777 2012-09-03 2012-08-19 03Z null
AJ52B1 null 2013-01-04 2014-03-01 null X
AJ52B1 321 2014-05-12 null 03Z Y
AJ52B1 999 null 2014-03-01 03Z Z
(4 rows, 0 ms)
? Where is just the first csv-line gone ... and why is it lost?
Could you please help a H2-newbie ... with some IBM-DB2-experience
Many thanks in advance
Achim
You didn't specify a column list in the CSVREAD function. That means the column list is read from the file, as documented:
If the column names are specified (a list of column names separated
with the fieldSeparator), those are used, otherwise (or if they are
set to NULL) the first line of the file is interpreted as the column
names.
Related
e.g, below query:
create table t1 ( age int, name varchar(10) )
insert into t1 values(1, 'name'),(2, 'surname')
copy select * from t1 into 't1.dat' DELIMITERS '|','
','"' null as '';
The copy select cmd returns -1 as the affected row count, although it should return 2 as the value. Not sure why this is so. At many other times, I have seen the same query returning correctly the affected row count.
If I run the same query in the Dbeaver tool Iam using, I see this:
Updated Rows: -1
Query: copy select * from t1 into 't1.dat' DELIMITERS '|','
','"' null as ''
Finish time: Sat Apr 30 16:53:28 IST 2022
I think you have to use an absolute path to the export file, e.g.
copy select * from t1 into '/home/user/t1.dat' DELIMITERS '|','
','"' null as '';
When you see a message like 2 affected rows, it might actually be due to the result summary of the successfully completed INSERT INTO statement. But that doesn't mean the COPY INTO <file> statement is successful though.
I want to put the value of checked name from a checkbox in oracle apex into a table, but unable to do that.
I tried taking help from google but the steps mentioned didn't work too.
Could you please advise how to do that if I have taken a blank form and then added a checkbox to it, also I fetched the checkbox value from LOV.
As I understand, you are using a List of Values as a source of your checkboxes. Let's say you have following values there:
return value display value
----------------------------
123 andAND
456 Dibya
789 Anshul
321 aafirst
555 Anuj
When you select several values, APEX puts their return values into a string and separate them with :. So for the case in your screenshot the value in item P12_NEW will be 123:555. To split this values you can use the following query:
select regexp_substr(:P12_NEW, '[^:]+', 1, level) values
from dual
connect by regexp_substr(:P12_NEW, '[^:]+', 1, level) is not null
The result will be:
values
------
123
555
Next, you need to put these values into table usr_amt (let's say with columns user_id and amount, and the amount is entered into item P12_AMOUNT):
merge into usr_amt t
using (select regexp_substr(:P12_NEW, '[^:]+', 1, level) user_id
from dual
connect by regexp_substr(:P12_NEW, '[^:]+', 1, level) is not null
) n on (n.user_id = t.user_id)
when matched then update
set t.amount = :P12_AMOUNT
when not matched then insert (user_id, amount)
values (n.user_id, :P12_AMOUNT)
This query will search for the each user_id selected in the table, and if the user presents there, updates the corresponding value to the value of item :P12_AMOUNT, if not present - inserts a row with user_id and the value.
I'm reading over some Hive scripts from another team in my company and having trouble understanding a specific part of it. The part in question is:where dt='${product_dt}', which can be found on on the third line from the bottom of the code chunk below.
I've never seen this syntax before nor am I able to find anything via Google search (probably because I don't know the correct search terms to use). Any insight into what that where row filter step is doing would be appreciated.
set hive.security.authorization.enabled=false;
add jar /opt/mobiletl/prod_workflow_dir/lib/hiveudf_hash.jar;
create temporary function hash_string as 'HashString';
drop table 00_truthset_product_email_uid_pid;
create table 00_truthset_product_email_uid_pid as
select distinct email,
concat_ws('|', hash_string(lower(email), "SHA-1"),
hash_string(lower(email), "MD5"),
hash_string(upper(email), "SHA-1"),
hash_string(upper(email), "MD5")) as hashed_email,
uid, address_id, confidencescore
from product.prod_vintages
where dt='${product_dt}'
and email is not null and email != ''
and address_id is not null and address_id != '';
I tried set product_dt = 2014-12;, but it doesn't seem to work:
hive> SELECT dt FROM enabilink.prod_vintages GROUP BY dt LIMIT 10;
. . .
dt
2014-12
2015-01
2015-02
2015-03
2015-05
2015-07
2015-10
2016-01
2016-02
2016-03
hive> set product_dt = 2014-12;
hive> SELECT email FROM product.prod_vintages WHERE dt='${product_dt}';
. . .
Total MapReduce CPU Time Spent: 2 seconds 570 msec
OK
email
Time taken: 25.801 seconds
those are variables set in Hive. if you have set the variables before the query (in the same session), Hive will replace it with the specified value
for example
set product_dt=03-11-2012
Edit
Make sure that you are removing the spaces in your dt field (use trim UDF). Also, set the variable without spaces.
The scenario
I've got two tables with identical structure.
TABLE [INFORMATION], [SYNC_INFORMATION]
[ITEM] [nvarchar](255) NOT NULL
[DESCRIPTION] [nvarchar](255) NULL
[EXTRA] [nvarchar](255) NULL
[UNIT] [nvarchar](2) NULL
[COST] [float] NULL
[STOCK] [nvarchar](1) NULL
[CURRENCY] [nvarchar](255) NULL
[LASTUPDATE] [nvarchar](50) NULL
[IN] [nvarchar](4) NULL
[CLIENT] [nvarchar](255) NULL
I'm trying to create a synchronize procedure that will be triggered by a scheduled event at a given time every day.
CREATE PROCEDURE [dbo].[usp_SynchronizeInformation]
AS
BEGIN
SET NOCOUNT ON;
--Update all rows
UPDATE TARGET_TABLE
SET TARGET_TABLE.[DESCRIPTION] = SOURCE_TABLE.[DESCRIPTION],
TARGET_TABLE.[EXTRA] = SOURCE_TABLE.[EXTRA],
TARGET_TABLE.[UNIT] = SOURCE_TABLE.[UNIT],
TARGET_TABLE.[COST] = SOURCE_TABLE.[COST],
TARGET_TABLE.[STOCK] = SOURCE_TABLE.[STOCK],
TARGET_TABLE.[CURRENCY] = SOURCE_TABLE.[CURRENCY],
TARGET_TABLE.[LASTUPDATE] = SOURCE_TABLE.[LASTUPDATE],
TARGET_TABLE.[IN] = SOURCE_TABLE.[IN],
TARGET_TABLE.[CLIENT] = SOURCE_TABLE.[CLIENT]
FROM SYNC_INFORMATION TARGET_TABLE
JOIN LSERVER.dbo.INFORMATION SOURCE_TABLE ON TARGET_TABLE.ITEMNO = SOURCE_TABLE.ITEMNO
WHERE TARGET_TABLE.ITEMNO = SOURCE_TABLE.ITEMNO
--Add new rows
INSERT INTO SYNC_INFORMATION (ITEMNO, DESCRIPTION, EXTRA, UNIT, STANDARDCOST, STOCKTYPE, CURRENCY_ID, LASTSTANDARDUPDATE, IN_ID, CLIENTCODE)
SELECT
src.ITEM,
src.DESCRIPTION,
src.EXTRA,
src.UNIT,
src.COST,
src.STOCKTYPE,
src.CURRENCY_ID,
src.LASTUPDATE,
src.IN,
src.CLIENT
FROM LSERVER.dbo.INFORMATION src
LEFT JOIN SYNC_INFORMATION targ ON src.ITEMNO = targ.ITEMNO
WHERE
targ.ITEMNO IS NULL
END
Currently, this procedure (including some others that are also executed at the same time) takes about 15 seconds to execute.
I'm planning on adding a "Synchronize" button in my work interface so that users can manually synchronize when, for instance, a new item is added and needs to be used the same day.
But in order for me to do that, I need to trim those 15 seconds as much as possible.
Instead of updating every single row, like in my procedure, is it possible to only update rows that have values that does not match?
This would greatly increase the execution speed, since it doesn't have to update all the 4000 rows when maybe only 20 actually needs it.
Can this be done in a better way, or optimized?
Does it need improvements, if yes, where?
How would you solve this?
Would also appreciate some time differences between the solutions so I can compare them.
UPDATE
Using marc_s's CHECKSUM is really brilliant. The problem is that in some instances the information creates the same checksum. Here's an example, due to the classified content, I can only show you 2 columns, but I can say that all columns have identical information except these 2. To clarify: this screenshot is of all the rows that had duplicate CHECKSUMs. These are also the only rows with a hyphen in the ITEM column, I've looked.
The query was simply
SELECT *, CHECKSUM(*) FROM SYNC_INFORMATION
If you can change the table structure ever so slightly - you could add a computed CHECKSUM column to your two tables, and in the case the ITEM is identical, you could then check that checksum column to see if there are any differences at all in the columns of the table.
If you can do this - try something like this here:
ALTER TABLE dbo.[INFORMATION]
ADD CheckSumColumn AS CHECKSUM([DESCRIPTION], [EXTRA], [UNIT],
[COST], [STOCK], [CURRENCY],
[LASTUPDATE], [IN], [CLIENT]) PERSISTED
Of course: only include those columns that should be considered when making sure whether a source and a target row are identical ! (this depends on your needs and requirements)
This persists a new column to your table, which is calculated as the checksum over the columns specified in the list of arguments to the CHECKSUM function.
This value is persisted, i.e. it could be indexed, too! :-O
Now, you could simplify your UPDATE to
UPDATE TARGET_TABLE
SET ......
FROM SYNC_INFORMATION TARGET_TABLE
JOIN LSERVER.dbo.INFORMATION SOURCE_TABLE ON TARGET_TABLE.ITEMNO = SOURCE_TABLE.ITEMNO
WHERE
TARGET_TABLE.ITEMNO = SOURCE_TABLE.ITEMNO
AND TARGET_TABLE.CheckSumColumn <> SOURCE_TABLE.CheckSumColumn
Read more about the CHECKSUM T-SQL function on MSDN!
Question:
Is it possible to have a column name in a select statement changed based on a value in it's result set?
For example, if a year value in a result set is less than 1950, name the column OldYear, otherwise name the column NewYear. The year value in the result set is guaranteed to be the same for all records.
I'm thinking this is impossible, but here was my failed attempt to test the idea:
select 1 as
(case
when 2 = 1 then "name1";
when 1 = 1 then "name2")
from dual;
You can't vary a column name per row of a result set. This is basic to relational databases. The names of columns are part of the table "header" and a name applies to the column under it for all rows.
Re comment: OK, maybe the OP Americus means that the result is known to be exactly one row. But regardless, SQL has no syntax to support a dynamic column alias. Column aliases must be constant in a query.
Even dynamic SQL doesn't help, because you'd have to run the query twice. Once to get the value, and a second time to re-run the query with a different column alias.
The "correct" way to do this in SQL is to have both columns, and have the column that is inappropriate be NULL, such as:
SELECT
CASE WHEN year < 1950 THEN year ELSE NULL END AS OldYear,
CASE WHEN year >= 1950 THEN year ELSE NULL END AS NewYear
FROM some_table_with_years;
There is no good reason to change the column name dynamically - it's analogous to the name of a variable in procedural code - it's just a label that you might refer to later in your code, so you don't want it to change at runtime.
I'm guessing what you're really after is a way to format the output (e.g. for printing in a report) differently depending on the data. In that case I would generate the heading text as a separate column in the query, e.g.:
SELECT 1 AS mydata
,case
when 2 = 1 then 'name1'
when 1 = 1 then 'name2'
end AS myheader
FROM dual;
Then the calling procedure would take the values returned for mydata and myheader and format them for output as required.
You will need something similar to this:
select 'select ' || CASE WHEN YEAR<1950 THEN 'OLDYEAR' ELSE 'NEWYEAR' END || ' FROM TABLE 1' from TABLE_WITH_DATA
This solution requires that you launch SQLPLUS and a .sql file from a .bat file or using some other method with the appropriate Oracle credentials. The .bat file can be kicked off manually, from a server scheduled task, Control-M job, etc...
Output is a .csv file. This also requires that you replace all commas in the output with some other character or risk column/data mismatch in the output.
The trick is that your column headers and data are selected in two different SELECT statements.
It isn't perfect, but it does work, and it's the closest to standard Oracle SQL that I've found for a dynamic column header outside of a development environment. We use this extensively to generate recurring daily/weekly/monthly reports to users without resorting to a GUI. Output is saved to a shared network drive directory/Sharepoint.
REM BEGIN runExtract1.bat file -----------------------------------------
sqlplus username/password#database #C:\DailyExtracts\Extract1.sql > C:\DailyExtracts\Extract1.log
exit
REM END runExtract1.bat file -------------------------------------------
REM BEGIN Extract1.sql file --------------------------------------------
set colsep ,
set pagesize 0
set trimspool on
set linesize 4000
column dt new_val X
select to_char(sysdate,'MON-YYYY') dt from dual;
spool c:\DailyExtracts\&X._Extract1.csv
select '&X-Project_id', 'datacolumn2-Project_Name', 'datacolumn3-Plant_id' from dual;
select
PROJ_ID
||','||
replace(PROJ_NAME,',',';')-- "Project Name"
||','||
PLANT_ID-- "Plant ID"
from PROJECTS
where ADDED_DATE >= TO_DATE('01-'||(select to_char(sysdate,'MON-YYYY') from dual));
spool off
exit
/
REM ------------------------------------------------------------------
CSV OUTPUT (opened in Excel and copy/pasted):
old 1: select '&X-Project_id' 'datacolumn2-Project_Name' 'datacolumn3-Plant_id' from dual
new 1: select 'MAR-2018-Project_id' 'datacolumn2-Project_Name' 'datacolumn3-Plant_id' from dual
MAR-2018-Project_id datacolumn2-Project_Name datacolumn3-Plant_id
31415 name1 1007
31415 name1 2032
32123 name2 3302
32123 name2 3384
32963 name3 2530
33629 name4 1161
34180 name5 1173
34180 name5 1205
...
...
etc...
135 rows selected.