Valid MySQL query breaks when used as boundary-query - sqoop

Note: This is NOT a duplicate of Sqoop - Syntaxt error - Boundary Query - “error in your SQL syntax”
To limit the fetching data from only last 8 days, I'm using this following boundary-query with Sqoop
SELECT min(`created_at`),
max(`created_at`)
FROM `billing_db`.`billing_ledger`
WHERE `created_at` >= timestamp(date(convert_tz(now(), IF(##global.time_zone = 'SYSTEM', ##system_time_zone, ##global.time_zone),'Asia/Kolkata')) + interval -2 DAY)"
I've broken query into multiple lines here for readability, actually i pass it to Sqoop in single line only
Explaination of different parts of boundary-query are
IF(##global.time_zone = 'SYSTEM', ##system_time_zone, ##global.time_zone)
determines server timezone
works for both MySQL & TiDB
convert_tz(now(), <server-timezone>,'Asia/Kolkata')
converts time from server-timezone in IST
timestamp(date(<ist-timestamp> + interval -{num_days} DAY)
returns the IST timestamp at 00:00 hours for date whih is {num_days} before today (current-time -> tz-specific)
While the query works fine on MySQL
mysql> SELECT min(`created_at`),
-> max(`created_at`)
-> FROM `billing_db`.`billing_ledger`
-> WHERE `created_at` >= timestamp(date(convert_tz(now(), IF(##global.time_zone = 'SYSTEM', ##system_time_zone, ##global.time_zone),'Asia/Kolkata')) + interval -2 DAY);
+---------------------+---------------------+
| min(`created_at`) | max(`created_at`) |
+---------------------+---------------------+
| 2020-05-08 00:00:00 | 2020-05-10 20:12:32 |
+---------------------+---------------------+
1 row in set (0.02 sec)
It breaks with following stacktrace on Sqoop
INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT min(), max() FROM . WHERE >= timestamp(date(convert_tz(now(), IF(##global.time_zone = 'SYSTEM', ##system_time_zone, ##global.time_zone),'Asia/Kolkata')) + interval -2 DAY)
[2020-05-10 12:45:34,968] {ssh_utils.py:130} WARNING - 20/05/10 18:15:34 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1589114450995_0001
[2020-05-10 12:45:34,971] {ssh_utils.py:130} WARNING - 20/05/10 18:15:34 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader#6ab7a896
[2020-05-10 12:45:34,973] {ssh_utils.py:130} WARNING - 20/05/10 18:15:34 ERROR tool.ImportTool: Import failed: java.io.IOException: java.sql.SQLSyntaxErrorException: (conn=313686) You have an error in your SQL syntax; check the manual that corresponds to your TiDB version for the right syntax to use line 1 column 12 near "), max() FROM . WHERE >= timestamp(date(convert_tz(now(), IF(##global.time_zone = 'SYSTEM', ##system_time_zone, ##global.time_zone),'Asia/Kolkata')) + interval -2 DAY)"
at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:207)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:303)
For the record
using WHERE $CONDITIONS is required in --query (free-form query-import) but for --boundary-query it is NOT mandatory. Without it, Sqoop merely generates this warning
WARN db.DataDrivenDBInputFormat: Could not find $CONDITIONS token in query: SELECT min(), max() FROM . WHERE >= timestamp(date(convert_tz(now(), IF(##global.time_zone = 'SYSTEM', ##system_time_zone, ##global.time_zone),'Asia/Kolkata')) + interval -2 DAY); splits may not partition data.
I've been using similar complex boundary-querys elsewhere in my pipeline but in this particular case it is breaking
What have I tried
I tried adding aliases in SELECT clause of query like this
SELECT min(`created_at`) AS min_created_at,...

Backticks `` were the culprit
Removing backticks from boundary-query resolved the error
Some comments in discussions point out that backticks can cause wierd things with sqoop
But the docs bear no mention of it and some discussions even encourage using it

Related

Try to convert Time to Sec in Oracle SQL

I am trying to convert Time to Sec but whatever I try I get error message.
The following query is what I done so far
SELECT
SUM(TIME_TO_SEC(mi.Time)),
uti.Date_
FROM
users ui
LEFT JOIN project_users pui
ON(ui.UserID = pui.UserID)
LEFT JOIN user_timesheets uti
ON(ui.UserID = uti.user_id)
LEFT JOIN moments mi
ON(uti.UserTimesheetsID = mi.UserTimesheetsID)
WHERE
uti.user_id = 1 AND mi.Time != ''
AND
EXTRACT(MONTH FROM uti.Date_) = '2020-01-21'
AND
EXTRACT(YEAR FROM uti.Date_) = '2020-01-21'
AND
mi.AtestStatus = 1
GROUP BY
uti.Date_
HAVING SUM(SELECT(TIME_TO_SEC(mi.Time))) > 28800;
I get error
ORA-00936: missing expression
00936. 00000 - "missing expression"
*Cause:
*Action:
Error at Line: 74 Column: 36
I am not sure what to use here to convert, but so far I try to use TO_CHAR and CAST
The reference link is here
REFERENCE
You refer TIME_TO_SEC function from MySQL documentation though question is marked with oracle tag. Use extract(second ...) or google oracle extract epoch equivalent, depending on what you want.
Also the expressions EXTRACT(MONTH FROM uti.Date_) = '2020-01-21' and EXTRACT(YEAR... look suspicious, returned values definitely are not of form 'YYYY-MM-DD'.

Merge statement issue - error unable to get a stable set of rows in the source tables

Hi I am getting error while running the below merge statement in oracle db , can you please let me know how to fix the below error ?
--Query
MERGE INTO d_prod_fld dp USING
(SELECT stg_prod_fld_id,
prod_cd_id,
country_name
FROM stg_prod_fld_delta pd
LEFT OUTER JOIN d_loc dl ON (dl.prod_cd_num=lpad(pd.prod_cd_id, 3, '0'))
WHERE pd.efft_to > trunc(sysdate+1)
AND pd.prod_cd_id IS NOT NULL ) stg
ON (dp.cd_id=stg.stg_prod_fld_id)
WHEN matched THEN
UPDATE
SET dl.prod_country=stg.country_name;
d_prod_fld - target dimension table ,
stg_prod_fld_delta - stage table ,
d_loc - look up table
basically when i tried to run the above query in sandbox it is running fine ,but when i tried to run in actual development environment it is showing the above error -
Error starting at line : 1 in command -
Error report -
SQL Error: ORA-30926: unable to get a stable set of rows in the source tables
30926. 00000 - "unable to get a stable set of rows in the source tables"
*Cause: A stable set of rows could not be got because of large dml
activity or a non-deterministic where clause.
*Action: Remove any non-deterministic where clauses and reissue the dml.
This means parallel DML's are happening in the source table stg_prod_fld_delta or lookup table d_loc which are modifying the results of the SELECT query.
You need to acquire a lock on stg_prod_fld_delta and d_loc using select * from table where condition =value for update no wait before running the MERGE statement .
Also issue COMMIT after the MERGE statement.
Please check the below
SELECT *
FROM stg_prod_fld_delta pd
WHERE pd.efft_to > trunc(sysdate+1)
AND pd.prod_cd_id IS NOT NULL
for update no wait;
select * from d_loc dl
for update no wait;
MERGE INTO d_prod_fld dp USING
(SELECT stg_prod_fld_id,
prod_cd_id,
country_name
FROM stg_prod_fld_delta pd
LEFT OUTER JOIN d_loc dl ON (dl.prod_cd_num=lpad(pd.prod_cd_id, 3,
'0'))
WHERE pd.efft_to > trunc(sysdate+1)
AND pd.prod_cd_id IS NOT NULL ) stg ON (dp.cd_id=stg.stg_prod_fld_id)
WHEN matched THEN
UPDATE
SET dl.prod_country=stg.country_name;
COMMIT;

Oracle SQL: Insert failed ORA-01722: invalid number, Data is numeric not string, why'd it fail?

I did a quick search and some have said this error is a result of trying to load numeric values that should be strings? I have no alpha characters. I will later be finding min/max/avg of all columns so I need them to be strings.
error:
--Insert failed for rows 1 through 50
--ORA-01722: invalid number
1 row of 23 columns of data (0's included):
25:33.5 - - - - - - -1.23 6.56 6.93 0 - - - - - 998.26 - - - - - - - 183.2 2.35 - 840 - - - - - - 1.56 -1.56 0 - - 0 - - - - - - - - 0 84.2 - 47.97 - - - - - - - - - - - - 0.81 0.48 - - - - - 0 11.37 4.5 -10.05 - - 13.3
for columns:
INSERT INTO CAR_LOGS (DEVICE_TIME, GX, GY, GZ, G_CALIBRATE, BAROMETER, ENGINE_COOL, ENGINE_LOAD, ENGINE_RPM, FUEL_TRIM_BANK1_LONG, FUEL_TRIM_BANK1_SENSOR1, FUEL_TRIM_BANK1_SENSOR2, FUEL_TRIM_BANK1_SHORT, GPS_VS_OBD_SPEED_DIFF, AIR_INTAKE_TEMP, MASS_AIR_FLOW_RATE, O2_VOLTS_BANK1_SENSOR1, O2_VOLTS_BANK1_SENSOR2, SPEED, THROTTLE_POSITION, TIMING_ADVANCE, TURBO_BOOST_VACUUM_GUAGE, VOLTAGE)
VALUES (27-Sep-2016 19:25:33.467,-1.23,6.56,6.93,-0.0,998.26,183.2,2.35,840.0,1.56,-1.56,0.0,0.0,0.0,84.2,47.97,0.81,0.48,0.0,11.37,4.5,-10.05,13.3);
I have like 8k rows of data in this file and it's failed of course on all of them. I'm fairly new to sql. My father is the db expert, I'm just learning programming at school/db on the side. Using Oracle SQL, I tried importing a csv file directly into the table and chose the columns correctly.
Also, I have like 20 files...any advice for a rookie on how to load them all the same way?
I found:
LOAD
DATA
cd path
cat file*.csv > all_files.csv
APPEND INTO TABLE TBL_DATA_FILE
EVALUATE CHECK_CONSTRAINTS
REENABLE DISABLED_CONSTRAINTS
EXCEPTIONS EXCEPTION_TABLE
FIELDS TERMINATED BY ","
OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
COL0,
COL1,
COL2,
COL3,
COL4
);
Do I replace path with
C:\Users\c_thu\Desktop\Database\CarLogsSEPT.2016
or attach it directly after the word path?
Two questions I guess but I only really want to figure out this error. Second question is meh.
Such error means you're trying to insert into numeric column value that cannot be converted to number. Can you please provide table description?
For sure please use
to_date('27-Sep-2016 19:25:33.467','dd-mon-yyyy hh24:mi:ss')
when inserting date instead of just posting 27-Sep-2016 19:25:33.467.
You shouldn't insert date without specifying format in to_date function. You have an NLS_DATE_FORMAT parameter which says what is format of the date during current session. But this parameter is session dependent so if you not use to_date your code can not work on other clients.
The issue is that you are putting time without single quotes. Try below.
INSERT INTO CAR_LOGS (DEVICE_TIME,
GX,
GY,
GZ,
G_CALIBRATE,
BAROMETER,
ENGINE_COOL,
ENGINE_LOAD,
ENGINE_RPM,
FUEL_TRIM_BANK1_LONG,
FUEL_TRIM_BANK1_SENSOR1,
FUEL_TRIM_BANK1_SENSOR2,
FUEL_TRIM_BANK1_SHORT,
GPS_VS_OBD_SPEED_DIFF,
AIR_INTAKE_TEMP,
MASS_AIR_FLOW_RATE,
O2_VOLTS_BANK1_SENSOR1,
O2_VOLTS_BANK1_SENSOR2,
SPEED,
THROTTLE_POSITION,
TIMING_ADVANCE,
TURBO_BOOST_VACUUM_GUAGE,
VOLTAGE)
VALUES (
TO_TIMESTAMP ('27-Sep-2016 19:25:33.467',
'dd-mon-yyyy HH24:MI:SS.FF'),
-1.23,
6.56,
6.93,
-0.0,
998.26,
183.2,
2.35,
840.0,
1.56,
-1.56,
0.0,
0.0,
0.0,
84.2,
47.97,
0.81,
0.48,
0.0,
11.37,
4.5,
-10.05,
13.3);
Quick Demo:
CREATE TABLE CAR_LOGS
(
DEVICE_TIME TIMESTAMP, --datatype of column has to be timestamp to show milliseconds
GX NUMBER,
GY NUMBER,
GZ NUMBER
);
Record Insert;
INSERT INTO CAR_LOGS (DEVICE_TIME,
GX,
GY,
GZ)
VALUES (
TO_TIMESTAMP ('27-Sep-2016 19:25:33.467',
'dd-mon-yyyy HH24:MI:SS.FF'),
-1.23,
6.56,
6.93);

How to generate diff between TIMESTAMP and DATE in SELECT in oracle 10

I need to query 2 tables, one contains a TIMESTAMP(6) column, other contains a DATE column. I want to write a select statement that prints both values and diff between these two in third column.
SB_BATCH.B_CREATE_DT - timestamp
SB_MESSAGE.M_START_TIME - date
SELECT SB_BATCH.B_UID, SB_BATCH.B_CREATE_DT, SB_MESSAGE.M_START_TIME,
to_date(to_char(SB_BATCH.B_CREATE_DT), 'DD-MON-RR HH24:MI:SS') as time_in_minutes
FROM SB_BATCH, SB_MESSAGE
WHERE
SB_BATCH.B_UID = SB_MESSAGE.M_B_UID;
Result:
Error report -
SQL Error: ORA-01830: date format picture ends before converting entire input string
01830. 00000 - "date format picture ends before converting entire input string"
You can subtract two timestamps to get an INTERVAL DAY TO SECOND, from which you calculate how many minutes elapsed between the two timestamps. In order to convert SB_MESSAGE.M_START_TIME to a timestamp you can use CAST.
Note that I have also removed your implicit table join with an explicit INNER JOIN, moving the join condition to the ON clause.
SELECT t.B_UID,
t.B_CREATE_DT,
t.M_START_TIME,
EXTRACT(DAY FROM t.diff)*24*60 +
EXTRACT(HOUR FROM t.diff)*60 +
EXTRACT(MINUTE FROM t.diff) +
ROUND(EXTRACT(SECOND FROM t.diff) / 60.0) AS diff_in_minutes
FROM
(
SELECT SB_BATCH.B_UID,
SB_BATCH.B_CREATE_DT,
SB_MESSAGE.M_START_TIME,
SB_BATCH.B_CREATE_DT - CAST(SB_MESSAGE.M_START_TIME AS TIMESTAMP) AS diff
FROM SB_BATCH
INNER JOIN SB_MESSAGE
ON SB_BATCH.B_UID = SB_MESSAGE.M_B_UID
) t
Convert the timestamp to a date using cast(... as date). Then take the difference between the dates, which is a number - expressed in days, so if you want it in minutes, multiply by 24*60. Then round the result as needed. I made up a small example below to isolate just the steps needed to answer your question. (Note that your query has many other problems, for example you didn't actually take a difference of anything anywhere. If you need help with your query in general, please post it as a separate question.)
select ts, dt, round( (sysdate - cast(ts as date))*24*60, 2) as time_diff_in_minutes
from (select to_timestamp('2016-08-23 03:22:44.734000', 'yyyy-mm-dd hh24:mi:ss.ff') as ts,
sysdate as dt from dual )
;
TS DT TIME_DIFF_IN_MINUTES
-------------------------------- ------------------- --------------------
2016-08-23 03:22:44.734000000 2016-08-23 08:09:15 286.52

SQL Navigator throws 'ORA-01834: Not a valid month' but query runs in other applications

I have stocked in this error many times but know I have no way to avoid and I have to get rid of it.
Sometimes I do run a query in SQL Navigator 6.1 XPert Edition and it throws:
ORA-01843: Not a valid month
But if I run this same query in same database but in other application(ie Aqua Data Studio) it works fine. It's just in isolated cases.
It may be some config problem?
EDIT: This query has that problem:
select
quantity dias_a_vencer
, estab
, initcap (planejador) planejador
, atributo2 fabrica
, mrp.item montagem
, initcap (descricao) des_montagem
, mrp.nro_docmto num_of
, initcap (mrp.fornecedor) cliente
, mrp.project_number projeto
, initcap (comprador) processista
, trunc (mrp.data_inicio) data_inicio
from etlt_mrp_exceptions mrp
where
mrp.compile_designator = 'ENGI'
and mrp.dt_coleta > sysdate - 50
and estab = '179' -- PARAMETRO ESTAB FILTRO
and atributo2 = '11' -- PARAMETRO FABRICA FILTRO
and nvl (mrp.quantity, 0) > 0
and dt_coleta = '05/12/2011' -- parametro do grafico acima
and initcap (planejador) = 'Maria Cristina Da Cruz Costa' -- parametro do grafico acima
order by quantity
, des_montagem
To make your query fail-safe in all environments, you have to change this line:
and dt_coleta = '05/12/2011'
to
and dt_coleta = to_date('05/12/2011', 'DD/MM/YYYY')
Assuming that you meant December 5th, and not May, 12th.
Btw: what datatype are the columns estab and atributo2. If those are numbers you should remove the single quotes around the parameters. That is another "implicit" data conversion that would e.g. prevent the usage of an index on those columns.
Always specify a date format, never assume it or use default formats. For example:
insert into mytable (mydate) values (to_date('02/28/2011', 'MM/DD/YYYY'));
Unfortunately, even using TO_DATE() does not guarantee success. (But I would strongly recommend that you are always aware of the date format in play.) For instance using SQL*Developer, this works:
alter session set nls_date_format='yyyy-mon-dd hh24:mi:ss.ddd';
select * from nns.nns_logGER WHERE LOG_DATE >= '2014-jul-30 14:47:16.211';
but this fails with "ORA-01834: day of month conflicts with Julian date" error:
alter session set nls_date_format='yyyy-mon-dd hh24:mi:ss.ddd';
select * from nns.nns_logGER WHERE LOG_DATE >= '2014-jul-30 14:47:16.210';
Notice that I changed only the last digit. And using to_date did not help:
select * from nns.nns_logGER WHERE LOG_DATE >=
to_date('2014-jul-30 14:47:16.210','yyyy-mon-dd hh24:mi:ss.ddd');
fails in the same way.
I wish there were better news, but I think this must be some internal problem.

Resources