QueryDSL: How to insert or update? - h2

I'm trying to implement https://stackoverflow.com/a/16392399/14731 for a table called "Modules" using QueryDSL. Here is my query:
String newName = "MyModule";
QModules modules = QModules.modules;
BooleanExpression moduleNotExists = session.subQuery().
from(modules).where(modules.name.eq(newName)).notExists();
SimpleSubQuery<String> setModuleName = session.subQuery().
where(moduleNotExists).unique(Expressions.constant(newName));
long moduleId = session.insert(modules).set(modules.name, setModuleName).
executeWithKey(modules.id);
I am expecting this to translate into:
insert into modules(name)
select 'MyModule'
where not exists
(select 1 from modules where modules.name = 'MyModule')
Instead, I am getting:
NULL not allowed for column "NAME"; SQL statement:
insert into MODULES (NAME)
values ((select ?
from dual
where not exists (select 1
from MODULES MODULES
where MODULES.NAME = ?)))
where ? is equal to MyModule.
Why does QueryDSL insert from dual? I am expecting it to omit from altogether.
How do I fix this query?

For the insert into select form use
columns(...).select(...)
But your error suggests that the INSERT clause is valid, but semantically not what you want.
Using InsertClause.set(...) you don't get the conditional insertion you are aiming for.
In other words with
columns(...).select(...)
you map the full result set into an INSERT template and no rows will be inserted for empty result sets, but with
set(...)
you map query results to a single column of an INSERT template and null values will be used for empty results.

Related

List of multiple column condition in query (kind of batch)

when trying to search with single record then this query works
#Query(value = "select * from table t where t.column1 = :column1 and t.column2 = :column2 and t.column3 = :column3")
Flux<Invoice> findByMultipleColumn(#Param("column1”) String column1, #Param("column2”) String column2, #Param("column3”) String column3);
But when I have list of criterias instead of a single row condition then I have to loop over the list of criterias & call the above query multiple times which is not feasible solution.
Sudo code
for (Criteria criteria : criteriaList) {
repository.findByMultipleColumn(criteria.getColumn1(), criteria.getColumn2(), criteria.getColumn3());
}
What I am trying to find a way to solve the above query for multiple LIST of all the 3 column criteria pair, something like below (this is not working solution)
#Query(value = "select * from table t where t.column1 = :column1 and t.column2 = :column2 and t.column3 = :column3")
Flux<Invoice> findByMultipleColumn(#Param List<Table> table);
Is there any way somehow we can try to achieve the above case?
Would be doable if column1, 2 and 3 were Embedded, then you could do
#Query(select * from Entity where embeddedProperty in (:values))
Flux<Entity> findByEmbeddedPropertyIn(Collection<EmbeddedClas> values);
Which would generate the following native SQL clause
Where (column1, column2, column3) in ((x, y, z), ...)
If you don't want to pack these fields i to an embeddable class, you can also try to do a workaround
#Query(select * from Entity where Concat(column1, ';', column2, ';', column3) in (:parametersConcatrenatedInJava)
Flux<Entity> findBy3Columns(Collection<String> parametersConcatrenatedInJava);
It's ofcourse not bulletproof, all three columns could have ";" as their values, this might be problematic if their type is not string, etc.
Edit.:
Third option is to use specification api. Using the criteria builder you can concatenate multiple and / or queries. And pass that specification as an argument to the repository that extends JpaSpecificationExecutor (if you're fetching whole entities) or an entity manager if you're using projections. Read more about specifications

Update statement with joins in Oracle

I need to update one column in table A with the result of a multiplication of one field from table A with one field from table B.
It would be pretty simple to do this in T-SQL, but I can't write the correct syntax in Oracle.
What I've tried:
UPDATE TABLE_A
SET TABLE_A.COLUMN_TO_UPDATE =
(select TABLE_A.COLUMN_WITH_SOME_VALUE * TABLE_B.COLUMN_WITH_PERCENTAGE
from TABLE_A
INNER JOIN TABLE_B
ON TABLE_A.PRODUCT_ID = TABLE_B.PRODUCT_ID
AND TABLE_A.SALES_CHANNEL_ID = TABLE_B.SALES_CHANNEL_ID)
WHERE TABLE_A.MONTH_ID IN (201601, 201602, 201603);
But I keep getting errors. Could anybody help me, please?
I generally prefer to use the below format for such cases since this will ensure there's no update performed if there's no data in the table(query extracted temp table) whereas in the above solution provided by Brian Leach will update the new value as null if there's no record present in the 2nd table but exists in the first table.
UPDATE
(
select TABLE_A.COLUMN_TO_UPDATE
, TABLE_A.PRODUCT_ID
, TABLE_A.COLUMN_WITH_SOME_VALUE * TABLE_B.COLUMN_WITH_PERCENTAGE as value
from TABLE_A
INNER JOIN TABLE_B
ON TABLE_A.PRODUCT_ID = TABLE_B.PRODUCT_ID
AND TABLE_A.SALES_CHANNEL_ID = TABLE_B.SALES_CHANNEL_ID
AND TABLE_A.MONTH_ID IN (201601, 201602, 201603)
) DATA
SET DATA.COLUMN_TO_UPDATE = DATA.value;
This solution can cause key preserved value issues which shouldn't be an issue here since i expect a single row in both the tables for one product(ID).
More on Key Preserved table concept in inner join can be found here
https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:548422757486
#Jayesh Mulwani raiesed a valid point, this will set the value to null if there is no matching record. This may or may not be the desired result. If it isn't, and no change is desirect, you can change the select statement to:
coalesce((SELECT table_b.column_with_percentage
FROM table_b
WHERE table_a.product_id = table_b.product_id AND table_a.sales_channel_id = table_b.sales_channel_id),1)
If this is the desired outcome, Jayesh's solution will be more efficient as it will only update matching records.
UPDATE table_a
SET table_a.column_to_update = table_a.column_with_some_value
* (SELECT table_b.column_with_percentage
FROM table_b
WHERE table_a.product_id = table_b.product_id
AND table_a.sales_channel_id = table_b.sales_channel_id)
WHERE table_a.month_id IN (201601, 201602, 201603);

Oracle: Invalid identifier

I am using the following query in oracle. However, it gives an error saying that "c.par" in line 5 is an invalid parameter. No idea why. The columns exist. I checked. I have been struggling with this for a long time. All I want to do is to merge one table into another and update it using oracle. Could someone please help?
MERGE INTO SPRENTHIERARCHIES
USING ( SELECT c.PARENTCATEGORYID AS par,
e.rootcategoryId AS root
FROM SPRENTCATEGORIES c,SPRENTHIERARCHIES e
WHERE e.root (+)= c.par
) SPRENTCATEGORIES
ON (SPRENTHIERARCHIES.rootcategoryId = SPRENTCATEGORIES.parentcategoryId)
WHEN MATCHED THEN
UPDATE SET e.root=c.par
The e and c aliases only exist within the query in the using clause. You're trying to refer to them in the update clause. You're also using a column alias from the using clause against the target table, which doesn't have that column (unless your tables have both rootcategoryId and root, and parentCategoryId and par).
So this:
UPDATE SET e.root=c.par
should be:
UPDATE SET SPRENTHIERARCHIES.rootcategoryId= SPRENTCATEGORIES.par
And in that using clause you're trying to use column aliases as the same level of query, so this:
WHERE e.root (+)= c.par
should be:
WHERE e.rootcategoryId (+)= c.PARENTCATEGORYID
Your on clause is wrong too, as that is not using the column alias:
ON (SPRENTHIERARCHIES.rootcategoryId = SPRENTCATEGORIES.par)
But I'd suggest you replace the old syntax in the using clause with proper join clauses:
MERGE INTO SPRENTHIERARCHIES
USING ( SELECT c.PARENTCATEGORYID AS par,
e.rootcategoryId AS root
FROM SPRENTCATEGORIES c
LEFT JOIN SPRENTHIERARCHIES e
ON e.rootcategoryId = c.PARENTCATEGORYID
) SPRENTCATEGORIES
ON (SPRENTHIERARCHIES.rootcategoryId = SPRENTCATEGORIES.par)
WHEN MATCHED THEN
UPDATE SET SPRENTHIERARCHIES.rootcategoryId= SPRENTCATEGORIES.par
You have a more fundamental problem though, as you're trying to update a joining column; this will get:
ORA-38104: Columns referenced in the ON Clause cannot be updated
As Gordon Linoff suggested you can use an update rather than a merge. Something like:
UPDATE SPRENTHIERARCHIES h
SET h.rootcategoryId = (
SELECT c.PARENTCATEGORYID
FROM SPRENTCATEGORIES c
WHERE c.PARENTCATEGORYID = h.rootCategoryID
)
WHERE EXISTS (
SELECT null
FROM SPRENTCATEGORIES c
WHERE c.PARENTCATEGORYID = h.rootCategoryID
)
The where exists clause is there in case there not be a matching record - which the outer join in your original query implies. But in this form it's even more obvious that you're going to update rootcategoryId to the same value, since you're selecting the parentCategoryID which is equal to it. So the update (or merge) seems to be pointless.

Hive LATERAL VIEW and WHERE Clause using Sub query

I'm looking for a way to optimize my query.
We have a table with events called lea, with a column app_properties, which are tags, stored as a comma separated string.
I would like to select all the events that match the result of a query that select the desired tags.
My first try:
SELECT uuid, app_properties, tag
FROM events
LATERAL VIEW explode(split(app_properties, '(, |,)')) tag_table AS tag
WHERE tag IN (SELECT source_value FROM mapping WHERE indicator = 'Bandwidth Usage')
But Hive will not allow this...
FAILED: SemanticException [Error 10249]: Line 4:6 Unsupported SubQuery Expression 'tag': Correlating expression cannot contain unqualified column references.
Gave it another try by replacing WHERE tag IN by WHERE tag_table.tag IN but not luck...
FAILED: SemanticException Line 4:6 Invalid table alias tag_table' in definition of SubQuery sq_1 [tag_table.tag IN (SELECT source_value FROM mapping WHERE indicator = 'Bandwidth Usage')] used as sq_1 at Line 4:20.
In the end... The query below gives the desired result, but I've a feeling that this is not the most optimized way of solving this use case. Has anyone ran into the same use case where you need the select from a LATERAL VIEW using a Sub query?
SELECT to_date(substring(events.time, 0, 10)) as date, t2.code, t2.indicator, count(1) as total
FROM events
LEFT JOIN (
SELECT distinct t.uuid, im.code, im.indicator
FROM mapping im
RIGHT JOIN (
SELECT tag, uuid
FROM events
LATERAL VIEW explode(split(app_properties, '(, |,)')) tag_table AS tag
) t
ON im.source_value = t.tag AND im.indicator = 'Bandwidth Usage'
WHERE im.source_value IS NOT NULL
) t2 ON (events.uuid = t2.uuid)
WHERE t2.code IS NOT NULL
GROUP BY to_date(substring(events.time, 0, 10)), t2.code, t2.indicator;
The Hive subquery in the WHERE clause can be used with IN, NOT IN, EXIST, or NOT
EXIST as follows. If the alias (see the following example for the employee table) is not specified before columns (name) in the WHERE condition, Hive will report the error Correlating expression cannot contain unqualified column references. This is a limitation of the Hive subquery.
From Apache Hive Essentials.
I guess this problem is also caused by subquery.
events should have an alias

NOT IN query... odd results

I need a list of users in one database that are not listed as the new_user_id in another. There are 112,815 matching users in both databases; user_id is the key in all queries tables.
Query #1 works, and gives me 111,327 users who are NOT referenced as a new_user_Id. But it requires querying the same data twice.
-- 111,327 GSU users are NOT listed as a CSS new user
-- 1,488 GSU users ARE listed as a new user in CSS
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (select cud.new_user_id
from css.user_desc cud
where cud.new_user_id is not null);
Query #2 would be perfect... and I'm actually surprised that it's syntactically accepted. But it gives me a result that makes no sense.
-- This gives me 1,505 users... I've checked, and they are not
-- referenced as new_user_ids in CSS, but I don't know why the ones
-- that were excluded were excluded.
--
-- Where are the missing 109,822, and whatexcluded them?
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (cudsubq.new_user_id);
What exactly is the where clause in the second query doing, and why is it excluding 109,822 records from the results?
Note The above query is a simplification of what I'm really after. There are other/better ways to do the above queries... they're just representative of the part of the query that's giving me problems.
Read this: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:442029737684
For what I understand, your cudsubq.new_user_id can be NULL even though both tables are joined by user_id, so, you won't get results using the NOT IN operator when the subset contains NULL values . Consider the example in the article:
select * from dual where dummy not in ( NULL )
This returns no records. Try using the NOT EXISTS operator or just another kind of join. Here is a good source: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
And what you need is the fourth example:
SELECT COUNT(descr.user_id)
FROM
user_profile prof
LEFT OUTER JOIN user_desc descr
ON prof.user_id = descr.user_id
WHERE descr.new_user_id IS NULL
OR descr.new_user_id != prof.user_id
Second query is semantically different. In this case
where gup.user_id not in (cudsubq.new_user_id)
cudsubq.new_user_id is treated as expression (doc: IN condition), not as a subquery, thus the whole clause is basically equivalent to
where gup.user_id != cudsubq.new_user_id
So, in your first query, you're literally asking "show me all users in GUP, who also have entries in CSS and their GUP.ID is not matching ANY NOT NULL NEW_ID in CSS ".
However, the second query is "show me all users in GUP, who also have entries in CSS and their GUP.ID is not equal to their RESPECTIVE NULLABLE (no is not null clause, remember?) CSS.NEW_ID value".
And any (not) in (or equality/inequality) checks with nulls don't actually work.
12:07:54 SYSTEM#oars_sandbox> select * from dual where 1 not in (null, 2, 3, 4);
no rows selected
Elapsed: 00:00:00.00
This is where you lose your rows. I would probably rewrite your second query's where clause as
where cudsubq.new_user_id is null, assuming that non-matching users have null new_user_id.
Your second select compares gup.user_id with cud.new_user_id on current joining record. You can rewrite the query to get the same result
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id != cud.new_user_id or cud.new_user_id is null;
You mentioned you compare list of user in one database with a list of users in another. So you need to query data twice and you don't query the same data. Maybe you can use "minus" operator to avoid using "in"
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id from css.user_desc cud
minus
select cud.new_user_id from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id;
You want new_user_id's from table gup that don't match any new_user_id on table cud, right? It sounds like a job for a left join:
SELECT count(gup.user_id)
FROM gsu.user_profile gup LEFT JOIN css.user_desc cud
ON gup.user_id = cud.new_user_id
WHERE cud.new_user_id is NULL
The join keeps all rows of gup, matching them with a new_user_id if possible. The WHERE condition keeps only the rows that have no matching row in cud.
(Apologies if you know this already and you're only interested in the behavior of the not in query)

Resources