I have a table with 5 columns like this:
id, name, firstName, job, number
There are many records in this table. Now, imagine these records:
- 1 Konan toto doctor 45
- 2 Konan tata doctor 50
- 3 Konan toto doctor 60
- 4 simba popo police 44
- 5 simba tata police 88
- 6 pikar popo doctor 99
- 7 simba popo doctor 72
now I want to find only record which job is doctor and get record with no duplicate in (name,firstName)
(if we have many records with same name + lastName we return only one record , let say anyone)
the result will be
- 1 Konan toto doctor 45
- 2 Konan tata doctor 50
- 6 pikar popo doctor 99
- 7 simba popo doctor 72
the record with id 3 is removed because there is duplicate on (name,firstName) the others because we only need job = doctor
What is the Hibernate Query to get the desired results ?
I'm not sure why this has to be done with JPQL, it seems rather difficult and produce an inefficient query that probably requires a self join. Here's a relatively simple Oracle SQL based solution
SELECT
MAX(id) KEEP (DENSE_RANK FIRST ORDER BY id) AS id,
name,
firstName,
MAX(job) KEEP (DENSE_RANK FIRST ORDER BY id) AS job,
MAX(number) KEEP (DENSE_RANK FIRST ORDER BY id) AS number
FROM t
WHERE job = 'doctor'
GROUP BY name, firstName
A more standards compliant solution using window functions:
SELECT
FROM (
SELECT
id, name, firstName, job, number,
ROW_NUMBER() OVER (PARTITION BY name, firstName ORDER BY id) rn
FROM t
WHERE job = 'doctor'
) t
WHERE rn = 1
This is a special case of the top N per category problem. I recommend using a native query, which you can still map to entities if you really need to.
Related
It seems to simple, but not getting desired results
I have a table with there data
Team_id, Player_id, Player_name Game_cd
1 100 abc 24
1 1000 xyz 24
1 588 ert 24
1 500 you 24
2 600 ops 24
2 700 dps 24
2 900 lmv 24
2 200 hmv 24
I have to write a query to get a result like this
Home_team home_plr_id home_player away_team away_plr_id away_player
1 100 abc 2 600 ops
1 1000 xyz 2 900 lmv
The query I wrote
select f1.Team_id as home_team,
f1.player_id as home_plr_id,
f1.player_Name as home_player,
f2.Team_id as away_team,
f2.player_id as away_plr_id,
f2.player_Name as home_player
from game f1, game f2
where
f1.team_id<> f2.team_id and
f1.game_cd = f2.game_cd
Alternative to #Radagast81's self-join is pivot, available in your Oracle version:
select home_plr_id, home_plr_name, away_plr_id, away_plr_name
from (select game.*,
row_number() over (partition by team_id order by player_id) rn
from game)
pivot (max(player_id) plr_id, max(player_name) plr_name
for team_id in (1 home, 2 away))
SQL Fiddle
Players have to be numbered somehow (here by ID), it can be done by name, null or even random. This numbering is needed only to put them in same rows. Pivot works also if numbers of players in teams differs.
It is not clear how you want to pair a home player with an away player. But provided that you don't care about that, the following might be what you are looking for:
WITH game_p AS (SELECT team_id, player_id, player_name, game_cd
, ROW_NUMBER() over (PARTITION BY team_id, game_cd ORDER BY player_id) pos
, dense_rank() over (PARTITION BY game_cd ORDER BY team_id) team_pos
FROM game)
SELECT NVL(f1.game_cd, f2.game_cd) AS game_cd
, f1.Team_id as home_team
, f1.player_id as home_plr_id
, f1.player_Name as home_player
, f2.Team_id as away_team
, f2.player_id as away_plr_id
, f2.player_Name as away_player
FROM (SELECT * FROM game_p WHERE team_pos = 1) f1
FULL JOIN (SELECT * FROM game_p WHERE team_pos = 2) f2
ON f1.game_cd = f2.game_cd
AND f1.pos = f2.pos
The new column POS gives any player of each team a position to pair them with the other team.
The new column TEAM_POS is to get the team_id mapped to the values 1 and 2, as the team_id's can differ per game.
Finally do a FULL JOIN to get the final list. If the number of players are allways the same for both teams you can do a normal join instead...
I have been working with an application that is integrated with spring and Hibernate 4.X.X and its transaction is managed by JTA in Weblogic application server. After 3 years, there are about 40 million records only into one table from 100 tables that exist in my DB. The DB is Oracle 11g. The response time of a query is about 5 minutes because of increasing the count of records of this tables.
I customized the query and put it into Sql Developer and run the query advisor plan for suggestion some Index. Totally after doing such this, its response time is reduced to 2 minute. But even so, this response time does not satisfy the Custumer. To further clarify I put the query, It is as following:
select *
from (select (count(storehouse0_.ID) over()) as col_0_0_,
storehouse3_.storeHouse_ID as col_1_0_,
(DBPK_PUB_STOREHOUSE.get_Storehouse_Title(storehouse5_.id, 1)) as col_2_0_,
storehouse5_.Organization_Code as col_3_0_,
publicgood1_.Goods_Item_Id as col_4_0_,
storehouse0_.storeHouse_Inventory_Id as col_5_0_,
storehouse0_.Id as col_6_0_,
storehouse3_.samapel_Item_Id as col_7_0_,
samapelite10_.MAINNAME as col_8_0_,
publicgood1_.serial_Number as col_9_0_,
publicgood1_1_.production_Year as col_10_0_,
samapelpar2_.ID_SourceInfo as col_11_0_,
samapelpar2_.Pn as col_12_0_,
storehouse3_.expire_Date as col_13_0_,
publicgood1_1_.Status_Id as col_14_0_,
baseinform12_.Topic as col_15_0_,
publicgood1_.public_Num as col_16_0_,
cast(publicgood1_1_.goods_Status as number(10, 0)) as col_17_0_,
publicgood1_1_.goods_Status as col_18_0_,
publicgood1_1_.deleted as col_19_0_
from amd.Core_StoreHouse_Inventory_Item storehouse0_,
amd.Core_STOREHOUSE_INVENTORY storehouse3_,
amd.Core_STOREHOUSE storehouse5_,
amd.SMP_SAMAPEL_CODE samapelite10_
cross join amd.Core_Goods_Item_Public publicgood1_
inner join amd.Core_Goods_Item publicgood1_1_
on publicgood1_.Goods_Item_Id = publicgood1_1_.Id
left outer join amd.SMP_SOURCEINFO samapelpar2_
on publicgood1_1_.Samapel_Part_Number_Id =
samapelpar2_.ID_SourceInfo, amd.App_BaseInformation
baseinform12_
where not exists
(select ssec.samapelITem_id
from core_security_samapelitem ssec
inner join core_goods_item g
on ssec.samapelitem_id = g.samapel_item_id
where not exists (SELECT aa.groupid
FROM app_actiongroup aa
where aa.groupid in
(select au.groupid
from app_usergroup au
where au.userid = 1)
and aa.actionid = 9054)
and ssec.isenable = 1
and storehouse0_.goods_Item_ID = g.id)
and not exists
(select *
from CORE_POWER_SECURITY cps
where not exists (SELECT aa.groupid
FROM app_actiongroup aa
where aa.groupid in
(select au.groupid
from app_usergroup au
where au.userid = 1)
and aa.actionid = 9055)
and cps.inventory_id =
storehouse0_.storeHouse_Inventory_Id
and cps.goodsitemtype = 6)
and storehouse0_.storeHouse_Inventory_Id = storehouse3_.Id
and storehouse3_.storeHouse_ID = storehouse5_.Id
and storehouse3_.samapel_Item_Id = samapelite10_.MAINCODE
and publicgood1_1_.Status_Id = baseinform12_.ID
and 1 <> 2
and storehouse0_.goods_Item_ID = publicgood1_.Goods_Item_Id
and publicgood1_1_.edited = 0
and publicgood1_1_.deleted = 0
and (exists (select storehouse13_.Id
from amd.Core_STOREHOUSE storehouse13_
cross join amd.core_power power16_
cross join amd.core_power power17_
where storehouse5_.powerID = power16_.Id
and storehouse13_.powerID = power17_.Id
and (storehouse13_.Id in (741684217))
and storehouse13_.storeHouseType = 2
and (power16_.hierarchiCode like
power17_.hierarchiCode || '%')) or
(storehouse3_.storeHouse_ID in (741684217)) and
storehouse5_.storeHouseType = 1)
and (storehouse5_.storeHouse_Status not in (2, 3))
order by storehouse3_.samapel_Item_Id)
where rownum <= 10
[Note: This query is generated by Hibernate].
It is clear that order by 40 million holds so much time.
I find the main issue of this query. I omitted the “order by” and run the query, its response time was reduced to about 5 second. I was wonderful why the “order by” affects so much the response time.
(Some body may think that if this table is partitioned or use another facility of oracle, it may get better response time. Ok it may be right but my emphasis is the “order by” performance. If there is a way that do the “order by” responsibility, why not to do it). Any way I am not able to omit the “order by” because the Customer needs to order and it is necessary for paging. I find a solution that is explained by an example. This solution I order only some records that is needed. How, I will explain later. It is clear when oracle wants to sort 40 million records, it naturally takes so much time. I replace “order by” with “where clause”. With doing this replacement the response time was reduces from 2 minute to about 5 second and this is very exciting for me.
I explain my solution via an example, anybody that read this Post tells me whether this solution is good or there are another solution that I do not know exists.
Another hand I have a solution that is explained later, if it is ok or not. Whether I use it or not.
I explain my solution:
Let’s assumed that there are two table as below:
Post table
Id Others fields
1
2
3
4
5
… …
Post_comment table
Id post_id
1 5
2 5
3 5
4 5
6 5
7 2
8 2
9 2
10 3
11 1
12 1
13 1
14 1
15 1
16 1
17 1
18 1
19 1
20 1
21 1
22 1
23 1
24 1
25 1
26 4
27 4
There is a form that shows the result of join between POST table and POST_COMMENT table.
I explain both query with “order by” all records of that table and “order by” only specific records that are needed. The result of two query are exactly the same but the response time of second approach is the better than that one.
You assume that the page size is 10 and you are in page 3.
The first query with the “order by” all records of that table:
select *
from (Select res.*, rownum as rownum_
from (Select * from POST_COMMENT Order by post_id asc) res
Where rownum <= 30)
where rownum_ > 20
The second solution:
Before execution the query, I query as below:
select *
from (select post_id, count(id) from POST_COMMENT group by post_id)
order by post_id asc
So the result of it is the below:
Post_id Count(id) Sum(count(id))
1 15 15
2 3 18
3 1 19
4 2 21
5 5 26
It needs to say that the third column that is "Sum(count(id))" is calculated after that query.Any entry of this column is sum all before records.
So there is a formula that specifics which post_id must be selected. The formula is the below:
pageSize = 10, pageNumber = 3
from : (pageNumber – 1) * pageCount 2 * 10 = 20
to : (pageNumber – 1) * pageCount + pageCount 20 + 10 = 30
So I need the posts that are between (20, 30] of Sum(count(id)). According to this, I need only two post_id that have value 4,5. According to this the main query of second approach is:
select *
from (select rownum as rownum_, res.*
from (select *
from (select * from POST_COMMENT where post_id in (4, 5))
order by post_id asc) res
where rownum <= 30)
where rownum_ > 20
If you look at both query, you will see the biggest difference. The second query only selects the records of POST_COMENT that have post_id that are 4 and 5. After that, orders this records not all records of that table.
After posting this post, I have searched. finally I am redirected to HERE . I can reach to the response time that is very excited for me. It is reduced from 3 minutes to less than 3 seconds. It is necessary to know, I only use one tip from all of the query optimization guidelines that are in that site that is Duplicate constant condition for different tables whenever possible.
Note: Before doing this tip, there are some indexs on fields that are in where-clause and order-by.
I have two tables affiliation and customer, in that i have data like this
aff_id From_cus_id
------ -----------
1 10
2 20
3 30
4 40
5 50
cust_id cust_aff_id
------- -------
10
20
30
40
50
i need to update data for cust_aff_id column from affiliation table which is aff_id like below
cust_id cust_aff_id
------- -------
10 1
20 2
30 3
40 4
50 5
could u please give reply if anyone knows......
Oracle doesn't have an UPDATE with join syntax, but you can use a subquery instead:
UPDATE customer
SET customer.cust_aff_id =
(SELECT aff_id FROM affiliation WHERE From_cus_id = customer.cust_id)
merge into customer t2
using affiliation t1 on (t1.From_cus_id =t2.cust_id )
WHEN MATCHED THEN
update set t2.cust_aff_id = t1.aff_id
;
Here is an update with join syntax. This, quite reasonably, works only if from_cus_id is primary key in the first table and cust_id is foreign key in the second table, referencing the first table. Without these conditions, the requirement doesn't make much sense in the first place anyway... but Oracle requires that these constraints be stated explicitly in the tables. This is also reasonable on Oracle's part IMO.
update
( select t1.aff_id, t2.cust_aff_id
from affiliation t1 join customer t2 on t2.cust_id = t1.from_cus_id) j
set j.cust_aff_id = j.aff_id;
I am facing an issue using connect by.
I have a query through which I retrieve a few columns including these three:
ID
ParentID
ObjectID
Now for the same ID and parentID, there are multiple objects associated e.g.
ID ParentID ObjectID
1 0 112
1 0 113
2 0 111
2 0 112
3 1 111
4 1 112
I am trying to use connect by but I'm unable to get the result in a proper hierarchy. I need it the way it is showed below. Take an ID-parentID combo, display all rows with that ID-parentID and then all the children of this ID i.e. whose parentID=ID
ID ParentID ObjectID
1 0 112
1 0 113
3 1 111
4 1 112
2 0 111
2 0 112
select ID,parent_id, object_id from table start with parent_id=0
connect by prior id=parent_id order by id,parent_id
Above query is not resulting into proper hierarchy that i need.
Well, your problem appears to be that you are using a non-normalized table design. If a given ID always has the same ParentID, that relationship shouldn't be indicated separately in all these rows.
A better design would be to have a single table showing the parent child relationships, with ID as a primary key, and a second table showing the mappings of ID to ObjectID, where I presume both columns together would comprise the primary key. Then you would apply your hierarchical query against the first table, and join the results of that to the other table to get the relevant objects for each row.
You can emulate this with your current table structure ...
with parent_child as (select distinct id, parent_id from table),
tree as (select id, parent_id from parent_child
start with parent_id = 0
connect by prior id = parent_id )
select id, table.parent_id, table.object_id
from tree join table using (id)
Here's a script that runs. Not ideal but will work -
select * from (select distinct test.id,
parent_id,
object_id,
connect_by_root test.id root
from test
start with test.parent_id = 0
connect by prior test.id = parent_id)
order by root,id
First of all Thanks to all who tried helping me.
Finally i changed my approach as applying hierarchy CONNECT BY clause to inner queryw ith multiple joins was not working for me.
I took following approach
Get the hierarchical data from First table i.e. table with ID-ParentID. Select Query table1 using CONNECT BY. It will give the ID in proper sequence.
Join the retrieved List of ID.
Pass the above ID as comma seperated string in select query IN Clause to second table with ID-ObjectID.
select * from table2 where ID in (above Joined string of ID) order by
instr('above Joined string of ID',ID);
ORDER BY INSTR did the magic. It will give me the result ordered by the IN Clause data and IN Clause string is prepared using the hierarchical query. Hence it will obviously be in sequence.
Again Thanks all for the help!
Note: Above approach has one constraint : ID passed as comma separated string in IN Clause. IN Clause has a limit of characters inside it. I guess 1000 chars. Not sure.
But as i am sure on the data of First table that it will not be so much so as to cross limit of 1000 chars. Hence i chose above approach.
Let's say I have table data similar to the following:
123456 John Doe 1 Green 2001
234567 Jane Doe 1 Yellow 2001
234567 Jane Doe 2 Red 2001
345678 Jim Doe 1 Red 2001
What I am attempting to do is only isolate the records for Jane Doe based upon the fact that she has more than one row in this table. (More that one sequence number)
I cannot isolate based upon ID, names, colors, years, etc...
The number 1 in the sequence tells me that is the first record and I need to be able to display that record, as well as the number 2 record -- The change record.
If the table is called users, and the fields called ID, fname, lname, seq_no, color, date. How would I write the code to select only records that have more than one row in this table? For Example:
I want the query to display this only based upon the existence of the multiple rows:
234567 Jane Doe 1 Yellow 2001
234567 Jane Doe 2 Red 2001
In PL/SQL
First, to find the IDs for records with multiple rows you would use:
SELECT ID FROM table GROUP BY ID HAVING COUNT(*) > 1
So you could get all the records for all those people with
SELECT * FROM table WHERE ID IN (SELECT ID FROM table GROUP BY ID HAVING COUNT(*) > 1)
If you know that the second sequence ID will always be "2" and that the "2" record will never be deleted, you might find something like:
SELECT * FROM table WHERE ID IN (SELECT ID FROM table WHERE SequenceID = 2)
to be faster, but you better be sure the requirements are guaranteed to be met in your database (and you would want a compound index on (SequenceID, ID)).
Try something like the following. It's a single tablescan, as opposed to 2 like the others.
SELECT * FROM (
SELECT t1.*, COUNT(name) OVER (PARTITION BY name) mycount FROM TABLE t1
)
WHERE mycount >1;
INNER JOIN
JOIN:
SELECT u1.ID, u1.fname, u1.lname, u1.seq_no, u1.color, u1.date
FROM users u1 JOIN users u2 ON (u1.ID = u2.ID and u2.seq_no = 2)
WHERE:
SELECT u1.ID, u1.fname, u1.lname, u1.seq_no, u1.color, u1.date
FROM users u1, thetable u2
WHERE
u1.ID = u2.ID AND
u2.seq_no = 2
Check out the HAVING clause for a summary query. You can specify stuff like
HAVING COUNT(*) >= 2
and so forth.