select attribute mysql - filter

I have there mysql table:
**product (id,name)**
1 Samsung
2 Toshiba
3 Sony
**attribute (id,name,parentid)**
1 Size 0
2 19" 1
3 17" 1
4 15" 1
5 Color 0
6 White 5
7 Black 5
8 Price 0
9 <$100 8
10 $100-$300 8
11 >$300 8
**attribute2product (id,productid,attributeid)**
1 1 2
2 1 6
3 2 2
4 2 7
5 3 3
6 3 7
7 1 9
8 2 9
9 3 10
And listed them like:
**Size**
-- 19" (2)
-- 17" (1)
-- 15" (0)
**Color**
-- White (1)
-- Black (2)
**Price**
-- <$100 (1)
-- $100-$300 (1)
-- >$300 (1)
Please help me the mysql query to list the attribute name and count the number product that this attribute have. EG: When select Size 19" (attribute.id 2)
**Size**
-- 19"
**Color**
-- White (1)
-- Black (1)
**Price**
-- <$100 (1)
-- $100-$300 (1)
This will query to attribute2product >> select the productid >> next query to select other attribute of that productid and display the attribute name, number of prod that attribute name now have... (Like Magento)
Thanks,

I've modified the query. This should be what you based on your updates:
SELECT attribute.name AS attributename, COUNT(*) AS numofproducts FROM product
INNER JOIN attribute2product ON attribute2product.productid = product.id
INNER JOIN attribute ON attribute.id = attribute2product.attributeid
WHERE product.id IN
(
SELECT p.id FROM product AS p
INNER JOIN attribute2product AS a2p ON a2p.productid = p.id
WHERE a2p.attributeid = 2
)
GROUP BY attribute.id, attribute.name;
Based on your above data I got:
attributename numofproducts
19" 2
White 1
Black 1
<$100 2
For multiple attributes (based a more knowledgeable expert Quassnoi's blog article) :
I've removed product table since it's not needed here
SELECT attribute.name AS attributename, COUNT(*) AS numofproducts
FROM attribute2product
INNER JOIN attribute ON attribute.id = attribute2product.attributeid
WHERE attribute2product.productid IN (
SELECT o.productid
FROM (
SELECT productid
FROM (
SELECT 2 AS att
UNION ALL
SELECT 6 AS att
) v
JOIN attribute2product ON attributeid >= att AND attributeid <= att
) o
GROUP BY o.productid
HAVING COUNT(*) = 2
)
GROUP BY attribute.id, attribute.name
2, 6 refer to 19" and White, respectively. COUNT(*) = 2 is to match 2 attributes. More attributes can be added by appending the following to nested derived table:
UNION ALL
SELECT <attributeid> AS att
As expected the result from the query:
attributename numofproducts
19" 1
White 1
<$100 1

Related

Oracle - how to insert if not exists?

Example dB : https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=49af209811bce88aa67b42387f1bb5f6
I'd like to add insert this line
1002 9 1 UNKNOWN
Because of the line exists
1002 5 1 JIM
I was thinking about something like
select codeclient from STATS_CLIENT_TEST where CODEAXESTAT=5
and insert codeclient, 9,1,UNKNOWN.
but not sure how to do it? And simple query or a PL/SQL?
What's the best way to get it?
Thanks
Use an INSERT .. SELECT statement with a PARTITIONed outer join:
INSERT INTO stats_client_test (
codeclient, codeaxestat, codeelementstat, valeuraxestatistiqueclient
)
SELECT cc.codeclient,
s.codeaxestat,
s.codeelementstat,
'UNKNOWN'
FROM (SELECT DISTINCT codeclient FROM stats_client_test) cc
LEFT OUTER JOIN stats_client_test s
PARTITION BY (s.codeaxestat, s.codeelementstat)
ON (s.codeclient = cc.codeclient)
WHERE s.rowid IS NULL;
or a MERGE statement:
MERGE INTO stats_client_test dst
USING (
SELECT cc.codeclient,
s.codeaxestat,
s.codeelementstat,
s.ROWID AS rid
FROM (SELECT DISTINCT codeclient FROM stats_client_test) cc
LEFT OUTER JOIN stats_client_test s
PARTITION BY (s.codeaxestat, s.codeelementstat)
ON (s.codeclient = cc.codeclient)
) src
ON (dst.ROWID = src.rid)
WHEN NOT MATCHED THEN
INSERT (codeclient, codeaxestat, codeelementstat, valeuraxestatistiqueclient)
VALUES (src.codeclient, src.codeaxestat, src.codeelementstat, 'UNKNOWN');
db<>fiddle here
Here's one option: using the MINUS set operator, find missing codeclient values and then insert appropriate row(s).
Before:
SQL> select * From stats_client_Test order by codeaxestat, codeclient;
CODECLIENT CODEAXESTAT CODEELEMENTSTAT VALEURAXESTATISTIQUECLIENT
-------------------- ----------- --------------- ----------------------------------------
1000 5 1 JOHN
1001 5 1 ALICE
1002 5 1 JIM
1003 5 1 BOB
1000 9 1 MAN
1001 9 1 WOMAN
1002 9 1 unknown
1003 9 1 MAN
8 rows selected.
Query:
SQL> insert into stats_client_test
2 (codeclient, codeaxestat, codeelementstat, VALEURAXESTATISTIQUECLIENT)
3 select x.codeclient, 9, 1, 'unknown'
4 from (select codeclient from stats_client_Test
5 where codeaxestat = 5
6 minus
7 select codeclient from stats_client_Test
8 where codeaxestat = 9
9 ) x;
0 rows created.
After:
SQL> select * From stats_client_Test order by codeaxestat, codeclient;
CODECLIENT CODEAXESTAT CODEELEMENTSTAT VALEURAXESTATISTIQUECLIENT
-------------------- ----------- --------------- ----------------------------------------
1000 5 1 JOHN
1001 5 1 ALICE
1002 5 1 JIM
1003 5 1 BOB
1000 9 1 MAN
1001 9 1 WOMAN
1002 9 1 unknown --> here it is
1003 9 1 MAN
8 rows selected.
SQL>

I need a Select for Max(Version)

I have two Tables with a foreign key from t1.ID to T2.T_ID
T1:
ID
PR_ID
Version
1
1
1
2
2
1
3
2
2
4
3
1
5
3
2
6
4
1
T2:
ID
T_ID
ab_nr
1
1
56
2
2
3
3
3
76
4
4
4
5
5
87
6
6
64
I need a select which gets all T2.IDs with the highest T1.Version. For example T1.PR_ID has the Numbers 2 and 3 with different Versions, here i would only need as end Result the T1.ID 's 1,3,5 and 6.
I tried it with:
SELECT * FROM T2
JOIN T1 ON T1.ID = T2.T_ID
WHERE T1.Version IN (SELECT MAX(VERSION) FROM T1);
but this doesnt work because it only gets the Number 2 and nothing else.
There's always a many ways to skin a SQL cat, but here's a simple one.
SELECT t2.*
FROM t1
INNER JOIN t2 ON t2.t_id = t1.id
WHERE NOT EXISTS ( SELECT 'higher version for the same PR_ID'
FROM t1 t1x
WHERE t1x.pr_id = t1.pr_id
AND t1x.version > t1.version )
That is, add a NOT EXISTS condition to filter out any results that are for old versions.
The way you tried to do it was on the right track, but you just needed to correlate your MAX(VERSION) subquery so that it got the max version for the current PR_ID. Like this:
SELECT * FROM T2
JOIN T1 ON T1.ID = T2.T_ID
WHERE T1.Version IN (SELECT MAX(VERSION) FROM T1X
-- You missed this part, below
WHERE T1X.PR_ID = T1.PR_ID
);
Anyway, try either of these. If performance is not good, we can start looking at more efficient ways of doing it (e.g., MAX ... KEEP)

Use SYS_CONNECT_BY_PATH to aggregate values

I would like to generate a data hierarchy.
This query:
select connect_by_root(parent_id) as root_id
,ID, NAME
,SYS_CONNECT_BY_PATH(PARENT_ID,'/') PATH
,level
,line
,LINE*power(10,-level+1) CALC
,ltrim(SYS_CONNECT_BY_PATH(lpad(LINE,3,'0'), '.'),'.') SORT
from (
select 3 ID, 1 LINE, 2 PARENT_ID FROM DUAL
union all
select 4 ID, 2 LINE, 2 PARENT_ID FROM DUAL
union all
select 5 ID, 3 LINE, 2 PARENT_ID FROM DUAL
union all
select 6 ID, 1 LINE, 5 PARENT_ID FROM DUAL
union all
select 7 ID, 1 LINE, 6 PARENT_ID FROM DUAL
) v
start with v.parent_id = 2
connect by nocycle prior id=parent_id
Generates:
ROOT_ID ID PATH LEVEL LINE CALC SORT
2 3 /2 1 1 1 001
2 4 /2 1 2 2 002
2 5 /2 1 3 3 003
2 6 /2/5 2 1 0.1 003.001
2 7 /2/5/6 3 1 0.01 003.001.001
What I would like:
ROOT_ID ID PATH LEVEL LINE CALC
2 3 /2 1 1 1
2 4 /2 1 2 2
2 5 /2 1 3 3
2 6 /2/5 2 1 3.1
2 7 /2/5/6 3 1 3.11
Is there a way to get sys_connect_by_path (or another function) to tally the CALC column and its parents?
Currently, I'm using the SORT field for ordering the rows; I'd rather sort on a proper numerical value (CALC field).
Try this:
select connect_by_root(parent_id) as root_id
,ID
,SYS_CONNECT_BY_PATH(PARENT_ID,'/') PATH
,level
,line
,LINE*power(10,-level+1) CALC
,XMLCAST(XMLQUERY(ltrim(SYS_CONNECT_BY_PATH(LINE*power(10,-level+1), '+'),'+') RETURNING CONTENT) AS NUMBER) SORT
from (
select 3 ID, 1 LINE, 2 PARENT_ID FROM DUAL
union all
select 4 ID, 2 LINE, 2 PARENT_ID FROM DUAL
union all
select 5 ID, 3 LINE, 2 PARENT_ID FROM DUAL
union all
select 6 ID, 1 LINE, 5 PARENT_ID FROM DUAL
union all
select 7 ID, 1 LINE, 6 PARENT_ID FROM DUAL
) v
start with v.parent_id = 2
connect by nocycle prior id=parent_id
You may take your SORT column and after some fidling (changing the first dot to comma and removing other dots) convert the result to a number.
The key part is here
to_number(regexp_replace(regexp_replace(SORT,'\.',',',1,1),'\.',null),
'99D9999' , ' NLS_NUMERIC_CHARACTERS = '',.'' ') sort2
Example
3.1.1 -> 3,1.1 -> 3,11 and convert to number
The complete query here
with v as (
select 3 ID, 1 LINE, 2 PARENT_ID FROM DUAL
union all
select 4 ID, 2 LINE, 2 PARENT_ID FROM DUAL
union all
select 5 ID, 3 LINE, 2 PARENT_ID FROM DUAL
union all
select 6 ID, 1 LINE, 5 PARENT_ID FROM DUAL
union all
select 7 ID, 1 LINE, 6 PARENT_ID FROM DUAL
), v2 as (
select connect_by_root(parent_id) as root_id
,ID
,SYS_CONNECT_BY_PATH(PARENT_ID,'/') PATH
,level my_level
,line
,LINE*power(10,-level+1) CALC
,ltrim(SYS_CONNECT_BY_PATH( LINE , '.'),'.') SORT
from v
start with v.parent_id = 2
connect by nocycle prior id=parent_id
)
select ROOT_ID, ID, PATH, my_level, LINE, CALC, SORT,
to_number(regexp_replace(regexp_replace(SORT,'\.',',',1,1),'\.',null),'99D9999' , ' NLS_NUMERIC_CHARACTERS = '',.'' ') sort2
from v2

How to find the nearest neighbor in Hive? Any windowing function?

Given a table
$cat data.csv
ID,State,City,Price,Flag
1,CA,A,95,0
2,CA,A,96,1
3,CA,A,195,1
4,NY,B,124,0
5,NY,B,128,1
6,NY,C,24,0
7,NY,C,27,1
8,NY,C,29,0
9,NY,C,39,1
Expected Result:
ID0, ID1
1,2
4,5
6,7
8,7
for each ID with Flag=0 above, we want to find another ID from Flag=1, with the same "State" and "City", and the nearest Price.
I have two rough stupid ideas:
Method 1.
Use a left outer join with the table itself on
(a.State=b.State and a.City=b.city and a.Flag=0 and b.Flag=1),
where a.Flag=0 and b.Flag=1,
and then use RANK() over (partitioned by a.State,a.City order by a.Price - b.Price) as rank
where rank=1
Method 2.
Use a left outer join with the table itself,
on
(a.State=b.State and a.City=b.city and a.Flag=0 and b.Flag=1),
where a.Flag=0 and b.Flag=1,
and then Use Distribute by a.State,a.City Sort by Price_Diff ASC limit 1
What's the best way to find the nearest neighbor in Hive?
Any valuable tips will be greatly appreciated!
select a.id, b.id , min(abs(b.price-a.price)) as delta
from data as a
inner join data as b
on a.country=b.country and
a.flag=0 and b.flag=1 and
a.city=b.city
group by a.id, b.id
order by delta asc;
This returns
1 2 1 <---
8 7 2 <---
6 7 3 <---
4 5 4 <---
8 9 10
6 9 15
1 3 100
The problem is that the last 3 rows have the same id used into the first 4.
select a.id as id0, b.id as id1, abs(b.price-a.price) as delta,
rank() over ( partition by a.country, a.city order by abs(b.price-a.price) )
from data as a
inner join data as b
on a.country=b.country and
a.flag=0 and b.flag=1 and
a.city=b.city;
This will return
id0 id1 prc rank
1 2 1 1 <---
1 3 100 2
4 5 4 1 <---
8 7 2 1 <---
6 7 3 2
8 9 10 3
6 9 15 4
We are missing 6,7 and this is somehow correct.
6,NY,C,24,0
7,NY,C,27,1
8,NY,C,29,0
9,NY,C,39,1
The lowest price difference for (6,7),(6,9),(8,7),(8,9) is in (8,7). (ambiguous join)
I think you will love this video about this topic : Big Data Analytics Using Window Functions

Updating status in 1 table, based on most recent response in another table

I'm using Oracle 11g R1 database. Please help me with what I'm trying to achive.
Table 1
-------
ID Name Status
-- ---- ------
1 John 0
2 Chris 0
3 Joel 0
4 Mike 0
5 Henry 0
Table 2
-------
ID Status ResponseDate
-- ------ -------------
1 0 1-Jan-2013
1 1 31-Jan-2013
1 2 3-Feb-2013
1 6 19-Jan-2013
2 6 3-Mar-2013
2 2 1-Mar-2013
2 1 4-Mar-2013
2 0 2-Mar-2013
3 0 3-Feb-2013
3 1 2-Feb-2013
3 2 1-Feb-2013
4 2 4-Apr-2013
4 1 6-Apr-2013
4 0 1-Apr-2013
5 1 31-Mar-2013
5 6 4-Apr-2013
5 3 10-Jan-2013
I would like to update Table1.status based on the most recent response the ID's have returned. So, the statuses in Table1 should finally be updated as below,
ID Name Status
-- ---- ------
1 John 2
2 Chris 1
3 Joel 0
4 Mike 1
5 Henry 6
update table1 t1
set status = (
select max(status) keep (dense_rank last order by responsedate)
from table2 t2
where t2.id = t1.id
);
update table1 t1
set status =
(
select status
from table2
where id = t1.id and responseDate =
(
select max(responseDate)
from table2
where id = t1.id
)
)
Of course you can update status column of the table1 every time a need arises, but you might consider to create a view, v_table_1 for instance, which will provide you with fresh and up to date information:
create or replace view V_Table1 as
select max(t.id) as id
, max(t.name) as name
, max(q.status) keep(dense_rank first
order by q.ResponseDate desc) as status
from table_1 t
join table_2 q
on (q.id = t.id)
group by t.id
Result:
select *
from V_Table1
ID1 NAME1 STATUS
-------- ----- ----------
1 John 2
2 Chris 1
3 Joel 0
4 Mike 1
5 Henry 6

Resources