merge several records with the same call_id into one - oracle

There is a table_verif table where I need to insert data, here is an example of fields and records for this table:
call_id | datetime | process_name | param_1 | param_2 | param_3 | param_ 4 |
------------------------------------------------------------------------------------------
1234567 | 29.12.2022| verification process | greeting| presentation | result | waiting |
---------------------------------------------------------------------------------------
And there is my table my_table with entries:
call_id | datetime | process_name | param_1 |
--------------------------------------------------------------
1234567 | 29.12.2022 | complaints | Establishing contact |
--------------------------------------------------------------
1234567 | 29.12.2022 | complaints | Identification |
--------------------------------------------------------------
1234567 | 29.12.2022 | complaints | Num specification |
--------------------------------------------------------------
1234567 | 29.12.2022 | complaints | Data transfer |
I need to translate these four records into a table_verif format:
call_id| datetime |process_name| param_1 | param_2 | param_3 | param_4
--------------------------------------------------------------------------------------------
1234567|29.12.2022|complaints |Establishing contact|Identification|Num specification|Data transfer
I took it shortly to param_4, but in practice it starts with param_1 and ends with ..param_9

You can pivot a result set (more explanation here), but you need to have a value to pivot on and provide the ordering for the result. For example you could start with a query like:
select call_id, datetime, process_name, param_1,
-- establish some ranking - use your own criteria
row_number() over (partition by call_id, process_name order by datetime, param_1) as rn
from my_table
which with the sample data plus reverse-engineered data from your example table_verif row might produce:
CALL_ID
DATETIME
PROCESS_NAME
PARAM_1
RN
1234567
29-DEC-22
complaints
Data transfer
1
1234567
29-DEC-22
complaints
Establishing contact
2
1234567
29-DEC-22
complaints
Identification
3
1234567
29-DEC-22
complaints
Num specification
4
1234567
29-DEC-22
verification process
greeting
1
1234567
29-DEC-22
verification process
presentation
2
1234567
29-DEC-22
verification process
result
3
1234567
29-DEC-22
verification process
waiting
4
You may have different criteria to decide the parameter order, perhaps from a look-up table you haven't shown. However you get it, once you have a generated value like rn you can use that for the pivot:
select *
from (
select call_id, datetime, process_name, param_1,
-- establish some ranking - use your own criteria
row_number() over (partition by call_id, process_name order by datetime, param_1) as rn
from my_table
)
pivot (
max(param_1)
for (rn) in (1 as param_1, 2 as param_2, 3 as param_3, 4 as param_4, 5 as param_5)
)
CALL_ID
DATETIME
PROCESS_NAME
PARAM_1
PARAM_2
PARAM_3
PARAM_4
PARAM_5
1234567
29-DEC-22
complaints
Data transfer
Establishing contact
Identification
Num specification
null
1234567
29-DEC-22
verification process
greeting
presentation
result
waiting
null
fiddle
You can then use that as an insert ... select ... statement to populate table_verif, or if this is permanent data then you can turn the query into a view instead of duplicating it.
You could also pivot by listing all the possible param_1 values in the in() clause instead of generating a ranking value, but that would give you a much wider result, and doesn't seem to be what you want here.

Related

Vertica database added duplicate entry with same primary key

I am running a docker image of Vertica on windows. I have created a table in vertica with this schema (student_id is primary key)
dbadmin#d1f942c8c1e0(*)=> \d testschema.student;
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
------------+---------+------------+-------------+------+---------+----------+-------------+-------------
testschema | student | student_id | int | 8 | | t | t |
testschema | student | name | varchar(20) | 20 | | f | f |
testschema | student | major | varchar(20) | 20 | | f | f |
(3 rows)
student_id is a primary key. I am testing loading data from csv file using copy command.
First I used insert - insert into testschema.student values (1,'Jack','Biology');
Then I created a csv file at /home/dbadmin/vertica_test directory -
vi student.csv
2,Kate,Sociology
3,Claire,English
4,Jack,Biology
5,Mike,Comp. Sci
Then I ran this command
copy testschema.students from '/home/dbadmin/vertica_test/student.csv' delimiter ',' rejected data as table students_rejected;
I tested the result
select * from testschema.student - shows 5 rows
select * from students_rejected; - no rows
Then I creates another csv file with bad data at /home/dbadmin/vertica_test directory
vi student_bad.csv
bad_data_type_for_student_id,UnaddedStudent, UnaddedSubject
6,Cassey,Physical Education
I added data from bad csv file
copy testschema.students from '/home/dbadmin/vertica_test/student.csv' delimiter ',' rejected data as table students_rejected;
Then I tested the output
select * from testschema.student - shows 6 rows <-- only one row got added. all ok
select * from students_rejected; - shows 1 row <-- bad row's entry is here. all ok
all looks good
Then I added the bad data again without the rejected data option
copy testschema.students from '/home/dbadmin/vertica_test/student_bad.csv' delimiter ',' ;
But now the entry with student id 6 got added again!!
student_id | name | major
------------+--------+--------------------
1 | Jack | Biology
2 | Kate | Sociology
3 | Claire | English
4 | Jack | Biology
5 | Mike | Comp. Sci
6 | Cassey | Physical Education <--
6 | Cassey | Physical Education <--
Shouldn't this have got rejected?
If you created your students with a command of this type:
DROP TABLE IF EXISTS students;
CREATE TABLE students (
student_id int
, name varchar(20)
, major varchar(20)
, CONSTRAINT pk_students PRIMARY KEY(student_id)
);
that is, without the explicit keyword ENABLED, then the primary key constraint is disabled. That is, you can happily insert duplicates, but will run into an error if you later want to join to the students table via the primary key column.
With the primary key constraint enabled ...
[...]
, CONSTRAINT pk_students PRIMARY KEY(student_id) ENABLED
[...]
I think you get the desired effect.
The whole scenario:
DROP TABLE IF EXISTS students;
CREATE TABLE students (
student_id int
, name varchar(20)
, major varchar(20)
, CONSTRAINT pk_students PRIMARY KEY(student_id) ENABLED
);
INSERT INTO students
SELECT 1,'Jack' ,'Biology'
UNION ALL SELECT 2,'Kate' ,'Sociology'
UNION ALL SELECT 3,'Claire','English'
UNION ALL SELECT 4,'Jack' ,'Biology'
UNION ALL SELECT 5,'Mike' ,'Comp. Sci'
UNION ALL SELECT 6,'Cassey','Physical Education'
;
-- out OUTPUT
-- out --------
-- out 6
COMMIT;
COPY students FROM STDIN DELIMITER ','
REJECTED DATA AS TABLE students_rejected;
6,Cassey,Physical Education
\.
-- out vsql:/home/gessnerm/._vfv.sql:4: ERROR 6745:
-- out Duplicate key values: 'student_id=6'
-- out -- violates constraint 'dbadmin.students.pk_students'
SELECT * FROM students;
-- out student_id | name | major
-- out ------------+--------+--------------------
-- out 1 | Jack | Biology
-- out 2 | Kate | Sociology
-- out 3 | Claire | English
-- out 4 | Jack | Biology
-- out 5 | Mike | Comp. Sci
-- out 6 | Cassey | Physical Education
SELECT * FROM students_rejected;
-- out node_name | file_name | session_id | transaction_id | statement_id | batch_number | row_number | rejected_data | rejected_data_orig_length | rejected_reason
-- out -----------+-----------+------------+----------------+--------------+--------------+------------+---------------+---------------------------+-----------------
-- out (0 rows)
And the only reliable check seems to be the ANALYZE_CONSTRAINTS() call ...
ALTER TABLE students ALTER CONSTRAINT pk_students DISABLED;
-- out Time: First fetch (0 rows): 7.618 ms. All rows formatted: 7.632 ms
COPY students FROM STDIN DELIMITER ','
REJECTED DATA AS TABLE students_rejected;
6,Cassey,Physical Education
\.
-- out Time: First fetch (0 rows): 31.790 ms. All rows formatted: 31.791 ms
SELECT * FROM students;
-- out student_id | name | major
-- out ------------+--------+--------------------
-- out 1 | Jack | Biology
-- out 2 | Kate | Sociology
-- out 3 | Claire | English
-- out 4 | Jack | Biology
-- out 5 | Mike | Comp. Sci
-- out 6 | Cassey | Physical Education
-- out 6 | Cassey | Physical Education
SELECT * FROM students_rejected;
-- out node_name | file_name | session_id | transaction_id | statement_id | batch_number | row_number | rejected_data | rejected_data_orig_length | rejected_reason
-- out -----------+-----------+------------+----------------+--------------+--------------+------------+---------------+---------------------------+-----------------
-- out (0 rows)
SELECT ANALYZE_CONSTRAINTS('students');
-- out Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-- out -------------+------------+--------------+-----------------+-----------------+---------------
-- out dbadmin | students | student_id | pk_students | PRIMARY | ('6')
-- out (1 row)

Oracle 11g insert into select from a table with duplicate rows

I have one table that need to split into several other tables.
But the main table is just like a transitive table.
I dump data from a excel into it (from 5k to 200k rows) , and using insert into select, split into the correct tables (Five different tables).
However, the latest dataset that my client sent has records with duplicates values.
The primary key usually is ENI for my table. But even this record is duplicated because the same company can be a customer and a service provider, so they have two different registers but use the same ENI.
What i have so far.
I found a script that uses merge and modified it to find same eni and update the same main_id to all
|Main_id| ENI | company_name| Type
| 1 | 1864 | JOHN | C
| 2 | 351485 | JOEL | C
| 3 | 16546 | MICHEL | C
| 2 | 351485 | JOEL J. | S
| 1 | 1864 | JOHN E. E. | C
Main_id: Primarykey that the main BD uses
ENI: Unique company number
Type: 'C' - COSTUMER 'S' - SERVICE PROVIDERR
Some Cases it can have the same type. just like id 1
there are several other Columns...
What i need:
insert any of the main_id my other script already sorted, and set a flag on the others that they were not inserted. i cant delete any data i'll need to send these info to the costumer validate.
or i just simply cant make this way and go back to the good old excel
Edit: as a question below this is a example
|Main_id| ENI | company_name| Type| RANK|
| 1 | 1864 | JOHN | C | 1 |
| 2 | 351485 | JOEL | C | 1 |
| 3 | 16546 | MICHEL | C | 1 |
| 2 | 351485 | JOEL J. | S | 2 |
| 1 | 1864 | JOHN E. E. | C | 2 |
RANK - would be like the 1864 appears 2 times,
1st one found gets 1 second 2 and so on. i tryed using
RANK() OVER (PARTITION BY MAIN_ID ORDER BY ENI)
RANK() OVER (PARTITION BY company_name ORDER BY ENI)
Thanks to TEJASH i was able to come up with this solution
MERGE INTO TABLEA S
USING (Select ROWID AS ID,
row_number() Over(partition by eniorder by eni, type) as RANK_DUPLICATED
From TABLEA
) T
ON (S.ROWID = T.ID)
WHEN MATCHED THEN UPDATE SET S.RANK_DUPLICATED= T.RANK_DUPLICATED;
As far as I understood your problem, you just need to know the duplicate based on 2 columns. You can achieve it using analytical function as follows:
Select t.*,
row_number() Over(partition by main_id, eni order by company_name) as rnk
From your_table t

Compare fields from the query results in Oracle

I am in a scenario to obtain all the records from a table where FIRSTNAME and LASTNAME of a particular record is the same but the BIRTHDATE is greater than or equal to 15 years.
Consider my table looks like:
_______________________________________________________________________________
| PRIMARY_ID | UNIQUE_ID | FIRSTNAME | LASTNAME | SUFFIX | BIRTHDATE |
_______________________________________________________________________________
| 12345 | abcd | john | collin | Mr | 1975-10-01 00:00:00|
| 12345 | cdef | john | collin | Mr | 1960-10-01 00:00:00|
| 12345 | efgh | john | collin | Mr | 1975-10-01 00:00:00|
| 12345 | ghij | john | collin | Mr | 1960-10-01 00:00:00|
| 12345 | aaaa | john | collin | Mr | 1975-10-01 00:00:00|
| 12345 | bdfs | john | collin | Mr | 1975-10-01 00:00:00|
| 12345 | asdf | john | collin | Mr | null |
| 12345 | dfgh | john | collin | Mr | null |
| 23456 | ghij | jeremy | lynch | Mr | 1982-10-15 00:00:00|
| 23456 | aaaa | jacob | lynch | Mr | 1945-10-12 00:00:00|
| 23456 | bdfs | jeremy | lynch | Mr | 1945-10-12 00:00:00|
| 23456 | asdf | jacob | lynch | Mr | null |
| 23456 | dfgh | jeremy | lynch | Mr | null |
_______________________________________________________________________________
In this table, for the PRIMARY_ID 12345, the FIRSTNAME and LASTNAME are all same but the BIRTHDATE difference between the UNIQUE_IDs if 15 years. So this PRIMARY_ID needs to be pulled out. Wherein for PRIMARY_ID 23456, the FIRSTNAME is not the same for all UNIQUE_ID records, so it must not be pulled out.
The table might contain NULL values for BIRTHDATE, which should be ignored.
This is what I have tried till now:
SELECT
/*PARALLEL(16)*/
PRIMARY_ID,
UNIQUE_ID,
FIRSTNAME,
LASTNAME,
SUFFIX,
BIRTHDATE,
RANK() OVER ( ORDER BY FIRSTNAME, LASTNAME, SUFFIX, BIRTHDATE) "GROUP"
FROM TABLE;
I have queried to form separate groups to distinguish by FIRSTNAME, LASTNAME and BIRTHDATE. I do not know on how to proceed further with this.
Can someone please help out?
NOTE: The BIRTHDATE field is in varchar datatype and I use Oracle 12C.
As I understand it, the goal is to return the distinct set of primary_id for which adjacent (alphabetically) unique_id that share the same firstname and lastname are separated by 15+ years. As I understand it, NULL should interrupt comparison (and be considered a non-match (otherwise, primary_id 23456 would also match here for pseudo-adjacent bdfs + ghij).
There are other ways to do this, but one way available in 12c is to use pattern-matching. An example is below. The example just uses a difference of 5478 days as to represent 15-years, but one could nuance that if greater exactitude was needed for intercalary days etc.
SELECT DISTINCT PRIMARY_ID
FROM THE_TABLE
MATCH_RECOGNIZE (
PARTITION BY PRIMARY_ID
ORDER BY UNIQUE_ID
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN(FIFTEEN_DIFF)
DEFINE FIFTEEN_DIFF AS
(FIFTEEN_DIFF.FIRSTNAME = PREV(FIFTEEN_DIFF.FIRSTNAME)
AND FIFTEEN_DIFF.LASTNAME = PREV(FIFTEEN_DIFF.LASTNAME)
AND (ABS(EXTRACT( DAY FROM (TO_TIMESTAMP(FIFTEEN_DIFF.BIRTHDATE,'YYYY-MM-DD HH24:MI:SS') - PREV(TO_TIMESTAMP(FIFTEEN_DIFF.BIRTHDATE,'YYYY-MM-DD HH24:MI:SS'))))) >= 5478)));
Result:
PRIMARY_ID
12345
1 row selected.
The above query does the following:
PARTITIONs to look at each PRIMARY_ID group individually,
then ORDERs by the UNIQUE_ID, so only alphabetically-adjacent records are compared.
Then each record is compared to the last, and if they share FIRSTNAME and LASTNAME, and their BIRTHDATEs differ by 15+ years, they are counted as a MATCH, and returns one record to indicate this.
After any match is found, it skips to the next row and resumes comparing.
Since only the distinct matches are desired, a DISTINCT is included in the select statement.
EDIT:
In response to follow-up questions, adding two additional examples.
Alternative 1: Pre-Filter NULL
This will bring different UNIQUE_ID into proximity, giving different matches.
SELECT DISTINCT PRIMARY_ID
FROM (SELECT PRIMARY_ID, UNIQUE_ID, FIRSTNAME, LASTNAME, SUFFIX, BIRTHDATE
FROM THE_TABLE
WHERE BIRTHDATE
IS NOT NULL)
MATCH_RECOGNIZE (
PARTITION BY PRIMARY_ID
ORDER BY UNIQUE_ID
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (FIFTEEN_DIFF)
DEFINE FIFTEEN_DIFF AS
(FIFTEEN_DIFF.FIRSTNAME = PREV(FIFTEEN_DIFF.FIRSTNAME)
AND FIFTEEN_DIFF.LASTNAME = PREV(FIFTEEN_DIFF.LASTNAME)
AND (ABS(EXTRACT(DAY FROM (TO_TIMESTAMP(FIFTEEN_DIFF.BIRTHDATE , 'YYYY-MM-DD HH24:MI:SS') -
PREV(TO_TIMESTAMP(FIFTEEN_DIFF.BIRTHDATE , 'YYYY-MM-DD HH24:MI:SS'))))) >= 5478)));
Result (this now includes PRIMARY_ID 23456, as removing NULL brings two UNIQUE_IDs into order that ar 15+ years apart) :
PRIMARY_ID
12345
23456
2 rows selected.
Alternative 2: Count NULL as a match
SELECT DISTINCT PRIMARY_ID
FROM THE_TABLE
MATCH_RECOGNIZE (
PARTITION BY PRIMARY_ID
ORDER BY UNIQUE_ID
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (FIFTEEN_DIFF)
DEFINE FIFTEEN_DIFF AS
(FIFTEEN_DIFF.FIRSTNAME = PREV(FIFTEEN_DIFF.FIRSTNAME)
AND FIFTEEN_DIFF.LASTNAME = PREV(FIFTEEN_DIFF.LASTNAME)
AND ((ABS(EXTRACT(DAY FROM (TO_TIMESTAMP(FIFTEEN_DIFF.BIRTHDATE , 'YYYY-MM-DD HH24:MI:SS') -
PREV(TO_TIMESTAMP(FIFTEEN_DIFF.BIRTHDATE , 'YYYY-MM-DD HH24:MI:SS'))))) >= 5478)
OR (LEAST(FIFTEEN_DIFF.BIRTHDATE,PREV(FIFTEEN_DIFF.BIRTHDATE)) IS NULL
AND COALESCE(FIFTEEN_DIFF.BIRTHDATE,PREV(FIFTEEN_DIFF.BIRTHDATE)) IS NOT NULL))));
Result (This also return both PRIMARY_ID, as NULL is now counted as a match):
PRIMARY_ID
12345
23456
2 rows selected.

SSRS (Report Builder) Min Value and Max Value from Same Dataset

I am attempting to extract data from a database that has an error in it. I can't resolve the error (it's a "design feature"), so i have to try to query around it. Here's how it is stored.
Record ID | Create Date | Update Date | Record Status
123 | 05/01/2018 | 05/01/2018 | Active
123 | 05/08/2018 | 05/08/2018 | Active
123 | 05/15/2018 | 05/15/2018 | Closed
123 | 05/22/2018 | 05/22/2018 | Closed
456 | 06/02/2018 | 06/02/2018 | Pending
456 | 06/09/2018 | 06/09/2018 | Active
456 | 06/16/2018 | 06/16/2018 | Active
456 | 06/23/2018 | 06/23/2018 | Suspended
And so on. As you can see, the Create Date and Update Date values match on each row. The Create Date value is supposed to be the date the Record ID was initially created, but it's actually being captured as the date the Record ID update was created.
What I need is a report that brings me a single row per Record ID that shows me the minimum Create Date and the maximum Update Date, so that the result looks something like this:
Record ID | Create Date | Update Date | Record Status
123 | 05/01/2018 | 05/22/2018 | Closed
456 | 06/02/2018 | 06/23/2018 | Suspended
I've tried using the MIN and MAX aggregate functions in the Query Designer, and that works just fine until I add any other field that may change through the life of the record. I get this:
Record ID | Create Date | Update Date | Record Status
123 | 05/01/2018 | 05/08/2018 | Active
123 | 05/15/2018 | 05/22/2018 | Closed
456 | 06/02/2018 | 06/02/2018 | Pending
456 | 06/09/2018 | 06/16/2018 | Active
456 | 06/23/2018 | 06/23/2018 | Suspended
I'm relatively new to Report Builder, though I think I'm picking up its concepts quickly. What am I missing here?
Edited to add that when I use the Query Designer, the query text looks like this:
SELECT
DB.RECORD.RECORD_ID
,DB.RECORD.RECORD_STATUS_CODE
,MAX(DB.RECORD.RECORD_CREATED_DATE) AS Max_RECORD_CREATED_DATE
,MIN(DB.RECORD.RECORD_UPDATED_DATE) AS Min_RECORD_UPDATED_DATE
FROM
DB.RECORD
GROUP BY
DB.RECORD.RECORD_ID
,DB.RECORD.RECORD_STATUS_CODE
There are more elegant ways to do this using CTEs but this is a simple solution.
First I replicated your data sample and stuffed in into a table variable #t. Then all we do is group by recordid, getting the min create date and max update date (ignoring the status for now). We join to this subquery back to your original table, joining on recordid and update date, this will give us the last record for the record id and get the status from there.
DECLARE #t TABLE ([Record ID] int, [Create Date] date, [Update Date] date, [Record Status] varchar(20))
INSERT INTO #t VALUES
(123, '2018-05-01', '2018-05-01', 'Active'),
(123, '2018-05-08', '2018-05-08', 'Active'),
(123, '2018-05-15', '2018-05-15', 'Closed'),
(123, '2018-05-22', '2018-05-22', 'Closed'),
(456, '2018-06-02', '2018-06-02', 'Pending'),
(456, '2018-06-09', '2018-06-09', 'Active'),
(456, '2018-06-16', '2018-06-16', 'Active'),
(456, '2018-06-23', '2018-06-23', 'Suspended')
SELECT
g.[Record ID], g.[Create Date], g.[Update Date], t.[Record Status]
FROM
(
SELECT [Record ID], MIN([Create Date]) AS [Create Date], MAX([Update Date]) AS [Update Date]
FROM #t
GROUP BY [Record ID]
) g
JOIN #t t ON g.[Record ID] = t.[Record ID] and g.[Update Date] = t.[Update Date]
ORDER BY [Record ID]
Here's link to a SQL Fiddle that shows the results. The only difference with this version is the table name.
http://sqlfiddle.com/#!18/0bb22/1/0
UPDATE Based on your dataset query
This may not be 100% as I don;t have all your data to test but you probably just need the following.
SELECT
g.RECORD_ID, g.RECORD_CREATED_DATE, g.lastUpdateDate, t.RECORD_STATUS_CODE
FROM
(
SELECT RECORD_ID, MIN(RECORD_CREATED_DATE) AS RECORD_CREATED_DATE, MAX(RECORD_UPDATED_DATE) AS lastUpdateDate
FROM DB.RECORD
GROUP BY RECORD_ID
) g
JOIN DB.RECORD t ON g.RECORD_ID = t.RECORD_ID and g.lastUpdateDate = t.RECORD_UPDATED_DATE
ORDER BY RECORD_ID

(Nested?) Select statement with MAX and WHERE clause

I'm cranking my head on a set of data in order to generate a report from a Oracle DB.
Data are in two tables:
SUPPLY
DEVICE
There is only one column that links the two tables:
SUPPLY.DEVICE_ID
DEVICE.ID
In SUPPLY, there are these data: (Markdown is not working well. it's supposed to show a table)
| DEVICE_ID | COLOR_TYPE | SERIAL | UNINSTALL_DATE |
|----------- |------------ |-------------- |--------------------- |
| 1232 | 1 | CAP857496 | 08/11/2016,19:10:50 |
| 5263 | 2 | CAP57421 | 07/11/2016,11:20:00 |
| 758 | 3 | CBO753421869 | 07/11/2016,04:25:00 |
| 758 | 4 | CC9876543 | 06/11/2016,11:40:00 |
| 8575 | 4 | CVF75421 | 05/11/2016,23:59:00 |
| 758 | 4 | CAP67543 | 30/09/2016,11:00:00 |
In DEVICE, there are columns that I've to select all (more or less), but each row is unique.
What i need to achieve is:
for each SUPPLY.DEVICE_ID and SUPPLY.COLOR_TYPE, I need the most recent ROW -> MAX(UNINSTALL_DATE)
JOINED with
more or less all the columns in DEVICE.
At the end I should have something like this:
| ACCOUNT_CODE | MODEL | DEVICE.SERIAL | DEVICE_ID | COLOR_TYPE | SUPPLY.SERIAL | UNINSTALL_DATE |
|-------------- |------- |--------------- |----------- |------------ |--------------- |--------------------- |
| BUSTO | MS410 | LM753 | 1232 | 1 | CAP857496 | 08/11/2016,19:10:50 |
| MACCHI | MX310 | XC876 | 5263 | 2 | CAP57421 | 07/11/2016,11:20:00 |
| ASL_COMO | MX711 | AB123 | 758 | 3 | CBO753421869 | 07/11/2016,04:25:00 |
| ASL_COMO | MX711 | AB123 | 758 | 4 | CC9876543 | 06/11/2016,11:40:00 |
| ASL_VARESE | X950 | DE8745 | 8575 | 4 | CVF75421 | 05/11/2016,23:59:00 |
So far, using a nested select like:
SELECT DEVICE_ID,COLOR_TYPE,SERIAL,UNINSTALL_DATE FROM
(SELECT SELECT DEVICE_ID,COLOR_TYPE,SERIAL,UNINSTALL_DATE
FROM SUPPLY WHERE DEVICE_ID = '123456' ORDER BY UNINSTALL_DATE DESC)
WHERE ROWNUM <= 1
I managed to get the highest value on the UNISTALL_DATE column after trying MAX(UNISTALL_DATE) or HIGHEST(UNISTALL_DATE).
I tried also:
SELECT SUPPLY.DEVICE_ID, SUPPLY.COLOR_TYPE, ....
FROM SUPPLY,DEVICE WHERE SUPPLY.DEVICE_ID = DEVICE.ID
and it works, but gives me ALL the items, basically it's a merge of the two tables.
When I try to narrow the data selected, i get errors or a empty result.
I'm starting to wonder that it's not possible to obtain this data and i'm starting to export the data in excel and work from there, but I wish someone can help me before giving up...
Thank you in advance.
for each SUPPLY.DEVICE_ID and SUPPLY.COLOR_TYPE, I need the most recent ROW -> MAX(UNINSTALL_DATE)
Use ROW_NUMBER function in this way:
SELECT s.*,
row_number() OVER (
PARTITION BY DEVICE_ID, COLOR_TYPE
ORDER BY UNINSTALL_DATE DESC
) As RN
FROM SUPPLY s
This query marks most recent rows with RN=1
JOINED with more or less all the columns in DEVICE.
Just join the above query to DEVICE table
SELECT d.*,
x.COLOR_TYPE,
x.SERIAL,
x.UNINSTALL_DATE
FROM (
SELECT s.*,
row_number() OVER (
PARTITION BY DEVICE_ID, COLOR_TYPE
ORDER BY UNINSTALL_DATE DESC
) As RN
FROM SUPPLY s
) x
JOIN DEVICE d
ON d.DEVICE_ID = x.DEVICE_ID AND x.RN=1
OK - so you could group by device_id, color_type and select max(uninstall_date) as well, and join to the other table. But you would miss the serial value for the most recent row (for each combination of device_id, color_type).
There are a few ways to fix that. Your attempt with rownum was close, but the problem is that you need to order within each "group" (by device_id, color_type) and get the first row from each group. I am sure someone will post a solution along those lines, using either row_number() or rank() or perhaps the analytic version of max(uninstall_date).
When you just need the "top" row from each group, you can use keep (dense_rank first/last) - which may be slightly more efficient - like so:
select device_id, color_type,
max(serial) keep (dense_rank last order by uninstall_date) as serial,
max(uninstall_date) as uninstall_date
from supply
group by device_id, color_type
;
and then join to the other table. NOTE: dense_rank last will pick up the row OR ROWS with the most recent (max) date for each group. If there are ties, that is more than one row; the serial will then be the max (in lexicographical order) among those rows with the most recent date. You can also select min, or add some order so you pick a specific one (you didn't discuss this possibility).
SELECT
d.ACCOUNT_CODE, d.DNS_HOST_NAME,d.IP_ADDRESS,d.MODEL_NAME,d.OVERRIDE_SERIAL_NUMBER,d.SERIAL_NUMBER,
s.COLOR, s.SERIAL_NUMBER, s.UNINSTALL_TIME
FROM (
SELECT s.DEVICE_ID, s.LAST_LEVEL_READ, s.SERIAL_NUMBER,TRUNC(s.UNINSTALL_TIME), row_number()
OVER (
PARTITION BY DEVICE_ID, COLOR
ORDER BY UNINSTALL_TIME DESC
) As RN
FROM SUPPLY s
WHERE s.UNINSTALL_TIME IS NOT NULL AND s.SERIAL_NUMBER IS NOT NULL
)
JOIN DEVICE d
ON d.ID = s.DEVICE_ID AND s.RN=1;
#krokodilko: thank you very much for your help. First query works. Modified it in order to remove junk, putting real columns name i need (yesterday evening i had no access to the DB) and getting only the data I need.
Unfortunately, when I join the two tables as you suggested I get error:
ORA-00904: "S"."RN": invalid identifier
00904. 00000 - "%s: invalid identifier"
If i remove s. before RN, the ORA-00904 moves back to s.DEVICE_ID.

Resources