I am trying to Update below sample json data into an Oracle version 19 table. (I want update 1000 rows from json with 1 query):
create table jt_test (
CUST_NUM int, SORT_ORDER int, CATEGORY varchar2(100)
);
[
{"CUST_NUM": 12345, "SORT_ORDER": 1, "CATEGORY": "ICE CREAM"}
{"CUST_NUM": 12345, "SORT_ORDER": 2, "CATEGORY": "ICE CREAM"}
{"CUST_NUM": 12345, "SORT_ORDER": 3, "CATEGORY": "ICE CREAM"}
]
I use this tutorial and this for insert rows from json and that work perfect. But for update rows I have no idea. How can I do?
Note: I use Oracle19C and connect and insert to db with cx_Oracle module python.
Code for Inserting by json to Oracle columns:
DECLARE
myJSON varchar2(1000) := '[
{"CUST_NUM": 12345, "SORT_ORDER": 1, "CATEGORY": "ICE CREAM"},
{"CUST_NUM": 12345, "SORT_ORDER": 2, "CATEGORY": "ICE CREAM"},
{"CUST_NUM": 12345, "SORT_ORDER": 3, "CATEGORY": "ICE CREAM"}
]';
BEGIN
insert into jt_test
select * from json_table ( myjson, '$[*]'
columns (
CUST_NUM, SORT_ORDER, CATEGORY
)
);
END;
In SQL Developer use below code :
MERGE INTO jt_test destttt using(
SELECT CUST_NUM,SORT_ORDER,CATEGORY FROM json_table (
'[
{"CUST_NUM": 12345, "SORT_ORDER": 1, "CATEGORY": "ICE CREAM"},
{"CUST_NUM": 12345, "SORT_ORDER": 2, "CATEGORY": "ICE CREAM"},
{"CUST_NUM": 12345, "SORT_ORDER": 3, "CATEGORY": "ICE CREAM"}
]'
,'$[*]'
COLUMNS
CUST_NUM int PATH '$.CUST_NUM ',
SORT_ORDER int PATH '$.SORT_ORDER ',
CATEGORY varchar2 PATH '$.CATEGORY ' ) ) srccccc
ON ( destttt.CUST_NUM= srccccc.CUST_NUM)
WHEN MATCHED THEN UPDATE SET destttt.CATEGORY=srccccc.CATEGORY
WHEN NOT MATCHED THEN INSERT ( CUST_NUM,SORT_ORDER,CATEGORY) VALUES (srccccc.CUST_NUM,srccccc.SORT_ORDER,srccccc.CATEGORY);
In python with cx_Oracle use below code :
long_json_string = '''[
{"CUST_NUM": 12345, "SORT_ORDER": 1, "CATEGORY": "ICE CREAM"},
{"CUST_NUM": 12345, "SORT_ORDER": 2, "CATEGORY": "ICE CREAM"},
{"CUST_NUM": 12345, "SORT_ORDER": 3, "CATEGORY": "ICE CREAM"}
]'''
sql = '''
DECLARE jsonvalue CLOB := :long_json_string ;
begin
MERGE INTO jt_test destttt using(
SELECT CUST_NUM,SORT_ORDER,CATEGORY FROM json_table (jsonvalue
,'$[*]'
COLUMNS
CUST_NUM int PATH '$.CUST_NUM',
SORT_ORDER int PATH '$.SORT_ORDER',
CATEGORY varchar2 PATH '$.CATEGORY' ) ) srccccc
ON ( destttt.CUST_NUM= srccccc.CUST_NUM)
WHEN MATCHED THEN UPDATE SET destttt.CATEGORY=srccccc.CATEGORY
WHEN NOT MATCHED THEN INSERT ( CUST_NUM,SORT_ORDER,CATEGORY) VALUES (srccccc.CUST_NUM,srccccc.SORT_ORDER,srccccc.CATEGORY);
'''
cursor.execute(sql, long_json_string=long_json_string)
Note1: Do not forget in end use commit.
Note 2: Make sure that the column you use as a comparison is not repeated in a json and causes deadlock.
Note 3: there is case sensitivity json keys, that is, CUST_NUM is different from cust_num and CUST_num and ...
Wrong : CUST_NUM int PATH '$.CUST_num' or CUST_NUM int PATH '$.cusr _num'
Ok: CUST_NUM int PATH '$.CUST_NUM'
Related
I'm trying to get some specific dataset from Scival REST API to oracle database table. Below is the JSON payload that I'm trying to manipulate.
{
"metrics": [{
"metricType": "ScholarlyOutput",
"valueByYear": {
"2017": 4,
"2018": 0,
"2019": 3,
"2020": 1,
"2021": 1
}
}],
"author": {
"link": {
"#ref": "self",
"#href": "https://api.elsevier.com/analytics/scival/author/123456789?apiKey=xxxxxxxxxx&httpAccept=text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2",
"#type": "text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2"
},
"name": "Citizen, John",
"id": 123456789,
"uri": "Author/123456789"
}
}
I'm able to query the 'author' bit with the below SQL.
SELECT jt.*
FROM TABLE d,
JSON_TABLE(d.column format json, '$.author' COLUMNS (
"id" VARCHAR2 PATH '$.id',
"name" VARCHAR2 PATH '$.name')
) jt;
However, I'm not able to get the 'valueByYear' value. I've tried below.
SELECT jt.*
FROM TABLE d,
JSON_TABLE
(d.column, '$.metrics[*]' COLUMNS
(
"metric_Type" VARCHAR2 PATH '$.metricType'
,"Value_By_Year" NUMBER PATH '$.valueByYear'
NESTED PATH '$.valueByYear[1]' COLUMNS
("2021" NUMBER PATH '$.valueByYear[1]'
)
)
) jt;
I would appreciate if you could let me know what I'm missing here. I'm after the latest 'year' value.
You can use:
SELECT jt.*
FROM table_name d,
JSON_TABLE(
d.column_name format json,
'$'
COLUMNS (
id VARCHAR2 PATH '$.author.id',
name VARCHAR2 PATH '$.author.name',
NESTED PATH '$.metrics[*]' COLUMNS (
metricType VARCHAR2(30) PATH '$.metricType',
value2021 NUMBER PATH '$.valueByYear."2021"'
)
)
) jt;
Which, for the sample data:
CREATE TABLE table_name (
column_name CLOB CHECK (column_name IS JSON)
);
INSERT INTO table_name (column_name) VALUES (
'{
"metrics": [{
"metricType": "ScholarlyOutput",
"valueByYear": {
"2017": 4,
"2018": 0,
"2019": 3,
"2020": 1,
"2021": 1
}
}],
"author": {
"link": {
"#ref": "self",
"#href": "https://api.elsevier.com/analytics/scival/author/123456789?apiKey=xxxxxxxxxx&httpAccept=text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2",
"#type": "text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2"
},
"name": "Citizen, John",
"id": 123456789,
"uri": "Author/123456789"
}
}'
);
Outputs:
ID
NAME
METRICTYPE
VALUE2021
123456789
Citizen, John
ScholarlyOutput
1
db<>fiddle here
If you want to do it dynamically in PL/SQL, then you can create the types:
CREATE TYPE scival_row IS OBJECT(
name VARCHAR2(100),
id NUMBER(12),
metricType VARCHAR2(50),
year NUMBER(4),
value NUMBER
);
CREATE TYPE scival_tbl IS TABLE OF scival_row;
and then the pipelined function:
CREATE FUNCTION parseScival(
i_json CLOB,
i_year NUMBER
) RETURN scival_tbl PIPELINED DETERMINISTIC
IS
v_obj JSON_OBJECT_T := JSON_OBJECT_T.parse(i_json);
v_author JSON_OBJECT_T := v_obj.get_Object('author');
v_name VARCHAR2(100) := v_author.get_String('name');
v_id NUMBER(12) := v_author.get_Number('id');
v_metrics JSON_ARRAY_T := v_obj.get_Array('metrics');
v_metric JSON_OBJECT_T;
BEGIN
FOR i IN 0 .. v_metrics.Get_Size - 1 LOOP
v_metric := TREAT(v_metrics.get(i) AS JSON_OBJECT_T);
PIPE ROW(
scival_row(
v_name,
v_id,
v_metric.get_string('metricType'),
i_year,
v_metric.get_object('valueByYear').get_number(i_year)
)
);
END LOOP;
END;
/
Then you can use the query:
SELECT j.*
FROM table_name t
CROSS APPLY TABLE(parseScival(t.column_name, 2021)) j
Which outputs:
NAME
ID
METRICTYPE
YEAR
VALUE
Citizen, John
123456789
ScholarlyOutput
2021
1
db<>fiddle here
My source table looks like this:
id|value|count
Value is a String of values separated by semicolons(;). For example it may look like this
A;B;C;D;
Some may not have values at a certain position, like this
A;;;D;
First, I've selectively moved records to a new table(targettable) based on positions with values using regexp. I achieved this by using [^;]+; for having some value between the semicolons, and [^;]*; for those positions I don't care about. For example, if I wanted the 1st and 4th place to have values, I could incorporate regexp with insert into like this
insert into
targettable tt (id, value, count)
SELECT some_seq.nextval,value, count
FROM source table
WHERE
regexp_like(value, '^[^;]+;[^;]*;[^;]*;[^;]+;')
so now my new table has a list of records that have values at the 1st and 4th position. It may look like this
1|A;B;C;D;|2
2|B;;;E;|1
3|A;D;;D|3
Next there are 2 things I want to do. 1. get rid of values other than 1st and 4th. 2.combine identical values and add up their count. For example, record 1 and 3 are the same, so I want to trim so they become A;D;, and then add their count, so 2+3=5. Now my new table looks like this
1|A;D;|5
2|B;E;|1
As long as I can somehow get to the final table from source table, I don't care about the steps. The intermediate table is not required, but it may help me achieve the final result. I'm not sure if I can go any further with Orcale though. If not, I'll have to move and process the records with Java. Bear in mind I have millions of records, so I would consider the Oracle method if it is possible.
You should be able to skip the intermediate table; just extract the 1st and 4th elements, using the regexp_substr() function, while checking that those are not null:
select regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) -- first position
|| ';' || regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) -- fourth position
|| ';' as value, -- if you want trailing semicolon
count
from source
where regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) is not null
and regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) is not null;
VALUE COUNT
------------------ ----------
A;D; 2
B;E; 1
A;D; 3
and then aggregate those results:
select value, sum(count) as count
from (
select regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) -- first position
|| ';' || regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) -- fourth position
|| ';' as value, -- if you want trailing semicolon
count
from source
where regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) is not null
and regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) is not null
)
group by value;
VALUE COUNT
------------------ ----------
A;D; 5
B;E; 1
Then for your insert you can use that query, either with an auto-increment ID (12c+), or setting an ID from a sequence via a trigger, or possibly wrapped in another level of subquery to get the value explicitly:
insert into target (id, value, count)
select some_seq.nextval, value, count
from (
select value, sum(count) as count
from (
select regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) -- first position
|| ';' || regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) -- fourth position
|| ';' as value, -- if you want trailing semicolon
count
from source
where regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) is not null
and regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) is not null
)
group by value
);
If you're creating a new sequence to do that, so they start from 1, you can use rownum or row_number() instead.
Incidentally, using a keyword or a function name like count as a column name is confusing (sum(count) !?); those might not be your real names though.
I would use regexp_replace to remove the 2nd and 3rd parts of the string, combined with an aggregate query to get the total count, like :
SELECT
regexp_replace(value, '^[^;]+;([^;]*;[^;]*;)[^;]+;', ''),
SUM(count)
FROM source table
WHERE
regexp_like(value, '^[^;]+;[^;]*;[^;]*;[^;]+;')
GROUP BY
regexp_replace(value, '^[^;]+;([^;]*;[^;]*;)[^;]+;', '')
I'm not used to work with scheduled tasks, I need some advice (is my thought good or bad)
I'm designing a function that runs every 20 minutes. This function retrieves data from a json file (which I do not have control over) and inserts the data into the database.
When I was doing this I did not think that this will create a unique ID problem in the database view that it is the same data that updates each time.
I thought of doing two functions:
1: the first insertions (INSERT)
2: Update the data according to the ID (UPDATE)
#Component
public class LoadSportsCompetition {
#PostConstruct
public void insert() {
// 1 : get json data
// 2 : insert in DB
}
#Scheduled(cron="0 0/20 * * * ?")
public void update() {
// 1 : get json data
// 2 : update rows by ID
}
}
The (most probably) best way to handle this in PostgreSQL 9.5 and later, is to use INSERT ... ON CONFLICT ... DO UPDATE.
Let's assume this is your original table (very simple, for the sake of this example):
CREATE TABLE tbl
(
tbl_id INTEGER,
payload JSONB,
CONSTRAINT tbl_pk
PRIMARY KEY (tbl_id)
) ;
We fill it with the starting data:
INSERT INTO tbl
(tbl_id, payload)
VALUES
(1, '{"a":12}'),
(2, '{"a":13, "b": 25}'),
(3, '{"a":15, "b": [12,13,14]}'),
(4, '{"a":12, "c": "something"}'),
(5, '{"a":13, "x": 1234.567}'),
(6, '{"a":12, "x": 1234.789}') ;
Now we perform a non-conflicting insert (i.e.: the ON CONFLICT ... DO won't be executed):
-- A normal insert, no conflict
INSERT INTO tbl
(tbl_id, payload)
VALUES
(7, '{"x": 1234.56, "y": 3456.78}')
ON CONFLICT ON CONSTRAINT tbl_pk DO
UPDATE
SET payload = excluded.payload ; -- Note: the excluded pseudo-table comprises the conflicting rows
And now we perform one INSERT that would generate a PRIMARY KEY conflict, which will be handled by the ON CONFLICT clause and will perform an update
-- A conflicting insert
INSERT INTO tbl
(tbl_id, payload)
VALUES
(3, '{"a": 16, "b": "I don''t know"}')
ON CONFLICT ON CONSTRAINT tbl_pk DO
UPDATE
SET payload = excluded.payload ;
And now, a two row insert that will conflict on one row, and insert the other:
-- Now one of each
-- A conflicting insert
INSERT INTO tbl
(tbl_id, payload)
VALUES
(4, '{"a": 18, "b": "I will we updated"}'),
(9, '{"a": 17, "b": "I am nuber 9"}')
ON CONFLICT ON CONSTRAINT tbl_pk DO UPDATE
SET payload = excluded.payload ;
We check now the table:
SELECT * FROM tbl ORDER BY tbl_id ;
tbl_id | payload
-----: | :----------------------------------
1 | {"a": 12}
2 | {"a": 13, "b": 25}
3 | {"a": 16, "b": "I don't know"}
4 | {"a": 18, "b": "I will we updated"}
5 | {"a": 13, "x": 1234.567}
6 | {"a": 12, "x": 1234.789}
7 | {"x": 1234.56, "y": 3456.78}
9 | {"a": 17, "b": "I am nuber 9"}
Your code should loop through your incoming data, get it, and perform all the INSERT/UPDATE (sometimes called MERGE or UPSERT) one row at a time, or in batches, with multi-line VALUES.
You can get all the code at dbfiddle here
There is also one alternative, which is better suited if you work in batches. Use a WITH statement, that has one UPDATE clause, followed by an INSERT one:
-- Avoiding (most) concurrency issues.
BEGIN TRANSACTION ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
WITH data_to_load (tbl_id, payload) AS
(
VALUES
(3, '{"a": 16, "b": "I don''t know"}' :: jsonb),
(4, '{"a": 18, "b": "I will we updated"}'),
(7, '{"x": 1234.56, "y": 3456.78}'),
(9, '{"a": 17, "b": "I am nuber 9"}')
),
update_existing AS
(
UPDATE
tbl
SET
payload = data_to_load.payload
FROM
data_to_load
WHERE
tbl.tbl_id = data_to_load.tbl_id
)
-- Insert the non-existing
INSERT INTO
tbl
(tbl_id, payload)
SELECT
tbl_id, payload
FROM
data_to_load
WHERE
data_to_load.tbl_id NOT IN (SELECT tbl_id FROM tbl) ;
COMMIT TRANSACTION ;
You'll get the same results, as you can see at dbfiddle here.
In both cases, be ready for error handling, and be prepared to retry your transactions if they conflict due to concurrent actions also modifying your database. Your transactions can be explicit (like in the second case), or implicit, if you have some kind of auto-commit every single INSERT
I’m interested in trying out CockroachDB, but typically SQL databases don’t natively support JSON. Is there a way for me to access fields of JSON objects in queries if I store them in CockroachDB?
UPDATE: CockroachDB supports JSON now.
Be Flexible & Consistent: JSON Comes to CockroachDB
We are excited to announce support for JSON in our 2.0 release (coming in April) and available now via our most recent 2.0 Beta release. Now you can use both structured and semi-structured data within the same database.
CockroachDB supports JSON. It stores JSON data in JSONB data type - Binary JSON.
CREATE TABLE my_table1 (
id INT PRIMARY KEY,
data JSONB
);
The size of a JSONB field is variable but should be kept within 1 MB to ensure satisfactory performance.
We can insert JSON string as follows:
INSERT INTO my_table1 (id, data)
VALUES
(1, '{"name": "Mary", "age": 16, "city": "Singapore"}'::JSONB),
(2, '{"name": "John", "age": 17, "city": "Malaysia" }'::JSONB),
(3, '{"name": "Pete", "age": 18, "city": "Vienna" }'::JSONB),
(99,'{"name": "Anna", "gender": "Female" }'::JSONB);
SELECT * from my_table1;
id | data
-----+---------------------------------------------------
1 | {"age": 16, "city": "Singapore", "name": "Mary"}
2 | {"age": 17, "city": "Malaysia", "name": "John"}
3 | {"age": 18, "city": "Vienna", "name": "Pete"}
99 | {"gender": "Female", "name": "Anna"}
(4 rows)
Return rows (and returning data as JSONB)
SELECT * FROM my_table1 WHERE data->'age' = '17'::JSONB;
id | data
-----+--------------------------------------------------
2 | {"age": 17, "city": "Malaysia", "name": "John"}
(1 row)
Return rows (and returning data as string)
SELECT * FROM my_table1 WHERE data->>'age' = '17';
id | data
-----+--------------------------------------------------
2 | {"age": 17, "city": "Malaysia", "name": "John"}
(1 row)
Return rows showing age and city from data field as JSONB
SELECT id,
data->'age' AS "age",
data->'city' AS "city"
FROM my_table1;
id | age | city
-----+------+--------------
1 | 16 | "Singapore"
2 | 17 | "Malaysia"
3 | 18 | "Vienna"
99 | NULL | NULL
(4 rows)
Return rows if data field has age subfield
SELECT id,
data->'age' AS "age",
data->'city' AS "city"
FROM my_table1
WHERE data ? 'age';
id | Age | City
-----+-----+--------------
1 | 16 | "Singapore"
2 | 17 | "Malaysia"
3 | 18 | "Vienna"
(3 rows)
Return rows if age subfield (as string) is equal to 17
SELECT * FROM my_table1 WHERE data->>'age' = '17';
id | data
-----+--------------------------------------------------
2 | {"age": 17, "city": "Malaysia", "name": "John"}
(1 row)
Return rows if age subfield (as JSONB) is equal to 17
SELECT * FROM my_table1 WHERE data->'age' = '17'::JSONB;
id | data
-----+--------------------------------------------------
2 | {"age": 17, "city": "Malaysia", "name": "John"}
(1 row)
Select rows if name and gender subfields exist in the data field.
SELECT * FROM my_table1 WHERE data ?& ARRAY['name', 'gender'];
id | data
-----+---------------------------------------
99 | {"gender": "Female", "name": "Anna"}
(1 row)
I have lets say two tables:
Student(id, name);
Class (id, name, student_id);
how I can select all students, but ordered by classes count?
Students:
1, "John"
2, "Andrew"
Classes:
1, french, 1
2, french, 2
3, Spanish, 1
4, English, 1
It should order:
John
Andrew
Right now I get students:
return entites.students.Include(w=>w.classes).ToList();
Order part is missing...
EDIT
Great, it works, but how it should looks, when classes table is in schools table and I want to get students ordered by schools count?
Students (id, name);
Classes (id, name, students_id);
Schools (id, name, classes_id);
Students:
1, "John"
2, "Andrew"
Classes:
1, french, 1
2, french, 2
3, Spanish, 1
4, English, 1
5, English, 2
Schools:
1, "Primary school", 1
2, "Secondary school", 2
3, "Another school", 5
It should give me:
Andrew
John
Assuming your file has a
using System.Linq;
above the namespace directive, you can do:
entities.students.Include(s => s.classes).OrderBy(s => s.classes.Count());