java 1.4 :how to insert multiple records in a database with one single hit using executeBatch? - java1.4

i am reading records data from a file(records count can be up to thousands ).Now i want to insert each record in to database.I want to insert all of records in one hit to reduce performance hit. If i use addBatch(String sqlQuery ) on statment object,my sql query should be static .but in my case query will be non static.Please tell me possible solutions with best performance?
platform
java 1.4
sql server 2000.

From Wiki
A SQL feature (since SQL-92) is the use of row value constructors to insert multiple rows at a time in a single SQL statement:
INSERT INTO ''TABLE'' (''column1'', [''column2, ... ''])
VALUES (''value1a'', [''value1b, ...'']),
(''value2a'', [''value2b, ...'']),
...

Related

Import massive table from Oracle to PostgreSQL with oracle-fdw return ORA-01406

I work on a project to transfer data from an Oracle database to a PostgreSQL database to create a datawarehouse with bash & SQL scripts. To access to the Oracle database, I use the PostgreSQL extension oracle-fdw.
One of my scripts import data from a massive table (~ 100 000 000 new rows/day). This table is partitioned and each partition contains 1 day of data. The query I use to import data looks like that :
INSERT INTO postgre_target_table (some_fields)
SELECT some_aggregated_fields -- (~150 fields)
FROM oracle_source_table
WHERE partition_id = :v_partition_id AND some_others_filters
GROUP BY primary_key;
On DEV server, the query works fine (there is much less data on this server) but in PREPROD, it returns the error ORA-01406: fetched column value was truncated.
In some posts, people say that the output fields may be too small but if I try to send a simple SELECT query without INSERT or GROUP BY I have the same error.
Another idea I found in another post is to create an Oracle side view but in my query I use multiple parameters that I cannot use in a view.
The last idea I found is to create an Oracle stored procedure that fills a table with aggregated data and then import data from this table but the Oracle database is critical and my customer prefers to avoid adding more data on it.
Now, I'm starting to think there's no solution and it's not good...
PostgreSQL version : 12.4 / Oracle version : 11.2
UPDATE
It seems my problem is more complecated than I thought.
After applying the modification given by Laurenz Albe, the query runs correctly on PGAdmin but the problem still appears when I use psql command.
Moreover, another query seems to have the same problem. This other query does not use the same source table as the first query, it uses 4 joined tables without any partition. The common point between these queries is the structure.
The detail I omit to specify in the original post is that the purpose of both queries is to pivot a table. They look like that :
SELECT osr.id,
MIN(CASE osr.category
WHEN 123 THEN
1
END) AS field1,
MIN(CASE osr.category
WHEN 264 THEN
1
END) AS field2,
MIN(CASE osr.category
WHEN 975 THEN
1
END) AS field3,
...
FROM oracle_source_table osr
WHERE osr.category IN (123, 264, 975, ...)
GROUP BY osr.id;
Now that I have detailed what the queries look like, I can give you some results I had with the second one without changing the value of max_long (this query is lighter than the first one) :
Sometimes it works (~10%), sometimes it failed (~90%) on PGadmin but it never works with psql command
If I delete the WHERE, it always works
I don't understand why deleting the WHERE change something, the field used in this clause is a NUMBER(6, 0) between 0 and 2500 and it is still used in the SELECT clause... Oh and in the 4 Oracle tables used by this query, there is no LONG datatype, only NUMBER datatype is used.
Among 20 queries I have, only these two have a problem, their structure is similar and I don't believe in coincidences.
Don't despair!
Set the max_long option on the foreign table big enough that all your oversized data fit.
The documentation has the details:
max_long (optional, defaults to "32767")
The maximal length of any LONG, LONG RAW and XMLTYPE columns in the Oracle table. Possible values are integers between 1 and 1073741823 (the maximal size of a bytea in PostgreSQL). This amount of memory will be allocated at least twice, so large values will consume a lot of memory.
If max_long is less than the length of the longest value retrieved, you will receive the error message
ORA-01406: fetched column value was truncated
Example:
ALTER FOREIGN TABLE my_tab OPTIONS (ADD max_long '1000000');

Nifi executeSql with 30 threads very slow

We are using HDF to fetch large data from oracle. We have a generateTableFetch to create partition of 8000 records which create query like below :
Select * from ( Select a.*, ROWNUM rnum FROM (SELECT * FROM OPUSER.DEPENDENCY_TYPES WHERE (1=1))a WHERE ROWNUM <= 368000) WHERE rnum > 361000
Now this query is taking almost 20-25min to return from oracle.
Is there anything wrong that we are doing wrong or any configuration changes we can do.
Nifi uses jdbc connection so is there any oracle side configuration for that.
Also if we somehow add parallelism hint to the query example /parallel(c,2)/. WIll this help?
I'm guessing you're using Oracle 11 (or less) and have selected Oracle as the database type. Since LIMIT/OFFSET wasn't introduced until Oracle 12, NiFi uses the nested SELECT with ROWNUM approach to ensure each "page" of data contains unique values. If you are using Oracle 12+, make sure to use the Oracle 12+ database adapter instead, as it can leverage the LIMIT/OFFSET capabilities resulting in a faster query. Also make sure you have the appropriate index(es) in place to help with query execution.
As of NiFi 1.7.0, you might also consider setting the Column for Value Partitioning property. If you have a column (perhaps your DEPENDENCY_TYPES column) that is fairly uniformly distributed, and is not "too sparse" in relation to your Partition Size property value, GenerateTableFetch can use the column's values rather than the ROWNUM approach, resulting in faster queries. See NIFI-5143 and the GenerateTableFetch documentation for more details.
If you need to add hints to the JDBC session, then as of NiFi 1.9.0 (see NIFI-5780 for more details) you can add pre- and post-query statements to ExecuteSQL.

BatchUpdate using an Oracle view

I have a complex Oracle view which takes around ~ 2 - 3 seconds to execute.
I'm trying to insert values from the Oracle view into a table.
I'm using JdbcTemplate BatchUpdate() to insert multiple values into the table.
In BatchUpdate(), PreparedStatement is used to set values.
Will using a Oracle view, cause any performance issue?
By using PreparedStatement, SQL statements are precompiled. But in case of VIEW, will the view be executed each time the insert query is fired ?
Views are just SQL-statements. They are not slower or faster than the underlying SQL-query.
However, when using complex views (multi-table joins and aggregation) built on-top of other complex views the optimizer may get confused and tries to outsmart itself, leading to really bad execution plans. The problems tend to be even worse if you don't have constraints, referential integrity in place.
A final note is that if you are merely pulling data out of the database to stuff it back in, you would probably achieve better performance doing the entire operation in the database instead. For an example, let's say you pull "order lines" from the database and then updates the "order header" with an "Order Total Qty". In this case you should probably do something like below instead:
merge
into order_header h
using (select order_id, sum(order_qty) as order_total_qty
from order_line
group by order_id
) l
on (h.order_id = l.order_id)
when matched then
update
set h.order_total_qty = l.order_total_qty;

select condition from cdef$ where rowid=:1 query elapsed time is more

In Db trace, there is a query taking long time.Can some one explain what it means.Seems this is very generic oracle query and not involved with my custom tables.
select condition from cdef$ where rowid=:1;
Found the same query in multiple places in trc files(DB trace) and one among all have huge amount of elapsed time. So, what will be the solution to avoid taking such a long time. Am using 11g version oracle.
You're right, that is an example of Oracle's recursive SQL, the statements it runs against the data dictionary to support our application SQL. That particular statement is the query Oracle runs to get the Search Condition of a CHECK constraint. If you are inserting or updating rows in tables with check constraints you will see it a lot.
The actual statement shouldn't take too long to run, so it is unlikely to be the source of a performance problem. Unless you are running lots of insert statements with hard-coded values. Oracle will run that query every time it parses a fresh insert or update statement. That will get expensive if you're not using bind variables.

JDBC batch creation in Sybase

I have a requirement of updating a table which has about 5 million rows.
So for that purpose i want to create batch statements in java and update as a bulk operation.
Righht now I have 100 batches aand it works fine.But when i increase the number of batches over hundred i get an exceptio as : com.sybase.jdbc2.jdbc.SybBatchUpdateException: JZ0BE: BatchUpdateException: Error occurred while executing batch statement: Message empty.
How can i have more batch statements in my CallableStatement object.
Not enough reputation to leave comments...but what types of statements are you batching? how many of these rows are you updating? Does the table have a primary key? How many columns in the table, and how many of those columns are you updating?
Generic answer:
The JDBC framework in sybase is extremely fast. You might at least consider writing a simple procedure that receives the primary key (or other) information you're using to identify the row, along with the new values that row will be updated to as input variables. this procedure will update a single row only.
Wrap this procedure in it's own java method that handles the callablestatement, register your out error number and error message params, etc.
Then you can loop through whatever constructs you're using now to update data, and use the same java method to call the procedure to update the values row by row.
Again, i don't know the volume of what you're trying to do...but I do know if you're trying to do single row updates, this will be VERY fast.

Resources