JDBC batch execution guarantees - performance

We're trying to execute a batch insert into Azure Synapse (formerly Azure SQL Data warehouse). Problems are:
Performance is abysmal (~1 second for insertion of one row of less than 2KB and 20-25 columns)
It scales linearly (~90 seconds for 100 rows I think)
We're using standard JDBC batch insertion pattern addBatch() & executeBatch() with PreparedStatements (https://stackoverflow.com/a/3786127/496289).
We're using JDBC driver provided by Microsoft.
We know what's wrong, in DB telemetry it's clear that DB is breaking the batch down and more or less running it as if it's in a for-loop. No batch "optimization".
Curiously, when the underlying data source is SQL Server, batch scales as expected.
Question is: Is there nothing in standard/spec that says executeBatch() should scale better than linearly?
E.g. JDBC™ 4.3 Specification (JSR 221) says it can improve performance, not it must.
CHAPTER 14 Batch Updates
The batch update facility allows multiple SQL statements to be submitted to a data source for processing at once. Submitting multiple SQL statements, instead of individually, can greatly improve performance. Statement, PreparedStatement, and CallableStatement objects can be used to submit batch updates
14.1.4 PreparedStatement Objects has no such explicit/implied statement to say batch mechanism is for better performance.
Should probably add that Azure Synapse is capable to loading 1 trillion rows of data (~450 GB in Parquet format) from Data lake in 17-26 minutes with 500 DWUs.

The JDBC specification doesn't require any kind of optimization for batch execution. In fact, not all databases support batch execution. A conforming JDBC driver is expected to implement batch execution whether or not the underlying database system supports it.
If the database system doesn't support it, the JDBC driver will simulate batch execution by repeatedly executing the statement in a loop. Such an implementation will not perform better than manually executing the statement repeatedly.
This is also why the text you quote says "can greatly improve performance" and not will or must.


Running a stored procedure in multi threaded way in oracle

I have a job which picks a record from a cursor and then it calls a stored procedure which processes the record picked up from the cursor.
The stored procedure has multiple queries to process the records. In all, procedure takes about 0.3 seconds to process a single record picked up by the cursor but since cursor contains more than 100k records it takes hours to complete the job.
The queries in the stored procedure are all optimized
I was thinking of making the procedure run in multi threaded way as in java and other programming language.
Can it be done in oracle? or is there any other way I can reduce the run time of my job.
I agree with the comments regarding processing cursors in a loop. As Tom Kyte often said "Row at a time [processing] is slow at a time"; Oracle performs best with set based operations and row-at-a-time operations usually have scalability issues (i.e. very susceptible to poor performance when things change on the DB such as CPU capacity, workload, number of records that need processing, changes in size of underlying tables, ...).
You probably already know that Oracle since 8i has a Java VM built in to the DB engine, so you might be able to have java code wrappered as PL/SQL, but this is not for the faint of heart [not saying that you are, just sayin'].
Before going to the trouble of re-writing your application, I would recommend the following tuning approach as it may yield some actionable tunings [assumes diagnostics and tuning pack licenses; won't remove the scalability issues but may lessen the impact of them]:
In versions of oracle 11g and above:
Find the the top level sql id recorded in gv$active_session_history and dba_hist_active_sess_history for the call to the PL/SQL procedure.
Examine the wait events for the sql_id's under that top_level_sql_id. (they tell you what the SQL is waiting on).
Run the tuning advisor on those sql_id's and check for any tuning recommendations. Sometimes if SQL is already sub-second getting it from hundredths of a second to thousandths of a second can have a big impact when call many times.
Run the ADDM report for the period when the procedure is running. Often you will find that heavy PL/SQL processes require increase in PGA. Further, ADDM may advise other relevant actions (e.g. increase SGA, session cached cursors, db writer processes, log buffer, run segment tuning advisor, ...)

performance issue in getting millions of record from database and processing in ERP in mule esb

We are trying to fetch millions of record from database and processing in ERP system per day and we are facing performance issue, is there any solution regarding this in Community?
What is the best way to process the records in mule? So should we use batch or is there any alternate to it? And if we use batch or any other solution, how can we use it so as not to face any performance issue?
Since we don't have details on your specific situation, here are some general ideas. You will definitely need to do performance testing when dealing with large data sets to make sure your flow design is performing well.
Just to clarify, I'm giving options below that show streaming, which are slightly less performant, but will allow you to process large datasets. If you can handle the dataset in memory and you want faster processing, then turn off streaming.
Test your db queries outside of mule to make sure they are performant and tables are properly indexed.
Use streaming db connection. Tweak chunk size for performance testing. (Using this with batch scope is a good combo)
If using on-premise runtime, do performance tuning.
Use batch scope (enterprise edition)
Batch sounds like what you want to do. For each batch step Mule creates a batch job instance and each instance contains a persistent queue with the batched records. However, it does a deep copy of the MuleEvent containing the flow variables, flow construct, message, processing time, session and exchange pattern so beware, make sure you keep a light footprint before going into your batch job. If you have to set the payload with millions of records to flow variables to do some manipulation, make sure you delete them before you start executing the batch. It will load these batch steps in memory and execute them concurrently so the amount of memory you will need will be the size of the batch job instance (in particular the MuleEvent) by the number of batch steps.

Apache Drill has bad performance against SQL Server

I tried using apache-drill to run a simple join-aggregate query and the speed wasn't really good. my test query was:
SELECT p.Product_Category, SUM(f.sales)
FROM facts f
JOIN Product p on f.pkey = p.pkey
GROUP BY p.Product_Category
Where facts has about 422,000 rows and product has 600 rows. the grouping comes back with 4 rows.
First I tested this query on SqlServer and got a result back in about 150ms.
With drill I first tried to connect directly to SqlServer and run the query, but that was slow (about 5 sec).
Then I tried saving the tables into json files and reading from them, but that was even slower, so I tried parquet files.
I got the result back in the first run in about 3 sec. next run was about 900ms and then it stabled at about 500ms.
From reading around, this makes no sense and drill should be faster!
I tried "REFRESH TABLE METADATA", but the speed didn't change.
I was running this on windows, through the drill command line.
Any idea if I need some extra configuration or something?
Drill is very fast, but it's designed for large distributed queries while joining across several different data sources... and you're not using it that way.
SQL Server is one of the fastest relational databases. Data is stored efficiently, cached in memory, and the query runs in a single process so the scan and join is very quick. Apache Drill has much more work to do in comparison. It has to interpret your query into a distributed plan, send it to all the drillbit processes, which then lookup the data sources, access the data using the connectors, run the query, return the results to the first node for aggregation, and then you receive the final output.
Depending on the data source, Drill might have to read all the data and filter it separately which adds even more time. JSON files are slow because they are verbose text files that are parsed line by line. Parquet is much faster because it's a binary compressed column-oriented storage format designed for efficient scanning, especially when you're only accessing certain columns.
If you have a small dataset stored on a single machine then any relational database will be faster than Drill.
The fact that Drill gets you results in 500ms with Parquet is actually impressive considering how much more work it has to do to give you the flexibility it provides. If you only have a few million rows, stick with SQL server. If you have billions of rows, then use the SQL Server columnstore feature to store data in columnar format with great compression and performance.
Use Apache Drill when you:
Have 10s of billions of rows or more
Have data spread across many machines
Have unstructured data like JSON stored in files without a standard schema
Want to split the query across many machines to run in faster in parallel
Want to access data from different databases and file systems
Want to join data across these different data sources
One thing people need to understand about how Drill works is how Drill translates an SQL query to an executable plan to fetch and process data from, theoretically, any source of data. I deliberately didn't say data source so people won't think of databases or any software-based data management system.
Drill uses storage plugins to read records from whatever data the storage plugin supports.
After Drill gets these rows, it starts performing what is needed to execute the query, whats needed may be filtering, sorting, joining, projecting (selecting specific columns)...etc
So drill doesn't by default use any of the source's capabilities of processing the queried data. In fact, the source may not support any capability of such !
If you wish to leverage any of the source's data processing features, you'll have to modify the storage plugin you're using to access this source.
One query I regularly remember when I think about Drill's performance, is this one
Select a.CUST_ID, (Select count(*) From SALES.CUSTOMERS where CUST_ID < a.CUST_ID) rowNum from SALES.CUSTOMERS a Order by CUST_ID
Only because of the > comparison operator, Drill has to load the whole table (i.e actually a parquet file), SORT IT, then perform the join.
This query took around 18 minutes to run on my machine which is a not so powerful machine but still, the effort Drill needs to perform to process this query must not be ignored.
Drill's purpose is not to be fast, it's purpose is to handle vast amounts of data and run SQL queries against structured and semi-structured data. And probably other things that I can't think about at the moment but you may find more information for other answers.

jdbc with oracle DB - out of memory

I've written a simple code that reads a table from oracle DB.
I try to run in on a very big table and I see that it consumes a huge amount of memory.
I thought that using fetchsize will cause it to optimize memory usage (that what happens when using it on SQLSERVER), but it didn't. tried it with various values - from 10 to 100000.
Can't see how I manage to perform a simple task - export a very big oracle table to a csv file.
I use ojdbc6.jar as a driver.
also I use
Any idea?
Seems like creating the statement with ResultSet.TYPE_FORWARD_ONLY solved this problem.

Does the compiled prepared statement in the database driver still require compilation in the database?

In the Oracle JDBC driver, there is an option to cache prepared statements. My understanding of this is that the prepared statements are precompiled by the driver, then cached, which improves performance for cached prepared statements.
My question is, does this mean that the database never has to compile those prepared statements? Does the JDBC driver send some precompiled representation, or is there still some kind of parsing/compilation that happens in the database itself?
When you use the implicit statement cache (or the Oracle Extension for the explicit Statement Cache) the Oracle Driver will cache a prepared- or callable statement after(!) the close() for re-use with the physical connection.
So what happens is: if a prepared Statement is used, and the physical connection has never seen it, it sends the SQL to the DB. Depending if the DB has seen the statement before or not, it will do a hard parse or a soft parse. So typically if you have a 10 connection pool, you will see 10 parses, one of it beein a hard parse.
After the statement is closed on a connection the Oracle driver will put the handle to the parsed statement (shared cursor) into a LRU cache. The next time you use prepareStatement on that connection it finds this cached handle to use and does not need to send the SQL at all. This results in a execution with NO PARSE.
If you have more (different) prepared statements used on a physical connection than the cache is in size the longest unused open shared cursor is closed. Which results in another soft parse the next time the statement is used again - because SQL needs to be sent to the server again.
This is basically the same function as some data sources for middleware have implemented more generically (for example prepared-statement-cache in JBoss). Use only one of both to avoid double caching.
You can find the details here:
Also check out the Oracle Unified Connection Pool (UCP) which supports this and interacts with FAN.
I think that this answers your question: (sorry it is powerpoint but it defines how the prepared statement is sent to Oracle, how Oracle stores it in the Shared SQL pool, processes it, etc). The main performance gain you are getting from Prepared statements is that on the 1+nth run you are avoiding hard parses of the sql statement.
Oracle (or db of choice) will store the prepared statement, java just send's it the same statement that the db will choose from (this is limited resources however, after x time of no query the shared sql will be purged esp. of non-common queries) and then a re-parse will be required -- whether or not it is cached in your java application.
