I'm using DBeaver to connect to an Oracle database. Database connection and table properties view functions are working fine without any delay. But fetching table data is too slow(sometimes around 50 seconds).
Any settings to speed up fetching table data in DBeaver?
Changing following settings in your oracle db connection will be faster fetching table data than it's not set.
Right click on your db connection --> Edit Connection --> Oracle properties --> tick on 'Use RULE hint for system catalog queries'
(by default this is not set)
UPDATE
In the newer version (21.0.0) of DBeaver, many more performance options appear here. Turning on them significantly improves the performance for me
I've never used DBeaver, but I often see applications which use too small an "array fetch size"**, which often poses fetch issues.
** Array fetch size note:
As per the Oracle documentation the Fetch Buffer Size is an application side memory setting that affects the number of rows returned by a single fetch. Generally you balance the number of rows returned with a single fetch (a.k.a. array fetch size) with the number of rows needed to be fetched.
A low array fetch size compared to the number of rows needed to be returned will manifest as delays from increased network and client side processing needed to process each fetch (i.e. the high cost of each network round trip [SQL*Net protocol]).
If this is the case, you will likely see very high waits on “SQLNet message from client” [in gv$session or elsewhere].
SQLNet message from client
This wait event is posted by the session when it is waiting for a message from the client to arrive. Generally, this means that the session is just sitting idle, however, in a Client/Server environment it could also means that either the client process is running slow or there are network latency delays. The database performance is not degraded by high wait times for this wait event.
Related
I am just curious of ways to better tune for speed bulk inserts via apache nifi. I am just curious if a different driver or other configurations could speed up the process. Any inputs or references to resources would be greatly appreciated!
This is my current flow with configurations included in pictures, Source DB is Oracle, Destination DB is IBM db2 z/Os:
I think you have a few things working against you:
You probably have low concurrency set on the PutDatabaseRecord processor.
You have a very large fetch size.
You have a very large record-per-flowfile count.
From what I've read in the past, the fetch size controls how many records will be pulled from the query's remote result in each iteration. So in your case, it has to pull 100k records before it will even register data being ready. Try dropping it down to 1k records for the fetch and experiment with 100-1000 records per flowfile.
If you're bulk inserting that flowfile, you're also sending over 100k inserts at once.
We have created some new SSAS Tabular models which fetch data directly from Oracle. But after some testing, we found that with real customer data (with few millions of rows of data), the processing times go close to 4 hours. Our goal is to keep them under about 15mins (Due to existing system performance). We fetch from Oracle tables so query performance is not the bottleneck.
Are there any general design guides/best practices to handle such a scenario?
Check your application side array fetch size as you could be experiencing network latency.
** Array fetch size note:
As per the Oracle documentation the Fetch Buffer Size is an application side memory setting that affects the number of rows returned by a single fetch. Generally, you balance the number of rows returned with a single fetch (a.k.a. array fetch size) with the number of rows needed to be fetched.
A low array fetch size compared to the number of rows needed to be returned will manifest as delays from increased network and client side processing needed to process each fetch (i.e. the high cost of each network round trip [SQL*Net protocol]).
If this is the case, on the Oracle side you will likely see very high waits on “SQL*Net message from client”. [This wait event is posted by the session when it is waiting for a message from the client to arrive. Generally, this means that the session is just sitting idle, however, in a Client/Server environment it could also mean that either the client process is running slow or there are network latency delays. The database performance is not degraded by high wait times for this wait event.]
As I like to say: “SQL*Net is a chatty protocol”; so even though Oracle may be done with its processing of the query, excessive network round-trips results in slower response times on the client side. One should expect that low array fetch size may be contributing to the slowness if the elapsed time to get the data into the application is much longer than the elapsed time for the DB to run the SQL; in this case app side processing time can also be a factor contributing to the slowness [you can look into app specific ways to troubleshoot/tune app side processing].
Array fetch size is not an attribute of the Oracle account nor is it an Oracle side session setting. Array fetch size can only be set at the client; there is no DB setting for the array fetch size the client will use. Every client application has a different mechanism for specifying the array fetch size:
Informatica: ?? config. file param ??? setting at the connection or
result set level??
Cognos http://www-01.ibm.com/support/docview.wss?uid=swg21981559
SQL*Plus: set arraysize n
Java/JDBC: setFetchSize(int rows) /* method in Statement,
PreparedStatement, CallableStatement, and ResultSet objects */
Properties object put method “defaultRowPrefetch”
http://download.oracle.com/otn_hosted_doc/jdeveloper/905/jdbc-javadoc/oracle/jdbc/OracleDriver.html Another link to Oracle JDBC DefaultRowPrefetch
http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-faq-090281.html
.Net Oracle .Net Developers Guide The FetchSize property represents
the total memory size in bytes that ODP.NET allocates to cache the
data fetched from a database round-trip. The FetchSize property can
be set on the OracleCommand, OracleDataReader, or OracleRefCursor
object, depending on the situation. It controls the fetch size for
filling a DataSet or DataTable using an OracleDataAdapter.
ODBC driver: ?? something like: SetRowsetSize
I'd like to save planner cost using plan cache, since OCRA/Legacy optimizer will take dozens of millionseconds.
I think greenplum cache query plan in session level, when session end or other session could not share the analyzed plan. Even more, we can't keep session always on, since gp system will not release resource until TCP connection disconnected.
most major database cache plans after first running, and use that corss connections.
So, is there any switch that turn on query plan cache cross connectors? I can see in a session, client timing statistics not match the "Total time" planner gives?
Postgres can cache the plans as well, which is on a per session basis and once the session is ended, the cached plan is thrown away. This can be tricky to optimize/analyze, but generally of less importance unless the query you are executing is really complex and/or there are a lot of repeated queries.
The documentation explains those in detail pretty well. We can query pg_prepared_statements to see what is cached. Note that it is not available across sessions and visible only to the current session.
When a user starts a session with Greenplum Database and issues a query, the system creates groups or 'gangs' of worker processes on each segment to do the work. After the work is done, the segment worker processes are destroyed except for a cached number which is set by the gp_cached_segworkers_threshold parameter.
A lower setting conserves system resources on the segment hosts, but a higher setting may improve performance for power-users that want to issue many complex queries in a row.
Also see gp_max_local_distributed_cache.
Obviously, the more you cache, the less memory there will be available for other connections and queries. Perhaps not a big deal if you are only hosting a few power users running concurrent queries... but you may need to adjust your gp_vmem_protect_limit accordingly.
For clarification:
Segment resources are released after the gp_vmem_idle_resource_timeout.
Only the master session will remain until the TCP connection is dropped.
I was reading some interesting stuff about JDBC pre-fetch size, but I cannot find any answer to a few questions:
The Java app I'm working on is designed to fetch rows from cursors opened and returned by functions within PL/SQL packages. I was wondering whether the pre-fetch default setting of the JDBC driver is actually affecting the fetching process or not, being the SQL statements parsed and opened within the Oracle database. I tried setting the fetch size on the JBoss configuration file and printing the value taken from the method setFetchSize(). The new value (100, just for testing purpose) was returned but I see no difference in how the application performs.
I also read this pre-fetching is enhancing performance by reducing the number of round-trips between the client and the database server, but how can I measure the number of round trips in order to verify and quantify the actual benefits I can eventually get by tuning the pre-fetch size?
Yes the Oracle JDBC thin driver will use the configured prefetch size when fetching from any cursor whether the cursor was opened by the client or from within a stored proc.
The easiest way to count the roundtrips is to look a the sqlnet trace. You can turn on sqlnet tracing on the server-side by adding trace_level_server = 16 to your sqlnet.ora file (again on the server as JDBC thin doesn't use sqlnet.ora). Each foreground process will then dump the network traffic in a trace file. You can then see the network packets exchanged with the client and count the roundtrips. By default the driver fetches rows 10 by 10. But since you have increased the fetch size to 100 it should fetch up to that number of rows in one single roundtrip.
Note that unless your client is far away from your server (significant ping time) then the cost of a roundtrip won't be high and unless you're fetching a very high number of rows (10,000s) you won't see much difference in performance in increasing the fetch size. The default 10 usually works fine for most OLTP applications. In your client is far away then you can also consider increasing the SDU size (maximum size of a sqlnet packet). The default is 8k but you can increase it up to 2MB in 12.2.
We have a TDBGrid that connected to TClientDataSet via TDataSetProvider in Delphi 7 with Oracle database.
It goes fine to show content of small tables, but the program hangs when you try to open a table with many rows (for ex 2 million rows) because TClientDataSet tries to load the whole table in memory.
I tried to set "FetchOnDemand" to True for our TClientDataSet and "poFetchDetailsOnDemand" to True in Options for TDataSetProvider, but it does not help to solve the problem. Any ides?
Update:
My solution is:
TClientDataSet.FetchOnDemand = T
TDataSetProvider.Options.poFetchDetailsOnDemand = T
TClientDataSet.PacketRecords = 500
I succeeded to solve the problem by setting the "PacketRecords" property for TCustomClientDataSet. This property indicates the number or type of records in a single data packet. PacketRecords is automatically set to -1, meaning that a single packet should contain all records in the dataset, but I changed it to 500 rows.
When working with RDBMS, and especially with large datasets, trying to access a whole table is exactly what you shouldn't do. That's a typical newbie mistake, or a borrowing from old file based small database engines.
When working with RDBMS, you should load the rows you're interested in only, display/modify/update/insert, and send back changes to the database. That means a SELECT with a proper WHERE clause and also an ORDER BY - remember row ordering is never assured when you issue a SELECT without an OREDER BY, a database engine is free to retrieve rows in the order it sees fit for a given query.
If you have to perform bulk changes, you need to do them in SQL and have them processed on the server, not load a whole table client side, modify it, and send changes row by row to the database.
Loading large datasets client side may fali for several reasons, lack of memory (especially 32 bit applications), memory fragmentation, etc. etc., you will flood the network probably with data you don't need, force the database to perform a full scan, maybe flloding the database cache as well, and so on.
Thereby client datasets are not designed to handle millions of billions of rows. They are designed to cache the rows you need client side, and then apply changes to the remote data. You need to change your application logic.