I am using the SPRING jdbcTemplate to extract some data from my DB using the following piece of code :
jdbctemplate.query(SELECT_SQL_QUERY, new RowCallbackHandler(){
public void processRow(resultsetRS) throws SQLException{
// Some operations here
}
});
When I use a select query,I want to know wether this method jdbctemplate.query() loads everything from the database before to process the data or loads row after another ?
I need the answer because I am using two SELECT queries and the second depends of the results of the first one (the results of the operations done on the selected data), means that if the second call loads everything before doing any treatement it won't take into accound the last changes of the first call (because i'm using paralellism in my code).
A SELECT query executed in JDBC returns all the rows as a set.
You have to execute the first query and then execute the second one.
This is the source of (n+1) death by latency.
A better solution might be to do it in one query: You'll only have one network round trip that way.
Related
In my application I am making a query to oracle and getting data this way
<select id="getAll" resultType="com.mappers.MyOracleMapper">
SELECT * FROM "OracleTable"
</select>
I get all the data, the problem is that there is a lot of data and it will take too much time to process all the data at once, since the response from the database will come in 3-4 minutes, this is not convenient.
How to make it so that I receive lines in portions without using the id field (since it does not exist, I do not know why). That is, take the first portion of lines, for example, the first 50, process them and take the next portion. It would be desirable to place a variable in properties that will be responsible for the number of lines in portions.
I can't do this in mybatis. This is new to me. Thanks in advance.
there is such a field and it is unique
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY
don't work, because the version is earlier than 12c
If you want to read millions of rows that's going to take time. It's normal to expect a few minutes to read and receive all the data over the wire.
Now, you have two options:
Use a Cursor
In MyBatis you can read the result of the query using the buffering a cursor gives you. The cursor reads a few hundred rows at a time and your app reads them one by one. Your app doesn't notice that behind the scenes there is buffering. Pretty good. For example, you can do:
Cursor<Client> clients = this.sqlSession.selectCursor("getAll");
for (Client c : clients) {
// process one client
}
Consider that cursors remain open until the end of the transaction. If you close the transaction (or exit the method marked as #Transactional) the cursor won't be usable anymore.
Use Manual Pagination
This solution can work well for the first pages of the result set, but it becomes increasingly inefficient and slooooooow the more you advance in the result set. Use it only as a last resort.
The only case where this strategy can be efficient is when you have the chance of implementing "key set pagination". I assume it's not the case here.
You can modify your query to perform explicit pagination. For example, you can do:
<select id="getPage" resultType="com.mappers.MyOracleMapper">
select * from (
SELECT rownum rnum, x.*
FROM OracleTable
WHERE rownum <= #{endingRow}
ORDER BY id
) x
where rnum >= #{startingRow}
</select>
You'll need to provide the extra parameters startingRow and endingRow.
NOTE: It's imperative you include an ORDER BY clause. Otherwise the pagination logic is meaningless. Choose any ordering you want, preferrably something that is backed up by an existing index.
jdbcTemplate.query(getQuery(id),
rs -> {
if(rs.next()) {
mainDTO.setSim(rs.getString("sim"));
mainDTO.setImei(rs.getString("imei"));
}
});
I use above code fragment to retrieve data from database and getting more than 100 records. But for all the records, sim and imei numbers are same. other fields are different. When executing above code I can get sim and imei number from the first record itself. but query run on all over the records and hence it take more than 3 seconds to complete. here is the problem.
How I can stop retrieving other records after I got the value for sim and imei from the first record. I cant change the sql query as the documentation and need to do the optimization in java code itself.
how can I optimize this to perform within below 100 mills.
You have two choices, either limit using a SQL query or use JdbcTemplate#setMaxRows:
SQL
You need to edit the query including what columns are about to be selected and the table name:
SELECT * FROM table LIMIT 1
JDBC
Use JdbcTemplate#setMaxRows to configure the JdbcTemplate to return up to one row:
jdbcTemplate.setMaxRows(1);
I guess it mimics Statement#setMaxRows.
We are trying to use a simple user-defined function (UDF) in the where clause of a query in Azure Cosmos DB, but it's not working correctly. The end goal is to query all results where the timestamp _ts is greater than or equal to yesterday, but our first step is to get a UDF working.
The abbreviated data looks like this:
[
{
"_ts": 1500000007
}
{
"_ts": 1500000005
}
]
Using the Azure Portal's Cosmos DB Data Explorer, a simple query like the following will correctly return one result:
SELECT * FROM c WHERE c._ts >= 1500000006
A simple query using our UDF incorrectly returns zero results. This query is below:
SELECT * FROM c WHERE c._ts >= udf.getHardcodedTime()
The definition of the function is below:
function getHardcodedTime(){
return 1500000006;
}
And here is a screenshot of the UDF for verification:
As you can see, the only difference between the two queries is that one query uses a hard-coded value while the other query uses a UDF to get a hard-coded value. The problem is that the query using the UDF returns zero results instead of returning one result.
Are we using the UDF correctly?
Update 1
When the UDF is updated to return the number 1, then we get a different count of results each time.
New function:
function getHardcodedTime(){
return 1;
}
New query: SELECT count(1) FROM c WHERE c._ts >= udf.getHardcodedTime()
Results vary with 7240, 7236, 7233, 7264, etc. (This set is the actual order of responses from Cosmos DB.)
By your description, the most likely cause is that the UDF version is slow and returns a partial result with continuation token instead of final result.
The concept of continuation is explained here as:
If a query's results cannot fit within a single page of results, then the REST API returns a continuation token through the x-ms-continuation-token response header. Clients can paginate results by including the header in subsequent results.
The observed count variation could be caused by the query stopping for next page at slightly different times. Check the x-ms-continuation header to know if that's the case.
Why is UDF slow?
UDF returning a constant is not the same as constant for the query. Correct me if you know better, but from CosmosDB side, it does not know that the specific UDF is actually deterministic, but assumes it could evaluate to any value and hence has to be executed for each document to check for a match. It means it cannot use an index and has to do a slow full scan.
What can you do?
Option 1: Follow continuations
If you don't care about the performance, you could keep using the UDF and just follow the continuation until all rows are processed.
Some DocumentDB clients can do this for you (ex: .net API), so this may be the fastest fix if you are in a hurry. But beware, this does not scale (keeps getting slower and cost more and more RU) and you should not consider this a long-term solution.
Option 2: Drop UDF, use parameters
You could pass the hardcodedTime instead as parameter. This way the query execution engine would know the value, could use a matching index and give you correct results without the hassle of continuations.
I don't know which API you use, but related reading just in case: Parameterized SQL queries.
Option 3: Wrap in stored proc
If you really-really must control the hardcodedTime in UDF then you could implement a server-side procedure which would:
query the hardcodedTime from UDF
query the documents, passing the hardcodedTime as parameter
return results as SP output.
It would use the UDF AND index, but brings a hefty overhead in the amount of code required. Do your math if keeping UDF is worth the extra effort in dev and maintenance.
Related documentation about SPs: Azure Cosmos DB server-side programming: Stored procedures, database triggers, and UDFs
I have a product search engine using Coldfusion8 and MySQL 5.0.88
The product search has two display modes: Multiple View and Single View.
Multiple displays basic record info, Single requires additional data to be polled from the database.
Right now a user does a search and I'm polling the database for
(a) total records and
(b) records FROM to TO.
The user always goes to Single view from his current resultset, so my idea was to store the current resultset for each user and not have to query the database again to get (waste a) overall number of records and (waste b) a the single record I already queried before AND then getting the detail information I still need for the Single view.
However, I'm getting nowhere with this.
I cannot cache the current resultset-query, because it's unique to each user(session).
The queries are running inside a CFINVOKED method inside a CFC I'm calling through AJAX, so the whole query runs and afterwards the CFC and CFINVOKE method are discarded, so I can't use query of query or variables.cfc_storage.
So my idea was to store the current resultset in the Session scope, which will be updated with every new search, the user runs (either pagination or completely new search). The maximum results stored will be the number of results displayed.
I can store the query allright, using:
<cfset Session.resultset = query_name>
This stores the whole query with results, like so:
query
CACHED: false
EXECUTIONTIME: 2031
SQL: SELECT a.*, p.ek, p.vk, p.x, p.y
FROM arts a
LEFT JOIN p ON
...
LEFT JOIN f ON
...
WHERE a.aktiv = "ja"
AND
... 20 conditions ...
SQLPARAMETERS: [array]
1) ... 20+ parameters
RESULTSET:
[Record # 1]
a: true
style: 402
price: 2.3
currency: CHF
...
[Record # 2]
a: true
style: 402abc
...
This would be overwritten every time a user does a new search. However, if a user wants to see the details of one of these items, I don't need to query (total number of records & get one record) if I can access the record I need from my temp storage. This way I would save two database trips worth 2031 execution time each to get data which I already pulled before.
The tradeoff would be every user having a resultset of up to 48 results (max number of items per page) in Session.scope.
My questions:
1. Is this feasable or should I requery the database?
2. If I have a struture/array/object like a the above, how do I pick the record I need out of it by style number = how do I access the resultset? I can't just loop over the stored query (tried this for a while now...).
Thanks for help!
KISS rule. Just re-query the database unless you find the performance is really an issue. With the correct index, it should scales pretty well. When the it is an issue, you can simply add query cache there.
QoQ would introduce overhead (on the CF side, memory & computation), and might return stale data (where the query in session is older than the one on DB). I only use QoQ when the same query is used on the same view, but not throughout a Session time span.
Feasible? Yes, depending on how many users and how much data this stores in memory, it's probably much better than going to the DB again.
It seems like the best way to get the single record you want is a query of query. In CF you can create another query that uses an existing query as it's data source. It would look like this:
<cfquery name="subQuery" dbtype="query">
SELECT *
FROM Session.resultset
WHERE style = #SelectedStyleVariable#
</cfquery>
note that if you are using CFBuilder, it will probably scream Error at you for not having a datasource, this is a bug in CFBuilder, you are not required to have a datasource if your DBType is "query"
Depending on how many records, what I would do is have the detail data stored in application scope as a structure where the ID is the key. Something like:
APPLICATION.products[product_id].product_name
.product_price
.product_attribute
Then you would really only need to query for the ID of the item on demand.
And to improve the "on demand" query, you have at least two "in code" options:
1. A query of query, where you query the entire collection of items once, and then query from that for the data you need.
2. Verity or SOLR to index everything and then you'd only have to query for everything when refreshing your search collection. That would be tons faster than doing all the joins for every single query.
I've the following problem with EJB and Oracle database.
I've some native SQL query deployed in Oc4j that returns more than 21k rows from Oracle DB. when I run the query against Oracle DB I get JOOM (out of memory) exception.
And because the requirements was to include pagination for the result set, so we decided to use em.setMaxResult, em.setFirstResult to return only 10 rows a time.
Using the EntityManager to implement the pagination put us in some problem As Later, it was required to sort the result returned, but the whole result not just the 10 rows returned by setMaxResult()! We found that, to put the clause ORDER BY xxxx in the native query makes the query performance became too bad.
So, we are considering doing the pagination in the Database layer (using Oracle rownum or any other technique).
Later, we recognized that, If we use em.clear() we might be able to avoid the JOOM exception by making something like:
define the result list
while database has more records
{
use entityManager get next 10 records and add them to the result list
entityManager.clear();
}
return result list
So, we could implement the paging on the Servlet side (using session.getAttribute("all_result").sublist(from, to)) and thus we can do the sort using Java as opposite to SQL sort.
Provided code for pagination, might help you.
em.createQuery("Select o from Entity o where o.id > :lastId order by o.id");
query.setParameter("lastId", previousResult.get(previousResult.size()-1).getId());
query.setMaxResults(10);