neo4j count all data of one node very slowly - performance

I have a Node named "Event" how I put all events performed by my web portal
Now I have 1 500 000 Events!
So when I count the number of Event, I do this query :
MATCH (e:Event) RETURN count(e) AS numberOfEvent
But it's extremly slow : 25 000 ms!
The same query, in a classical SGBD like Postgres, is executed in 200 ms!
Is it normal or my query isn't correctly written?
Regards
Olivier

Is that the first time you run the query? Can you run it again?
I think postgres stores the total count, while neo will actually load the data.
So you measure your disk speed for loading the data.
We will work on improving that by using our database statistics internally for this kind of queries.

Related

Fetching rows with Snowflake JDBC while the query is running in the server

I have a complex query that runs a long time (e.g 30 minutes) in Snowflake when I run it in the Snowflake console. I am making the same query from a JVM application using JDBC driver. What appears to happen is this:
Snowflake processes the query from start to finish, taking 30 minutes.
JVM application receives the rows. The first receive happens 30 minutes after the query started.
What I'd like to happen is that Snowflake starts to send rows to my application while it is still executing the query, as soon as data is ready. This way my application could start processing the rows in the first 30 minutes.
Is this possible with Snowflake and JDBC?
First of all, I would request to check the Snowflake warehouse size and do the tuning. It's not worth waiting for 30 mins when by resizing of the warehouse, the query time can be reduced one fourth or less than that. By doing any of the below, your cost will be almost the same or low. The query execution time will be reduced linearly as you increase the warehouse size. Refer the link
Scale up by resizing a warehouse.
Scale out by adding clusters to a warehouse (requires Snowflake
Enterprise Edition or higher).
Now coming to JDBC, I believe it behaves the same way as for other databases as well

Improving query execution time

I am working with spring data mongo, I have around 2000 documents stored(would probably reach 10000 in the upcoming 2-3 months), I would like to extract them all, however the query takes around ~2.5 seconds, which is pretty bad in my opinion, I am using MongoRepository default - findAll()
Tried to increase the cursor batchsize to 500,1000,2000 without any much improvement(best result was 2.13 seconds).
Currently I'm using a workaround - I store the documents in a different collection which used for cache, extracting this data takes around 0.25 seconds, but I would like to figure out how to fix the original query execution time.
Would like the answer will return in less then 1 sec, less is even better.
Without knowing the exact details i cannot confirm you a method.
But for data selection queries "Indexing" will help you.
Please Try Indexing the DB.
https://docs.mongodb.com/manual/indexes/

How to get the network cost in one MariaDB query using JDBC?

When I query using HeidiSql, the console will give the info like this:
/* Affected rows: 0 Found rows: 2,632,206 Warnings: 0 Duration for 1 query: 0.008 sec. (+ 389.069 sec. network) */
I want to use JDBC to do the performance testing on our database.
So distinguish the network cost and the actual query cost is important in my case.
How to get the network cost in one MariaDB query using JDBC? Is it possible?
In HeidiSQL, I am defining the query duration as the time which mysql_real_query took to execute.
That "network" duration is the time which mysql_store_result takes afterwards.
See also:
https://mariadb.com/kb/en/mariadb/mysql_real_query/
https://mariadb.com/kb/en/mariadb/mysql_store_result/
I guess JDBC has similar methods as the C API, so I guess the above mentioned logic from HeidiSQL should be easy to adapt.

A map that sometimes run fast, and sometimes slow

I built up a map containing this logic:
SOURCES -> SORTER -> AGG(FIRST BY GROUP) -> 2 LOOKUPS -> FILTER -> TARGET
Now, when I manually run the query generated by the sources, adding the 2 lookups with a LEFT JOIN and sorting, the query takes about 30 seconds.
I ran the same map in my DEV environment to try to debug it , but suddenly it ran in 2 minutes(connected to the same connection as in the PRODUCTION , and the map is trunc/insert)
I looked up the history of this session, and its running time is between 6 minutes up to hour+ , with the same amount of data every day!
I've tried adding statistics/increasing the commit interval but nothing seems to help.
Any suggestions?
Thanks in advance.
First thing, the query from source (with lookups) return you data within 30 seconds doesn't mean you will get all data by 30 second. The SQL client tool shows only first 50 to 500 records. Extracting complete data set may need more time.
Now, i don't see many reasons for slowness. Here are my thoughts -
Did you find any pattern of slowness? Like during month end or month start etc.? All i can see is mainly source and lookup (if table) data may be reason of slowness. See, when a table size varies rapidly or table is not analyzed or table undergoes lot of delete/load operation, its cost varies and SQL becomes slower. Make sure stats are gathered periodically for lookup and source tables.
May be some other operations (that is running in parallel to your map) is eating up all your resources so its taking 1 hour to complete the map.
how much data it processes? In thousands, millions or billions? Depending on that you can re-arrange map like this source > source qual> lookup > filter > sorter + aggregator > target to improve performance.

CouchDB query performance

If the number of documents is more will the querying of data gets slower in CouchDB?
Example Scenario:
I have a combobox in a form for customer name. When the user types the customer name, I have to do autofilling.
There will be around 10k customer documents in the CouchDB. I understand that i have to create a view to do the same.
CouchDB database is in the local machine where the application resides.
Question:
Will it take more than 2 - 3 seconds to query the DB for matching customer names?
Will querying take more time for each query if there are many documents in the CouchDB (say around 100000 documents)?
Any pointers on how to create views/index will be helpful.
Thanks in advance.
The view runs on every document, but only once. After that, the document's view value(s) are stored forever. Fetching a customer by name will be very fast because you would normally have only a few new documents to process in the view at query time.
Query time will not increase noticeably if you have more documents. Technically, access times grow logarithmically with the number of documents. However, in practice fetching documents is basically constant time and very unlikely to be a problem.

Resources