I was implemented a JDBC test plan with my database on a web-server (I built a web server by myself). When I start a simple request from JMeter Client (Ex: SELECT * From link d WHERE d.linkLIKE '%com%'), then the CPU of JMeter would high usage (90-100%) for a long time (~5 mins, but I set my test plan in 6s :(. And on server side, CPU high very short time - 5-7seconds (I think this time for the query to database). I tried to change the HEAP in jmeter.bat to more than 1024m, but is wasn't successful.
Can you help me to solve this problem?
I'd run EXPLAIN PLAN on that SQL query. You're likely to see a TABLE SCAN because of the way you wrote the WHERE clause. That takes a lot of time, more as your table grows, because it requires that you examine each and every record.
Related
I have a complex query that runs a long time (e.g 30 minutes) in Snowflake when I run it in the Snowflake console. I am making the same query from a JVM application using JDBC driver. What appears to happen is this:
Snowflake processes the query from start to finish, taking 30 minutes.
JVM application receives the rows. The first receive happens 30 minutes after the query started.
What I'd like to happen is that Snowflake starts to send rows to my application while it is still executing the query, as soon as data is ready. This way my application could start processing the rows in the first 30 minutes.
Is this possible with Snowflake and JDBC?
First of all, I would request to check the Snowflake warehouse size and do the tuning. It's not worth waiting for 30 mins when by resizing of the warehouse, the query time can be reduced one fourth or less than that. By doing any of the below, your cost will be almost the same or low. The query execution time will be reduced linearly as you increase the warehouse size. Refer the link
Scale up by resizing a warehouse.
Scale out by adding clusters to a warehouse (requires Snowflake
Enterprise Edition or higher).
Now coming to JDBC, I believe it behaves the same way as for other databases as well
I'm reading data from a table in Sybase using a Table Input step. The query is really simple:
SELECT person_ref, displayname FROM person
That table has about 2 million rows. I'm connecting to Sybase ASE 12. My user has read-only rights. PDI is using the jconnect driver with the following options:
IMPLICIT_CURSOR_FETCH_SIZE=5000
SELECT_OPENS_CURSOR=True
I've also tried using the noholdlock option on that query to change the isolation level.
The problem is that the query seems to remain idle for a long time, nearly a minute. PDI indicates that the step is in idle state for that time and then changes to Running. This makes it hard to measure the time the process takes, because PDI won't start measuring time until the steps change state from idle.
I can't seem to find anything in the manuals, or any option that will speed up the read time by decreasing or eliminating this idel time. Is there any option I'm missing? Does the idle status mean that PDI is just waiting for a response from Sybase?
Maybe your query is long to retreive the data.
The latence time is in the jdbc architecture. It sends the query to the database, who stores the data in a buffer. Only when this buffer is full, the data is transferred back to PDI. Until it receives some data, the Input table is in idle mode.
If you want to measure the time including the idle time, put a step that will fire without any latency, for example a Generate row (1 row is enough) step. You do not need to connect this step to any thing, as the PDI will start all the steps in parallel as soon as possible.
You won't see the total result on the Input table row of the the Step Metrics bottom tab. But you will have the result on the Metrics.
You can also use a Block this step until steps finish. You have an example in the sample directory that was shipped with your distribution. Open youKettleInstallDir/sample/transformation/Block this step until steps finish.ktr, and replace the top row with your flow. Then watch the statistics of the blocking step.
In my opinion, you have another step in your transformation locking the tables person. There is an overwhelming probability that you have a Output table step trying to truncate the table person.
I don't know if this is what I would call an answer, but I definitely found a way to get the Sybase connection to respond quickly. There's a querying tool called Sybase anywhere, that you can use to query the DB directly. What I did was look into an installation in a separate machine that had a good connection.
That machine had an ODBC connection defined for the Sybase DB, and the install of the client tool had its own version of Sybase drivers, along with some DLL files. I tool the jars and dll's and put them in the machine that had PDI installed. I made sure they were all in the classpath, and created a generic JDBC connection that pointed to the system ODBC one. It's going at the speed you would expect now.
I have an Azure website running about 100K requests/hour and it connects to Azure SQL S2 database with about 8GB throughput/day. I've spent a lot of time optimizing the database indexes, queries, etc. Normally the Data IO, CPU and Log IO percentages are well behaved in the 20% range.
A recent portion of the data throughput is retained for supporting our customers. I have a nightly maintenance procedure that removes obsolete data to manage database size. This mostly works well with the exception of removing image blobs in a varbinary(max) field.
The nightly procedure has a loop that sets 10 records varbinary(max) field to null at a time, waits a couple seconds, then sets the next 10. Nightly total for this loop is about 2000.
This loop will run for about 45 - 60 minutes and then stop running with no return to my remote Sql Agent job and no error reported. A second and sometimes third running of the procedure is necessary to finish setting the desired blobs to null.
In an attempt to alleviate the load on the nightly procedure, I started running a job once every 30 seconds throughout the day - it sets one blob to null each time.
Normally this trickle job is fine and runs in 1 - 6 seconds. However, once or twice a day something goes wrong and I can find no explanation for it. The Data I/O percentage peaks at 100% and stays there for 30 - 60 minutes or longer. This causes the database responsiveness to suffer and the website performance goes with it. The trickle job also reports running for this extended period of time. If I stop the Sql Agent job, it can take a few minutes to stop but the Data I/O continues at 100% for the 30 - 60 minute period.
The web service requests and database demands are relatively steady throughout the business day - no volatile demands that would explain this. No database deadlocks or other errors are reported. It's as if the database hits some kind of backlog limit where its ability to keep up suddenly drops and then it can't catch up until something that is jammed finally clears. Then the performance will suddenly return to normal.
Do you have any ideas what might be causing this intermittent and unpredictable issue? Any ideas what I could look at when one of these events is happening to determine why the Data I/O is 100% for an extended period of time? Thank you.
If you are on SQL DB V12, you may also consider using the Query Store feature to root cause this performance problem. It's now in public preview.
In order to turn on Query Store just run the following statement:
ALTER DATABASE your_db SET QUERY_STORE = ON;
I want MongoDB to hold query results in RAM for longer period of time (say 30 minutes if memory is available). Is it possible? OR is there any way i can make sure that the data is pre-loaded into RAM before subsequent queries on it.
In fact i am wondering about simple query results performance by MongoDB. I have a dedicated server with 10GB RAM and my db.stats() are as follows;
db.stats();
{
"db": "test",
"collections":16,
"objects":625690,
"avgObjSize":68.90,
"dataSize":43061996,
"storageSize":1121402888,
"numExtents":74,
"indexes":25,
"indexSize":28207200,
"fileSize":469762048,
"nsSizeMB":16,
"ok":1
}
Now when i query single document (as mentioned here) from a web service it loads in 1.3 seconds. Subsequent calls of same queries gives response in 400ms and then after few seconds, it again starts taking 1.3 seconds. Looks like MongoDB has lost the previous queried document from Memory, where as there is no other queries asking for data mapped to RAM.
Please explain this and let me know any way to make subsequent queries faster responding.
Your observed performance problem on an initial query is likely one of the following issues (in rough order of likelihood):
1) Your application / web service has some overhead to initialize on first request (i.e. allocating memory, setting up connection pools, resolving DNS, ...).
2) Indexes or data you have requested are not yet in memory, so need to be loaded.
3) The Query Optimizer may take a bit longer to run on the first request, as it is comparing the plan execution for your query pattern.
It would be very helpful to test the query via the mongo shell, and isolate whether the overhead is related to MongoDB or your web service (rather than timing both, as you have done).
Following are some notes related to MongoDB.
Caching
MongoDB doesn't have a "caching" time for documents in memory. It uses memory-mapped files for disk I/O and the documents in memory are based on your active queries (documents/indexes you've recently loaded) as well as the available memory. The operating system's virtual memory manager is in charge of caching, and typically will follow a Least-Recently Used (LRU) algorithm to decide which pages to swap out of memory.
Memory Usage
The expected behaviour is that over time MongoDB will grow to use all free memory to store your active working data set.
Looking at your provided db.stats() numbers (and assuming that is your only database), it looks like your database size is current about 1Gb so you should be able to keep everything within your 10Gb total RAM unless:
there are other processes competing for memory
you have restarted your mongod server and those documents/indexes haven't been requested yet
In MongoDB 2.2, there is a new touch command you can use to load indexes or documents into memory after a server restart. This should only be used on initial startup to "warm up" the server, as otherwise you could be unhelpfully forcing actual "active" data out of memory.
On a linux system, for example, you can use the top command and should see that:
virtual bytes/VSIZE will tend to be the size of the entire database
if the server doesn't have other processes running, resident bytes/RSIZE will be the total memory of the machine (this includes file system cache contents)
mongod should not use swap (since the files are memory-mapped)
You can use the mongostat tool to get a quick view of your mongod activity .. or more usefully, use a service like MMS to monitor metrics over time.
Query Optimizer
The MongoDB Query Optimizer compares plan execution for a query pattern every ~1,000 write operations, and then caches the "winning" query plan until the next time the optimizer runs .. or you explicitly call an explain() on that query.
This should be a straightforward one to test: run your query in the mongo shell with .explain() and look at the ms timings, and also the number of index entries and documents scanned. The timing for an explain() isn't the actual time the queries will take to run, as it includes the cost of comparing the plans. The typical execution will be much faster .. and you can look for slow queries in your mongod log.
By default MongoDB will log all queries slower than 100ms, so this provides a good starting point to look for queries to optimize. You can adjust the slow ms value with the --slowms config option, or using the Database Profiler commands.
Further reading in the MongoDB documentation:
Caching
Checking Server Memory Usage
Database Profiler
Explain
Monitoring & Diagnostics
Does anyone know how one can get the total number of calls to an MSSQL2000 server during a specified time, let’s say 24 hours?
We want figures of how many calls our production machine gets per day, but we can’t find any good tools/strategies for this.
Best regards
Fredrik
You could use SQL Profiler?
http://technet.microsoft.com/en-us/library/aa173918(SQL.80).aspx
http://www.sqlteam.com/article/sql-server-2000-performance-tuning-tools
http://support.microsoft.com/kb/325263
I think using SQL Profiler here is overkill in this situation, particularly as it can create a substantial load on the server depending on what you trace. SQL Server exposes the raw values used for its performance counters via the sysperfinfo system table; you should just be able to run this query once each day and subtract the values to work out how many SQL requests you received for the day:
SELECT cntr_value
FROM sysperfinfo
WHERE object_name = 'SQLServer:SQL Statistics'
AND counter_name = 'Batch Requests/sec'
This will obviously only work if the server is up for the whole day; restarting will reset the number.
I sloved this another way (all calls are "routed" thru an IIS cluster and I where able to analyze their logs).
Thanx!