We have 2 m3.medium cassandra systems. And we are seeing this type of warning
2016-11-27 00:53:06,097 QueryProcessor.java:123 - 1 prepared statements discarded in the last minute because cache limit reached (10 MB) WARN [ScheduledTasks:1] 2016-11-27 00:57:06,097 QueryProcessor.java:123 - 1 prepared statements discarded in the last minute because cache limit reached (10 MB)
Would we loose any incoming write data when this happens ?
Nothing bad will happen with this. The drivers will handle it if you try to use a statement thats been evicted, recreating it.
The warning is a good note though. Particularly, its flag that you may be creating or recreating prepared statements constantly. In that case you are probably running slower than if you just don't prepare them at all. Your query will be blocked by waiting for an acknowledgement of the statement being prepared on the coordinator, which can be something like 2-3x the latency.
No, Cassandra will not loose any incoming write.
Cassandra cached all the prepared statement.In your case prepared statement Cache limit exceeded.It will evict some prepared statement and prepared the new statement
Related
I remarked that after inserting 400 000 rows into mysql using spring boot, the heap size and RAM consumption go up, but never drops down after.
If a similar request is done the second time, then the heap rises with additional 5-10%.
In what the problem could be in ?
Why the heap is not cleared after executing a save query ?
Is there a problem with garbage collector ? if yes, how is it possible to fix ?
P.S: I tried some solutions provided in this article, but they did not help
https://medium.com/quick-code/what-a-recurring-outofmemory-error-taught-me-11f2061063a1
Edited:
I tried to select 1.2 mil records in order to fulfill the heap.
And after 20-25 minutes passed, the heap started to decrease. Why is this happening ? Isn't garbage collector suppose to clear the heap faster ? In this way, if during these 25 minutes other requests are done, server will just crash.
Edited 2:
Seems that when trying the same on ec2, garbage collector does not work at all. It happened already 3 times that server just runs out of memory and crashes.
Anyone know the cause ?
I ran 1 #Async thread which had inside it another 2 #Async threads called from different beans. They finished execution and after that heap never got down.
I do not have any #Cacheable methods or similar stuffs. I am just getting some data from tables, process them and update it. Have no inputstreams etc.
We're currently using H2 version 199 in embedded mode with default nio file protocol and MVStore storage system. The write_delay parameter is set to 60 seconds.
We run a batch insert/update/delete of about 30.000 statements within 2 seconds (in one transaction) followed by another batch of a couple of hundred statements only 30 seconds later (in a second transaction). The next attempt to open a db connection (only 2 minutes later) shows that the DB is corrupt:
File corrupted while reading record: null. Possible solution: use the recovery tool [90030-199]
Since the transactions occur within a minute, we wonder whether the write_delay of 60 seconds might be contributing to the issue.
Changing write_delay to 60s (from a default value of 0.5s) will definitely increase your risk of lost transactions, and I do not see a good reason for doing it. Should not cause a db corruption, though. More likely some thread interruptions do that, since you are running in the same JVM a web server and who knows what else. Using async file store might help in that area, and yes it is stable enough (how much worse it can go for your app, than a database corruption, anyway).
I have configured Nutch 2.3.1 with Hadoop/Hbase ecosystem. I have not changed gora.buffer.read.limit and gora.buffer.read.limit i.e., using their default values that is 10000 in both cases. At generate phase, I set topN to 100,000. During generate job I get following information
org.apache.gora.mapreduce.GoraRecordWriter: Flushing the datastore after 60000 records
After job completion, I found that 100,000 urls are marked for fetched that I want to be. But I am confused what does above warning shows ? What is impact of gora.buffer.read.limit on my crawling ?
Can someone guide ?
That log is written here. By default, the buffer is flushed after writing 10000 records, so you must have somewhere configured gora.buffer.write.limit to 60000 (at core-site.xml or mapred-site.xml or code?).
It is not important, since it is at INFO level. It only notifies that the write buffer is going to be written into the storage.
The writing process happens each time you call store.flush(), or in gora.buffer.write.limit size batches.
I have a performance problem with my process. It's an asynchronous task launched in CMT bean (on jboss server).
1 iteration performs 1 update and 3 inserts to my db via Hibernate. The process is divided into new transactions every 100 iterations.
Flush is called on EntityManager after every update/insert.
While the starting performance of first batch is satisfying (around 5-8s) it slows down drastically with time. The 30th batch takes around 30s to finish and later grows up to over 2 minutes per batch.
I tried switching FlushModeType to COMMIT, manual clearing/closing entityManagers, clearing entityManagers cache, I looked for memory leaks and can't find the reason for this slow down.
I measured little bits of code execution time and every code involving database connection slows down with time. I understand that a transaction slows with more entities processed but why is new transaction also slower?
The latest process consists of 250 000 iterations (2500 transactions in 1 thread) and takes forever to end.
If needed I'll provide more information. Any help would be appreciated.
I tried simplifying this code just to do 1 hibernate insert and no other operations and it still slows with time. This is an abstract pseudo view of what's going on inside.
Bean1
#Asynchronous
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void mainTask(){
while(...){
subTask();
}
}
Bean2
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void subTask(){
100.times{
3*Insert
1*Update
}
}
I am sure that my suggestions might not be accurate one or you tried them already, but I'd like to give a try. As you mentioned that DB connection is bottleneck, I'll go after that.
After reading the question, I find that time taken by transaction is proportional to iteration number. So it looks that entities created in first iteration are being sent to hibernate in next iteration.
For example, in 4th iteration, entities created in 1st,2nd and 3rd iterations are also being sent for update or being sent to hibernate somehow.
That could be the reason for degradation of performance as iteration progresses. As number of records to be updated/inserted/selected increases with each iteration.
I can think of following possibilities on top of my head -
Hibernate session used in first iteration is being used till end. Due to this, entities created in first iteration are being updated in later iterations as well. I read that you tried closing entity managers etc. But still, please check the place where sesssion is ending. You can create a new session in each transaction or delete the entities created in session after each transaction.
The list which filters records for each iteration is sending already processed records. That means the list should send records from 300 to 399 in 3rd iteration, but records from 0 to 399 are sent to transaction.
If you're using HQL, try using named query. Last time when I used Hibernate (about 8 months back), I noticed that when HQL is used number of objects loaded by hibernate are much more than named query.
Hibernate provides a way to print actual SQL query/parameters sent to DB. You can check actual query sent to DB. Link for this -
How to print a query string with parameter values when using Hibernate
We faced similar problems on couple of occasions during executing of batch programs with JPA.
Only solution we could find is use jdbc api for batch programs which involves lot of processing.
Turns out it's a java problem, we tested our application on different configurations and it works fine when our jboss runs on java 1.7 (on contrary to 1.6). Every batch finishes in about 5 seconds. We now stand against a choice to upgrade our java to 1.7 or to dig deeper and find what's wrong with our setup on java 1.6.
I have implemented AgentX using mib2c.create-dataset.conf ( with cache enabled)
In my snmd.conf :: agentXTimeout 15
In testtable.h file I have changed cache value as below...
#define testTABLE_TIMEOUT 60
According to my understanding It loads data every 60 second.
Now my issue is if the data in data table is exceeds some amount it takes some amount of time to load it.
As in between If I fired SNMPWALK it gives me “no response from the host” If I use SNMPWALK for whole table and in between testTABLE_TIMEOUT occurs it stops in between and shows following error (no response from the host).
Please tell me how to solve it ? In my table large amount of data is present and changing frequently.
I read some where:
(when the agent receives a request for something in this table and the cache is older than the defined timeout (12s > 10s), then it does re-load the data. This is the expected behaviour.
However the agent does not automatically release the local cache (i.e. call the 'free' routine) as soon as the timeout has expired.
Instead this is handled by a regular "garbage collection" run (once a minute), which will free any stale caches.
In the meantime, a request that tries to use that cache will spot that it's expired, and reload the data.)
Is there any connection between these two ?? I can’t get this... How to resolve my problem ???
Unfortunately, if your data set is very large and it takes a long time to load then you simply need to suffer the slow load and slow response. You can try and load the data on a regular basis using snmp_alarm or something so it's immediately available when a request comes in, but that doesn't really solve the problem either since the request could still come right after the alarm is triggered and the agent will still take a long time to respond.
So... the best thing to do is optimize your load routine as much as possible, and possibly simply increase the timeout that the manager uses. For snmpwalk, for example, you might add -t 30 to the command line arguments and I bet everything will suddenly work just fine.