I know that ClickHouse does not have ACID, therefore I don't expect it to have Durability in terms of ACIDness. But, the question is, is there a possibility to lose inserts if the server crashes?
CH is not Durable.
You can lost a data successfully inserted for last 8-10 minutes on hardware spontaneous reboot, but not on CH crash.
By performance reason CH does not use fsync (this improves insert performance dramatically). So last parts level0 (inserts) will be in the Linux page cache only. You can reduce 10minutes by tuning Linux kernel parameters. (You can configure direct_io for merges (starting with 1 byte) so level1 parts will be written to disk)
And can use replicated tables and quorum insert. But still in this case you can lost data if both replicas are in the same rack on a rack power outage.
In the beginning 2021 CH will start to support WAL+fsync.
It will be controlled by parameters
min_rows_to_fsync_after_merge
min_compressed_bytes_to_fsync_after_merge
min_compressed_bytes_to_fsync_after_fetch
fsync_after_insert
fsync_part_directory
write_ahead_log_bytes_to_fsync
write_ahead_log_interval_ms_to_fsync
in_memory_parts_insert_sync
Related
I can see that the Firebird 2.1 process (on Linux) (for our program) reaches 97% CPU load, the load may be distributed, e.g. the server can have 4 cores and 2 cores are consumed with 97% load and the remaining 2 cores are under normal load (1-10%) from the Firebird process. The bad thing is, that this 97% peak can last half hour, an hour or even longer.
As I understand, then I just need to determine the Firebird transaction and the Firebird attachment (i.e. connection) that has created this peek and then I can just ask the user/software instance, that created this connection/attachment to close his/her program and start anew. When attachment is closed, the Firebird can sense this and Firebird process stops any CPU loads and processes that were assigned to that attachment.
So, my aim is to look on the data from the monitoring tables (mon$...) and to determine the offending transaction/connection.
I came up with the select (for Firebird 2.1):
select a.mon$user, sa.*, t.*
from mon$transactions t
left join mon$io_stats s on (t.mon$stat_id=s.mon$stat_id)
left join mon$attachments a on (t.mon$attachment_id=a.mon$attachment_id)
left join mon$statements sa on (t.mon$transaction_id=sa.mon$transaction_id)
where s.mon$page_reads>1000000
This SQL seems to be right, but practically the results are misleading. For example, my select returns several entries with a.mon$timestamp that is 4 or even more hours old. I can not believe that there are transactions that are so old and that still are taking resources. The strange thing is that the records have no data from left-joined mon$statements. So, I have some information about long-running transactions, but I have no information about statements that case created or prolonged this transaction. I don't even understand whether such transactions are actually creating the CPU peak or if this data is obsolete.
So, how to correct this SQL (or write completely anew) to find the statements/attachments that is causing CPU % in Firebird 2.1?
I have a cluster with two Redis docker instances (v3.2.5) I use for caching responses from Spring boot microservices.
I've disabled all persistence and the number of keys is stable over time, all of them expiring between 5 minutes and 1 day.
Despite this, I can see the memory usage creeping up. It looks like once a day (around midnight) it uses a lot of memory and then releases some of it.
Does anyone have any idea what this process may be, if there's any way to configure Redis to avoid using that much memory?
The number of keys I have doesn't justify this amount of memory
UPDATE
After taking a snapshot of the database and loading the data on a fresh new Redis instance (same version, same config) the memory_used_human is 10 times lower than the original one.
Is it possible that key expiration doesn't really delete keys from memory?
I'd like to save planner cost using plan cache, since OCRA/Legacy optimizer will take dozens of millionseconds.
I think greenplum cache query plan in session level, when session end or other session could not share the analyzed plan. Even more, we can't keep session always on, since gp system will not release resource until TCP connection disconnected.
most major database cache plans after first running, and use that corss connections.
So, is there any switch that turn on query plan cache cross connectors? I can see in a session, client timing statistics not match the "Total time" planner gives?
Postgres can cache the plans as well, which is on a per session basis and once the session is ended, the cached plan is thrown away. This can be tricky to optimize/analyze, but generally of less importance unless the query you are executing is really complex and/or there are a lot of repeated queries.
The documentation explains those in detail pretty well. We can query pg_prepared_statements to see what is cached. Note that it is not available across sessions and visible only to the current session.
When a user starts a session with Greenplum Database and issues a query, the system creates groups or 'gangs' of worker processes on each segment to do the work. After the work is done, the segment worker processes are destroyed except for a cached number which is set by the gp_cached_segworkers_threshold parameter.
A lower setting conserves system resources on the segment hosts, but a higher setting may improve performance for power-users that want to issue many complex queries in a row.
Also see gp_max_local_distributed_cache.
Obviously, the more you cache, the less memory there will be available for other connections and queries. Perhaps not a big deal if you are only hosting a few power users running concurrent queries... but you may need to adjust your gp_vmem_protect_limit accordingly.
For clarification:
Segment resources are released after the gp_vmem_idle_resource_timeout.
Only the master session will remain until the TCP connection is dropped.
I am currently working on an application that continuously queries a database for real time data to be displayed.
In order to have minimal impact on systems which are writing to database, which are essential to the business operation, I am connecting to the Read Only replica in the availability group directly (using the read only replica server name as opposed to Read Only routing via the Always On listener by means of applicationintent=readonly).
Even in doing so we are noticing response time increases on the inserting of data to the primary server.
To my understanding of secondary replicas this should not be the case. I am using NOLOCK hints in the query as well. I am very perplexed by this and do not quite understand what is causing this increase in response times. All I have thought of so far is that SQL is, regardless of the NOLOCK hint, locking the table I am reading from and preventing the synchronous replication to the read only replica, which is in turn locking the primary instances table, which is holding up the insert query.
Is this the case or is there something I am not quite understanding with regard to Always on Read only replicas?
I found this document which I think best describes what could be possible causes of the increases in response times on the primary server.
In general it's a good read for those who are looking into using their AlwaysOn Availability group to distribute the load between their primary and secondary replicas.
for those who don't wish to read the whole document it taught me the following (in my own rough words):
Although very unlikely, it is possible that workloads running on the secondary replica can impact the the time taken to send the acknowledgement that the transaction has committed(the replication to the secondary). When using synchronous commit mode the primary waits for this acknowledgement before committing the transaction it is running (an insert for example). So the increase in time for the acknowledgement of the secondary replica causes the primary replica to take longer on the insert.
It is much better explained in the document though, under the 'Impact on Primary Workload' section. And again, if you want to know more, I really suggest you read it.
We have a test system which matches our production system like for like. 6 months ago we did some testing on new hardware, and found the performance limit of our system.
However, now we are re-doing the testing with a view to adding further hardware, and we have found the system doesnt perform as it used to.
The reason for this is because on one specific volume we are now doing random I/O which used to be sequential. Further to this it has turned out that the activity on this volume by oracle which is 100% writes, is actually in 8k blocks, where before it was up to 128k.
So something has caused the oracle db writer to stop batching up it's writes.
We've extensively checked our config, and cannot see any difference between our test and production systems. We've also opened a call with Oracle but at this stage information is slow in forthcoming.
so; Ultimately this is 2 related questions:
Can you rely on oracle multiblock writes? Is that a safe thing to engineer/tune your system for?
Why would oracle change its behaviour?
We're not at this stage necessarily blaming oracle - it may well be reacting to something in the environment - but what?
The OS/arch is solaris/sparc.
Oh; I forgot to mention, the insert table has no indexes, and only a couple of foreign keys - it's designed as a bucket for as fast an insert as possible. It's also partitioned on the key field.
Thanks for any tips!
More description of the workload would allow some hypotheses.
If you are updating random blocks, then the DBWR process(es) are going to have little choice but to do single-block writes. Indexes especially are likely to have writes all over the place. If you have an index of character values and need to insert a new 'M' record where there isn't room, it will get a new block for the index and split the current block. You'll have some of those 'M' records in the original block, and some in the new block (while will be the last [used] block in the last extent).
I suspect you are most likely to get multi-block writes when bulk inserting into tables, as new blocks will be allocated and written to. Potentially, initially you had (say) 1GB of extents allocated and were writing into that space. Now you might have reached the limit of that and be creating new extents (say 50 Mb) which it may be getting from scattered file locations (eg other tables that have been dropped).