(note this isnt about parallel execution of a query inside the RDB, but peformance characteristics of submitting queries in parallel).
I have a process that executes 1000's (if not 10,000s+) of queries, in a single threaded manner (i.e. send, wait for response, process, send....), loosely of the form
select a,b from table where id = 123
i.e. query a single record on an already indexed field
on an oracle database.
This process takes longer than desired, and doing some metrics on it, I'm sure that 90% of the time is spent server side execution (and transport) rather than client side.
This process naturally can be split into N 'jobs', and its been suggested that this could/should speed up the process.
Naively you would expect it to run N times quicker (with a small overhead to merge the answer).
Given that (loosely) SQL is 'serialised' is this the case though? That would imply that actually it would probably not run quicker at all.
I assume that for an update on a single record (for example) N updates would have to be effectively serialised, but for N reads, this may not be the case.
Which theory is most accurate (or neither)
I'm not a dba, it looks like for reads, reads never block reads, so assuming infinite resources the theory would be that N reads could be run completely in parallel with no blocking. For writes and reads it gets more complex depending on how you set up your transactions/locks but thats out of scope for me.
Related
We are using a timestamp to ensure that entries in a log table are recorded sequentially, but we have found a potential flaw. Say, for example, we have two nodes in our RAC and the node timestamps are 1000ms off. Our app server inserts two log entries within 30ms seconds of each other. The first insert is serviced by Node1 and the second by Node2. With 1000ms difference between the two nodes, the timestamp could potentially show the log entries occurring in the wrong order! (I would just use a sequence, but our sequences are cached for performance reasons... )
NTP sync doesn't help this situation because NTP has a fault tolerance of 128ms -- which leaves the door open for records to be recorded out of order when they occur more frequently than that.
I have a feeling I'm looking at this problem the wrong way. My ultimate goal is to be able to retrieve the actual sequence that log entries are recorded. It doesn't have to be by a timestamp column.
An Oracle sequence with ORDER specified is guaranteed to return numbers in order across a RAC cluster. So
create sequence my_seq
start with 1
increment by 1
order;
Now, in order to do this, that means that you're going to be doing a fair amount of inter-node communication in order to ensure that access to the sequence is serialized appropriately. That's going to make this significantly more expensive than a normal sequence. If you need to guarantee order, though, it's probably the most efficient approach that you're going to have.
Bear in mind that an attached timestamp on a row is generated at time of the insert or update, but the time that the actual change to the database takes place is when the commit happens - which, depending on the complexity of the transactions, row 1 might get inserted before row2, but gett committed after.
The only thing I am aware of in Oracle across the nodes that guarantees the order is the SCN that Oracle attaches to the transaction, and by which transactions in a RAC environment can be ordered for things like Streams replication.
1000ms? It is one sec, isn't it? IMHO it is a lot. If you really need precise time, then simply give up the idea of global time. Generate timestamps on log server and assume that each log server has it's own local time. Read something about Lamport's time, if you need some theory. But maybe the source of your problem is somewhere else. RAC synchronises time between nodes, and it would log some bigger discrepancy.
If two consecutive events are logged by two different connections, is the same thread using both connections? Or are those evens passed to background threads and then those threads write into the database? i.e. is it logged sequentially or in parallel?
Can anyone explain me the definitions and differences between sequential consistency and quiescent consistency? In the most dumb form possible :|
I did read this: Example of execution which is sequentially consistent but not quiescently consistent
But I am not able to understand Sequential and quiescent consistency itself :(
Sequential consistency requires that the operations should appear to take effect in the order they are specified in each program. Basically it enforces program order within each individual process, and allows all processes to assume they are observing the same order of operations. Let's say we have 2 processes enqueuing and dequeuing items on a queue q:
P1 -- q.enq(x) -----------------------------
P2 -------------- q.enq(y) ---- q.deq():y --
This is not the expected behaviour from a FIFO queue. We'd expect to dequeue x because P1 enqueues x before P2 enqueues y. However this scenario is allowed in the sequential consistency model because sequential consistency doesn't require that the order seen by all processes is correct (real-time order). There's at least one sequential execution that can explain these results and one is:
P2:q.enq(y) P1:q.enq(x) P2:q.deq():y
In this execution each process executes operations in program order meaning each process executes its operations in the order they're specified in each process.
Quiescent consistency requires non-overlapping operations to appear to take effect in their real-time order, but overlapping operations might be reordered. Therefore, the same scenario is not allowed in the quiescent consistency model because we expect q.enq(x) to appear to take effect before q.enq(y), and q.deq() to return x instead of y. Also quiescent consistency doesn't necessarily preserve program order. If q.enq(x) and q.enq(y) would be concurrent (overlapping) operations, they could be reordered and q.deq():y would be quiescently consistent.
Basically some executions are sequentially consistent but not quiescently consistent, and vice versa.
First you should understand what is program order, it is literally how you expect your program runs in the order of the appearance of instructions.
But a program order is only for a single thread program, if you have multithreads, then problem comes as the program order may not hold or even not exist as sometimes you cannot tell which thread's method call happens first.
A quiescent consistency describes a clear program order of all threads' behaviors. that is no overlaps are allowed since it is required a quiescent period between two method calls.
A sequential consistency allows overlaps, but requires one can find a program order in which all the method calls can be put in a place and still returns correct value and behaves correctly.
I have a cluster application, which is divided into a controller and a bunch of workers. The controller runs on a dedicated host, the workers phone in over the network and get handed jobs, so far so normal. (Basically the "divide-and-conquer pipeline" from the zeromq manual, with job-specific wrinkles. That's not important right now.)
The controller's core data structure is unordered_map<string, queue<string>> in pseudo-C++ (the controller is actually implemented in Python, but I am open to the possibility of rewriting it in something else). The strings in the queues define jobs, and the keys of the map are a categorization of the jobs. The controller is seeded with a set of jobs; when a worker starts up, the controller removes one string from one of the queues and hands it out as the worker's first job. The worker may crash during the run, in which case the job gets put back on the appropriate queue (there is an ancillary table of outstanding jobs). If it completes the job successfully, it will send back a list of new job-strings, which the controller will sort into the appropriate queues. Then it will pull another string off some queue and send it to the worker as its next job; usually, but not always, it will pick the same queue as the previous job for that worker.
Now, the question. This data structure currently sits entirely in main memory, which was fine for small-scale test runs, but at full scale is eating all available RAM on the controller, all by itself. And the controller has several other tasks to accomplish, so that's no good.
What approach should I take? So far, I have considered:
a) to convert this to a primarily-on-disk data structure. It could be cached in RAM to some extent for efficiency, but jobs take tens of seconds to complete, so it's okay if it's not that efficient,
b) using a relational database - e.g. SQLite, (but SQL schemas are a very poor fit AFAICT),
c) using a NoSQL database with persistency support, e.g. Redis (data structure maps over trivially, but this still appears very RAM-centric to make me feel confident that the memory-hog problem will actually go away)
Concrete numbers: For a full-scale run, there will be between one and ten million keys in the hash, and less than 100 entries in each queue. String length varies wildly but is unlikely to be more than 250-ish bytes. So, a hypothetical (impossible) zero-overhead data structure would require 234 – 237 bytes of storage.
Ultimately, it all boils down on how you define efficiency needed on part of the controller -- e.g. response times, throughput, memory consumption, disk consumption, scalability... These properties are directly or indirectly related to:
number of requests the controller needs to handle per second (throughput)
acceptable response times
future growth expectations
From your options, here's how I'd evaluate each option:
a) to convert this to a primarily-on-disk data structure. It could be
cached in RAM to some extent for efficiency, but jobs take tens of
seconds to complete, so it's okay if it's not that efficient,
Given the current memory hog requirement, some form of persistent storage seems a reaonsable choice. Caching comes into play if there is a repeatable access pattern, say the same queue is accessed over and over again -- otherwise, caching is likely not to help.
This option makes sense if 1) you cannot find a database that maps trivially to your data structure (unlikely), 2) for some other reason you want to have your own on-disk format, e.g. you find that converting to a database is too much overhead (again, unlikely).
One alternative to databases is to look at persistent queues (e.g. using a RabbitMQ backing store), but I'm not sure what the per-queue or overall size limits are.
b) using a relational database - e.g. SQLite, (but SQL schemas are a
very poor fit AFAICT),
As you mention, SQL is probably not a good fit for your requirements, even though you could surely map your data structure to a relational model somehow.
However, NoSQL databases like MongoDB or CouchDB seem much more appropriate. Either way, a database of some sort seems viable as long as they can meet your throughput requirement. Many if not most NoSQL databases are also a good choice from a scalability perspective, as they include support for sharding data across multiple machines.
c) using a NoSQL database with persistency support, e.g. Redis (data
structure maps over trivially, but this still appears very RAM-centric
to make me feel confident that the memory-hog problem will actually go
away)
An in-memory database like Redis doesn't solve the memory hog problem, unless you set up a cluster of machines that each holds a part of the overall data. This makes sense only if keeping all data in-memory is needed due to low response times requirements. Yet, given the nature of your jobs, taking tens of seconds to complete, response times, respective to workers, hardly matter.
If you find, however, that response times do matter, Redis would be a good choice, as it handles partitioning trivially using either client-side consistent-hashing or at the cluster level, thus also supporting scalability scenarios.
In any case
Before you choose a solution, be sure to clarify your requirements. You mention you want an efficient solution. Since efficiency can only be gauged against some set of requirements, here's the list of questions I would try to answer first:
*Requirements
how many jobs are expected to complete, say per minute or per hour?
how many workers are needed to do so?
concluding from that:
what is the expected load in requestes/per second, and
what response times are expected on part of the controller (handing out jobs, receiving results)?
And looking into the future:
will the workload increase, i.e. does your solution need to scale up (more jobs per time unit, more more data per job?)
will there be a need for persistency of jobs and results, e.g. for auditing purposes?
Again, concluding from that,
how will this influence the number of workers?
what effect will it have on the number of requests/second on part of the controller?
With these answers, you will find yourself in a better position to choose a solution.
I would look into a message queue like RabbitMQ. This way it will first fill up the RAM and then use the disk. I have up to 500,000,000 objects in queues on a single server and it's just plugging away.
RabbitMQ works on Windows and Linux and has simple connectors/SDKs to about any kind of language.
https://www.rabbitmq.com/
Given: SQL Server 2008 R2. Quit some speedin data discs. Log discs lagging.
Required: LOTS LOTS LOTS of inserts. Like 10.000 to 30.000 rows into a simple table with two indices per second. Inserts have an intrinsic order and will not repeat, as such order of inserts must not be maintained in short term (i.e. multiple parallel inserts are ok).
So far: accumulating data into a queue. Regularly (async threadpool) emptying up to 1024 entries into a work item that gets queued. Threadpool (custom class) has 32 possible threads. Opens 32 connections.
Problem: performance is off by a factor of 300.... only about 100 to 150 rows are inserted per second. Log wait time is up to 40% - 45% of processing time (ms per second) in sql server. Server cpu load is low (4% to 5% or so).
Not usable: bulk insert. The data must be written as real time as possible to the disc. THis is pretty much an archivl process of data running through the system, but there are queries which need access to the data regularly. I could try dumping them to disc and using bulk upload 1-2 times per second.... will give this a try.
Anyone a smart idea? My next step is moving the log to a fast disc set (128gb modern ssd) and to see what happens then. The significant performance boost probably will do things quite different. But even then.... the question is whether / what is feasible.
So, please fire on the smart ideas.
Ok, anywering myself. Going to give SqlBulkCopy a try, batching up to 65536 entries and flushing them out every second in an async fashion. Will report on the gains.
I'm going through the exact same issue here, so I'll go through the steps i'm taking to improve my performance.
Separate the log and the dbf file onto different spindle sets
Use basic recovery
you didn't mention any indexing requirements other than the fact that the order of inserts isn't important - in this case clustered indexes on anything other than an identity column shouldn't be used.
start your scaling of concurrency again from 1 and stop when your performance flattens out; anything over this will likely hurt performance.
rather than dropping to disk to bcp, and as you are using SQL Server 2008, consider inserting multiple rows at a time; this statement inserts three rows in a single sql call
INSERT INTO table VALUES ( 1,2,3 ), ( 4,5,6 ), ( 7,8,9 )
I was topping out at ~500 distinct inserts per second from a single thread. After ruling out the network and CPU (0 on both client and server), I assumed that disk io on the server was to blame, however inserting in batches of three got me 1500 inserts per second which rules out disk io.
It's clear that the MS client library has an upper limit baked into it (and a dive into reflector shows some hairy async completion code).
Batching in this way, waiting for x events to be received before calling insert, has me now inserting at ~2700 inserts per second from a single thread which appears to be the upper limit for my configuration.
Note: if you don't have a constant stream of events arriving at all times, you might consider adding a timer that flushes your inserts after a certain period (so that you see the last event of the day!)
Some suggestions for increasing insert performance:
Increase ADO.NET BatchSize
Choose the target table's clustered index wisely, so that inserts won't lead to clustered index node splits (e.g. autoinc column)
Insert into a temporary heap table first, then issue one big "insert-by-select" statement to push all that staging table data into the actual target table
Apply SqlBulkCopy
Choose "Bulk Logged" recovery model instad of "Full" recovery model
Place a table lock before inserting (if your business scenario allows for it)
Taken from Tips For Lightning-Fast Insert Performance On SqlServer
I have a very large set of data (~3 million records) which needs to be merged with updates and new records on a daily schedule. I have a stored procedure that actually breaks up the record set into 1000 record chunks and uses the MERGE command with temp tables in an attempt to avoid locking the live table while the data is updating. The problem is that it doesn't exactly help. The table still "locks up" and our website that uses the data receives timeouts when attempting to access the data. I even tried splitting it up into 100 record chunks and even tried a WAITFOR DELAY '000:00:5' to see if it would help to pause between merging the chunks. It's still rather sluggish.
I'm looking for any suggestions, best practices, or examples on how to merge large sets of data without locking the tables.
Thanks
Change your front end to use NOLOCK or READ UNCOMMITTED when doing the selects.
You can't NOLOCK MERGE,INSERT, or UPDATE as the records must be locked in order to perform the update. However, you can NOLOCK the SELECTS.
Note that you should use this with caution. If dirty reads are okay, then go ahead. However, if the reads require the updated data then you need to go down a different path and figure out exactly why merging 3M records is causing an issue.
I'd be willing to bet that most of the time is spent reading data from the disk during the merge command and/or working around low memory situations. You might be better off simply stuffing more ram into your database server.
An ideal amount would be to have enough ram to pull the whole database into memory as needed. For example, if you have a 4GB database, then make sure you have 8GB of RAM.. in an x64 server of course.
I'm afraid that I've quite the opposite experience. We were performing updates and insertions where the source table had only a fraction of the number of rows as the target table, which was in the millions.
When we combined the source table records across the entire operational window and then performed the MERGE just once, we saw a 500% increase in performance. My explanation for this is that you are paying for the up front analysis of the MERGE command just once instead of over and over again in a tight loop.
Furthermore, I am certain that merging 1.6 million rows (source) into 7 million rows (target), as opposed to 400 rows into 7 million rows over 4000 distinct operations (in our case) leverages the capabilities of the SQL server engine much better. Again, a fair amount of the work is in the analysis of the two data sets and this is done only once.
Another question I have to ask is well is whether you are aware that the MERGE command performs much better with indexes on both the source and target tables? I would like to refer you to the following link:
http://msdn.microsoft.com/en-us/library/cc879317(v=SQL.100).aspx
From personal experience, the main problem with MERGE is that since it does page lock it precludes any concurrency in your INSERTs directed to a table. So if you go down this road it is fundamental that you batch all updates that will hit a table in a single writer.
For example: we had a table on which INSERT took a crazy 0.2 seconds per entry, most of this time seemingly being wasted on transaction latching, so we switched this over to using MERGE and some quick tests showed that it allowed us to insert 256 entries in 0.4 seconds or even 512 in 0.5 seconds, we tested this with load generators and all seemed to be fine, until it hit production and everything blocked to hell on the page locks, resulting in a much lower total throughput than with the individual INSERTs.
The solution was to not only batch the entries from a single producer in a MERGE operation, but also to batch the batch from producers going to individual DB in a single MERGE operation through an additional level of queue (previously also a single connection per DB, but using MARS to interleave all the producers call to the stored procedure doing the actual MERGE transaction), this way we were then able to handle many thousands of INSERTs per second without problem.
Having the NOLOCK hints on all of your front-end reads is an absolute must, always.