Do Databricks Tables Support Transaction Isolation Levels - azure-databricks

I am working on databricks tables in Azure databricks service, however, it looks to me that databricks tables do not support transaction isolation levels? What happens when a table is being updated/deleted/inserted while another process accessing(reading/modifying) the same table?

Azure Databricks table schema is immutable.
Delta Lake on Azure Databricks supports two isolation levels: Serializable and WriteSerializable.
Delta Lake provides ACID transaction guarantees between reads and writes. This means that:
Multiple writers across multiple clusters can simultaneously modify a table partition and see a consistent snapshot view of the table and there will be a serial order for these writes.
Readers continue to see a consistent snapshot view of the table that the Azure Databricks job started with, even when a table is modified during a job.
The isolation level of a table defines the degree to which a transaction must be isolated from modifications made by concurrent transactions. Delta Lake on Azure Databricks supports two isolation levels: Serializable and WriteSerializable.
Serializable: The strongest isolation level. It ensures that committed write operations and all reads are Serializable. Operations are allowed as long as there exists a serial sequence of executing them one-at-a-time that generates the same outcome as that seen in the table. For the write operations, the serial sequence is exactly the same as that seen in the table’s history.
WriteSerializable (Default): A weaker isolation level than Serializable. It ensures only that the write operations (that is, not reads) are serializable. However, this is still stronger than Snapshot isolation. WriteSerializable is the default isolation level because it provides great balance of data consistency and availability for most common operations.
In this mode, the content of the Delta table may be different from that which is expected from the sequence of operations seen in the table history. This is because this mode allows certain pairs of concurrent writes (say, operations X and Y) to proceed such that the result would be as if Y was performed before X (that is, serializable between them) even though the history would show that Y was committed after X. To disallow this reordering, set the table isolation level to be Serializable to cause these transactions to fail.
For more information on which types of operations can conflict with each other in each isolation level and the possible errors, see Concurrency control.
For more details, refer "Azure Databricks - Isolation levels".

Related

Oracle Optimizer Statistics Advisor and its effect on tables gather statistics

Recently i am reading more about 'Optimizer Statistics Advisor' and I have done some test on my test database. It gave the below recomondation:
Rule Name: UseConcurrent
Rule Description: Use Concurrent preference for Statistics Collection
Finding: The CONCURRENT preference is not used.
Recommendation: Set the CONCURRENT preference.
Example:
dbms_stats.set_global_prefs('CONCURRENT', 'ALL');
Rationale: The system's condition satisfies the use of concurrent statistics
gathering. Using CONCURRENT increases the efficiency of statistics
gathering.
As I understand from oracle base
Concurrent statistics collection is simply the ability to gather statistics on multiple tables, or table partitions, at the same time. This is done using a combination of the job scheduler, advanced queuing and resource manager.
so this recomondation for all the database , not for the tables. What I am saying if I gather statistics for a table such remocommndation will not have any benefits correct ? Also is there a way there 'Optimizer Statistics Advisor' can be implemented on specific tables ?
Unlikely, because one of the overall aims for the optimizer team is that for 99% of customers, the default settings of the optimizer statistics gathering mechanisms will be sufficient to ensure good plans.
This is why (for example)
when you use CTAS, you'll get stats collected automatically on creation, so (in most cases) no need for additional steps before using table in queries.
a table that has queries that use predicates on skewed data will ultimately end up with histograms (via SYS.COL_USAGE$) without user intervention needed
tables without stats will get them automatically in the next run, and until then will be dynamically sampled
tables that change by 10% will get stats picked up again on next run
and so forth. The aim is that (except in niche cases) you'll not need to "worry" about the optimizer statistics process.

CockroachDB 2-phase commits with or without blocking?

Two-phase commits are supposed to suffer from blocking problems. Is that the case with CockroachDB, and if not, how is it avoided?
Summary: 2-phase commits are blocking, so it is important to keep the thing that is being 2-phase committed as "small" as possible, so that the set of all actions that are blocked is minimal. CockroachDB does this using MVCC with intents, 2-phase committing only on a single intent. Because CockroachDB provides serializable transactions, it reorders transaction timestamps to minimize blocking only to where absolutely necessary.
Longer answer
2-phase commits are blocking after the first phase, while all participants wait for a reply from the coordinator as to whether the second phase is to be committed or aborted. During this time period, participants that have already sent a "Yes" vote cannot unilaterally revoke their vote, but also cannot treat it as committed (as the coordinator might get back with an abort). So they are forced to block all subsequent actions that need to concretely know what the state of this transaction is. The key in the above sentence is in the "need": it is on us to design our system to reduce that set to the bare minimum. CockroachDB uses write intents and [MVCC] to minimize these dependencies.
Consider a naïve implementation of a distributed (multi-key) transactional key-value store: I wish to transactionally commit some write transaction t1. t1 spans many keys across many machines, but of particular concern is that it writes k1 = v2. k1 is on machine m1 (let's say k1=v1 was the previous value).
Since t1 spans many keys on many machines, all of them are involved in a 2-phase commit transaction. Once that 2-phase transaction is begun, we have to note that we have an intent to write k1=v2, and the status of the transaction is unknown (the transaction may abort, because one of the other writes cannot proceed).
Now if some other transaction t2 comes along which wants to read the value of k1, we simply cannot give that transaction an authoritative answer, until we know the final result of the 2-phase commit. t2 is blocked.
But, we (and CockroachDB) can do better. We can keep multiple versions of values for each key, and have a concurrency control mechanism to keep all of these versions in order. Namely, we can assign our transactions timestamps, and have our writes look (loosely) as follows:
`k1 = v1 committed at time=1`
`k1 = v2 at time=110 INTENT (pending transaction t1)`
Now, when t2 comes along, it has an option: it can choose to do the read at time<=109, which would not be blocked on t1. Of course, some transactions cannot do this (if say, they also are distributed, and there's a different component that simply requires a higher timestamp). Those transactions will be blocked. But in practice, this frees up the database to assign timestamps such that many types of transactions can proceed.
As the other answer says, Cockroach Labs has a post about CockroachDB's use of MVCC here, which explains some further details as well.
CockroachDB has a long blog post on how it uses 2-phase commit without locking here: https://www.cockroachlabs.com/blog/how-cockroachdb-distributes-atomic-transactions/
The part that deals most with the prevention of locking is its use of "write intents" (Stage: Write Intents is the heading in the blog post).

Can a single SELECT significantly degrade the performance of an Oracle Database?

All,
Consider
an Oracle Database of version 10gR2 or greater.
Hardware sufficient to exceed existing peak workload by more than 100%
Can you construct any circumstance in which a single user with a single connection and read-only access to user tables (not system tables or v$ or x$ tables) degrade the overall performance of an Oracle database. Also list mitigation strategy if any.
As stipulated, consider that this is not about a database which is a few CPU-cycles away from saturation such that any additional load would be dangerous. This is about a well sized box for the current workload.
E.G.
If a user uses a parallel hint, Oracle may use a very high DOP to execute that query and starve other processes of CPU. Mitigation: explain to user that Parallel hints are forbidden.
The simplest approach would be to issue a query that does a Cartesian product of your biggest table with itself a bunch of times. That will blow out your TEMP tablespace rather quickly and generate errors for other sessions that need to sort. You can mitigate that either by granting limited quotas on TEMP (which may get tricky if this is an application account that is used by multiple people simultaneously rather than an individually identifiable account) or by using Resource Manager to kill sessions that either run too long or that use too much CPU or I/O resources.
Even without an explicit PARALLEL hint, it's possible that Oracle will use parallelism automatically. Depending on the Oracle version and how you've configured parallelism, turning on parallel automatic tuning to limit the total number of parallel workers at any time. Of course, that doesn't prevent the read-only user from creating a dozen sessions each of which spawns a couple of parallel workers that choke off other sessions that you actually want to run more parallel workers. You can use Resource Manager to configure priorities for different types of workload if and when the system becomes CPU constrained.
And if you allow an anonymous PL/SQL block, there are more ways to generate havoc by, for example, creating a nested table that gets filled with billions of rows until your PGA is exhausted.

Would you recommend using Hadoop/HBASE?

We have a SQL server 2008 and one of the tables, say table A has the following characteristics:
Every day we get several heterogeneous feeds from other systems with numerical data.
Feeds are staged elsewhere, converted to a format compliant with A's schema.
Inserted into A.
Schema looks like:
<BusinessDate> <TypeId> <InsertDate> <AxisX> <AxisY> <Value>
The table has a variable number of rows. Essentially we have to purge it at the weekends otherwise the size affects performance. So size ranges from 3m-15m rows during the week. Due to some new requirements we expect this number to be increased by 10m by the end of 2012. So we would be talking about 10m-25m rows.
Now in addition
Data in A never change. The middle tier may use A's data but it will be a read only operation. But typically the middle tier doesn't even care about the contents. It typically (not always but 80% of cases) runs stored procs to generate reports and delivers the reports in other systems.
Clients of these table would typically want to do do long sequential reads for one business date and type. i.e. "get me all type 1 values for today"
Clients will want to join this table with 3-5 more tables and then deliver reports to other systems.
The above assumptions are not necessarily valid for all tables with which A is joined. For example we usually join A with a table B and do a computation like B.value*A.value. B.value is a volatile column.
Question
A's characteristics do sound very much like what HBase and other column oriented schemas can offer.
However some of the joins are with volatile data.
Would you recommend migrating A to an HBase schema?
And also, if we were to move A I would assume we would also have to migrate B and other dependent tables which (on the contrary with A) are being used by several other places from the middle tier. Wouldn't this be complicating things a lot?
25 Million rows doesn't sound big enough to justify using HBase, although the usage pattern fits. You need a name node, a job tracker, a master and then your region servers, so you'll be needing a minimum of maybe 5 nodes to run HBase in any reasonable way. Your rows are so small I'm guessing it's maybe 10gb of data, so storing this across 5 servers seems like overkill.
If you do go this route (perhaps you want to store more than a week's data at once) there are ways to integrate HBase with relational DBs. Hive, for example, provides ODBC/JDBC connectivity and can query HBase. Oracle and Teradata both provide integration between their relational DB software and non-relational storage. I know Microsoft has recently announced that they are dropping Dryad in favor of integrating with Hadoop, but I am not certain how far along that process is wrt SQL Server. And if all you need is "get a list of IDs to use in my SQL query" you can of course write something yourself easily enough.
I think HBase is very exciting, and there may be things you haven't mentioned which would drive you towards it (e.g. high availability). But my gut says you can probably scale out your relational db much more cheaply than switching to HBase.

a question about oracle undo segment binding

I'm no DBA, I just want to learn about Oracle's Multi-Version Concurrency model.
When launching a DML operation, the first step in the MVCC protocol is to bind a undo segment. The question is why one undo segment can only serve for one active transaction?
thank you for your time~~
Multi-Version Concurrency is probably the most important concept to grasp when it comes to Oracle. It is good for programmers to understand it even if they don't want to become DBAs.
There are a few aspects but to this, but they all come down to efficiency: undo management is overhead, so minimizing the number of cycles devoted to it contributes to the overall performance of the database.
A transaction can consist of many statements and generate a lot of undo: it might insert a single row, it might delete thirty thousands. It is better to assign one empty UNDO block at the start rather than continually scouting around for partially filled blocks with enough space.
Following one from that, sharing undo blocks would require the kernel to track of usage at a much finer granularity, which is just added complexity.
When the transaction completes the undo is released (unless, see next point). The fewer blocks the transaction has used the fewer latches have to be reset. Plus, if the blocks are shared we would have to free shards of a block, which is just more effort.
The key thing about MVCC is read consistency. This means that all the records returned by a longer running query will appear in the state they had when the query started. So if I issue a SELECT on the EMP table which takes fifteen minutes to run and halfway through you commit an update of all the salaries I won't see your change, The database does this by retrieving the undo data from the blocks your transaction used. Again, this is a lot easier when all the undo data is collocated in a one or two blocks.
"why one undo segment can only serve for one active transaction?"
It is simply a design decision. That is how undo segments are designed to work. I guess that it was done to address some of the issues that could occur with the previous rollback mechanism.
Rollback (which is still available but deprecated in favor of undo) included explicit creation of rollback segments by the DBA, and multiple transactions could be assigned to a single rollback segment. This had some drawbacks, most obviously that if one transaction assigned to a given segment generated enough rollback data that the segment was full (and could no longer extend), then other transactions using the same segment would be unable to perform any operation that would generate rollback data.
I'm surmising that one design goal of the new undo feature was to prevent this sort of inter-transaction dependency. Therefore, they designed the mechanism so that the DBA sizes and creates the undo tablespace, but the management of segments within it is done internally by Oracle. This allows the use of dedicated segments by each transaction. They can still cause problems for each other if the tablespace fills up (and cannot autoextend), but at the segment level there is no possibility of one transaction causing problems for another.

Resources