Sqoop: sqoop export DB2 loack - sqoop

Does sqoop export issue any locks while exporting the data from hive to db2?
What type of lock does it issue? If there is a lock how are these locks released?
I get a validation error as there are parallel sqoop export process running on the same db2 table. Hence, wondering if there are any locks issued and what type of locks.

Yes Aavik, DB2 supports locks. There are three types of locks.
1. S lock (share)
2. U lock (update)
3. X lock (exclusive)
When you scan the table to read the data, for E.g when you do select * from table where <condition>, it does a read operation on the table, when it reads the data from the table it applies S lock on the table, meaning other requests can read the data, but cannot update or write.
When you do transactions on the table it applies U lock.
When you insert new data it acquires X lock, meaning it doesn't allow any read operation or update operation.
So, when you do sqoop export from Hive to DB2 it acquires X lock on
the table as it is inserting new records.
When you do sqoop import, it acquires S lock on the table.
This is a very common problem everywhere, you have many options to overcome this issue.
1. maintain separate views/Tables for regular transactions.
2. Increase number of max retries or write a shell script which checks if DB2 is free from locks, basically you've to create a dependency, I know this will become bit complicated, there may be better ways to do though.
Hope this gives you a better understanding.

Related

Preserve exclusive table lock after DDL in Oracle

It is a well known fact that in an Oracle database it is not possible to make a transaction out of multiple DDL statements.
However, is there any way to lock a specific set of database objects within the current connection so that after a DDL query is executed, all locks are held until they are explicitly released?
An obvious solution of this kind doesn't work, because executing the DDL statement automatically commits the transaction, and with it, the locks are released:
LOCK TABLE x ....;
LOCK TABLE y ....;
ALTER TABLE x ....; -- Does not work properly since table locks are released here
ALTER TABLE y ....;
ALTER TABLE x ....;
COMMIT;
The DBMS_LOCK option doesn't work either, because it is an advisory lock, and the concurrent thread must respect this lock and at least be aware of its existence.
Moreover, it is not controlled which statements can be executed by concurrent threads/sessions. It is possible to execute a query only in the current session, and it must be ensured that no intermediate queries on tables X and Y are executed from other sessions until the current session has ended.
Are there any ideas how this can be implemented?
PS: Please don't mention the high-level task or XY problem. There is no high-level task. The question is posed exactly as it is.
A bit of a joke (breaks all dependent PL/SQL), but... ;)
ALTER TABLE x RENAME TO x__my_precious;
ALTER TABLE y RENAME TO y__my_precious;
ALTER TABLE x__my_precious ...;
ALTER TABLE y__my_precious ...;
ALTER TABLE x__my_precious RENAME TO x;
ALTER TABLE y__my_precious RENAME TO y;
I'm pretty sure what you're trying to do isn't possible with Oracle's native transaction control. DDL will always end a transaction, so no lock on that object is going to survive it. Even if you immediately attempted to lock it after the DDL, another waiting session could slip in and obtain the lock before you do.
You can, however, serialize access to the table by utilizing another dummy table or row in a dummy table, assuming you control the code of any process wishing to access the table. If this is the case, then before accessing the table, attempt to lock the dummy table or a row in it first, and only if it succeeds continue with accessing the main table. Then the process that does DDL can take out that same lock (preventing other processes from proceeding), then do the DDL in a subroutine (named PL/SQL block) with PRAGMA AUTONOMOUS_TRANSACTION. That way the DDL ends the autonomous transaction rather than the main one, which still holds the lock on the dummy table.
You have to use a dummy table because if you tried to use the same table you want to modify you'll deadlock yourself. Of course, this only works if you can make all other processes do the lock-the-dummy-table safety check before they can proceed.
Lastly, albeit what I said above should work, it is likely that you're trying to do something you shouldn't do. DDL against hot objects isn't a good idea. Whatever you're trying to do, there is probably a better way than modifying objects on the fly like this. Even if you are able to keep other locked out, you are likely to cause object reference errors, package invalidations, SQL cursor invalidations, etc.. it can be a real mess.

Hive truncate table takes too much time

My hive query Truncate table tablename is taking too much time. Table definition has these properties defined
CLUSTERED BY(field1) INTO 2 BUCKETS
STORED AS ORC TBLPROPERTIES('transactional'='true');
The data in table may be 20-30k rows only.
ACID transactions are enabled.
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nostrict;
set hive.compactor.initiator.on=true;
set hive.compactor.worker.threads=1;
After waiting for long. It throw an error as below
FAILED: Error in acquiring locks: Lock acquisition for
LockRequest(component:[LockComponent(type:EXCLUSIVE, level:TABLE, dbname:db1,
tablename:tbl1, operationType:NO_TXN, isAcid:true)], txnid:0, user:xyz,
hostname:host123, agentInfo:xyz_20190310220349_62d794b8-3166-4049-b9f9-646e40f1d344) timed out after 5503335ms. LockResponse(lockid:5563,
state:WAITING)
But no other user or job is using this table. So as to wait for the lock. What else could be the reason for the wait?
Also an insert query was executed right before the truncate(for a particular condition) .
As there was no other answer i would like to mention that Delete from table completed in usual time(took 2 mins and more importantly No lock error) in my case compared to Truncate.
Hive's concurrency support for acid tables are not quite right.
As per https://community.hortonworks.com/content/supportkb/150639/hive-queries-randomly-fail-due-to-error-in-acquiri.html
Disable concurrency support with
(set hive.support.concurrency=false)
and restart the affected components.

Disable queries on table while updating

I have a pl/sql script that clears (via delete from statement) and populates several depended tables like this:
delete from table-A
insert into table-A values(...)
delete from table-B
insert into table-B values(...)
These operations require ~ 10 seconds to complete and I'd like to stop all sql queries that try to read data from table-A or table-B while tables are updating. These queries should stop and continue execution when table-A and table-B are completely updated.
What is the proper way to do this?
As others have pointed out, Oracle's basic concurrency model is that writers do not block readers and readers do not block writers. You can't stop a simple select from running. Your queries will see the data as of the SCN that they started executing (assuming that you're using the default read committed transaction isolation level) so they will have a consistent view of the data before your updates started.
You could potentially acquire a custom named lock using dbms_lock.request. You would need to acquire this lock before running your updates and every session that queries the tables would also need to acquire the lock before it starts to query the tables. That will, obviously, decrease the scalability of your application but it will accomplish what you appear to be asking for. Presumably, the sessions doing queries can acquire the lock in shared mode while the session doing the updates would need to acquire it in exclusive mode.

What is the use of disable operation in hbase?

I know it is to disallow anyone from performing any operation on a table, when a schema change is going to be made.
> disable ‘table_name’
But I want more clarification on it. Why should we disallow others to perform any operation on it? Is it just because wrong and unexpected results would be given when a query is made while a schema change is undergoing...!
HBase is a strictly consistent NoSQL database in case of reads and writes.
So achieving consistency is very important for HBase during DB operations.
HBase demands disabling table in case of altering schema changes and dropping tables.
HBase doesn't have a protocol to tell all the regions to update the schema changes online. So we need to disable the table before alter it.
HBase table drop is two step procedure:
Closing all the regions. i.e disable the table
Dropping them. i.e drop the table.
So We must disable all operations except a few operations like list, is_enabled, is_disabled etc... on the table before dropping it.

ORA-00054 while loading large data file

I get ORA-00054 while loading large data files(~ 10 gb)
The error occurs when this a new file is loaded after a previous file.
Any ideas how I can solve this?
One possible scenario.
Is this a direct path load ? If so, please check the v$locked_object view and see if is being locked by someone during your load.
select dbao.object_name
from v$locked_object vlo,
dba_objects dbao
where vlo.object_id = dbao.object_id
and dbao.object_name = 'Table that you are trying to load...'
From the Oracle Documentation at http://download.oracle.com/docs/cd/B10500_01/server.920/a96524/c21dlins.htm
Locking Considerations with
Direct-Path INSERT
During direct-path INSERT, Oracle
obtains exclusive locks on the table
(or on all partitions of a partitioned
table). As a result, users cannot
perform any concurrent insert, update,
or delete operations on the table, and
concurrent index creation and build
operations are not permitted.
Concurrent queries, however, are
supported, but the query will return
only the information before the insert
operation.
Maybe this is linked to tablespace datafile sizes, table size, because ORA-00054 usually appears when an ALTER statement is run.
I do not pretend to be right here.
Check those views.
DBA_BLOCKERS – Shows non-waiting sessions holding locks being waited-on
DBA_DDL_LOCKS – Shows all DDL locks held or being requested
DBA_DML_LOCKS - Shows all DML locks held or being requested
DBA_LOCK_INTERNAL – Displays 1 row for every lock or latch held or being requested with the username of who is holding the lock
DBA_LOCKS - Shows all locks or latches held or being requested
DBA_WAITERS - Shows all sessions waiting on, but not holding waited for locks
http://www.dba-oracle.com/t_ora_00054_locks.htm
Your table seems to be locked: ORA-00054
It can be because of the way that Oracle driver handles the BLOB types (the driver locks the record, opens an stream to write the binary data, and needs "some help" to release the record).
I would try the next secuence:
Load the first file
COMMIT;
Load the second file

Resources