How to optimize user search in OIM 11gR2PS2 - oracle

We have a large user base in our Oracle Identity Manager system. We have over 0.5 million records in USR table. We have our trusted reconciliation scheduled jobs running every 2 hours. While running trusted reconciliation scheduled jobs for LDAP and FlatFile, OIM is firing a search query on USR table everytime to list all active users. Due to large user base, this query takes a lot of time and our scheduled job which is supposed to bring less than 100 insers/updates takes around 1 hour to complete. Is there a way to optimize it? I have gone through the OIM optimizations guide and have done all the optimizations suggested by Oracle which includes putting USR table in default buffer pool. Any suggestions would be appreciated.
Thanks.

Related

Create Oracle DB user with expiration time

I would like to create an Oracle DB user and I would like to disable him in exactly 8 hours.
I don't care if a user just gets locked or if all of his roles are revoked, I just want
to prevent him from doing any activities on DB exactly 8 hours after his DB user was created.
Does Oracle provide such option out of the box ?
If not, I might go with the following solution:
create a table where all newly created DB users are stored (with DB user creation time)
create a trigger on Create user, so I save DB username and his creation time in my table
create a function / job that checks my table every 5 minutes if there's any user older than 8 hours and it locks him
My proposed solution is very nasty so I really hope there's a better solution for my issue.
How about creating a profile
which is a set of limits on database resources. If you assign the profile to a user, then that user cannot exceed these limits.
Especially check the following parameters:
CONNECT_TIME: Specify the total elapsed time limit for a session, expressed in minutes
PASSWORD_LIFE_TIME: Specify the number of days the same password can be used for authentication. If you also set a value for PASSWORD_GRACE_TIME, then the password expires if it is not changed within the grace period, and further connections are rejected. If you omit this clause, then the default is 180 days

Oracle auto-gather statistics & stale statistics

Starting in Oracle 11g, GATHER_STATS_JOB is no longer valid, and has been replace by "auto optimizer stats collection".
This job supposedly runs during the "maintenance windows" and gathers statistics for objects which have changed 10% or more, or have stale stats. If this is true, then why when I run a query checking "stale_stats='YES'", I still get objects?
Maybe I'm not understanding how the job executes......
Two broad possibilities
Oracle updates stale_statistics to "YES" in dba_tab_statistics periodically throughout the day as tables undergo changes. It is entirely possible that a table had just under the threshold amount of changes when stats were gathered this morning and that stale_stats flipped to "YES" during the day today when a few more changes were made.
Depending on how many objects had stale stats when the job ran and how much data those tables contained, how large your maintenance window is, and how powerful your server is, it is possible that the statistics job had to be aborted before it could re-gather all the stale statistics. If the job was aborted, that abort would be logged in the job history. If this happened because there happened to be a large number of changes one day (say you ran an annual purge process that deleted a large amount of data from almost every table in the database), the stale statistics would be updated over the course of several days worth of statistics job runs until the job caught up.

In Apache Hive, A DB with high count of External Tables takes too long to DROP CASCADE

I have found lots of answers of HOW to drop a DB and all its tables, but nothing around why it takes ~3-4 seconds per table to drop, seemingly in SERIAL (one after the other).
I have a database with 2,414 EXTERNAL Tables pointed at parquet locations, and DROP DATABASE <db> CASCADE; can take 1-2 HOURS just to drop the metadata for the DB.
In a separate session I can repeatedly SHOW TABLES IN <deleted DB>; and watch the count of tables go down at a rate of about 1 every 3-4 seconds. This takes upwards of 2 hours before the session releases the delete and allows us to replace the DB with a new one...
504 rows selected (0.29 seconds)
...
503 rows selected (0.17 seconds)
...
502 rows selected (0.29 seconds)
...
What is taking Hive so long?
Is there a configuration I can use to make it quicker?
Is there a way I can tell what it's doing during that time?
I would think others would have encountered this problem if it were more common, so that makes me think I have a setting somewhere I can tweak to fix this (?)...
The parquets don't seem to be deleted underneath the dropped database, so it doesn't seem to have anything to do with hdfs/parquet files unless dropping an external table checks them for any reason...
Any ideas why it would be so slow?
AFAIK, it has to drop all of its references. It can be external table, but in case there are lot of partitions, stats etc from metastore. Also, if it has lot of rows, it needs to acquire specific locks.
You may want to check the metastore (mysql or equivalent) and see if you can introduce any indexes or collect stats on a periodic basis.

How oracle handles concurrency in clustered environment?

I have to implement a database solution wherein contention is handled in a clustered environment. There is a scenario wherein there are multiple users trying to access a bank account at the same time and deposit money into it if balance is less than $100, how can I make sure that no extra money is deposited? Basically , this query is supposed to fire :-
update acct set balance=balance+25 where acct_no=x ;
Since database is clustered , account ends up getting deposited multiple times.
I am looking for purely oracle based solution.
Clustering doesn't matter for the system which is trying to prevent the scenario you're fearing/seeing, which is locking.
Behold scenario user A and then user B trying to do an update, based on a check (less than 100 dollar in account):
If both the check and the update is done in the same transaction, locking will prevent that user B does a check, UNTIL user A has done both the check, and the actual insert. In other words, user B will find the check failing, and will not perform the asked action.
When a user says "at the same time", you should know that the computer does not know that concept, as all transactions are sequential, no matter what millisecond is identical. Behold the ID that is kept in the Redo Logs, there's only one counter. Transaction X and Y is done before or after each other, never at the same time.
That doesn't sound right ... When Oracle locks a row for update, the lock should be across all nodes. What you describe doesn't sound right. What version of Oracle are you using, and can you provide a step-by-step example of what you're doing?
Oracle 11 doc here:
http://docs.oracle.com/cd/B28359_01/server.111/b28318/consist.htm#CNCPT020

Caching expensive SQL query in memory or in the database?

Let me start by describing the scenario. I have an MVC 3 application with SQL Server 2008. In one of the pages we display a list of Products that is returned from the database and is UNIQUE per logged in user.
The SQL query (actually a VIEW) used to return the list of products is VERY expensive.
It is based on very complex business requirements which cannot be changed at this stage.
The database schema cannot be changed or redesigned as it is used by other applications.
There are 50k products and 5k users (each user may have access to 1 up to 50k products).
In order to display the Products page for the logged in user we use:
SELECT TOP X * FROM [VIEW] WHERE UserID = #UserId -- where 'X' is the size of the page
The query above returns a maximum of 50 rows (maximum page size). The WHERE clause restricts the number of rows to a maximum of 50k (products that the user has access to).
The page is taking about 5 to 7 seconds to load and that is exactly the time the SQL query above takes to run in SQL.
Problem:
The user goes to the Products page and very likely uses paging, re-sorts the results, goes to the details page, etc and then goes back to the list. And every time it takes 5-7s to display the results.
That is unacceptable, but at the same time the business team has accepted that the first time the Products page is loaded it can take 5-7s. Therefore, we thought about CACHING.
We now have two options to choose from, the most "obvious" one, at least to me, is using .Net Caching (in memory / in proc). (Please note that Distributed Cache is not allowed at the moment for technical constraints with our provider / hosting partner).
But I'm not very comfortable with this. We could end up with lots of products in memory (when there are 50 or 100 users logged in simultaneously) which could cause other issues on the server, like .Net constantly removing cache items to free up space while our code inserts new items.
The SECOND option:
The main problem here is that it is very EXPENSIVE to generate the User x Product x Access view, so we thought we could create a flat table (or in other words a CACHE of all products x users in the database). This table would be exactly the result of the view.
However the results can change at any time if new products are added, user permissions are changed, etc. So we would need to constantly refresh the table (which could take a few seconds) and this started to get a little bit complex.
Similarly, we though we could implement some sort of Cache Provider and, upon request from a user, we would run the original SQL query and select the products from the view (5-7s, acceptable only once) and save that result in a flat table called ProductUserAccessCache in SQL. Next request, we would get the values from this cached-table (as we could easily identify the results were cached for that particular user) with a fast query without calculations in SQL.
Any time a product was added or a permission changed, we would truncate the cached-table and upon a new request the table would be repopulated for the requested user.
It doesn't seem too complex to me, but what we are doing here basically is creating a NEW cache "provider".
Does any one have any experience with this kind of issue?
Would it be better to use .Net Caching (in proc)?
Any suggestions?
We were facing a similar issue some time ago, and we were thinking of using EF caching in order to avoid the delay on retrieving the information. Our problem was a 1 - 2 secs. delay. Here is some info that might help on how to cache a table extending EF. One of the drawbacks of caching is how fresh you need the information to be, so you set your cache expiration accordingly. Depending on that expiration, users might need to wait to get the fresh info more than they would like to, but if your users can accept that they migth be seing outdated info in order to avoid the delay, then the tradeoff would worth it.
In our scenario, we decided to better have the fresh info than quick, but as I said before, our waiting period wasn't that long.
Hope it helps

Resources