Good morning,
We have an application which used to create daily reports.
When a daily report is finished it will be locked.
But,sometimes daily report must be changed after locking.
Therefore report will be unlocked.
After the changes made the report will be locked again.
And what we would like to do:
When report is unlocked we would like to take a snapshot
When report is locked again we would like to take another snapshot of the records to compare with the previous snapsnot (at the moment of unlocking) and see what changes were made in the records. We would like to see the before and after values of each field.
The daily report means around 40 tables and more hundred fields. So, when something has been changed in daily report it can happen in a few hundred fields of around 40 tables.
We are interested only to compare the status at the moment of unlock and at the moment of locking again. (With other words, we are not interested in all the changes made between unlock and locking again)
What is the best/recommended way to do this?
Thanks in advance for the answers.
One way to do that is to go for FDA (Flashback Data Archive)
Kindly check the below links that explain how you can do that.
https://oracle-base.com/articles/11g/flashback-and-logminer-enhancements-11gr1#flashback_data_archive
https://oracle-base.com/articles/12c/flashback-data-archive-fda-enhancements-12cr1
Related
I'm trying to work out why my Oracle 19c database is "suddenly" experiencing high commit waits. Looking in V$ACTIVE_SESSION_HISTORY and DBA_HIST_ACTIVE_SESS_HISTORY shows me that lots of sessions are waiting on "log file sync" and the blocking session is the LGWR process. Not a sign of a problem in itself, but a couple of months ago (before a recent set of product updates) it wasn't doing that, so I'm trying to understand what has changed. Either some code changes made over the last 2 months have caused this, or potentially the I/O system is experiencing a problem.
Because it's an OLTP system we have many different types of transaction, and I'm finding it difficult to filter out the noise from the performance views. What I'd like to be able to do is identify the sessions which are doing most commits, and also the sessions that are doing the "largest" commits, and then I can trace these back to see which pieces of code are responsible etc.
I would therefore like to be able to create a table such as this:
SESSION_ID
SESSION_SERIAL#
COMMIT_COUNT
COMMIT_SIZE
1
12345
3
132436
For commit size, I guessed I would need to use something like the "wait time" as an approximation and was hoping that the TM_DELTA_DB_TIME column would help me out here, but not sure how to measure the number of commits. I had hoped that the XID column would allow me to see the transaction boundaries but it's usually NULL.
And now I've stopped to question why there isn't an easier way to do this, and whether I'm going about it the wrong way. Surely I can't be the only person to want more in-depth understanding of the commit activity within their Oracle database. Or am I asking for data that doesn't exist in the views?
If anybody has some tips for where to look I would be very grateful!
Suppose there is a scenario where there is a data loading process into the fact table\dimensional table, and after analysis found that 100 millions records are being improperly
loaded, what are the steps I need to perform to clean the data properly.
Here are two practices which help in that scenario:
Take a backup or snapshot before each batch. In the case of a major error like this you can roll back to the snapshot, reload and process the correct data.
Maintain an insert-only persistent staging area in the DW, such as a data vault, with each row stamped with a batch ID and timestamp. Remove the rows in error, and rebuild your facts and dimensions.
If this represents a real situation your only chance is #1.
If you don't have a reliable backup, and you have updated and/or deleted rows during the ETL/ELT process, you don't have any record of the pre-fail state and it may be impossible to go back.
I want to build a data model which supports:
Data history - store every change of data. This is not a problem: http://en.wikibooks.org/wiki/Java_Persistence/Advanced_Topics#History
Data planning - user should be able to prepare a record with validity sometime in the future (for example, I know that customer name changes from May so I prepare record with validity of 1st of May).
How can I do point 2?
How can I do these things together (points 1 & 2)
If you really need point 2 - and I would think very hard about this, because in my experience users will never use it, and you will be spending a lot of effort to support something no one will ever use - anyway, if you really need it, then:
Make no changes at all directly in the table. All changes go through history.
Behind the scenes, periodically you will run a batch updater. This goes through history, finds all unapplied changes (set a status flag in the history to be able to rapidly find them), and applies them, and it checks the date to make sure it is time to apply the change.
You are going to have to deal with merges. What if the user says: In one month my name changes. Then goes in a and changes their name effective today. You have a conflict. How do you resolve it? You can either prevent any immediate changes, until past ones are done (or at least all new changes have a date after the last unapplied one). Or you can change it now, and change it again in a month.
I think storing the change of data is handled in the background, Look into data warehousing and slowly changing dimensions http://en.wikipedia.org/wiki/Slowly_changing_dimension in a Stored Procedure to handle new records and predecessors of those new records which will be known as "expired records". Once you allowed for SCD it's quite easy to find those historic expired records that you're after.
I have a database project for a web app, and currently I have it configured to fail if data loss may occur during deployment. I feel safer this way. However I've run into a problem. I actually need to deploy changes on some things where I'm okay with the possible data loss, i.e. shortening column lengths where nothing would actually get deleted, but the system thinks it would.
I have 2 questions.
The first is this: other than enabling or disabling the catch all go or no go, is there any way to have more granular control over this process, i.e. specify columns it's okay to drop or shorten? Is there any way to get more granular control of this process?
The second is, how do you guys handle these situations? Initially I had hoped that adding a pre-deployment script to drop the columns would be sufficient, however it seems to catch drops etc. in those files as well.
No there isn't any way to control it at a more granular way unfortunately.
I disable it when I know I'll be deploying something that will cause data loss but is what I want. Then I re-enable it after. Also, I would always check the change script that comes out when deploying to production.
Just update the column in a pre-deployment script to the truncate length?
Eg : to truncate my col to 20 :
UPDATE mycol = LEFT(mycol, 20)
FROM mytable
WHERE mycol != LEFT(mycol, 20)
The Microsoft guidance is to move the data out into a temporary table in pre-deployment, let the deployment engine run a check to see whether the table contains rows (this will pass because it is now empty) and upgrade the schema, and move the data back in a post deployment script.
For more information, see Barclay Hills posts on the subject:
Managing data motion during your deployments (Part 1)
Managing data motion during your deployments (Part 2)
Here is the issue.
On a site I've recently taken over it tracks "miles" you ran in a day. So a user can log into the site, add that they ran 5 miles. This is then added to the database.
At the end of the day, around 1am, a service runs which calculates all the miles, all the users ran in the day and outputs a text file to App_Data. That text file is then displayed in flash on the home page.
I think this is kind of ridiculous. I was told they had to do this due to massive performance issues. They won't tell me exactly how they were doing it before or what the major performance issue was.
So what approach would you guys take? The first thing that popped into my mind was a web service which gets the data via an AJAX call. Perhaps every time a new "mile" entry is added, a trigger is fired and updates the "GlobalMiles" table.
I'd appreciate any info or tips on this.
Thanks so much!
Answering this question is a bit difficult since there we don't know all of your requirements and something didn't work before. So here are some different ideas.
First, revisit your assumptions. Generating a static report once a day is a perfectly valid solution if all you need is daily reports. Why hit the database multiple times throghout the day if all that's needed is a snapshot (for instance, lots of blog software used to write html files when a blog was posted rather than serving up the entry from the database each time -- many still do as an optimization). Is the "real-time" feature something you are adding?
I wouldn't jump to AJAX right away. Use the same input method, just move the report from static to dynamic. Doing too much at once is a good way to get yourself buried. When changing existing code I try to find areas that I can change in isolation wih the least amount of impact to the rest of the application. Then once you have the dynamic report then you can add AJAX (and please use progressive enhancement).
As for the dynamic report itself you have a few options.
Of course you can just SELECT SUM(), but it sounds like that would cause the performance problems if each user has a large number of entries.
If your database supports it, I would look at using an indexed view (sometimes called a materialized view). It should support allows fast updates to the real-time sum data:
CREATE VIEW vw_Miles WITH SCHEMABINDING AS
SELECT SUM([Count]) AS TotalMiles,
COUNT_BIG(*) AS [EntryCount],
UserId
FROM Miles
GROUP BY UserID
GO
CREATE UNIQUE CLUSTERED INDEX ix_Miles ON vw_Miles(UserId)
If the overhead of that is too much, #jn29098's solution is a good once. Roll it up using a scheduled task. If there are a lot of entries for each user, you could only add the delta from the last time the task was run.
UPDATE GlobalMiles SET [TotalMiles] = [TotalMiles] +
(SELECT SUM([Count])
FROM Miles
WHERE UserId = #id
AND EntryDate > #lastTaskRun
GROUP BY UserId)
WHERE UserId = #id
If you don't care about storing the individual entries but only the total you can update the count on the fly:
UPDATE Miles SET [Count] = [Count] + #newCount WHERE UserId = #id
You could use this method in conjunction with the SPROC that adds the entry and have both worlds.
Finally, your trigger method would work as well. It's an alternative to the indexed view where you do the update yourself on a table instad of SQL doing it automatically. It's also similar to the previous option where you move the global update out of the sproc and into a trigger.
The last three options make it more difficult to handle the situation when an entry is removed, although if that's not a feature of your application then you may not need to worry about that.
Now that you've got materialized, real-time data in your database now you can dynamically generate your report. Then you can add fancy with AJAX.
If they are truely having performance issues due to to many hits on the database then I suggest that you take all the input and cram it into a message queue (MSMQ). Then you can have a service on the other end that picks up the messages and does a bulk insert of the data. This way you have fewer db hits. Then you can output to the text file on the update too.
I would create a summary table that's rolled up once/hour or nightly which calculates total miles run. For individual requests you could pull from the nightly summary table plus any additional logged miles for the period between the last rollup calculation and when the user views the page to get the total for that user.
How many users are you talking about and how many log records per day?