We're running Oracle 12c SE. I've read a lot of postings that say v$segment_statistics may have information on which are the most frequently queried or updated objects. However, can that be broken down? Say that one might want to see during what times of the day certain objects are hotter than other objects, or perhaps number of physical reads or writes per hour for a given table?
Does Oracle SE offer this an any of the v$ views?
It sounds like you are describing dba_hist_seg_stat which is one of the tables that is populated as part of the Automatic Workload Repository (AWR). If you're on the standard edition, I don't believe querying these views would violate your license agreement but I don't keep up to date with changes to licensing terms particularly for the standard edition.
You could replicate this functionality yourself by putting together a job that runs every few minutes, queries v$segment_statistics, and writes the delta from the prior snap to a custom table. You could then query that table to see what activity was going on at different points in time.
Related
At work, my team accesses and works in a number of different databases using our team login. We have a ton of tables and views in each respective schema and I would guess that only ~10% are used regularly. As such, I would like to clean up these schemas to keep only those tables and views which are actually used and delete all the other ones (or at least archive them).
Is there any way for me to see the last time that a view was run, or the last time that a table was queried? My thinking is that if I can see that a view/table hasn't been used in x amount of time, then I'd feel more comfortable dropping it. My fear is that without such a process, I might drop tables/views that are used in Tableau dashboards and for other purposes.
Please check this Link
DBA_HIST tables can show you data depending till what date data is stored but not beyond that and it wont be conclusive.
I'm endeavoring to develop an application that uses Oracle as the database back-end. The application will calculate several statistics from the various tables in the database. The front-end will most likely be a web application and this front-end will display various charts and calculated statistics. Now, I imagine that it would be more efficient to perform the calculations in the database rather than in the service layer because said calculations would need to be performed for every web request. That being the case, I'm not sure which mechanism to use. (e.g. stored procedure, function, view) To illustrate what I'm going for, suppose I want to keep statistics of student grades for many students. I would like to have a web interface that lets me view those statistics on student-by-student basis and also an all-inclusive basis. Some of the stats are dependent on aggregates (e.g. average, min, max) of all of the student grades and some stats are dependent only on an individual student. In this situation, every time a record is added or updated, the aggregates would have to be recalculated. So I am speculating that if I had a special table that held all of the calculated values I need and a trigger(s) to recalculate everything when a record is added/updated then all I would need to do from a web request point-of-view is have the service layer pull the desired values from this special table. I'm just not sure if this is the best way to go or not so I am asking the community for any input/advice. Note: Although I'm using Oracle, I'm open to using PostgreSQL or mySQL.
Thanks in advance
The scenario you are describing would be ideal for using materialized views. They can be designed to refresh automatically (and incrementally) every time the source data is updated by your application. The calculations would be built in to the view definition. No triggers required, and likely no stored procedures unless your calculations involve multiple steps. Check here: https://oracle-base.com/articles/misc/materialized-views and here: https://medium.com/oracledevs/lightning-fast-sql-with-real-time-materialized-views-12-things-developers-will-love-about-oracle-54bcc9eac358 for more info.
It's kinda real-world problem and I believe the solution exists but couldn't find one.
So We, have a Database called Transactions that contains tables such as Positions, Securities, Bogies, Accounts, Commodities and so on being updated continuously every second whenever a new transaction happens. For the time being, We have replicated master database Transaction to a new database with name TRN on which we do all the querying and updating stuff.
We want a sort of monitoring system ( like htop process viewer in Linux) for Database that dynamically lists updated rows in tables of the database at any time.
TL;DR Is there any way to get a continuous updating list of rows in any table in the database?
Currently we are working on Sybase & Oracle DBMS on Linux (Ubuntu) platform but we would like to receive generic answers that concern most of the platform as well as DBMS's(including MySQL) and any tools, utilities or scripts that can do so that It can help us in future to easily migrate to other platforms and or DBMS as well.
To list updated rows, you conceptually need either of the two things:
The updating statement's effect on the table.
A previous version of the table to compare with.
How you get them and in what form is completely up to you.
The 1st option allows you to list updates with statement granularity while the 2nd is more suitable for time-based granularity.
Some options from the top of my head:
Write to a temporary table
Add a field with transaction id/timestamp
Make clones of the table regularly
AFAICS, Oracle doesn't have built-in facilities to get the affected rows, only their count.
Not a lot of details in the question so not sure how much of this will be of use ...
'Sybase' is mentioned but nothing is said about which Sybase RDBMS product (ASE? SQLAnywhere? IQ? Advantage?)
by 'replicated master database transaction' I'm assuming this means the primary database is being replicated (as opposed to the database called 'master' in a Sybase ASE instance)
no mention is made of what products/tools are being used to 'replicate' the transactions to the 'new database' named 'TRN'
So, assuming part of your environment includes Sybase(SAP) ASE ...
MDA tables can be used to capture counters of DML operations (eg, insert/update/delete) over a given time period
MDA tables can capture some SQL text, though the volume/quality could be in doubt if a) MDA is not configured properly and/or b) the DML operations are wrapped up in prepared statements, stored procs and triggers
auditing could be enabled to capture some commands but again, volume/quality could be in doubt based on how the DML commands are executed
also keep in mind that there's a performance hit for using MDA tables and/or auditing, with the level of performance degradation based on individual config settings and the volume of DML activity
Assuming you're using the Sybase(SAP) Replication Server product, those replicated transactions sent through repserver likely have all the info you need to know which tables/rows are being affected; so you have a couple options:
route a copy of the transactions to another database where you can capture the transactions in whatever format you need [you'll need to design the database and/or any customized repserver function strings]
consider using the Sybase(SAP) Real Time Data Streaming product (yeah, additional li$ence is required) which is specifically designed for scenarios like yours, ie, pull transactions off the repserver queues and format for use in downstream systems (eg, tibco/mqs, custom apps)
I'm not aware of any 'generic' products that work, out of the box, as per your (limited) requirements. You're likely looking at some different solutions and/or customized code to cover your particular situation.
I recently working with Oracle database to generate some reports. What I need is to get result sets of specific records (only SELECT statement), sometimes are large records, to be used for generating the report in excel file.
At first, the reports are queried in Views but some of them are slow (have some complex subqueries). I was asked to increase the performance and also fixed some field mapping. I also want to tidy things up, because when I query against View, I must specifically call the right column name. I want to separate the data works into database, and the web app just for passing parameters and call the right result set.
I'm new to Oracle, so which is better to do this kind of task? Using SP or Function? or in what condition that maybe View is better?
Makes no difference whether you compile your SQL in a view, SP or function. It is the SQL itself that matters.
As long as you are able to meet your requirements with the views they should be a good option. If you intend to break-up your queries into multiple ones for achieving better performance then you should go for stored procedures. If you decide to go for stored procedure then it would be advisable to create a package and bundle all the stored procedures together in the package. If your problem is performance then there may not be a silver bullet solution for the same. You will have to work on your queries and design for the same.
If the problem is performance due to complex SELECT query (queries), you can consider tuning the queries. Often you will find queries written 15-20 years ago, which do not use functionality and techniques that were introduced by Oracle in more recent versions (even if the organization spent the big bucks to buy the more recent versions - making it into a waste of money). Honestly, that may be too much of a task for you if you are new at Oracle; also, some slow queries may have been written by people just like you, many years ago - before they had a chance to learn a lot about Oracle and have experience with it.
Another thing, if the reports don't need to use the absolute current state of the underlying tables (for example, if "what was in the tables at the end of the business day yesterday" is acceptable), you can create a materialized view. It will not work any faster than a regular view, but it can run overnight (say), or every six hours, or whatever - so that the further reporting processing from there will not have to wait for the queries to complete. This is one of the main uses of materialized views.
Good luck!
My team needs to find a solution to the following problem:
Our application allows users to view total sales for the enterprise, totals by product, totals by region, totals by region x product, totals by regions x division, etc. You get the idea. There are so many values that need to be aggregated to get many of those totals that they cannot be computed on the fly - we have to pre-aggregate them to provide decent response times, a process that takes about 5 minutes.
The problem, which we thought was a common one but can find no references to, is how to allow updates to various sales without shutting off the users. Also, the users cannot accept eventual consistency - if they drill down on a total of 12 they better see numbers that add up to 12. So we need Consistency + Availability.
The best solution we've come up with so far is to direct all queries to a redundant database, "B" (optimized for queries) while updates are directed to the primary database, "A". When we decide to spend the 5 minutes to update all the aggregates, we update database "C", which is yet another redundant database just like "B". Then, new user sessions get directed to "C", while existing user sessions continue to use "B". Eventually, warning anyone left using "B", we kill the sessions on "B" and re-aggregate there, swapping the roles of "B" and "C". Typical drain-stop scenario.
We are surprised that we cannot find any discussion of this and are concerned that we are over-engineering this problem or maybe it's not the problem we think it is. Any advice is greately appreciated.
This was an interesting problem so I thought about it on the train, and I came up with the idea of storing a timestamp for each row in the database that you aggregate over. (I think this technique has a name, but it escapes me and googling isn't finding it...)
The timestamp would indicate when this row was inserted. In addition:
-If rows can be updated, then you will have two 'versions' of the row at once, one more recent than the other.
-If rows can be deleted, then there will need to be a 'deleted version' row that specifies when it was deleted.
Now you can do things such as:
1) Say you update the aggregates at Jan 1 2000 midnight. You can have views of the table return the table's data as though it was Jan 1 2000 midnight, ignoring all inserts/updates/deletes more recent than that. Now the aggregates are as up to date as the data in the view AND you can keep adding data to the underlying table.
2) I don't know how feasible/easy to guarantee it's reliable this would be, but you could have 'differentially computed aggregates' where on Jan 2 2000 midnight, you take the aggregates of Jan 1 2000 midnight and update them only with the data that has been changed since that time - saving you from recomputing so much historical data. (Of course, it gets hairier once you consider rows being updated or deleted that are older than 24 hours)
3) Whenever you bring your aggregates up to date, you can merge updated and deleted rows with their older version and get rid of the older version, so you only have to keep duplicates of rows around when you need them to separate rows that have been aggregated and rows that aren't (this also means that, for instance, if all your aggregates run at once, and you update a row three times in quick succession, you only need to keep the most recent update-indicating row)
If updates cannot be computed on the fly, then caching of results sets as you are doing in another database helps solve the issue of availability with faster response times.
For consistency, you may be able to make use of some form of transaction isolation. For example, MySQL supports a number of different transaction levels, of which REPEATABLE READ may go close to providing you with some consistency in a single transaction. If a transaction can be left open for multiple requests as the users drill down to see the data, they effectively see a snapshot of the database state as of the first request.
In a more generic sense, you're just after a handle which to the data which is provided by the client to indicate a consistent set. As in Patashu's answer, the handle for a client requesting a set of aggregates could be time based. The first stage of client interaction would be to get a handle to the latest aggregate data, eg the current time. If would then pass that handle with each request. As requests are made of the server, it uses the handle to determine which set of aggregate data to return. Rather than having both server "B" and "C", all aggregate data could be stored in server "B", with all aggregate data containing the handle information. This then allows requests to a single server for aggregate data both new and old. At some point, old aggregate data could be purged from "B".
Perhaps a search on transaction isolation will turn up more results for discussion on consistency.
I think you're looking for Data Warehousing concepts
In computing, a data warehouse or enterprise data warehouse (DW, DWH,
or EDW) is a database used for reporting and data analysis. It is a
central repository of data which is created by integrating data from
one or more disparate sources. Data warehouses store current as well
as historical data and are used for creating trending reports for
senior management reporting such as annual and quarterly comparisons.
...
Unlike the ETL-based data warehouse, the integrated source data
systems and the data warehouse are all integrated since there is no
transformation of dimensional or reference data. This integrated data
warehouse architecture supports the drill down from the aggregate data
of the data warehouse to the transactional data of the integrated
source data systems.