Oracle materialized view vs JPA query - oracle

My case is that a third party prepares a table in our schema domain on which we run different spring batch jobs that look for mutations (diff between the given third party table and our own tables). This table will contain about 200k records on average.
My question is simply: does generating a material view up front provide any benefits vs running the query at runtime?
Since the third party table will be populated on our command (basically it's a db boolean field that is set to 1, after which a scheduler picks it up to populate the table. Don't ask me why it's done this way), the query needs to run anyway.
Obviously from an application point of view, it seems more performant to query a flat material view. However, I'm not sure if there is any real performance benefit, since the material view needs to be built on db level.
Thanks.

The benefit of a materialized view here is if you are running the multiple times (more so if the query is expensive and / or there is a big drop in cardinality).
If you are only hitting the query once then you there isn't going to be a huge amount in it. You are running the same query either way and you have the overhead of inserting into the materialized view but you also have the benefit that you can tune this a lot easier than you could querying via JPA and could apply things like compression so less data is transferred back to the application but for 200k rows any difference is likely to be small.
All in all, unless you are running the same query multiple times then I wouldn't bother.
Update
One other thing to consider is coupling. Referencing a materialized view directly in JPA would allow you to update any logic without updating the application but the flip side of this is that logic is hidden outside the application which can make debugging a pain.
Also if you are just referencing a materialized view directly and not using any query rewrite or rollup features then am simple table created via CTAS would actually be better as you still have the precomputed data without the (small) overhead of maintaining the materialized view.

Related

Materialized View vs SSAS Cube

Here is current scenario - We have 3 tables in Oracle DB (with millions of records) which are being used to generate SSRS reports.
These reports are displaying complex data calculation such as deviations, median etc.
SSRS fetch data using stored procs in oracle (joining all the 3 tables) based on date parameters
Calculations are performed in SSRS and data is displayed in tables and charts
Now, for small date duration, report is getting generated quite fast, so no issues there.
When date range is big like a week or 2-3 months, report takes lot of time to process and most of the time it gets timed out as well.
To resolve this issue, I am thinking to remove calculations from SSRS and move them to DB level. Where we can have pre-calculated data
which will be served to SSRS reports for faster report generation.
In order to do this, I can see 2 options -
Oracle Materialized Views
SSAS Cube
I have never used Materialized Views before, so I am a bit skeptical about its performance specially FAST REFRESH issues.
What way would you prefer? MV or SSAS or mix of both?
Data models (SSAS) are great for organizing data, consolidating business logic, and defining how calculations behave in different scopes. They are generally faster to query than the raw data which is what you currently have. There is some caching involved, but you still have to query the data and wait for it to be processed. Models are also most appropriate when you have multiple reports that will be using a common set of data.
With a materialized view, you can shift the heavy lifting of calculation time to the scheduled refresh. Think of it as essentially the same as creating a new table that is refreshed by a procedure. This will greatly improve query times for the report especially if the date column you're filtering on is indexed. Also, the development and maintenance requirements are much lower for this than a model.
So, based on your specifications I would suggest the materialized view.
I would concur with the Materialized View (MV) approach. Depending on the amount and type (insert vs update vs delete) would determine if a fast refresh is possible or practical.
Counter intuitively, a FULL refresh is often a better approach, since you can better take advantage of set based SQL processing, together with parallelism to build the MV.

"Saving" BigQuery Views for use in Tableau

I'm trying to make faster dashboards in Tableau by creating views of my calculations directly in BigQuery.
Based on my understating if the gcloud documentation here, the view will re-execute the query once it is accessed, so it kinda defeats my goal.*
*My goal is to eliminate calculations on the fly, be it in Tableau or BigQuery.
Is it possible to "save" these views, by way of scheduled scripts or workflows?
Thanks,
A view is best thought of as a way to reformat a table to make it look more convenient to further queries. The query still has to run on BigQuery so the benefits will be that the view may look simpler to Tableau than the raw table (particularly convenient if the view uses some complex SQL to create some of its columns). But it won't save calculation time.
But, if your view is doing some complex consolidation of a larger table then it might be worth saving the results as a new table instead of creating a view. This is OK if your underlying table doesn't change frequently (rule of thumb if you use the results every day and the table changes weekly, it is probably worthwhile and certainly so if the changes are monthly). Then Tableau will be querying pre-consolidated results rather than the much larger raw table. BigQuery storage and processing is cheap so this is often a reasonable solution.
Another alternative is to use a Tableau extract to bring the data into your local drive or server. This is only practical if the table is small enough to fit locally and will only work really well for speed if it fits into local memory (which can be a lot more than you might think). But extracts, at least on Tableau server, can be set to refresh on a schedule, making much faster user interaction and absolving you of having to remember to manually update the consolidated table.

Oracle Materialized View - Need help in creating View with very large number of records

We have to create a materialised view in our database from a remote database which is currently in production. The view has about 5 crore records and is taking a long time. In between , the connection has snapped once and not even a single record persisted in our database. Since the remote database is production server, we get a very limited window to create the view.
My question is, can we have something like auto commit /auto start from where left at last time while the view is created so that we don't have to do the entire operation in one go ?
We are tying to make the query such that records can be fetched in smaller numbers as a alternate. But the data is read only for us and doesn't really have a where clause at this point which we can use.
Due to sensitive nature of data, I cannot post the view structure or the query.
No, you can not commit during the process of creating the view. But why don't you import the data to a table instead of a view? This would provide the possibility to commit in between. Furthermore this might provide the possibility, to load only the delta of changes maybe on daily basis - this would reduce the required time dramatically.

Is there any advantages related to Performance in Oracle View

I'm learning about Oracle Views and I got the concept of views but little confused about performance.
I was watching video and there I listen that oracle view can increase the performance. Suppose I have created view like below.
CREATE VIEW SALES_MAN
AS
SELECT * FROM EMP WHERE JOB='SALESMAN';
Ok now I have executed query to get SALES_MAN detail.
SELECT * FROM SALES_MAN
Ok now confusion start.
I listened in video that once the above query SELECT * FROM SALES_MAN will be executed the DATA/RECORD will be placed into cache memory after hitting the oracle DB. and if I will execute same query( IN Current Session/Login ) Oracle Engine will not hit to Database and will give you record from CACHED-MEMORY is it right?
But I have read on many websites that View add nothing to SQL performance more
Another Reference that says view not help in performance. Here
So Views increase performance too or not?
A view is just a kind of virtual table defined by a query. It has no proper data and performance will depend on the underlying table(s) and the query definition.
But you also have Materialized View wich stores the results from the view query definition. Data is synchronized automatically, and you can add indexes to the view. In this case, you can achieve better performances. The cost to pay is
more space (the view contains data dupplicated),
no access to the up to date data from the underlying tables
You (and perhaps the creator of the un-cited video) are confusing two different concepts. The reason the data that was in the cache was used on the second execution was NOT because it was the second execution of the view. As long as data remains in the cache, it is available for ANY query that needs it. The very fact that the first execution of the view had to get the data from disk should be a strong clue. And what if your second use of the view wasn't until hours or days later when the data was no longer in the cache? The answer remains. A view does NOT improve performance.

How to access data in Dynamics CRM?

What is the best way in terms of speed of the platform and maintainability to access data (read only) on Dynamics CRM 4? I've done all three, but interested in the opinions of the crowd.
Via the API
Via the webservices directly
Via DB calls to the views
...and why?
My thoughts normally center around DB calls to the views but I know there are purists out there.
Given both requirements I'd say you want to call the views. Properly crafted SQL queries will fly.
Going through the API is required if you plan to modify data, but it isnt the fastest approach around because it doesnt allow deep loading of entities. For instance if you want to look at customers and their orders you'll have to load both up individually and then join them manually. Where as a SQL query will already have the data joined.
Nevermind that the TDS stream is a lot more effecient that the SOAP messages being used by the API & webservices.
UPDATE
I should point out in regard to the views and CRM database in general: CRM does not optimize the indexes on the tables or views for custom entities (how could it?). So if you have a truckload entity that you lookup by destination all the time you'll need to add an index for that property. Depending upon your application it could make a huge difference in performance.
I'll add to jake's comment by saying that querying against the tables directly instead of the views (*base & *extensionbase) will be even faster.
In order of speed it'd be:
direct table query
view query
filterd view query
api call
Direct table updates:
I disagree with Jake that all updates must go through the API. The correct statement is that going through the API is the only supported way to do updates. There are in fact several instances where directly modifying the tables is the most reasonable option:
One time imports of large volumes of data while the system is not in operation.
Modification of specific fields across large volumes of data.
I agree that this sort of direct modification should only be a last resort when the performance of the API is unacceptable. However, if you want to modify a boolean field on thousands of records, doing a direct SQL update to the table is a great option.
Relative Speed
I agree with XVargas as far as relative speed.
Unfiltered Views vs Tables: I have not found the performance advantage to be worth the hassle of manually joining the base and extension tables.
Unfiltered views vs Filtered views: I recently was working with a complicated query which took about 15 minutes to run using the filtered views. After switching to the unfiltered views this query ran in about 10 seconds. Looking at the respective query plans, the raw query had 8 operations while the query against the filtered views had over 80 operations.
Unfiltered Views vs API: I have never compared querying through the API against querying views, but I have compared the cost of writing data through the API vs inserting directly through SQL. Importing millions of records through the API can take several days, while the same operation using insert statements might take several minutes. I assume the difference isn't as great during reads but it is probably still large.

Resources