to OLAP or not to OLAP - oracle

I want to ask what are the performance gains of pulling reports in Sharepoint 2010 directly from an Oracle DW using an ODBC Connection as opposed to building an OLAP Layer using SSAS and accessing the data that way.

It sounds like you're confusing terminologies and technologies. Comparing ODBC to OLAP and SSAS is like comparing apples to oranges. They are very different things used for very different purposes. The commonality is that they both deal with data... and apples and oranges are both fruit.
But, trying to read between the lines, OLAP, if configured correctly, will deliver actionable information (biz intel etc.) much more quickly and readily than data aggregated from a standard RDBMS. After all, that's what OLAP is designed for:
In computing, online analytical processing, or OLAP (play /ˈoʊlæp/),
is an approach to swiftly answer multi-dimensional analytical (MDA)
queries.1 OLAP is part of the broader category of business
intelligence, which also encompasses relational reporting and data
mining.[2]

Now that SQL Server 2012 is released, you may want to look into building a BISM (using SSAS) over your Oracle data and then displaying the results in SharePoint.
Although you can use the 'PowerPivot on SharePoint' mode to be able to host (render) the BISM using SharePoint resources, you'd rather get a separate server to host SSAS, so that effort of rendering (presumably) large reports doesn't slow down your SharePoint server.
Of course it is still possible build an OLAP cube using SSAS, but unless you need one or more of the features not yet available in the BISM model, you'd rather just build a BISM than an OLAP cube, as the BISM leverages Vertipaq (so it's simpler to build, i.e. requires less ETL from Oracle). For more check out my deck on slideshare on the BISM

Related

Data Transformations in Snowflake - View, Tools etc?

We're considering Snowflake and want to understand how we could use it, and possibly other tools, to overcome one of our main problems - ETL! We currently use a legacy DWH with an ETL process consisting of SSIS and some views. This has all the common pitfalls of this methodology - most notably that it takes ages!
I was under the assumption that we'd move to an ELT model in Snowflake, I started to research tools to do the 'T' part of it, however, I'm just listening to this podcast: https://www.dataengineeringpodcast.com/snowflakedb-cloud-data-warehouse-episode-110/
And it's suggesting that just slapping a SQL View over something and exposing it in say PowerBI or Tableau is enough for the T part of things!...
Just wondering what people's experience was here?
- Do you do transformations just by writing a view in Snowflake?
- Do you use a third party tool specifically to address this need?
Secondary to this, for the Extraction and Loading, do you:
- Do this using Snowflake only
- Use a third party tool
I'm specifically interested if you do this to create some kind of timeseries in Snowflake from a non timeseries source. That's something we'd be keen to do.
This question is hard to answer without sounding opinionated, especially not knowing your use case. For what it's worth here is what I think:
Don't stick views on top of your tables and expose to a reporting tool unless you have a very very simple setup. If you're considering a tool like Snowflake then you will probably want to go for something more sustainable, this approach can become prohibitive in terms of cost and complexity in your views.
Use a third-party tool to manage your ELT process. Your choice of tool will depend on your internal skills and cloud strategy, have a look at the tools out there like Stich, Fivetran etc. If you don't mind having on-premise technologies why not stick with SSIS or use something like Apache Airflow (requires up-skilling)
Snowflake will not help you with the E of ELT, you will need to use a third-party tool to manage the extract of data from your other systems like SSIS. It will help with the L part, for this you can use Snowpipe or COPY commands which are available within the Snowflake ecosystem. Snowflake will also help you share your data with external parties which is really nice.
My organization has created a fairly complicated dimensional model in Snowflake using layers of SQL views, against which we can point our reporting tools. We use a separate replication tool for extraction from source systems and loading into Snowflake. Using views simplifies our approach in that we don't need to use an additional tool. It also makes managing the code easier than something like SSIS. For instance, we can search for code using the Snowflake interface or our version control tool instead of having to open individual SSIS packages.

Oracle database driver benchmark

I'm actually working in a situation where a .NET stack is managing an Oracle Database. Now, because the legacy code is consistently based on PL SQL stored procedures that deal with the majority of the work, the correct choice of driver to connect to the database is of primary importance.
For this reason, knowing that Oracle provides a large number of driver for the most known programming languages, I was trying to find a documented benchmark (even with all the problem and the influence of the context in which the tests are made) that could compare the different Oracle drivers for the different programming languages, just to support the hypothesis that the best choice in terms of performance for an isolated test microservice would be to use the Java driver in combination with Scala (Java is now property of Oracle now, after all).
Are there any on the internet that could support (or not) this hypothesis?
EDIT
I didn’t provide all the information. What I’m trying to achieve is develop a series of microservices focused on fetching data from the database and convert them into json to be consumed by the front end. .NET driver behaves seamlessly until the numbers become really overwhelming (> 1000 rows).
That’s why I was wondering if it could make any sense using JDBC to increase performance, having heard that, for instance, .NET driver for SQL server, both made by the same Company, performs 5 times better than the oracle one when it comes to gathering data from a cursor.
Your choice of drives may not give you the milage if most of the work is in PL/SQL or stored procedures. Having said that, jdbc is a good choice. However do not fight if developers are more familiar with other drivers like Oracle ODBC, oracle provider for .NET, ADO,etc.. all protocols have a Oracle drive to access Oracle DB.
You don’t have to convert to json. Oracle DB can convert it. Your network latency is more impacted by how big the pipe is ,array size,and packet size than protocols.

What's the difference between creating a report in BI publisher and creating a report using a BI administrator repository?

I've been trying to build a repository using the BI administrator tool in OBIEE. The repository is then loaded using the enterprise manager without any issues. But a new analysis could not be performed because of the following error whenever I try to see the answers of my analysis in the analytics presentation port 9704:
Message returned from OBIS. [nQSError: 42016] Check database specific features table. Must be able to push at least a single table reference to a remote database (HY000)
which I haven't been able to correct. Then I came across a link:
https://www.youtube.com/watch?v=TK7KYaCEGZU
which creates a report using the BI publisher. It has the features I want:
1. I am able to access my table in the Oracle server
2. I choose the columns and make it into a data model
3. I use this model to create my report
When I use the BI administrator tool, they've asked us to build a physical layer, then the BMM layer and the presentation layer. But there is no reference to such terms while creating a data model using the BI publisher tool and so, I'd like to know the difference between creating a report using BI publisher tool and BI administrator tool.
From my perspective, the biggest differences are scalability and access, though this really only scratches the surface.
Scalability
The OBI business model should be considered more widely than one reporting requirement. A business model can be used by multiple presentation catalogs, and in turn each presentation catalog can by used by many Analyses and Dashboards. So you only have to build the data model once, and it can be reused and added to over time to make a more complete... well, model of the business. This gives you "one version of the truth", with reports all coming from this one data source.
In Publisher, there is the concept of the data model. This is generally built specifically for one report. You then have many data models (perhaps using the same underlying data sources), for your many reports.
Consumption
Although Publisher reports can be accessed through the browser and interacted with, the interactivity is generally limited and the reports are (as the name suggests) published in PDF or Microsoft Office formats.
With OBI dashboards there as a much greater degree of interactivity with prompts and navigation. Although these can be published through Agents, they are usually considered as consumed through the browser on a self-serve basis, allowing users the flexibility to answer business questions as they arise.

monetdb - anyone uses it in production?

I am very interested in using monetdb as a datamart, holding some huge data tables for querying and reporting
However, after some searching, I am unable to find any online posts / blogs regarding their use of Monetdb in any kind of production capacity.
Also, there seems to be little or next to no activity online regarding Monetdb.
Is this a bad sign for the future of Monetdb ?
I am very interested in using monetdb as a datamart, holding some huge data tables for >querying and reporting
My boss is also interested in MonetDB and I had the same reaction as you. No one is writing about MonetDB... is no one using MonetDB?
Regardless, I have been running performance tests on datasets of 500,000 to 1,000,000 records comparing MonetDB (column-oriented dbms) vs. MySQL (row-oriented dbms) and MonetDB beats MySQL in all regards- even in bulk inserts... which hypothetically it should not be as good at.
I can't speculate as to what all this means for MonetDB's future, but while it's around you might want to check it out because it performs well.
(I run Windows 7 and am communicating with each database using PHP)
I react a bit late to this post, but I'd like to add my voice to the ones using MonetDB in a production environment. We use it as the back-end of Spinque, a framework for designing complex search solutions. I've been using MonetDB for about 10 years, but only in the past 3 years in a production environment. Clearly, it has pros and cons and bugs like all other products, but it is being developed and improved very actively (I don't understand the low-activity signs that you refer to). If you want a DB that allows you to be ahead of the market standards, it's a good choice. Otherwise, just go for MS SQL ;)
I've been evaluating it lately for a client so I've had some time with it. My impression at this point is that it is just finishing "growing up" from being an academic experimental playground. It clearly has yet to be really discovered, though it does have some rough edges which might hinder certain applications.
As I write, I'm in the process of trying to load over 100 million rows into an instance (at 27mil presently). So far, it performs startlingly well in some areas (aggregates), but is oddly sluggish in others (most joins I've tried so far); that said, I've not yet run the recommended sampling process yet and I'm forcing it to live in just a single service with 32GB RAM.
I've found a few little glitches and one thing that caused a full service crash (obscure and reported), but I'm thinking that for many applications MonetDB could be just the ticket. Columnar storage (rather than NoSQL) seems to be the future IMO.
I'll update this if I find anything particularly interesting.
MonetDB is first and for all a research system, but has progressed far beyond the level of the average research prototype. It is the (only) relational column-store platform in open source that I know of that supports full SQL. I have used it myself at CWI in many research projects that are not core DB research, but do need advanced DB technology.
You can see on the user's mailing list that deployments happen in many different organisations. As Roberto Cornacchia stated in a different answer, it is the backend of all Spinque deployments and we are happy MonetDB users. MonetDB is also used at a variety of non-profit projects like open streetmap and open kvk.
More and more commercial parties deploy MonetDB for analytics. (They do not always like to advertise that their analyses depend on an open source system.) Recently, MonetDB Solutions has started to provide dedicated commercial support for these deployments.
We have been using MonetDB in our business. We analyse very large data sets with many millions of rows. Traditional methods of data warehousing on SQL databases became so slow. The problem we were facing was that the data was only going to get bigger! The only way forward was to go columnar.
The results have been amazing. When you have very few joins it is staggeringly quick. Even with joins on the data sets we are looking at it is still frightening how fast it comes back.
Having seen some of the commercial partnerships I think MonetDB is going to boom over the next few years. I believe some of the major BI suppliers are using Monet under their hood to perform the large data work.

Simulated OLAP

We have a client that has Oracle Standard, and a project that would be ten times easier addressed using OLAP. However, Oracle only supports OLAP in the Enterprise version.
Migration to enterprise is not possible
I'm thinking of doing some manual simulation of OLAP, creating relational tables to simulate the technology.
Do you know of some other way I could do this? Maybe an open-source tool for OLAP? Any ideas?
You can simulate OLAP functionality using client side tools pointed at a relational database.
Personally I think the best tool for the job is probably Tableau Desktop. This is an amazingly sophisticated front end analytics tool that will make your relational data look multidimensional without much effort, and the tool itself is really mind blowing. They have a free trial so you can take it for a spin. We use Tableau heavily for our own analysis and have been very impressed. Of course, this tool also works with multidimensional databases as well, so if you end up with some cubes at the end of the day you can continue to use the Tableau front end.
As for open source, you could try out Palo - an open source MOLAP server and Excel front end.
If you are interesting in building your own reporting front end and use .NET there are a number of components (such as the DevExpress PivotGrid or the several tools from RadarSoft) that will do the same thing, but will require some elbow grease to get wired together.
I find that it's the schema that causes most of the issues people have with querying a database. OLAP forces you to either a flat table or a Star/snowflake schema which is easy to query and comparably faster to the source oltp tables. So if you ETL your source to a flat table or star schema you should get 80% of what you get from OLAP, the 20% being MDX and analytic functions and performance.
Note that you should get a perf boost with a star schema in relational database as well and Oracle probably has analytic functions in PL/SQL anyways.
Try an open-source OLAP server called 'Mondrian'. IIRC the XMLA API on this is sufficiently compatible with AS to fool Pivot Table Services, which would allow you to use it with ProClarity or Excel.
IIRC it was originally designed to work over Oracle - it is a HOLAP architecture using base tables in the underlying relational store and caching aggregates. You can also make use of materialised views and query rewrite in the underlying Oracle database to do aggregates.
A few more thoughts on this topic:
Actually, Oracle Standard does have an OLAP facility based on a descendent of Express embedded in the database engine and storing its internal data structures in BLOBs in the main tablespaces. Using this is technically possible but not necessarily advisable for the following reasons:
It uses a highly non-standard OLAP query engine with very little third party tool support (AFAIK ArcPlan is the only third-party OLAP front-end supporting 10g+ OLAP), poor documentation for the query language and almost no third party literature describing it. This will work with B.I. Beans if you feel like writing a JSP front-end. It is not compatible with MDX at all. As of early 2006 the best Oracle could do when asked about drillthrough (this functionality was not supported in Discoverer 'Drake') was to recommend building a JSP apllication using B.I. Beans.
The reason that there is no migration path from Standard to Enterprise is that Enterprise is actually what used to be Siebel Analytics. Standard is the old Oracle OLAP/Express descendant which Oracle partners recommended avoiding even before Oracle bought out Seibel. Oracle has not even attempted to support migrating.
From this point of view, Mondrian is actually the most cost-effective OLAP solution for an Oracle Standard Edition shop. You can get a supported version from an outfit called Pentaho1. The next cheapest is Analysis Services, which comes with SQL Server. Following that you are into the likes of Hyperion Essbase, which will be an order of magnitude more expensive than SQL Server or any supported verion of Mondrian.
Whilst MS SQL Server offers OLAP, you'll need an Enterprise licence to use a cube in a live environment that is web-facing.
You might want as well to give a try to www.icCube.com - we're quite flexible on the data-source used to populate the cube and are quite cost effective compared to the big actors of the market.

Resources