I am in the process of creating a star schema-based cube using SSAS 2008.
I would like advice on whether it is best practice to create my own date/time table in the DW database as the basis for the Time dimension or to use the 'generate a time table' in data source/on the server option in SSAS?
Second question: assuming 'generate in the data source' is not an option, which is better, create own table or generate on the server?
Two factors may influence the decision:
1) will be using YTD measures
2) will use Gregorian calendar in the first pass, but will be adding the Muslim calendar in second pass.
Thanks for your help
Well, if i say that it´s much easier and SSAS will do it very well, i.e. use the option "generate a time table" then your second question will be answered as well, i.e. generate on the server. I have done it and it works very good.
Related
Here is current scenario - We have 3 tables in Oracle DB (with millions of records) which are being used to generate SSRS reports.
These reports are displaying complex data calculation such as deviations, median etc.
SSRS fetch data using stored procs in oracle (joining all the 3 tables) based on date parameters
Calculations are performed in SSRS and data is displayed in tables and charts
Now, for small date duration, report is getting generated quite fast, so no issues there.
When date range is big like a week or 2-3 months, report takes lot of time to process and most of the time it gets timed out as well.
To resolve this issue, I am thinking to remove calculations from SSRS and move them to DB level. Where we can have pre-calculated data
which will be served to SSRS reports for faster report generation.
In order to do this, I can see 2 options -
Oracle Materialized Views
SSAS Cube
I have never used Materialized Views before, so I am a bit skeptical about its performance specially FAST REFRESH issues.
What way would you prefer? MV or SSAS or mix of both?
Data models (SSAS) are great for organizing data, consolidating business logic, and defining how calculations behave in different scopes. They are generally faster to query than the raw data which is what you currently have. There is some caching involved, but you still have to query the data and wait for it to be processed. Models are also most appropriate when you have multiple reports that will be using a common set of data.
With a materialized view, you can shift the heavy lifting of calculation time to the scheduled refresh. Think of it as essentially the same as creating a new table that is refreshed by a procedure. This will greatly improve query times for the report especially if the date column you're filtering on is indexed. Also, the development and maintenance requirements are much lower for this than a model.
So, based on your specifications I would suggest the materialized view.
I would concur with the Materialized View (MV) approach. Depending on the amount and type (insert vs update vs delete) would determine if a fast refresh is possible or practical.
Counter intuitively, a FULL refresh is often a better approach, since you can better take advantage of set based SQL processing, together with parallelism to build the MV.
I recently working with Oracle database to generate some reports. What I need is to get result sets of specific records (only SELECT statement), sometimes are large records, to be used for generating the report in excel file.
At first, the reports are queried in Views but some of them are slow (have some complex subqueries). I was asked to increase the performance and also fixed some field mapping. I also want to tidy things up, because when I query against View, I must specifically call the right column name. I want to separate the data works into database, and the web app just for passing parameters and call the right result set.
I'm new to Oracle, so which is better to do this kind of task? Using SP or Function? or in what condition that maybe View is better?
Makes no difference whether you compile your SQL in a view, SP or function. It is the SQL itself that matters.
As long as you are able to meet your requirements with the views they should be a good option. If you intend to break-up your queries into multiple ones for achieving better performance then you should go for stored procedures. If you decide to go for stored procedure then it would be advisable to create a package and bundle all the stored procedures together in the package. If your problem is performance then there may not be a silver bullet solution for the same. You will have to work on your queries and design for the same.
If the problem is performance due to complex SELECT query (queries), you can consider tuning the queries. Often you will find queries written 15-20 years ago, which do not use functionality and techniques that were introduced by Oracle in more recent versions (even if the organization spent the big bucks to buy the more recent versions - making it into a waste of money). Honestly, that may be too much of a task for you if you are new at Oracle; also, some slow queries may have been written by people just like you, many years ago - before they had a chance to learn a lot about Oracle and have experience with it.
Another thing, if the reports don't need to use the absolute current state of the underlying tables (for example, if "what was in the tables at the end of the business day yesterday" is acceptable), you can create a materialized view. It will not work any faster than a regular view, but it can run overnight (say), or every six hours, or whatever - so that the further reporting processing from there will not have to wait for the queries to complete. This is one of the main uses of materialized views.
Good luck!
Question:
We (that is to say I, single person) should implement "user generated reports" (in 1 month at most, presentation at the end of the month/start of the new month).
Problem 1:
By user, I mean users that do not have any technical skills, like SQL or VBA.
Problem 2:
Technology is .NET ONLY, so I cannot use Java (and things based on Java like Jasper)
Problem 3:
Exports to Excel should be possible (and I mean XLS or XLSX, not XML or CSV)
Problem 4:
Grouping of data should be possible (multiple groups)
Problem 5:
Database is Microsoft SQL-Server (presumable 2008 R2, but could end-up being 2008 R1 or 2005)
Bonus "Problem":
Web based, with ASP.NET WebForms, but can also be desktop based, if web is not possible
Now apart from the sheer ridiculousness of those requirements and time constraints...
One solution would be the report builder supplied by SSRS (SQL-Server Reporting Service).
However, there are some disadvantages, which I think are pretty severe:
The user creating the report basically still needs to know SQL (left, right, inner, outer join and their consequences). Since the user probably doesn't understand the difference, they will just blame me if they get no or wrong results (inner join on a null column for example).
The user creating the report knows nothing about the database/data-structure (e.g. soft deletes, duration dates). Also garbage-in garbage-out is probably going to be a problem, complete with wrong data etc. ...
If they are going to make a matrix, and are going to sum subtotals from unrounded values, the sum of the total is not going to match the sum of the subtotals, because the report is going to display something like only 2 digits after the comma (and therefore round the value to 2 digits) for subtotals, but it's going to calculate the total from the sum of all values (which are NOT rounded), not from the sum of the subtotals (which are rounded). Again, they will blame me or the data, or the report builder for it.
Since the report-builder is not going to display the number of results after adding an additional table with a join, a user will have no way of telling whether they have the right number of records, which will inevitably result in wrong results. Again, they will blame me.
Filters on dates: One needs to apply them, but not necessarily in the where, but in the join. Report builder doesn't support that. It's not possible to create a serious report like that.
Status: As said, we use soft-deletes, and a status field with status 99 for deleted records. Filtering status in the where is dangerous, and must sometimes occur in the join. Again, Report Builder doesn't support that, unless you use raw SQL, which is pointless since the users are not going to know SQL.
Installing report builder requires admin rights, or the IT department of the customer company to install it. And the appropriate .NET framework for the appropriate ReportBuilder , and the appropriate ReportBuilder for the appropriate Report-Server, since SQL-Server 2005 Reporting Service is NOT going to work with reports for SQL-Server 2008 Reporting Services, and SQL 2008 R1 not with R2. And also this requires all users capable of that to be in a certain SQL-Server reporting service report generator user role, which requires the IT department to put users into the appropriate active-directory group, which so far has never worked with any of the customers we had. Plus I don't trust the IT department to know that they install the appropriate ReportBuilder, if they agree to install it at all.
Now I once (a longer time ago) happened to view a presentation of SSAS (SQL-Server Analysis Services) on youtube.
But I don't find the link anymore.
But anyway, I don't have any experience in SSAS, only SSRS.
I think it would be possible to abuse SSAS in such a way, that the users could connect to it via Excel, and get the data and sum them more or less like they want to. Also, they would be able to see the raw data.
And I could pre-prepare a few queries for raw-data from tables (that, I could do with reportbuilder as well, via datasets).
Does anybody know SSAS well enough to tell me whether this is feasible in that amount of time ?
And if the add-in required for analysis server and Excel-versions (2007/2010) is compatible with all analysis-server versions, or if there are problems accessing 2008 R2 from Excel 2007 or SSAS-2005 from Excel 2010.
Or whether I am bound to run into more problems with SSAS than with ReportBuilder ?
If your question is whether SSAS is a reasonable approach to your problem, my answer is yes. The benefit of SSAS is that generally speaking the data is modeled in a way that is readily understood by business users and easily manipulated in Excel to produce a variety of reports with no knowledge of a query language. With any version of SSAS, you can use Excel versions 2007 or 2010. There is no add-in required for this - the provider is built into both Excel versions already. Furthermore, by putting the model into SSAS, you are actually making your data more readily accessible by a variety of tools - you can use Excel or SSRS or a variety of 3rd party tools if you so desire. In other words, you're not limiting your options with this approach, but expanding your options as compared to Report Builder.
That said, working with SSAS can be simple or hard. It depends on the type of data you're working with and the complexity of any calculations that must be added to the model. Whether you can achieve your goals in the amount of time you have available is not a question I can answer. It really depends on the type of data that you have and the type of reports that your users need.
I can point you to a couple of resources. I wrote an article for TechNet as a gentle introduction: http://technet.microsoft.com/en-us/magazine/ee677579.aspx. It was written for SSAS 2008 but the principles apply to SSAS 2005, SSAS 2008, SSAS 2008 R2, and SSAS 2012.
If you prefer a video introduction, see http://channel9.msdn.com/Blogs/rdoherty/Demo-Developing-a-SQL-Server-2008-R2-Analysis-Services-Database to start. You can find a lot of free video material on SSAS at channel9. Just do a search for SSAS.
This is a problem that my friend asked over the phone. The C# 3.5 program he has written is filling a Dataset from a Patient Master table which has 350,000 records. It uses the Microsoft ADO.NET driver for Oracle. The ExecuteQuery method takes over 30 seconds to fill the dataset. However, the same query (fetching about 20K records) takes less than 3 second in Toad . He is not using any Transactions within the program. It has an index on the column (Name) which is being used to search.
These are some alternatives i suggested :-
1) Try to use a Data Reader and then populate a Data table and pass it to the form to bind it to the Combo box (which is not a good idea since it is likely to take same time)
2) Try Oracles' ADO.NET Driver
3) Use Ants profiler to see if you can identify any particular ADO.NET line.
Has anyone faced similar problems and what are some ways of resolving this.
Thanks,
Chak.
You really need to do an extended SQL trace to see where the slowness is coming from. Here is a paper from Cary Millsap (of Method R and formerly of Hotsos) that details doing this:
http://method-r.com/downloads/doc_details/10-for-developers-making-friends-with-the-oracle-database-cary-millsap
Toad would typically only fetch the first x rows (500 in my setup). So double check if the comparison is valid.
Then you should try to seperate the db stuff from the form stuff if possible to see if the db is taking up the time.
If that's the case, try the Oracle libraries if that is any faster, we've seen 50% improvements between the latest Oracle driver and the standard Microsoft driver.
Without knowing the actual code he uses to accomplish his tasks and not knowing the number of rows he's actually fetching (I'm hoping he doesn't read all 350K of them?) it's impossible to say anything that's gonna help him.
Have him add a code snippet to the question for clarity.
If we put aside the rights and wrongs of putting demo data into a live system for a minute (that's a whole separate discussion!), we are being asked to store some demo data in our live system so that it can be credibly demonstrated without the appearance of smoke + mirrors (we want to use the same login page for example)
Since I'm sure this is a challenge many other people must have - I'd be interested to know what approaches have people have devised to separating this data so that it doesn't get in the way of day to day operations on their systems?
As I alluded to above, I'm aware that this probably isn't best practice. :-)
Can you instead, segregate the data into a new database, and just redirect your connection strings (they're not hard-coded, right? right?) to point to the demo database. This way, live data isn't tainted, and your code looks identical. We actually do a three tier-deployment system this way, where we do local development, deploy to QC environments that have snapshots of the live data every few months, and then deploy to live when testing is complete.
FWIW, we're looking at using Oracle's row level security / virtual private database feature to seperate the demo data from the rest.
I've often seen it on certain types of live systems.
For example, point of sale systems in a supermarket: cashiers are trained on the production point of sale terminals.
The key is to carefully identify the test or training data. I wouldn't say that there's any explicit best practice for how to model this in a database - it's going to be applicaiton specific.
You really have to carefully define the scope of what is covered by the test/training scenarios. For example, you don't want the training/test transactions to appear in production reports (but you may want to be able to create reports with this data for training/test purposes).
Completely disagree with Joe. Oracle has a tool to do this regardless of implementation. Before I read your answer I was going to say VPD... But that could have an impact on Production.
Remember Every table in a query changes from
SELECT * FROM tableA
to
SELECT * FROM (SELECT * FROM tableA WHERE Data_quality = 'PROD' <or however you do it>
Every table with a policy that is...
So assuming your test data has to span EVERY table, every table will have to have a policy and every table will be filtered before a SQL can begin working.
You can even hide that column from the users. You'll need to write the policy with some deftness if you do. You'll have to create that value based on how the data is inserted and expose the column to certain admin accounts for maintenance.