calculate percentages at the different grouping levels without creating separate percentage formulas for each group level - crystal-reports-2008

In Crystal Report,I am looking for a way to calculate percentages at the different grouping levels without creating separate percentage formulas for each group level.
can someone please tell me is there a way to do that or I have to create separate percentage formulas for each group level?
Thank you.

You need to create the saperately for all grouping levels.

Related

Tableau running slow/ Queries taking a long time

To speed up processing time in tableau: Is it best to combine multiple calculated fields into one calculated field or best to have the equation broken out into pieces?
Thanks!
Both options are not related to optimization. However, it is a good practice to hide unused fields from the "tables" sidebar

Datamart modelling fact table: indicator in columns or lines with one column called indicator

I am modelling a datamart and have multiple measures (indicators ) and dimensions.
Is it better when modelling the fact table to make indicators by column or having one column that contains indicators like creating a dimension of indicators ?
Please give me your opinions and when to choose each option?
Dimensional modelling aims for each fact table to represent a business process where you take measurements, with each measurement stored separately as columns. These are separately named, with the aim being that these are things you can drag onto your BI tool's report without a user having to worry about going off to another table to work out what measure you're looking at.
The Kimball Group don't normally recommend the approach where you create a measure type dimension, and producing a 'generic' fact. It makes the number of rows in the fact table bigger (one for each measurement) and makes calculations between measurements in a single measurement event (fact) more difficult.
Where would this end? You could feasibly have one fact that represents all measurements, from all your facts. This might be easier to model and load into, and might be exactly what you need in your situation, but it doesn't make it easier to report from, and wouldn't be called a dimensional model.
The situation Kimball suggests this would be an acceptable technique, however, is when you could have hundreds of potential measurements, but only a few would be applicable to any particular fact.

BIRT: How to find average of derived measure in cross tab in spago BI?

I my BIRT Report i am using a data cube to show some data with the help of cross tab.In the cross tab i added a derived measure with the help of other measures.
Question:
How to calculate the average of that derived measure in the cross tab?
Please give the answer of my question. Thanks
In my case i made a derived measures depending on the other measures.
Suppose i have 2 measures names are monthlyRent and totalRent. With the help of these two measure i made a derived measure like this:
derived measure = (monthlyRent/totalRent)*100
Now i want to calculate the grand total for this derived measure.
Steps:
Right click on cell.
then insert.
Dynamic Text.
Available Column Bindings.
cross tab.
then selected the both measures monthlyRent and totalRent.
It will be like this: (data["monthlyRent"]/data["totalRent"])*100
Regarding any help you can contact me i will provide the practical example for this.
Thanks

How to detect which data are affecting the result of a feature with machine learning?

Firstly, I will illustrate the scenario that I have a dataset like;
ProductID, ProductType, MachineID, MachineModel, MachineSpeed, RejectDate, RejectVolume etc.
And I want to find which field(s) is the reason for the increase in my RejectVolume? Also, in the scenario, all products have a RejectVolume. I mean RejectVolume is nonzero and there are continuous but different values. Thanks to this, I can recognize the reason(s) and find the solution for reducing the value of RejectVolume.
Can you give me any ideas for creating the model?
Thank you.
You want to look at Feature Selection methods.
In this scenario you could start with Linear Regression using Lasso for feature selection. This is done by successively increasing the lasso regularization term, which will decrease the weight of unimportant features, leaving you with the features with the most impact.

Appropriate clustering method for 1 or 2 dimensional data

I have a set of data I have generated that consists of extracted mass (well, m/z but that not so important) values and a time. I extract the data from the file, however, it is possible to get repeat measurements and this results in a large amount of redundancy within the dataset. I am looking for a method to cluster these in order to group those that are related based on either similarity in mass alone, or similarity in mass and time.
An example of data that should be group together is:
m/z time
337.65 1524.6
337.65 1524.6
337.65 1604.3
However, I have no way to determine how many clusters I will have. Does anyone know of an efficient way to accomplish this, possibly using a simple distance metric? I am not familiar with clustering algorithms sadly.
http://en.wikipedia.org/wiki/Cluster_analysis
http://en.wikipedia.org/wiki/DBSCAN
Read the section about hierarchical clustering and also look into DBSCAN if you really don't want to specify how many clusters in advance. You will need to define a distance metric and in that step is where you would determine which of the features or combination of features you will be clustering on.
Why don't you just set a threshold?
If successive values (by time) do not differ by at least +-0.1 (by m/s) they a grouped together. Alternatively, use a relative threshold: differ by less than +- .1%. Set these thresholds according to your domain knowledge.
That sounds like the straightforward way of preprocessing this data to me.
Using a "clustering" algorithm here seems total overkill to me. Clustering algorithms will try to discover much more complex structures than what you are trying to find here. The result will likely be surprising and hard to control. The straightforward change-threshold approach (which I would not call clustering!) is very simple to explain, understand and control.
For the simple one dimension K-means clustering (http://en.wikipedia.org/wiki/K-means_clustering#Standard_algorithm) is appropriate and can be used directly. The only issue is selecting appropriate K. The best way to select a good K is to either plot K vs residual variance and select the K that "dramatically" reduces variance. Another strategy is to use some information criteria (eg. Bayesian Information Criteria).
You can extend K-Means to multi-dimensional data easily. But you should be beware of scaling the individual dimensions. Eg. Among items (1KG, 1KM) (2KG, 2KM) the nearest point to (1.7KG, 1.4KM) is (2KG, 2KM) with these scales. But once you start expression second item in meters, probably the alternative is true.

Resources