Combining dimension values to create an additional dimension value

Combining dimension values to create an additional dimension value - set

I have a simple sample data set:
Name Database Amount
Brian DC1 50
Brian DC2 100
Steve DC1 34
Bill DC2 90
Ed DC1 49
Suz DC2 82
I'm struggling to create a calc or formula to combine "DC1" and "DC2" so that it says both. I'd like to ultimately be able to filter on "DC1", "DC2", or "Both"
Pushing this logic down to the to a sql case statement isn't feasible.
Is this possible to perform? Should I be creating sets and then using combined sets?

Create a level-of-detail calculated field to see how many distinct database names each account has. Your code will look something like this.
if {FIXED [Name] : countd([database])} > 0 then 'Both' else [Database] END

Related

How do you add greater than filter in Apache superset?

I am new to Apache Superset and wanted to know if there's a way to implement "greater than" filter. For example, I have a column like this:
Cost
2000
2400
3000
1200
2320
1000
1800
2010
2800
I know that I can put a cost filter that says:
Cost: [2000][2400][3000]
And put all my filtering entries here, but what I actually want is something like this:
Cost: [>=2500]
Which will provide me output as:
Cost
3000
2800
Just wondering if there's a way to do so?
Thanks in advance.

You can write a custom sql creating groups of values for the Cost column and use that column in your Filter. Something like this
Select Cost, Case when cost >1000 and cost<2000 then '<2000' when cost >2000 and cost<3000 then '<3000' end

Unfortunately No.
The issue is being tracked here
Dashboard conditional filter on numbers
but it has been closed.
You can comment on that thread to show interest.

Creating advanced SUMIF() calculations in Quicksight

I have a couple of joined Athena tables in Quicksight. The data looks something like this:
Ans_Count | ID | Alias
10 | 1 | A
10 | 1 | B
10 | 1 | C
20 | 2 | D
20 | 2 | E
20 | 2 | F
I want to create a calculated field such that it sums the Ans_Count column based on distinct IDs only. i.e., in the example above the result should be 30.
How do I do that?? Thanks!

Are you looking for the sum before or after applying a filter?
Sumif(Ans_Count,ID) may be what your looking for.
If you need to always return the result of the sum, regardless of the filter on the visual, look at the sumOver() function.

You can use distinctCountOver at PRE_AGG level to count unique number of values for a given partition. You could use that count to drive the sumIf condition as well.
Example : distinctCountOver(operand, [partition fields], PRE_AGG)
More details about what will be visual's group by specification and an example where there duplicate IDs will help give a specific solution.
It might even be as simple as minOver(Ans_Count, [ID], PRE_AGG) and using SUM aggregation on top of it in the visual.

If you want another column with the values repeated, use sumOver(Ans_Count, [ID], PRE_AGG). Or, if you want to aggregate via QuickSight, you would use sumOver(sum(Ans_Count), [ID]).

I agree with the above suggestions to use sumOver(sum(Ans_Count), [ID]).
I have yet to understand the use cases for pre_agg, so if anyone has concrete examples please share them!
Another suggestion would be to do a sumover + partition by in your table (if possible) before uploading the dataset, then checking if the results matche with Quicksight's aggregations. I find Quicksight can be tricky with calculated fields, aggregations, and nested ifs so I've been doing calculations in SQL where possible before bringing it in to quicksight to have a better grasp of what the outputs should look like. This obviously is an extra step, but can help in understanding how quicksight pulls off calcs and brings up figures (as the documentation doesn't always give much), and spotting things that don't look right (I've had a few) before you share your analysis with a wider group.

how to use lookups

Lookup table:-
FROM_METER TO_METER COST
1 5 0.004
6 10 0.006
11 20 0.012
Needed output:- (here for 15 and 7 for example cost output will come as from lookup master)
METER COST
15 0.012
7 0.006
etc....

A lookup table is normally used t provide code or reference values for regular data. Assuming your business table is the one you called MASTER you would join its records to your lookup like this:
select m.meter
, l.cost
from master m
join lookup l
on ( m.meter between l.meter_from and l.meter_to)
Your data model ppears to have a problem: it doesn't guarantee that every METER in the MASTER table will find a matching record in the lookup table. This is why a more common approach is to use types or categories (say 'cheap', 'reasonable', 'expensive').
Obviously, without knowing your business rukes it is hard for me to say whether you model is correct, but you probably should consider what to do if the MASTER table doesn't have a matching row. Perhaps the relation is enforced in the application but in my experience that approach is hard to make it bulletproof.

select ? Meter, cost from Lookup
where ? > = FROM_METER
and ? < = TO_METER
replace the ? with your parameter.

Real time data processing

I am parsing keywords several times per second. Every second i have 1000 - 5000 keywords. So i want to find outlier, growing and other stuff which called technical analysis. One of the problem is how to store data.
I will be able to do someting like:
20-01 20-02 20-03
brother 0 3 4
table 1 0 0
cup 34 54 78
But it might be a lot of keywords. For every new part of data i need to look is this word exists? If donnt then i must to add new words and add new rows for them. What is right way to organize store? Should i use key\value database, NoSQL or something else?

How to convert multiple rows into single row in iformatica for large volume of data, need best solution

I have data in table A as below
Assetid attribute value
1546 Ins_date 05062011
1546 status active
1546 X 10.4567
1546 Y 27.56
1546 size 17
675 X 4.778
675 Y 53.676
675 depth 5
675 st_date 06092010
I have data as above in table A. This table has many Assetids 1546,675,....etc. attributes might vary for assets.
I want output as below:
assetid ins_date status X Y Size depth st_date
1546 05062011 active 10.4567 27.56 17 null null
675 null null 4.778 53.676 null 5 06092010
I have created Stored procedure, then called in Informatica to achieve this output. However, since i have large volume of data, it is taking much time to load.
Please suggest me other easy and best way to load it.

Use a router to split the rows into separate groups depending on attribute and then use a set of joiners to merge the rows with the same assetid values.

Use an Aggregator transformation to condense the records into one record per assetid. Then for each attribute, create a port that returns MAX(value) where the attribute matches. Note that this method assumes that you know all possible attributes ahead of time.

As suggested in previous answer you can use the aggregator. Since your data set is large you can use a technique using variable port in an expression as well provided the data is sorted before it reaches the expression.
You can download the sample mappings that demonstrate both the the techniques from Informatica Marketplace App titled "PowerCenter Mapping: Convert Rows Into Columns".

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Combining dimension values to create an additional dimension value - set

Create a level-of-detail calculated field to see how many distinct database names each account has. Your code will look something like this. if {FIXED [Name] : countd([database])} > 0 then 'Both' else [Database] END

Related

How do you add greater than filter in Apache superset?

Creating advanced SUMIF() calculations in Quicksight

how to use lookups

Real time data processing

How to convert multiple rows into single row in iformatica for large volume of data, need best solution

Categories

Resources