CONTEXT:
I want to monitor payment transactions for money laundering, where payments cross multiple borders. There are a max of 6 countries shown per transaction. For each of these countries, I need to know a risk level.
I have 2 tables:
Transaction data (where there are many transactions from same country)
Country Risk (containing each country once, with an added risk classification. There are 100+ countries, and only 6 different Risk levels).
Problem:
I would like to look up the Risk Classification per country in Transaction Data. The problem is, there are 6 countries per transaction in Transaction Data. So I have to link Transaction data to Country Risk 6 times. Only 1 relationship can be active, of course.
So I tried writing the following measure:
CALCULATE(
VALUES('Country Risk'[Risk classification]);
USERELATIONSHIP('Transaction Data'[Country 2];'Country Risk'[Country Code]))
I get an error though when using the measure in a graph where I listed the countries from Transaction Data (where every country is mentioned in multiple rows) and I wanted to see the related risk categories:
A table of multiple values was supplied where a single value was expected
What am I doing wrong?
Made similar test data: https://drive.google.com/file/d/1_kJW-BpbrwCsbSpxdo7AJ3IzPy2oLWFJ/view?usp=sharing
Needed:
for every C (C1-C6) column I need to add the risk category.
For every C column I need to make a pie chart showing the number of transactions per risk category for that C column
Pie charts should filter the transaction oevrview: (
I've taked a PBI consultant about this, there is no way to get this issue solved the way I want it to (to have multiple relationships between 2 tables all acting as if they were active relationships at same time).
the only way of getting it done would be:
1. write measures (but that doesn't allow cross filtering between pie chart and table below)
2. unpivot the country columns (but that wouldn't allow to have 6 columns with risk category in table)
3. have 6 dimension tables (this solves the issue, because it allows both cross filtering between piechart and other piecharts & table. and it would allow to have 6 columns for separate risks in the table visual)
thanks for trying to help guys!
Related
I have the table of customers with different statuses in different months
.
I have added Status value In Power BI Slicer Visual to filter the Matrix Data. And when, selecting for example A, it only shows customers who has A status in certain period.
Filtered Customer Data
.
(6 an 8 are missing because they don't have status A in any period). The Problem is that I want to see all the statuses of the customers who even once had status A. is it possible somehow in Power BI ?
Result I want to See
Good news: there is a pretty easy fix for this.
Create a new table using DAX.
FilterableStatuses =
SUMMARIZE(
DemoData,
DemoData[CustomerID],
DemoData[Status]
)
Create a relationship in your model between CustomerID on this new table and CustomerID on the table from your visual. It will be Many to Many and I prefer to not leave the filter direction as 'both' -- make it so FilterableStatus filters your original table.
Create a slicer using the status from FilterableStatuses rather than the original table, and that should give you the behavior that you're after. In essence, rather than filter the visual by [Status], you are filtering the list of CustomerIDs by status, and then letting the new relationship filter your visual to CustomerIDs
Hope it helps!
Using Java and Oracle.
We need to update changes in Email, UserID of employee to third party.
Actual table is Employee and intermediate table we keep which we will use for comparison of changes before sending to third party.
Following are database designs coming in mind for intermediate table:
Only Single table:
EmployeeiD|Value|Type|UpdateDate
Value is userid or email, type will be 'email' or 'userid'. Update date is kept so to figure out that which of email or userid was different and update to third party.
Multiple Table:
Employee_EmailID
EmpId|EmailID|Updatedate
Employee_UserID
EmpId|UserID|Updatedate
Java flow will be:
Pick employee from actual table.
Pick employee from above intermediate table.
Compare differences. Update difference to third party.
Update above table with updated value and last update date.
Which one is consider as best way, single table approach or multiple table or is there any standard way to implement the same? There are 10,000 Employees in system.
Intermediate table is just storing Delta records i.e Records transferred to third party so that it can be compared next day.
Good database design has separate tables for different concepts. Using the same database column to hold different types of data will lead to code which is harder to understand, prone to data corruption and less performative.
You may think it's only two tables and a few tens of thousands of rows, so does it matter? But that is only your current requirement. What you choose now will set the template for what happens when (say) you need to add telephone numbers to the process.
Now in future if we get 5 more entities to update
Do you mean "entities", like say Customers rather than Employees? Or do you really mean "attributes" as in my example of Employee Telephone Number?
Generally speaking we have a separate table for distinct entities, and all the attributes of that entity are grouped at the same cardinality. To take your example, I would expect an Employee to have one UserID and one Email Address so I would design the table like this:
Employee_audit
EmpId|UserID|EmailID|Updatedate
That is, I have one record which stores the complete state of the Employee record at the Updatedate.
If we add a new entity, Customers then we have a new table. Simple. But a new attribute like Employee Phone Number offers a choice, because an employee can have more than one: work landline, mobile, fax, home, etc. So we could represent this in three ways: a child table with a type column, multiple child tables for each type, or as distinct columns on the Employee record.
For the main Employee table I would choose the separate table (or tables, depending on whether I'm shooting for 6NF). But for an audit table I would choose one record per Employee and pivot the phone numbers like this:
Employee_audit
EmpId|UserID|EmailID|Landline|Mobile|Fax|Home|Updatedate
The one thing I would never do is have a single table with type and value columns. It seems attractive because it means we could track additional entities without any further DDL. But in fact it becomes harder to re-assemble the complete state of an Employee at any given time with each attribute we add. Also it means the auditing process itself is more complicated (because it needs to determine which attributes have changed and whether it needs to audit the change) and more expensive (because changing three attributes on the same record entails inserting three audit records).
We have an audit table which we get from OLTP system, it records any activity done by the user including if he downloaded some attachment, or read some note or written some note , or any change for an incident etc.How do we include these audit table activity in our dimensional model for incident management system(IT service management)?
On a simple level, which is all I can provide based on the level of detail in the question, is to look at your audit table and decide which categories of audit you want to be a dimension. Perhaps there are audit_type, user_type, and audit_subtype fields or something like that? Also, typically you have another field called a "measure" or "quantity", which is typically used for stats on numerics, to support aggregate functions. For example, you might typically have store_id, product_cat as categorical dimensions, but roll up sales$ as min,max,avg,stdev grouped by different date types like month, quarter and other dimensions. If your data is purely categorical by date, then COUNT() is usually used as a calculated measure.
You really just need to decide how you want to be able to drill up and drill down though the data, which categories matter, and which quantities matter. Once you decide that, create a flat table with FKs to lookup tables. A star schema is simply a fat table with a bunch of lookup tables floating around it like a star.
Hope this helps
https://web.stanford.edu/dept/itss/docs/oracle/10g/olap.101/b10333/globdiag.gif
Assume that we have a start schema as above..
My questions is - In real-time how do we populate the colums (unit_price, unit_cost) columns of the fact table..?
Can anyone provide me a start schema tables with real data?
I am having hard time in understanding star schema...
Please help!..
Start schema consists of two types of tables fact tables and dimensions.
The ideal of the star design is that you can split your data in two part.
The static part is described with dimensions and the dynamic part (= transactions) in the fact table.
Each transaction is stored in the fact table as a new record and is connected to the surrounding dimensions, that define the context of the transaction.
The example in link contains two fact tables: SHIPMENTS and PRODUCT_CONDITIONS.
Note that the fact tables in the link are dubbed UNITS_HISTORY_FACT and PRICE_AND_COST_HISTORY_FACT, but I find this not a best choice.
The SHIPMENTS table stores one record for each shipment of a PRODUCT to a CUSTOMER at some TIME, via a defined CHANNEL.
All the above information is defined using the corresponding keys of the respective dimensions.
The fact table also contains MEASURES describing the attributes of the transaction, here the number of UNITS shipped.
The structure of the fact table would be therefore
CUSTOMER_ID
PRODUCT_ID
TIME_ID
CHANNEL_ID
UNITS
The second fact table (bottom) is more interesting, because here you split the product in two parts:
PRODUCT dimension defining the ID, name and other more static attributes
PRODUCT_CONDITION this is fact table, designed with the expectation the price and cost of the product will change over time.
With each change of the price or cost insert a new record in the fact table and connect it to the PRODUCT and TIME (of change).
The structure of the fact table would be therefore
PRODUCT_ID
TIME_ID
UNIT_PRICE
UNIT_COST
Final note the the design of the TIME dimension.
The best practice to connect the fact table with the dimension tables is to use meaningless ID (surrogate keys), but for TIME dimension you should be careful. For big (time partitioned) fact table is often used the natural key (DATE format) to be able to deploy the partitioning features. See more details in How I Defined a Time Dimension Using a Surrogate Key and other resources in web.
I am trying to create a report which groups on a column called "Legal Entity." When the output is directed to Excel, a separate tab will be created for each distinct entity in the query resultset.
For each Excel tab/Legal Entity, there will be two "sections." The first is a repeating section that breaks on a column "Funding Arrangement Type." After all of the Funding Arrangement Types are exhausted, there will be a single "Totals" grid which will summarize the data on the tab for the current Legal Entity. The data will be summarized across all Funding Arrangement Types within the current Legal Entity.
Because the Totals (lower) grid is really just a summarization of the same source query, Query1, I thought that I would also bind the Totals grid to it. However, if I do that, I get a run time error that tells me that I need to establish a Master-Detail relationship (If I decide to use a separate query for the Totals grid, the Totals grid "will not be aware" of the current Legal Entity/tab that must be considered when summarizing.)
Therefore, I continued with my guess at how the Master-Detail relationship should be defined. I made various attempts to link the two grids, including:
On all of the dimension (non-summarized) columns.
On Legal Entity
On Legal Entity and Funding Arrangement Type
Doing so affected previously correct totals reported in the upper cross tab results/
This Master-Detail approach is foreign and as a result I don't understand what it is doing.
I also tried to use a separate query, Query2, for the lower totals grid and adding a filter to filter SQL2 where SQL2. LegalEntity = SQL1.LegalEntity in an effort to get the totals grid to summarize within the current LeglEntity grouping. This resulted in a cross join error.
I’m a real noob with Cognos. Suggestions are welcomed. Thank you!
You can use mouse+scroll wheel to zoom in:
I was able to get it working by binding both grids to a single query and for both grids, establish a Master-Detail Replationship on Legal Entity. Prior to doing that, I added these columns to both grids and hide them, not sure if this was necessary.