Relations between tables - database-relations

I have 3 tables: A, B and C.
Table A is in relation (n:1) with B and with C.
Typically I store in A the B.Id (or the C.Id) and the table name.
e.g.
A.ParentId = 1
A.TableName = "B"
A.ParentId = 1
A.TableName = "C"
A.ParentId = 2
A.TableName = "B"
Is it a good solution? Are there any other solutions?

Why not 2 parentid columns?
A.ParentIdB = 1
A.ParentIdC = 3

Another possibility is to introduce another table Content (D) that serves as a "supertype" to Posts and Images. Then a row in Comments (A) would reference a primary key in Content as would each row in Posts (B) and Images (D). Any common fields in Posts and Images would be moved to Content (perhaps "title" or "date") and those original tables would then only contain information specific to a post or image (perhaps "body" or "resolution"). This would make it easier to perform joins than having the table names in a field, but it does mean that a real-world entity could be both a post and a comment (or indeed, be multiply a post or comment!). Really, though, it depends on the situation that you're trying to model.

Related

DAX/POWERBI : count tickets attached to an item and all its child items

I am new to DAX and PowerBI and have a problem to write DAX formulas for my case:
I have two tables: Assets and Tickets. Each have an Id, and the Assets have a ParentAssetId (can be 0 or None).
In a DAX expression: I would like to count (and list) all the tickets attached to an Asset and its children.
I tried this way but without success:
nbChildTickets =
VAR mykey =
SELECTEDVALUE ( Assets[AssetKey] )
VAR mypar =
SELECTEDVALUE ( Assets[ParentId] )
RETURN
CALCULATE(
COUNTX(Tickets, Tickets[TicketKey]),
FILTER(Tickets, RELATED(Assets[ParentId]) = mykey)
)
The Tables and the Canvas
It is the asset table which contains both the AssetKey and the ParentId colums.
Have any idea or tuto to do this ?
Thanks
A question, are these tables already related? In case they are, it seems that you wouldn't need a measure like that to get the count of # of tickets by asset and its parent. I would do it in the next two ways. Supposing you need it in a table:
Option 1.
Create a simple count measure to get the number of tickets in Tickets table, it could be something like Number of Tickets = COUNTROWS('Tickets')
Drag a table to the canvas.
Add to the table Assets and its children columns, finally add your new measure to the table
Option 2. In case each ticket has an ID.
Drag a table to the canvas.
Add to the table Assets and its children columns. Also add your tickets Id.
At Fields section (where you drag and drop your columns, right click and select the Count option.
Done
Remember, it is important that your both tables are already related to work. Otherwise Power BI will not know how to calculate and displays this combination of data for you.
First create two relationships between Assets and Ticket table. One relationship will be active (one to many) on column AssetKey Column.
Second relationship will be Inactive. Asset[AssetKey] = Ticket[ParentID]
Now use the below Measures -
Number Of Tickets = COUNT(Tickets[TicketKey])
Number of Child = CALCULATE(COUNT(Tickets[AssetKey]),USERELATIONSHIP(Asset[AssetKey],Tickets[ParentId]))
Relationship diagram is in below image :
Output is mentioned below:
The blank row can be eliminated from visual filters :

How to design querying multiple tags on analytics database

I would like to store user purchase custom tags on each transaction, example if user bought shoes then tags are "SPORTS", "NIKE", SHOES, COLOUR_BLACK, SIZE_12,..
These tags are that seller interested in querying back to understand the sales.
My idea is when ever new tag comes in create new code(something like hashcode but sequential) for that tag, and code starts from "a-z" 26 letters then "aa, ab, ac...zz" goes on. Now keep all the tags given for in one transaction in the one column called tag (varchar) by separating with "|".
Let us assume mapping is (at application level)
"SPORTS" = a
"TENNIS" = b
"CRICKET" = c
...
...
"NIKE" = z //Brands company
"ADIDAS" = aa
"WOODLAND" = ab
...
...
SHOES = ay
...
...
COLOUR_BLACK = bc
COLOUR_RED = bd
COLOUR_BLUE = be
...
SIZE_12 = cq
...
So storing the above purchase transaction, tag will be like tag="|a|z|ay|bc|cq|" And now allowing seller to search number of SHOES sold by adding WHERE condition tag LIKE %|ay|%. Now the problem is i cannot use index (sort key in redshift db) for "LIKE starts with %". So how to solve this issue, since i might have 100 millions of records? dont want full table scan..
any solution to fix this?
Update_1:
I have not followed bridge table concept (cross-reference table) since I want to perform group by on the results after searching the specified tags. My solution will give only one row when two tags matched in a single transaction, but bridge table will give me two rows? then my sum() will be doubled.
I got suggestion like below
EXISTS (SELECT 1 FROM transaction_tag WHERE tag_id = 'zz' and trans_id
= tr.trans_id) in the WHERE clause once for each tag (note: assumes tr is an alias to the transaction table in the surrounding query)
I have not followed this; since i have to perform AND and OR condition on the tags, example ("SPORTS" AND "ADIDAS") ---- "SHOE" AND ("NIKE" OR "ADIDAS")
Update_2:
I have not followed bitfield, since dont know redshift has this support also I assuming if my system will be going to have minimum of 3500 tags, and allocating one bit for each; which results in 437 bytes for each transaction, though there will be only max of 5 tags can be given for a transaction. Any optimisation here?
Solution_1:
I have thought of adding min (SMALL_INT) and max value (SMALL_INT) along with tags column, and apply index on that.
so something like this
"SPORTS" = a = 1
"TENNIS" = b = 2
"CRICKET" = c = 3
...
...
"NIKE" = z = 26
"ADIDAS" = aa = 27
So my column values are
`tag="|a|z|ay|bc|cq|"` //sorted?
`minTag=1`
`maxTag=95` //for cq
And query for searching shoe(ay=51) is
maxTag <= 51 AND tag LIKE %|ay|%
And query for searching shoe(ay=51) AND SIZE_12 (cq=95) is
minTag >= 51 AND maxTag <= 95 AND tag LIKE %|ay|%|cq|%
Will this give any benefit? Kindly suggest any alternatives.
You can implement auto-tagging while the files get loaded to S3. Tagging at the DB level is too-late in the process. Tedious and involves lot of hard-coding
While loading to S3 tag it using the AWS s3API
example below
aws s3api put-object-tagging --bucket --key --tagging "TagSet=[{Key=Addidas,Value=AY}]"
capture tags dynamically by sending and as a parameter
2.load the tags to dynamodb as a metadata store
3.load data to Redshift using S3 COPY command
You can store tags column as varchar bit mask, i.e. a strictly defined bit sequence of 1s or 0s, so that if a purchase is marked by a tag there will be 1 and if not there will be 0, etc. For every row, you will have a sequence of 0s and 1s that has the same length as the number of tags you have. This sequence is sortable, however you would still need lookup into the middle but you will know at which specific position to look so you don't need like, just substring. For further optimization, you can convert this bit mask to integer values (it will be unique for each sequence) and make matching based on that but AFAIK Redshift doesn't support that yet out of box, you will have to define the rules yourself.
UPD: Looks like the best option here is to keep tags in a separate table and create an ETL process that unwraps tags into tabular structure of order_id, tag_id, distributed by order_id and sorted by tag_id. Optionally, you can create a view that joins the this one with the order table. Then lookups for orders with a particular tag and further aggregations of orders should be efficient. There is no silver bullet for optimizing this in a flat table, at least I don't know of such that would not bring a lot of unnecessary complexity versus "relational" solution.

how can I group sum and count with sequel ORM and postgresl?

This is too tough for me guys. It's for Jeremy!
I have two tables (although I can also envision needing to join a third table) and I want to sum one field and count rows, in the same, table while joining with another table and return the result in json format.
First of all, the data type field that needs to be summed, is numeric(10,2) and the data is inserted as params['amount'].to_f.
The tables are expense_projects which has the name of the project and the company id and expense_items which has the company_id, item and amount (to mention just the critical columns) - the "company_id" columns are disambiguated.
So, the following code:
expense_items = DB[:expense_projects].left_join(:expense_items, :expense_project_id => :project_id).where(:project_company_id => company_id).to_a.to_json
works fine but when I add
expense_total = expense_items.sum(:amount).to_f.to_json
I get an error message which says
TypeError - no implicit conversion of Symbol into Integer:
so, the first question is why and how can this be fixed?
Then I want to join the two tables and get all the project names form the left (first table) and sum amount and count items in the second table. I have tried
DB[:expense_projects].left_join(:expense_items, :expense_items_company_id => expense_projects_company_id).count(:item).sum(:amount).to_json
and variations of this, all of which fails.
I would like a result which gets all the project names (even if there are no expense entries and returns something like:
project item_count item_amount
pr 1 7 34.87
pr 2 0 0
and so on. How can this be achieved with one query returning the result in json format?
Many thanks, guys.
Figured it out, I hope this helps somebody else:
DB[:expense_projects___p].where(:project_company_id=>user_company_id).
left_join(:expense_items___i, :expense_project_id=>:project_id).
select_group(:p__project_name).
select_more{count(:i__item_id)}.
select_more{sum(:i__amount)}.to_a.to_json

Core Data. Is it possible creating a view like you would do with normal SQL

In normal SQL world you would use Create View .... to define a view on one or more tables, e.g. to get a join and already a group by. Is that also possible somehow in Core data?
The reason I'm asking is, I have a table with a details. Each detail record has two keys and an amount. Now I need to show the sum of the amounts grouped by the two keys in a table view - i.e. The first key in the section and the second as normal entry with the sum amount. I thought FRC would work, but it does not group (add up the detail records). With a normal fetch request I can group and get everything - but it seems to be a lot of work to handle the sections manual. So I thought, the best is, I put a view on the table and use the FRC to bring it in the table view. Does that make sense? Any help ist very much appreciated.
example:
I have three fields:
A X 2
A X 2
A Z 3
B X 2
B Y 2
B Y 1
B Z 8
as a result I need
Section : A
X 4
Z 3
Section: B
Y 2
Z 8
So I am not sure if there is a shorter answer but here's how you can do it.
I'll assume the first column, second column and third column are called: firstCol, secondCol, thirdCol.
You can use this predicate to get all object for "A" and put it in resultArray:
//loop over the letters A to Z. Here's what it would look like:
NSPredicate *aPredicate = [NSPredicate predicateWithFormat:#"firstCol = %#)", #"A"];
Then find all the second column letters for objects that have A in first column (resultArray):
NSArray *allLetters = [resultArray valueForKeyPath:#"#distinctUnionOfObjects.secondCol"];
In case of "A" allLetters will include X and Z. Then loop over allLetters and add up the third column:
For (NSString *letter in allLetters) {
int sum = [allLetters valueForKeyPath:[NSString stringWithFormat:#"#sum.%#", letter]];
//this sums up each letter for example returns 4 for X in case of "A"
//insert the sum in an Array and then a Dictionary that can be used for data source of the table.
}

LINQ Grouping help

I have a database table that holds parent and child records much like a Categories table. The ParentID field of this table holds the ID of that record's parent record...
My table columns are: SectionID, Title, Number, ParentID, Active
I only plan to allow my parent to child relationship go two levels deep. So I have a section and a sub section and that it.
I need to output this data into my MVC view page in an outline fashion like so...
Section 1
Sub-Section 1 of 1
Sub-Section 2 of 1
Sub-Section 3 of 1
Section 2
Sub-Section 1 of 2
Sub-Section 2 of 2
Sub-Section 3 of 2
Section 3
I am using Entity Framework 4.0 and MVC 2.0 and have never tried something like this with LINQ. I have a FK set up on the section table mapping the ParentID back to the SectionID hoping EF would create a complex "Section" type with the Sub-Sections as a property of type list of Sections but maybe I did not set things up correctly.
So I am guessing I can still get the end result using a LINQ query. Can someone point me to some sample code that could provide a solution or possibly a hint in the right direction?
Update:
I was able to straighten out my EDMX so that I can get the sub-sections for each section as a property of type list, but now I realize I need to sort the related entities.
var sections = from section in dataContext.Sections
where section.Active == true && section.ParentID == 0
orderby section.Number
select new Section
{
SectionID = section.SectionID,
Title = section.Title,
Number = section.Number,
ParentID = section.ParentID,
Timestamp = section.Timestamp,
Active = section.Active,
Children = section.Children.OrderBy(c => c.Number)
};
produces the following error.
Cannot implicitly convert type 'System.Linq.IOrderedEnumerable' to 'System.Data.Objects.DataClasses.EntityCollection'
Your model has two navigation properties Sections1 and Section1. Rename the first one to Children and the second one to Parent.
Depending on whether you have a root Section or perhaps have each top-level section parented to itself (or instead make parent nullable?), your query might look something like:-
// assume top sections are ones where parent == self
var topSections = context.Sections.Where(section => section.ParentId == SectionId);
// now put them in order (might have multiple orderings depending on input, pick one)
topSections = topSections.OrderBy(section => section.Title);
// now get the children in order using an anonymous type for the projection
var result = topSections.Select(section => new {top = section, children = section.Children.OrderBy(child => child.Title)});
For some linq examples:
http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx
This covers pretty much all of the linq operations, have a look in particular at GroupBy. The key is to understand the input and output of each piece in order to orchestrate several in series and there is no shortcut but to learn what they do so you know what's at hand. Linq expressions are just combinations of these operations with some syntactic sugar.

Resources