Add column to nested tables using outer value in Power Query - powerquery

I have an outer table with "Name" and "Content" columns, I also have nested tables contained in the "Content" column
How do I add a new column in the nested tables using the value "Name" from the outer one?
If I add a new column in the outer using
= Table.AddColumn(Step-1,"NewColOut", each Table.AddColumn([Content],"FileName", (x)=> [Name]))
I have no problem, what if I want to transform "Content" without adding a new column in the outer one?
I tried Table.TransformColumns but to no avail, I am not able to bring in the "Name" value at the nested table level
any help would be greately appreciated

You don't really need to do this, since if you expand the embedded table, it will automatically copy down the filename, but if you wanted to, you could use the simple line
#"Added Custom1" = Table.AddColumn(#"PriorStepNameGoesHere, "NewColOut", each let name=[Name] in Table.AddColumn([Content],"Filename",each name))
or with transform
#"Added Custom1"= Table.FromRecords(Table.TransformRows(#"PriorStepNameGoesHere",
(r) => Record.TransformFields(r,
{"Content", each Table.AddColumn(_,"NewColOut",each r[Name]) })))

Related

Combine PowerBI DAX Filter and SELECTCOLUMN

I want to create a new table based on this one:
that filters for Warehouse=2 and "drops" the columns "Price" and "Cost" like this:
I have managed to apply the filter in the first step using:
FILTER(oldtable;oldtable[Warehouse]=2)
and then in the next step cold create another table that only selects the required columns using:
newtable2=SELECTCOLUMNS("newtable1";"Articlename";...)
But I want to be able to combine these two functions and create the table straight away.
This is very simple, because in your first step, a table is returned which you can use directly in your second statement.
newTabel = SELECTCOLUMNS(FILTER(warehouse;warehouse[Warehouse]=2);"ArticleName";warehouse[Articlename];"AmountSold";warehouse[AmountSold];"WareHouse";warehouse[Warehouse])
If you want to keep the overview, you can also use variables and return:
newTabel =
var filteredTable = FILTER(warehouse;warehouse[Warehouse]=2)
return SELECTCOLUMNS(filteredTable;"ArticleName";warehouse[Articlename];"AmountSold";warehouse[AmountSold];"WareHouse";warehouse[Warehouse])

Drupal 7 | Query through multiple node references

Let me start with structure first:
[main_node]->field_reference_to_sub_node->[sub_node]->field_ref_to_sub_sub_node->[sub_sub_node]
[sub_sub_node]->field_type = ['wrong_type', 'right_type']
How to efficiently query all [sub_sub_node] ids with right_type, referenced by main_node (which is current opened node)?
Doing node_load on foreach seems a bit of overkill for this. Anybody has some better solutions? Greatly appreciated!
If you want to directly query the table of the fields:
$query = db_select('node', 'n')->fields('n_sub_subnode', array('nid'));
$query->innerJoin('table_for_field_reference_to_sub_node', 'subnode', "n.nid = subnode.entity_id AND subnode.entity_type='node'");
$query->innerJoin('node', 'n_subnode', 'subnode.subnode_target_id = n_subnode.nid');
$query->innerJoin('table_for_field_ref_to_sub_sub_node', 'sub_subnode', "n_subnode.nid = sub_subnode.entity_id AND sub_subnode.entity_type='node'");
$query->innerJoin('node', 'n_sub_subnode', 'sub_subnode.sub_subnode_target_id = n_sub_subnode.nid');
$query->innerJoin('table_for_field_type', 'field_type', "n_sub_subnode.nid = field_type.entity_id AND field_type.entity_type='node'");
$query->condition('n.nid', 'your_main_node_nid');
$query->condition('field_type.field_type_value', 'right_type');
Here is the explanation of each line:
$query = db_select('node', 'n')->fields('n_sub_subnode', array('nid'));
We start by querying the base node table, with the alias 'n'. This is the table used for the 'main_node'. The node ids which will be returned will be however from another alias (n_sub_subnode), you will see it below.
$query->innerJoin('table_for_field_reference_to_sub_node', 'subnode', "n.nid = subnode.entity_id AND subnode.entity_type='node'");
The first join is with the table of the field_reference_to_sub_node field, so you have to replace this with the actual name of the table. This is how we will get the references to the subnodes.
$query->innerJoin('node', 'n_subnode', 'subnode.subnode_target_id = n_subnode.nid');
A join back to the node table for the subnodes. You have to replace the 'subnode_target_id' with the actual field for the target id from the field_reference_to_sub_node table. The main purpose of this join is to make sure there are valid nodes in the subnode field.
$query->innerJoin('table_for_field_ref_to_sub_sub_node', 'sub_subnode', "n_subnode.nid = sub_subnode.entity_id AND sub_subnode.entity_type='node'");
The join to the table that contains references to the sub_sub_node, so you have to replace the 'table_for_field_ref_to_sub_sub_node' with the actual name of the table. This is how we get the references to the sub_sub_nodes.
$query->innerJoin('node', 'n_sub_subnode', 'sub_subnode.sub_subnode_target_id = n_sub_subnode.nid');
The join back to the node table for the sub_sub_nodes, to make sure we have valid references. You have to replace the 'sub_subnode_target_id' with the actual field for the target id from the 'field_ref_to_sub_sub_node' table.
$query->innerJoin('table_for_field_type', 'field_type', "n_sub_subnode.nid = field_type.entity_id AND field_type.entity_type='node'");
And we can now finally join the table with the field_type information. You have to replace the 'table_for_field_type' with the actual name of the table.
$query->condition('n.nid', 'your_main_node_nid');
You can put now a condition for the main node id if you want.
$query->condition('field_type.field_type_value', 'right_type');
And the condition for the field type. You have to replace the 'field_type_value' with the actual name of the table field for the value.
Of course, if you are really sure that you always have valid references, you can skip the joins to the node table and directly join the field tables using the target id and the entity_id fields (basically the target_id from on field table has to be the entity_id for the next one).
I really hope I do not have typos, so please check the queries carefully.

Some help on a Union LINQ expression please

I need some help constructing a LINQ expression. I tend to use Lambda syntax.
I have 2 tables
OrderItem >- LibraryItem
OrderItem has a number of columns:
Id
FkLibraryItemId
Text
FkOrderId
LibraryItem has a number of Columns:
Id
Text
Type
Usually when selecting an "OrderItem", one picks a "Library Item". The "Id" and "Text" value are placed into the item record.
Sometimes a user may add a one off "OrderItem" which does not need storing in the "LibraryItem" table. It is simply stored in "OrderItem", but without a "FkLibraryItemId". So I have records in "OrderItem" that do not exist in "LibraryItem".
I need the LINQ to pull out all the relevant "LibraryItem" records of "Type=X" in addition to the "OrderItem" records for the relevant Order Id.
Many thanks in advance.
UPDATE:
I think I am talking about something like:
LibraryItem.Select(new{Id,Text}).Union(Order.Select(new{Id, Text})
context.OrderItems.Include("LibraryItems")
.Where(o => o.OrderId == orderId
&&(o.LibraryItem != null ? o.LibraryItem.Type == "X" : true))
Linq-to-SQL has a different syntax for Include, which basically eager-loads the reference objects (probably LoadWith, I don't remember at the moment).
Actually you can also do it with left inner join.

How do I improve the performance of this simple LINQ?

I have two tables, one parent "Point" and one child "PointValue", connected by a single foreign key "PointID", making a one-to-many relation in SQL Server 2005.
I have a LINQ query:
var points = from p in ContextDB.Points
//join v in ContextDB.PointValues on p.PointID equals v.PointID
where p.InstanceID == instanceId
orderby p.PointInTime descending
select new
{
Point = p,
Values = p.PointValues.Take(16).ToList()
};
As you can see from the commented out join and the "Values" assignment, the "Point" table has a relation to "PointValue" (called "Points" and "PointValues" by LINQ).
When iterating through the "var points" IQueryable (say, when binding it to a GridView, etc.) the initial query is very fast, however iterating through the "Values" property is very slow. SQL Profiler shows me that for each value in the "points" IQueryable another query is executed.
How do I get this to be one query?
Interestingly, the initial query becomes very slow when the join is uncommented.
I think you want to use the DataLoadOptions.LoadWith method, described here:
http://msdn.microsoft.com/en-us/library/system.data.linq.dataloadoptions.loadwith.aspx
In your case you would do something like the following, when creating your DataContext:
DataLoadOptions options = new DataLoadOptions();
ContextDB.LoadOptions = options;
options.LoadWith((Point p) => p.PointValues);
You should make sure that the PointValues table has an index on the PointID column.
See also this SO question: Does Foreign Key improve query performance?

Can I force the auto-generated Linq-to-SQL classes to use an OUTER JOIN?

Let's say I have an Order table which has a FirstSalesPersonId field and a SecondSalesPersonId field. Both of these are foreign keys that reference the SalesPerson table. For any given order, either one or two salespersons may be credited with the order. In other words, FirstSalesPersonId can never be NULL, but SecondSalesPersonId can be NULL.
When I drop my Order and SalesPerson tables onto the "Linq to SQL Classes" design surface, the class builder spots the two FK relationships from the Order table to the SalesPerson table, and so the generated Order class has a SalesPerson field and a SalesPerson1 field (which I can rename to SalesPerson1 and SalesPerson2 to avoid confusion).
Because I always want to have the salesperson data available whenever I process an order, I am using DataLoadOptions.LoadWith to specify that the two salesperson fields are populated when the order instance is populated, as follows:
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson1);
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson2);
The problem I'm having is that Linq to SQL is using something like the following SQL to load an order:
SELECT ...
FROM Order O
INNER JOIN SalesPerson SP1 ON SP1.salesPersonId = O.firstSalesPersonId
INNER JOIN SalesPerson SP2 ON SP2.salesPersonId = O.secondSalesPersonId
This would make sense if there were always two salesperson records, but because there is sometimes no second salesperson (secondSalesPersonId is NULL), the INNER JOIN causes the query to return no records in that case.
What I effectively want here is to change the second INNER JOIN into a LEFT OUTER JOIN. Is there a way to do that through the UI for the class generator? If not, how else can I achieve this?
(Note that because I'm using the generated classes almost exclusively, I'd rather not have something tacked on the side for this one case if I can avoid it).
Edit: per my comment reply, the SecondSalesPersonId field is nullable (in the DB, and in the generated classes).
The default behaviour actually is a LEFT JOIN, assuming you've set up the model correctly.
Here's a slightly anonymized example that I just tested on one of my own databases:
class Program
{
static void Main(string[] args)
{
using (TestDataContext context = new TestDataContext())
{
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Place>(p => p.Address);
context.LoadOptions = dlo;
var places = context.Places.Where(p => p.ID >= 100 && p.ID <= 200);
foreach (var place in places)
{
Console.WriteLine(p.ID, p.AddressID);
}
}
}
}
This is just a simple test that prints out a list of places and their address IDs. Here is the query text that appears in the profiler:
SELECT [t0].[ID], [t0].[Name], [t0].[AddressID], ...
FROM [dbo].[Places] AS [t0]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t1].[AddressID],
[t1].[StreetLine1], [t1].[StreetLine2],
[t1].[City], [t1].[Region], [t1].[Country], [t1].[PostalCode]
FROM [dbo].[Addresses] AS [t1]
) AS [t2] ON [t2].[AddressID] = [t0].[AddressID]
WHERE ([t0].[PlaceID] >= #p0) AND ([t0].[PlaceID] <= #p1)
This isn't exactly a very pretty query (your guess is as good as mine as to what that 1 as [test] is all about), but it's definitively a LEFT JOIN and doesn't exhibit the problem you seem to be having. And this is just using the generated classes, I haven't made any changes.
Note that I also tested this on a dual relationship (i.e. a single Place having two Address references, one nullable, one not), and I get the exact same results. The first (non-nullable) gets turned into an INNER JOIN, and the second gets turned into a LEFT JOIN.
It has to be something in your model, like changing the nullability of the second reference. I know you say it's configured as nullable, but maybe you need to double-check? If it's definitely nullable then I suggest you post your full schema and DBML so somebody can try to reproduce the behaviour that you're seeing.
If you make the secondSalesPersonId field in the database table nullable, LINQ-to-SQL should properly construct the Association object so that the resulting SQL statement will do the LEFT OUTER JOIN.
UPDATE:
Since the field is nullable, your problem may be in explicitly declaring dataLoadOptions.LoadWith<>(). I'm running a similar situation in my current project where I have an Order, but the order goes through multiple stages. Each stage corresponds to a separate table with data related to that stage. I simply retrieve the Order, and the appropriate data follows along, if it exists. I don't use the dataLoadOptions at all, and it does what I need it to do. For example, if the Order has a purchase order record, but no invoice record, Order.PurchaseOrder will contain the purchase order data and Order.Invoice will be null. My query looks something like this:
DC.Orders.Where(a => a.Order_ID == id).SingleOrDefault();
I try not to micromanage LINQ-to-SQL...it does 95% of what I need straight out of the box.
UPDATE 2:
I found this post that discusses the use of DefaultIfEmpty() in order to populated child entities with null if they don't exist. I tried it out with LINQPad on my database and converted that example to lambda syntax (since that's what I use):
ParentTable.GroupJoin
(
ChildTable,
p => p.ParentTable_ID,
c => c.ChildTable_ID,
(p, aggregate) => new { p = p, aggregate = aggregate }
)
.SelectMany (a => a.aggregate.DefaultIfEmpty (),
(a, c) => new
{
ParentTableEntity = a.p,
ChildTableEntity = c
}
)
From what I can figure out from this statement, the GroupJoin expression relates the parent and child tables, while the SelectMany expression aggregates the related child records. The key appears to be the use of the DefaultIfEmpty, which forces the inclusion of the parent entity record even if there are no related child records. (Thanks for compelling me to dig into this further...I think I may have found some useful stuff to help with a pretty huge report I've got on my pipeline...)
UPDATE 3:
If the goal is to keep it simple, then it looks like you're going to have to reference those salesperson fields directly in your Select() expression. The reason you're having to use LoadWith<>() in the first place is because the tables are not being referenced anywhere in your query statement, so the LINQ engine won't automatically pull that information in.
As an example, given this structure:
MailingList ListCompany
=========== ===========
List_ID (PK) ListCompany_ID (PK)
ListCompany_ID (FK) FullName (string)
I want to get the name of the company associated with a particular mailing list:
MailingLists.Where(a => a.List_ID == 2).Select(a => a.ListCompany.FullName)
If that association has NOT been made, meaning that the ListCompany_ID field in the MailingList table for that record is equal to null, this is the resulting SQL generated by the LINQ engine:
SELECT [t1].[FullName]
FROM [MailingLists] AS [t0]
LEFT OUTER JOIN [ListCompanies] AS [t1] ON [t1].[ListCompany_ID] = [t0].[ListCompany_ID]
WHERE [t0].[List_ID] = #p0

Resources