Drupal 7 | Query through multiple node references - performance

Let me start with structure first:
[main_node]->field_reference_to_sub_node->[sub_node]->field_ref_to_sub_sub_node->[sub_sub_node]
[sub_sub_node]->field_type = ['wrong_type', 'right_type']
How to efficiently query all [sub_sub_node] ids with right_type, referenced by main_node (which is current opened node)?
Doing node_load on foreach seems a bit of overkill for this. Anybody has some better solutions? Greatly appreciated!

If you want to directly query the table of the fields:
$query = db_select('node', 'n')->fields('n_sub_subnode', array('nid'));
$query->innerJoin('table_for_field_reference_to_sub_node', 'subnode', "n.nid = subnode.entity_id AND subnode.entity_type='node'");
$query->innerJoin('node', 'n_subnode', 'subnode.subnode_target_id = n_subnode.nid');
$query->innerJoin('table_for_field_ref_to_sub_sub_node', 'sub_subnode', "n_subnode.nid = sub_subnode.entity_id AND sub_subnode.entity_type='node'");
$query->innerJoin('node', 'n_sub_subnode', 'sub_subnode.sub_subnode_target_id = n_sub_subnode.nid');
$query->innerJoin('table_for_field_type', 'field_type', "n_sub_subnode.nid = field_type.entity_id AND field_type.entity_type='node'");
$query->condition('n.nid', 'your_main_node_nid');
$query->condition('field_type.field_type_value', 'right_type');
Here is the explanation of each line:
$query = db_select('node', 'n')->fields('n_sub_subnode', array('nid'));
We start by querying the base node table, with the alias 'n'. This is the table used for the 'main_node'. The node ids which will be returned will be however from another alias (n_sub_subnode), you will see it below.
$query->innerJoin('table_for_field_reference_to_sub_node', 'subnode', "n.nid = subnode.entity_id AND subnode.entity_type='node'");
The first join is with the table of the field_reference_to_sub_node field, so you have to replace this with the actual name of the table. This is how we will get the references to the subnodes.
$query->innerJoin('node', 'n_subnode', 'subnode.subnode_target_id = n_subnode.nid');
A join back to the node table for the subnodes. You have to replace the 'subnode_target_id' with the actual field for the target id from the field_reference_to_sub_node table. The main purpose of this join is to make sure there are valid nodes in the subnode field.
$query->innerJoin('table_for_field_ref_to_sub_sub_node', 'sub_subnode', "n_subnode.nid = sub_subnode.entity_id AND sub_subnode.entity_type='node'");
The join to the table that contains references to the sub_sub_node, so you have to replace the 'table_for_field_ref_to_sub_sub_node' with the actual name of the table. This is how we get the references to the sub_sub_nodes.
$query->innerJoin('node', 'n_sub_subnode', 'sub_subnode.sub_subnode_target_id = n_sub_subnode.nid');
The join back to the node table for the sub_sub_nodes, to make sure we have valid references. You have to replace the 'sub_subnode_target_id' with the actual field for the target id from the 'field_ref_to_sub_sub_node' table.
$query->innerJoin('table_for_field_type', 'field_type', "n_sub_subnode.nid = field_type.entity_id AND field_type.entity_type='node'");
And we can now finally join the table with the field_type information. You have to replace the 'table_for_field_type' with the actual name of the table.
$query->condition('n.nid', 'your_main_node_nid');
You can put now a condition for the main node id if you want.
$query->condition('field_type.field_type_value', 'right_type');
And the condition for the field type. You have to replace the 'field_type_value' with the actual name of the table field for the value.
Of course, if you are really sure that you always have valid references, you can skip the joins to the node table and directly join the field tables using the target id and the entity_id fields (basically the target_id from on field table has to be the entity_id for the next one).
I really hope I do not have typos, so please check the queries carefully.

Related

For table cmdb_rel_ci, I want to retrieve unique parent.sys_class_name with count for "type=In Rack::Rack contains"

For table cmdb_rel_ci, I want to retrieve unique parent.sys_class_name with count for "type=In Rack::Rack contains". I am doing practice in out of the box instance.
At table level URL is as below:
URL
I want to retrieve result from above URL with my below script.
var count = new GlideAggregate('cmdb_rel_ci');
count.addQuery('type','e76b8c7b0a0a0aa70082c9f7c2f9dc64');// sys_id of type In Rack::Rack contains e76b8c7b0a0a0aa70082c9f7c2f9dc64
count.addAggregate('COUNT', 'parent.sys_class_name');
count.query();
while(count.next()){
var parentClassName = count.parent.sys_class_name.toString();
var parentClassNameCount = count.getAggregate('COUNT','parent.sys_class_name');
gs.log(parentClassName + " : " + parentClassNameCount );
}
The issue is I am getting parentClassName empty.
Try this instead:
var parentClassName = count.getValue("parent.sys_class_name")
Since it's a GlideAggregate query (instead of GlideRecord), the query being issued isn't returning all of the fields on the target table. With GlideRecord, dot-walking through a reference field (e.g. parent.sys_class_name) automatically resolves that referenced record to provide access to its field values. This is made possible by the fact that the driving/original query brought back the value of the parent field. This is not happening with GlideAggregate. The query in this case basically looks like:
SELECT cmdb1.`sys_class_name` AS `parent_sys_class_name`, count(*)
FROM (cmdb_rel_ci cmdb_rel_ci0 LEFT JOIN cmdb cmdb1 ON cmdb_rel_ci0.`parent` = cmdb1.`sys_id` )
WHERE cmdb_rel_ci0.`type` = 'e76b8c7b0a0a0aa70082c9f7c2f9dc64'
GROUP BY cmdb1.`sys_class_name`
ORDER BY cmdb1.`sys_class_name`
So, you actually have access specifically to that dot-walked sys_class_name that's being grouped, but not through the dot-walk. The call to getValue("parent.sys_class_name") is expectedly resolved to the returned column aliased as parent_sys_class_name.
That being said, what you're doing probably should also work, based on user expectations, so you've not done anything incorrect here.

.Where in LinQ not working correctly

I have Documents table and Signs table. Document record can be related with many records in Signs table.
Now, I want to get all records of Documents table when document ID appears in Signs table.
Here I get all documents:
var documents = (from c in context.documents select c);
Here I get all my signs and save into List:
var myDocuments = (from s in context.signs where s.UserId== id select s.ID).ToList();
This list contains collection on document ID.
And here, I'm trying to get all documents that exists in myDocuments list:
documents.Where(item => myDocuments.Contains(item.ID));
But, when I do .ToList() allways return all records (in database only exists one compatible record)
What is wrong in LinQ statement?
The problem is that this statement doesn't modify the contents of documents, it merely returns the results (which you're not doing anything with):
documents.Where(item => myDocuments.Contains(item.ID));
documents is still the full list.
Change this line to something like:
var matchingIDDocs = documents.Where(item => myDocuments.Contains(item.ID));
And then use matchingIDDocs in place of "documents" later in your code.

joining custom tables using magento commands

I have been trying to join two custom table using magento's commands. After searching i came across this block of generic code
$collection = Mage::getModel('module/model_name')->getCollection();
$collection->getSelect()->join( array('table_alias'=>$this->getTable('module/table_name')),
'main_table.foreign_id = table_alias.primary_key',
array('table_alias.*'),
'schema_name_if_different');
Following this as template I have tried to join my tables together but have only returned errors such as incorrect table name or table doesn't exist or some other error.
Just to clear things up can someone please correct me on my understanding
$collection = Mage::getModel('module/model_name')->getCollection();
Gets an instance of your model. Within that model is the table that holds the required data (for this example I shall call the table p)
$collection->getSelect()
Select data from table p
->join()
Requires three parameters to join two table together
PARAM1
array('table_alias'=>$this->getTable('module/table_name'))
'the alised name you give the table' => 'the table you want to add to the collection (this has been set up in the model folder)'
PARAM2
'main_table.foreign_id = table_alias.primary_key'
This bit i don't get (it seems straight forward though)
my main table (p) doesn't have a foreign id (it has it's primary key - is that also its foreign id)?
has to be equal to the alised name you gave in param1
PARAM3
'main_table.foreign_id = table_alias.primary_key'
get all from alised name
Where have I gone wrong on my understanding?
Please have a look in below sql join statement, I am using it in my project and it is working perfectly.
Syntax
$collection = Mage::getModel('module/model_name')->getCollection();
$collection->getSelect()->join(Mage::getConfig()->getTablePrefix().'table_name_for_join', 'main_table.your_table_field ='.Mage::getConfig()->getTablePrefix().'table_name_for_join.join_table_field', array('field_name_you_want_to_fetch_from_db'));
Working Query Example
$collection = Mage::getModel('module/model_name')->getCollection();
$collection->getSelect()->join(Mage::getConfig()->getTablePrefix().'catalog_product_entity_varchar', 'main_table.products_id ='.Mage::getConfig()->getTablePrefix().'catalog_product_entity_varchar.entity_id', array('value'));
Hope this will work for you !!

Linq Contains issue: cannot formulate the equivalent of 'WHERE IN' query

In the table ReservationWorkerPeriods there are records of all workers that are planned to work on a given period on any possible machine.
The additional table WorkerOnMachineOnConstructionSite contains columns workerId, MachineId and ConstructionSiteId.
From the table ReservationWorkerPeriods I would like to retrieve just workers who work on selected machine.
In order to retrieve just relevant records from WorkerOnMachineOnConstructionSite table I have written the following code:
var relevantWorkerOnMachineOnConstructionSite = (from cswm in currentConstructionSiteSchedule.ContrustionSiteWorkerOnMachine
where cswm.MachineId == machineId
select cswm).ToList();
workerOnMachineOnConstructionSite = relevantWorkerOnMachineOnConstructionSite as List<ContrustionSiteWorkerOnMachine>;
These records are also used in the application so I don't want to bypass the above code even if is possible to directly retrieve just workerPeriods for workers who work on selected machine. Anyway I haven't figured out how it is possible to retrieve the relevant workerPeriods once we know which userIDs are relevant.
I have tried the following code:
var userIDs = from w in workerOnMachineOnConstructionSite select new {w.WorkerId};
List<ReservationWorkerPeriods> workerPeriods = currentConstructionSiteSchedule.ReservationWorkerPeriods.ToList();
allocatedWorkers = workerPeriods.Where(wp => userIDs.Contains(wp.WorkerId));
but it seems to be incorrect and don't know how to fix it. Does anyone know what is the problem and how it is possible to retrieve just records which contain userIDs from the list?
Currently, you are constructing an anonymous object on the fly, with one property. You'll want to grab the id directly with (note the missing curly braces):
var userIDs = from w in workerOnMachineOnConstructionSite select w.WorkerId;
Also, in such cases, don't call ToList on it - the variable userIDs just contains the query, not the result. If you use that variable in a further query, the provider can translate it to a single sql query.

Can I force the auto-generated Linq-to-SQL classes to use an OUTER JOIN?

Let's say I have an Order table which has a FirstSalesPersonId field and a SecondSalesPersonId field. Both of these are foreign keys that reference the SalesPerson table. For any given order, either one or two salespersons may be credited with the order. In other words, FirstSalesPersonId can never be NULL, but SecondSalesPersonId can be NULL.
When I drop my Order and SalesPerson tables onto the "Linq to SQL Classes" design surface, the class builder spots the two FK relationships from the Order table to the SalesPerson table, and so the generated Order class has a SalesPerson field and a SalesPerson1 field (which I can rename to SalesPerson1 and SalesPerson2 to avoid confusion).
Because I always want to have the salesperson data available whenever I process an order, I am using DataLoadOptions.LoadWith to specify that the two salesperson fields are populated when the order instance is populated, as follows:
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson1);
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson2);
The problem I'm having is that Linq to SQL is using something like the following SQL to load an order:
SELECT ...
FROM Order O
INNER JOIN SalesPerson SP1 ON SP1.salesPersonId = O.firstSalesPersonId
INNER JOIN SalesPerson SP2 ON SP2.salesPersonId = O.secondSalesPersonId
This would make sense if there were always two salesperson records, but because there is sometimes no second salesperson (secondSalesPersonId is NULL), the INNER JOIN causes the query to return no records in that case.
What I effectively want here is to change the second INNER JOIN into a LEFT OUTER JOIN. Is there a way to do that through the UI for the class generator? If not, how else can I achieve this?
(Note that because I'm using the generated classes almost exclusively, I'd rather not have something tacked on the side for this one case if I can avoid it).
Edit: per my comment reply, the SecondSalesPersonId field is nullable (in the DB, and in the generated classes).
The default behaviour actually is a LEFT JOIN, assuming you've set up the model correctly.
Here's a slightly anonymized example that I just tested on one of my own databases:
class Program
{
static void Main(string[] args)
{
using (TestDataContext context = new TestDataContext())
{
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Place>(p => p.Address);
context.LoadOptions = dlo;
var places = context.Places.Where(p => p.ID >= 100 && p.ID <= 200);
foreach (var place in places)
{
Console.WriteLine(p.ID, p.AddressID);
}
}
}
}
This is just a simple test that prints out a list of places and their address IDs. Here is the query text that appears in the profiler:
SELECT [t0].[ID], [t0].[Name], [t0].[AddressID], ...
FROM [dbo].[Places] AS [t0]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t1].[AddressID],
[t1].[StreetLine1], [t1].[StreetLine2],
[t1].[City], [t1].[Region], [t1].[Country], [t1].[PostalCode]
FROM [dbo].[Addresses] AS [t1]
) AS [t2] ON [t2].[AddressID] = [t0].[AddressID]
WHERE ([t0].[PlaceID] >= #p0) AND ([t0].[PlaceID] <= #p1)
This isn't exactly a very pretty query (your guess is as good as mine as to what that 1 as [test] is all about), but it's definitively a LEFT JOIN and doesn't exhibit the problem you seem to be having. And this is just using the generated classes, I haven't made any changes.
Note that I also tested this on a dual relationship (i.e. a single Place having two Address references, one nullable, one not), and I get the exact same results. The first (non-nullable) gets turned into an INNER JOIN, and the second gets turned into a LEFT JOIN.
It has to be something in your model, like changing the nullability of the second reference. I know you say it's configured as nullable, but maybe you need to double-check? If it's definitely nullable then I suggest you post your full schema and DBML so somebody can try to reproduce the behaviour that you're seeing.
If you make the secondSalesPersonId field in the database table nullable, LINQ-to-SQL should properly construct the Association object so that the resulting SQL statement will do the LEFT OUTER JOIN.
UPDATE:
Since the field is nullable, your problem may be in explicitly declaring dataLoadOptions.LoadWith<>(). I'm running a similar situation in my current project where I have an Order, but the order goes through multiple stages. Each stage corresponds to a separate table with data related to that stage. I simply retrieve the Order, and the appropriate data follows along, if it exists. I don't use the dataLoadOptions at all, and it does what I need it to do. For example, if the Order has a purchase order record, but no invoice record, Order.PurchaseOrder will contain the purchase order data and Order.Invoice will be null. My query looks something like this:
DC.Orders.Where(a => a.Order_ID == id).SingleOrDefault();
I try not to micromanage LINQ-to-SQL...it does 95% of what I need straight out of the box.
UPDATE 2:
I found this post that discusses the use of DefaultIfEmpty() in order to populated child entities with null if they don't exist. I tried it out with LINQPad on my database and converted that example to lambda syntax (since that's what I use):
ParentTable.GroupJoin
(
ChildTable,
p => p.ParentTable_ID,
c => c.ChildTable_ID,
(p, aggregate) => new { p = p, aggregate = aggregate }
)
.SelectMany (a => a.aggregate.DefaultIfEmpty (),
(a, c) => new
{
ParentTableEntity = a.p,
ChildTableEntity = c
}
)
From what I can figure out from this statement, the GroupJoin expression relates the parent and child tables, while the SelectMany expression aggregates the related child records. The key appears to be the use of the DefaultIfEmpty, which forces the inclusion of the parent entity record even if there are no related child records. (Thanks for compelling me to dig into this further...I think I may have found some useful stuff to help with a pretty huge report I've got on my pipeline...)
UPDATE 3:
If the goal is to keep it simple, then it looks like you're going to have to reference those salesperson fields directly in your Select() expression. The reason you're having to use LoadWith<>() in the first place is because the tables are not being referenced anywhere in your query statement, so the LINQ engine won't automatically pull that information in.
As an example, given this structure:
MailingList ListCompany
=========== ===========
List_ID (PK) ListCompany_ID (PK)
ListCompany_ID (FK) FullName (string)
I want to get the name of the company associated with a particular mailing list:
MailingLists.Where(a => a.List_ID == 2).Select(a => a.ListCompany.FullName)
If that association has NOT been made, meaning that the ListCompany_ID field in the MailingList table for that record is equal to null, this is the resulting SQL generated by the LINQ engine:
SELECT [t1].[FullName]
FROM [MailingLists] AS [t0]
LEFT OUTER JOIN [ListCompanies] AS [t1] ON [t1].[ListCompany_ID] = [t0].[ListCompany_ID]
WHERE [t0].[List_ID] = #p0

Resources