How to enable solr document cache between 2 independent entities? - caching

Am trying to index an application (which has both metadata and attachments along with it). Am using DIH to build the solr documents. In my DIH Config xml, i have defined 2 separate entities, in which first one will fetch the metadata for all my active records, and the second entity will fetch the attachments(can be 0 to many )for all those records.
Both these entities have a "SEQUENCE ID" in common, Is there a way i can cache the SEQUENCE IDs from my first entity & re use that in second entity?
Appreciate any help
Thanks!
<entity name="metadata" query="SELECT sequence_id,field1,field2 from table1 where active_indicator='Y'">
<field column="sequence_id" name="seq_id"/>
<field column="field1" name="field_1"/>
<field column="field2" name="field_2" />
</entity>
<entity name="attachments" query="select t2.sequence_id,t2.field3 from table1 t1,table2 t2 where t1.sequence_id=t2.sequence_id and t1.sequence_id=(SELECT sequence_id from table1 where active_indicator='Y')">
<!--if i can get the t1.sequence_id cached, i can use them to select attachments for those seq ids alone-->
<field column="sequence_id" name="seq_id"/>
<field column="field3" name="field_3"/>
</entity>
</document>

You can wrap your dependent entities inside the entity it depends on, which will give you access to the column values for the rows in the parent entity. There's an example in the community wiki with multiple entities within each other.
<entity name="record" query="SELECT * FROM records">
<entity name="attachment"
query="SELECT field_1, field_2 FROM attachments WHERE SEQUENCE_ID='${record.SEQUENCE_ID}'">
<field name="attachment_val" column="field_1" />
</entity>
<field ... />
</entity>
This is the basic DIH configuration and should work. To rewrite this to use the cache implementation (if that's what you're asking for), you'll have to configure the inner entity to select all rows and use the cache for lookups:
<entity name="attachment"
query="SELECT field_1, field_2 FROM attachments"
cacheKey="SEQUENCE_ID" cacheLookup="record.SEQUENCE_ID">
....
</entity>
The functionality should otherwise be the same.

Related

Solr DataImportHandler cache with multiple keys

I'm trying to index data from a database using the Solr DataImportHandler. The database contains products defined by a class and with multiple features with the following relevant tables:
product table contains a PK product_id and a FK class_id
product_feature table contains a PK (product_id & feature_id)
class_feature table contains a PK (class_id & feature_id)
For each product I need to index multiple product_feature and for each of those I need to index multiple class_feature. I also need to cache sub-entities for improved performance. The entity-definitions in the config look like this:
<entity name="product" query="select product_id, class_id, ... from product">
...
<entity name="product_feature" query="select product_id, feature_id, ... from product_feature"
cacheImpl="SortedMapBackedCache" cacheKey="product_id" cacheLookup="product.product_id">
...
<entity name="class_feature" query="select class_id, feature_id, ... from class"
cacheImpl="SortedMapBackedCache" where="class_id=product.class_id AND feature_id=product_feature.feature_id">
...
</entity>
</entity>
</entity>
Notice in the innermost entity class_feature, I'm defining a where attribute matching two FKs, one on the outermost product and another on the direct parent product_feature. This doesn't seem to work. How must I define an entity cache to match on multiple keys?

How do I order a fetchXml result set with linked entities first by a value of the linked entity, then by a value of the linking entity?

I am trying to formulate a fetchXml query that returns a set of entities in the same order as would the following SQL statement:
select
t1.col_1
t2.col_2
from
tab_one t1 join
tab_two t2 on t1.id = t2.id
order by
t2.col_2,
t1.col_1
With the following approach, the result set is ordered by t1.col_1.
<fetch mapping="logical" version="1.0">
<entity name="tab_one">
<attribute name="col_1" />
<link-entity name="tab_two" from="id" to="id" alias="t2" link-type="inner">
<attribute name="col_2" />
<order attribute="col_2" />
</link-entity>
<order attribute="col_1" />
</entity>
</fetch>
Where or in what order to I have to put the <order attribute ...> tags so that the result set is returned first order by t2.col_2 then t1.col_1?

different types of null,how to deal with it in mybatis?

I am doing some web development based on Freemarker+Spring+MyBatis+Oracle.Now I am facing a very simple but hard problem,that is, how to deal with different types of 'NULL':
For example , a table stores information of many students,one row for a student including his name, his class number and his teacher(we assume that he has only one teacher).
NULL 1:we want to find students of class three , so our query condition becomes {class_number='3'}.In this condition ,our query becomes:
select s * from student_table where class_number='3'
MyBatis config file can be like this:
<select id="selectStudents" parameterType="map" resultType="map">
select * from student_table a where 1=1
<if test="class_number!= null">
and a.CLASS_NUMBER= #{product_name,jdbcType=VARCHAR}
</if>
<if test="teacher_name!= null">
and a.TEACHER_NAME= #{teacher_name,jdbcType=VARCHAR}
</if>
</select>
Yes , it is what we want.
NULL 2:for history reasons, some rows's teacher_name is null,it means that the teacher of this student is unknown.we want to find these students whose teacher is unknown.Obviously , the xml file above cannot meet our demands.we should modify it like this:
<select id="selectStudents" parameterType="map" resultType="map">
select * from student_table a where 1=1
<choose>
<when test="class_number!=null && ques_type!=''">
and a.CLASS_NUMBER= #{class_number,jdbcType=VARCHAR}
</when>
<otherwise>
and a.CLASS_NUMBER is null
</otherwise>
</choose>
<choose>
<when test="class_number!=null && ques_type!=''">
and a.TEACHER_NAME= #{teacher_name,jdbcType=VARCHAR}
</when>
<otherwise>
and a.TEACHER_NAME is null
</otherwise>
</choose>
</select>
If we have both query requirements in a single system, what should I do ?

xquery return Boolean if node exists at a certain location or not

I have XML similar to the following in a SQL 2008 database, stored in an XML field. I would like to return a true or false indication if a node exists in a specific section of the XML.
<root>
<node attribute1='value1' attribute2='value2'>
<sub1 name='ID' value="1" />
<sub2 name='project' value="abc" />
<sub3 name='Lead' value="John" />
</node>
<entry attribute1='value1' attribute2='value2'>
<message>start</message>
</entry>
<entry attribute1='value1' attribute2='value2'>
<attribute name='project' value='done'>
</entry>
<node attribute1='value1'>
<sub1 name='ID' value="2" />
<sub2 name='project' value="abc" />
<sub3 name='Lead' value="John" />
</node>
<entry attribute1='value1' attribute2='value2'>
<message>start</message>
</entry>
<node attribute1='value1'>
<sub1 name='ID' value="3" />
<sub2 name='project' value="abc" />
<sub3 name='Lead' value="John" />
</node>
<entry attribute1='value1' attribute2='value2'>
<message>start</message>
</entry>
<node attribute1='value1'>
<sub1 name='ID' value="4" />
<sub2 name='project' value="abc" />
<sub3 name='Lead' value="John" />
</node>
<entry attribute1='value1' attribute2='value2'>
<message>start</message>
</entry>
<entry attribute1='value1' attribute2='value2'>
<attribute name='project' value='done'>
</entry>
</root>
As you'll notice, the <attribute> node may or may not occur after a node with 'ID'. In this example, you can see it in the first and fourth "sections" for lack of a better term.
With the following table structure:
ID (PK)
EventID (FK)
RawXML (XML)
Created (datetime)
Here is an extract of the SQL/xQuery that I have so far:
WITH XMLNAMESPACES(
'http://www.w3.org/2001/XMLSchema-instance' as xsi,
),
t1 as(
SELECT distinct
x.EventId
, c.value ('(//node/sub[#name=''ID'']/#value)[1]', 'nvarchar(max)') as ID
, c.value ('(//node/sub[#name=''ID''][1][descendant::attribute/#name=''project''])[1]', 'nvarchar(max)' ) as Exists
FROM
Table1 x
CROSS APPLY
RawXML.nodes('./.') as t(c)
)
select distinct
t1.ID
, t1.Exists
from t1
I will be running the script 4 or more times (incrementing all of the singleton values before each run)
For the XML given, I need to end up with the following results after running the query 4 times:
(the values of the IDs will not be know so I can't use them in the query)
ID Exists
---- -------
1 true
2 false
3 false
4 true
With the SQL given, I didn't get any errors but it's taking forever (well over 45 minutes) and I haven't even let it finish yet. It really shouldn't take this long to parse the XML.
UPDATE: I limited my query to make sure it was only parsing one row (one XML file) and it finished in 57 seconds. However, I got a result of '0' for ID 1 and ID 2 when I should have had a '1' for ID 1.
And I'm sure most of you are aware that following-sibling, etc isn't supported by SQL Server so unfortunately that's not an option.
Just for reference, I've used this successfully to find the two instances of 'Project' but it ignores where in the xml they occur.:
c.value ('(//node[descendant::attribute/#name=''Project''])[1]', 'nvarchar(max)' ) as TrueFalse
So basically, I need to know if the node with name='Project' exists after a node with name='ID' BUT before the next instance of a node with name='ID'
You have some errors in your XML and judging by the query you use I also changed the sub nodes.
You can enumerate your ID and project nodes using row_number() and then check if the "next row" is a project node or a an ID row using regular SQL instead of XQuery.
-- Temp table to hold the extracted values from the XML
create table #C
(
rn int primary key,
ID int
);
-- Get the enumerated rows with ID.
-- project nodes will have NULL in ID
insert into #C
select row_number() over(order by T.N) as rn,
T.N.value('sub[#name = "ID"][1]/#value', 'int') as ID
from table1
cross apply RawXML.nodes('/root/*[sub/#name = "ID" or attribute/#name = "project"]') as T(N)
-- Get the ID's and check if the next row is a project node
select C1.ID,
case when exists (
select *
from #C as C2
where C1.rn + 1 = C2.rn and
C2.ID is null
)
then 1
else 0
end as [Exists]
from #C as C1
where C1.ID is not null;
drop table #C;
SQL Fiddle
You can do it without a temp table using a CTE instead but I suspect that the temp table version will be faster.
with C as
(
select row_number() over(order by T.N) as rn,
T.N.value('sub[#name = "ID"][1]/#value', 'int') as ID
from table1
cross apply RawXML.nodes('/root/*[sub/#name = "ID" or attribute/#name = "project"]') as T(N)
)
select C1.ID,
case when exists (
select *
from C as C2
where C1.rn + 1 = C2.rn and
C2.ID is null
)
then 1
else 0
end as [Exists]
from C as C1
where C1.ID is not null;
SQL Fiddle

How can I accomplish 2 outer joins in fetchXML between same 2 entities?

I'm trying to establish 2 left outer joins in fetchXML. Can I accomplish this sql statement...
select
a.new_campaignid
, a.new_ContactId
, b.new_campaigncontactstatusId
from
new_ContactCampaignNN AS a
left outer join new_campaigncontactstatus AS b ON a.new_contactid = b.new_ContactId
AND a.new_campaignid = b.new_CampaignId
into a fetchXML statement such as this?
<fetch mapping='logical' distinct='true'>
<entity name='new_contactcampaignnn'>
<attribute name='new_campaignid' />
<attribute name='new_contactid' />
<filter type='and'>
<condition attribute ='new_campaignid' operator='eq' value='72C9284B-905D-E111-9847-002655325864'/>
</filter>
<link-entity name='new_campaigncontactstatus' from='new_contactid' to='new_contactid' visible='true' link-type='outer' alias='new_contactcampaignnn_new_campaigncontactstatus'>
<attribute name='new_campaigncontactstatusid' />
<link-entity name='new_contactcampaignnn' from='new_campaignid' to='new_campaignid' visible='true' link-type='outer' alias='new_contactcampaignnn_new_campaigncontactstatus1'></link-entity>
</link-entity>
</entity>
</fetch>
I think LINQtoCRM can to this and you can probably also poke around with LinqPad.

Resources