Accessing ancestor values in xpath with Solr DataImportHandler

Accessing ancestor values in xpath with Solr DataImportHandler - xpath

If my xml is structured like so:
<fruit>
<apple appleId="apple_1">
<core coreId="core_1">
<seed>1</seed>
<seed>2</seed>
</core>
</apple>
<apple appleId="apple_2">
<core coreId="core_1">
<seed>1</seed>
</core>
</apple>
</fruit>
and I want the seeds to be the documents in my solr schema, how can I access the appleId and coreId?
Here's the pertinent entity definition from my data-config.xml:
<entity name="apples"
processor="XPathEntityProcessor"
stream="true"
forEach="/fruit/apple/core/seed"
url="fruit.xml"
transformer="script:create_id"
>
<field column="seed_s" xpath="/fruit/apple/core/seed" />
<field column="apple_id_s" xpath="/fruit/apple/#appleId" />
</entity>
script:create_id creates a unique id for each seed.
In this example, apple_id_s is coming back as null.

I found the problem. I need to use commonField="true" and make sure to loop through each apple and core. Also, I need to set the pk="seed_s" which triggers solr to store the document.
Here's my new entity definition:
<entity name="apples"
processor="XPathEntityProcessor"
stream="true"
pk="seed_s"
forEach="/fruit/apple/core/seed | /fruit/apple | /fruit/apple/core"
url="fruit.xml"
transformer="script:create_id"
>
<field column="seed_s" xpath="/fruit/apple/core/seed" />
<field column="apple_id_s" xpath="/fruit/apple/#appleId" commonField="true"/>
<field column="core_id_s" xpath="/fruit/apple/core/#coreId" commonField="true"/>

Related

Spring Integration Xpath Filter with Regex

I am using an Xpath Filter to filter out some incoming events.
This is my sample input xml. I need to filter by the value of fieldC, by allowing events that have a fieldC value of 1,2,3,6 or 7.
<?xml version="1.0" encoding="UTF-8"?>
<add>
<doc>
<field name="fieldA">453.97</field>
<field name="fieldB">278.25</field>
<field name="fieldC">3</field>
<field name="fieldD">Agent</field>
<field name="fieldE">Mobile Site</field>
<field name="fieldF">Cancel</field>
<field name="fieldG">2015-09-14T13:17:21.000Z</field>
</doc>
</add>
Xpath Tried:
/add/doc/field[#name='fieldC']/text()
/add/doc/field[#name='fieldC']
<int:chain input-channel="channelIn" output-channel="channelOut">
<int-xml:xpath-filter id="filterEvents" match-value="3" match-type="exact">
<int-xml:xpath-expression expression="/add/doc/field[#name='fieldC']/text()" />
</int-xml:xpath-filter>
</int-xml:xpath-filter>
</int:chain>
Filter by match-type 'exact' works, but I am not able to get the same working with regex.
Regex Tried: /^(1|2|3|6|7)$/
Any help would be appreciated.

Your regex has bad syntax; lose the /s...
match-value="^(1|2|3|6|7)$" match-type="regex"
...works fine for me.

Solr DataImportHandler Cache Support for Multiple Values

I'm trying to use cache for some entities in my data import handler configuration. Somehow if I use cache, I only get the first value of my multivalued field. My configuration looks like this:
<entity name="product" query="SELECT product_id FROM Product WHERE 1">
<entity name="strength" query="SELECT *
FROM Strength WHERE product_id = '${product.product_id}'">
<entity name="form" query="SELECT CONCAT(parent_route,'|',form_name) AS form_name, LOWER(CONCAT_WS('\n',form_name,parent_route)) AS form_name_s,
CAST(form_id AS CHAR(10)) AS form_id_string FROM Form WHERE form_id = '${strength.form_id}'"
transformer="RegexTransformer"
cacheImpl="SortedMapBackedCache" cacheLookup="strength.form_id" cacheKey="form_id_string">
<field column="form_name" name="form_name" />
<field column="form_name_s" splitBy="\n" />
</entity>
</entity>
</entity>
There should be two rows returned for the entity "form" but only the first one is visible if cache is enabled. Does Solr not have the ability to cache multiple rows or am I doing something wrong? My Solr version is 4.1.

Problem is fixed when the where part of the cached query is removed. I'm not sure the following configuration is ideal but what I understand is the aim is reducing the count of queries.
<entity name="product" query="SELECT product_id FROM Product WHERE 1">
<entity name="strength" query="SELECT *
FROM Strength WHERE product_id = '${product.product_id}'">
<entity name="form" query="SELECT CONCAT(parent_route,'|',form_name) AS form_name, LOWER(CONCAT_WS('\n',form_name,parent_route)) AS form_name_s,
CAST(form_id AS CHAR(10)) AS form_id_string FROM Form"
transformer="RegexTransformer"
cacheImpl="SortedMapBackedCache" cacheLookup="strength.form_id" cacheKey="form_id_string">
<field column="form_name" name="form_name" />
<field column="form_name_s" splitBy="\n" />
</entity>
</entity>
</entity>

How to apply filter in search view, based on user - OpenERP 7.0?

I am trying to apply some filters in my tree view. And all was going fine until I tried to apply filters based on user.id
My XML code looks like this:
<record model="ir.ui.view" id="view_generic_request_search">
<field name="name">generic_request.search</field>
<field name="model">generic.request</field>
<field name="arch" type="xml">
<search string="Search Request">
<filter icon="terp-mail-message-new" string="My Requests" name="my_requests_filter" domain="[('requestor','=',user.id)]" />
<filter icon="terp-mail-message-new" string="Requests I'm responsible" name="request_im_responsible_filter" domain="[('responsible_name','=',user.id)]" />
<filter icon="terp-mail-message-new" string="Requests I own" name="requests_i_own_filter" domain="[('owner','=',user.id)]" />
<separator />
<filter icon="terp-mail-message-new" string="Denied Requests" name="denied_requests_filter" domain="[('state','=','denied')]"/>
<filter icon="terp-mail-message-new" string="Authorized Requests" name="authorized_requests_filter" domain="[('state','=','authorized')]"/>
<filter icon="terp-mail-message-new" string="Confirmed Requests" name="confirmed_requests_filter" domain="[('state','=','confirmed')]"/>
<separator/>
<group expand="0" string="Group By...">
<filter string="Requested by" domain="[]" context="{'group_by' : 'requestor'}" />
<filter string="Responsible person" domain="[]" context="{'group_by' : 'responsible_name'}" />
<filter string="Status" domain="[]" context="{'group_by': 'state'}"/>
</group>
</search>
</field>
</record>
All filters and groups by are working fine, except the 3 based on user.id (ex. )
I get diffent js error, on different browsers:
Chrome & IE
Uncaught TypeError: Cannot read property 'length' of undefined
http://myserveraddress:8069/web/webclient/js?db=may_9:3256
Firefox:
TypeError: results.group_by is undefined
http://myserveraddress:8069/web/webclient/js?db=may_9:3256
I tryed to add context="{'group_by' : 'requestor'}", just in case, but I get the same error! Any ideia of what I'm missing here?
Thanks in advance.

I guess I'm loosing my mind with OpenERP. I was formatting badly the filter domain, I should use uid instead of user.id. This way, filters should be <filter icon="terp-mail-message-new" string="My Requests" name="my_requests_filter" domain="[('requestor','='uid)]" />
And, BTW, if one wants to set a filter as a default on tree view, it has to add the following code in the action definition:
<record model="ir.actions.act_window" id="action_generic_request">
<field name="name">Generic Request</field>
<field name="res_model">generic.request</field>
<field name="view_type">form</field>
<field name="context">{"search_default_my_requests_filter":1}</field>
<field name="view_mode">tree,form</field>
</record>

Can Solr join tables in-memory?

There is a table of n products, and a table of features of these products. Each product has many features. Given a Solr DataImportHandler configuration:
<document name="products">
<entity name="item" query="select id, name from item">
<field column="ID" name="id" />
<field column="NAME" name="name" />
<entity name="feature"
query="select feature_name, description from feature where item_id='${item.ID}'">
<field name="feature_name" column="description" />
<field name="description" column="description" />
</entity>
</entity>
</document>
Solr will run n + 1 queries to fetch this data. 1 for the main query, n for the queries to fetch the features. This is inefficient for large numbers of items. Is it possible to configure Solr such that it will run these queries separately and join them in-memory instead? All rows from both tables will be fetched.

This can be done using CachedSqlEntityProcessor:
<document name="products">
<entity name="item" query="select id, name from item">
<field column="ID" name="id" />
<field column="NAME" name="name" />
<entity name="feature"
query="select item_id, feature_name, description from feature"
cacheKey="item_id"
cacheLookup="item.ID"
processor="CachedSqlEntityProcessor">
<field name="feature_name" column="description" />
<field name="description" column="description" />
</entity>
</entity>
</document>
Since Solr's index is 'flat', feature_name and description are not connected in any way; each product will have multi-valued fields for each of these.

I am not sure if Solr can do this, but the database can. Assuming that you are using MySQL, use JOIN and GROUP_CONCAT to convert this into a single query. The query should look something like this:
SELECT id, name, GROUP_CONCAT(description) AS desc FROM item INNER JOIN feature ON (feature.item_id = item.id) GROUP BY id
Don't forget to use the RegexTransformer on desc to separate out the multiple values.

solr clobtransfomer

I am stuck with ClobTransformer in solr from the past 3 days. I want to convert an oracle clob field to text field in solr. I am using multiple cores and I started my config and schema files from scratch.
This is my config file:
<lib dir="../../../dist/" regex="apache-solr-dataimporthandler-.*\.jar" />
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
These are the columns in my schema file for a core:
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="mandp" type="text_en_splitting" indexed="true" stored="true" multiValued="false" />
This is my data-config.xml for the core:
<dataConfig>
<dataSource type="JdbcDataSource"
driver="oracle.jdbc.driver.OracleDriver"
url="jdbc:oracle:thin:#***"
user="***"
password="****"/>
<document>
<entity name="wiki" transformer="ClobTransformer"
query="Select t.id as id, t.mandp From table1 t">
<field column="mandp" name="mandp" clob="true" />
</entity>
</document>
</dataConfig>
When I start solr, I can see that dataimporthandler*.jar files have loaded successfully in the console. When I run my dataimport from http://localhost:8983/solr/wiki/dataimport?command=full-import&clean=false, I don't see any errors in the console neither do I see anything related to transformer or clob. So, If I type anything in my transformer parameter (transformer="bla bla bla"), it doesn't throw any errors in the console, that could mean my transformer argument is completely ignored or the full logging is turned off.
When I query solr, I see oracle.sql.CLOB#375c929a in the mandp field. Nothing happens of course if I use HTMLStripTransformer class too. I want to use both on this field.
Any ideas are appreciated!!!

It looks like the ClobTransformer is not fired. I would personally change the mandp column name inside the query like this:
Select t.id as id, t.mandp as mandp From table1 t

please add transformer="ClobTransformer, RegexTransformer" to the entity in your data-config.xml file

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Accessing ancestor values in xpath with Solr DataImportHandler - xpath

Related

Spring Integration Xpath Filter with Regex

Solr DataImportHandler Cache Support for Multiple Values

How to apply filter in search view, based on user - OpenERP 7.0?

Can Solr join tables in-memory?

solr clobtransfomer

Categories

Resources