configure dismax requesthandlar for boost a field

configure dismax requesthandlar for boost a field - boost

I want to apply boost for searching. i want that if a query term occur both in description,name than docs having query term in description field come high in search results. for this i configure dismax request handler as:
<requestHandler name="dismax" class="solr.DisMaxRequestHandler" default="true" >
<lst name="defaults">
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">
text^0.5 name^1.0 description^1.5
</str>
<str name="fl">
UID_PK,name,price,description
</str>
<str name="mm">
2<-1 5<-2 6<90%
</str>
<int name="ps">100</int>
<str name="q.alt">*:*</str>
<str name="f.name.hl.fragsize">0</str>
<str name="f.name.hl.alternateField">name</str>
<str name="f.text.hl.fragmenter">regex</str> <!-- defined below -->
</lst>
</requestHandler>
But i am not finding any effect in my search results. do i need to do some more configuration to see the effect.

Related

Xquery get node with specific child element

I am using xquery 1.0 and have the following problem.
My input message:
<Body>
<album>
<contents>
<content>correct</content>
<content>hardcore</content>
</contents>
</album>
<album>
<contents>
<content>incorrect</content>
<content>punk</content>
</contents>
</album>
<album>
<contents>
<content>incorrect</content>
<content>rock</content>
</contents>
</album>
</Body>
Desired result:
I would like to search for the 'Album' node that contains the child element <content>correct</content> and when the node has been found I would like to pick/use the element <content>hardcore</content>. Note that the order of the album nodes is subject to change. So a first() or [1] will not be sufficient.
What I tried:
if (body/album/contents/content[text()='correct']) then ???

If I understand you correctly, you probably don't need xquery for that.
//contents/content[.="correct"]/following-sibling::content
should be enough.

Query Xpath match first string instead of contain text()

I have a xml file like this:
<Doc> A0B100 </Doc>
<Doc> A0B101 </Doc>
<Doc> B1A100 </Doc>
<Doc> B1A101 </Doc>
I use xpath query to select value of node that contain "B1"
my code :
$txtSearch = "B1";
$titles = $xpath->query("Doc[contains(text(),\"$txtSearch\")]");
It returned all 4 value :
A0B100
A0B101
B1A100
B1A101
But I only want the contain text() to match first string that the result I expected is
B1A100
B1A101
How can I do that?

use this xpath
Doc[starts-with(normalize-space(text()),\"$txtSearch\")]
added normalize-space() to trim spaces on your sample xml

Apache Solr only returning results if wildcard character (*) is present (with Magento)

I have Solr set up with Magento Enterprise Edition 1.9 and for the most part it works well. However, there are certain terms (e.g. "banana") which return no results even though product names in my catalog contain the word "banana".
However, as soon as I search for "banana*", with a wildcard, it returns results as expected.
I have used Magento's default schema for Solr so I don't have experience in tweaking Solr's schema file, so any advice would be appreciated.
Edit: here is a link to both my schema and config files: https://gist.github.com/anonymous/8d7a7106eb4e594d5adc
Edit 2: exploring my index using Luke I noticed that when I changed my default field from "fulltext" to "fulltext1_en" or "name_en", my normal query "banana" worked as expected. When I made this change in my schema, the search is working as expected. This leads me to more questions, however: I'm not sure how "fulltext" relates to "fulltext1_en". Why does "fulltext" not work but "fulltext1_en" does? Doesn't "fulltext" exist since it's in the Magento schema? And how was I getting any search results at all if the "fulltext" field simply didn't exist in my schema?

Without seeing your schema, standard keyword searches in solr only return results for that exact word not partially. Like you said, adding a wildcard will work much like a regex expression but some results will just be rubbish.
One workaround is to add a spellcheck component:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="spellcheckIndexDir">./spellchecker</str>
<str name="field">textsuggest</str>
<str name="comparatorClass">freq</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
Add in your request handler the component
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck">true</str>
<str name="spellcheck.count">3</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
Or you can create new fields similar to an autocomplete like:
<!-- autocomplete_edge : Will match from the left of the field, e.g. if the document field
is "A brown fox" and the query is "A bro", it will match, but not "brown"
-->
<fieldType name="autocomplete_edge" class="solr.TextField">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])" replacement=" " replace="all"/>
<filter class="solr.EdgeNGramFilterFactory" maxGramSize="30" minGramSize="1"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])" replacement=" " replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^(.{30})(.*)?" replacement="$1" replace="all"/>
</analyzer>
</fieldType>
<field name="textnge" type="autocomplete_edge" indexed="true" stored="false" />

solr dismax lower boost for empty values

I have a SOLR document which looks like this:
<doc>
<float name="score">1.7004467</float>
<str name="name">Love</str>
<str name="id">15801637</str>
<int name="itemCount">3</int>
<date name="last_modified">2012-08-10T11:04:28Z</date>
<str name="emailaddress"/>
</doc>
<doc>
<str name="name">Love</str>
<str name="id">158015757</str>
<int name="itemCount">3</int>
<date name="last_modified">2012-08-10T11:04:28Z</date>
<str name="emailaddress">xxx#yy.com</str>
</doc>
I want to write a query that matches documents by name, but boost records with emailaddress to appear on top, and without emailaddress toward the bottom.
I don't want to sort by email address. I prefer using dismax (i am presenting a simplified problem here).

Check e.g. Boost Score OR If you are using Dismax parser check for parameter Boost Query
emailaddress:[* TO *] should cover with emailaddress having values
For your condition you can try bq=emailaddress:[* TO *]^2.0

Solr search/faceting results have strange behaviour: i only get "stemmed" strings (hope it's correct definition)

Sorry for a title that bad, but i didn't know how to describe my problem.
I'm using sunburnt (python interface) to query solr within my django app.
When i'm searching, everything is ok, i get the full string.
On the other hand, if i'm faceting (let's say on "job_title" field) i'm getting only the stemmed words
Like this:
<lst name="job_title">
<int name="manag">17095</int>
<int name="sale">7689</int>
<int name="engin">6995</int>
<int name="consult">4907</int>
<int name="account">4710</int>
<int name="develop">4509</int>
<int name="senior">4366</int>
and so on...
This is my text fieldType definition:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
i think the PorterStemFilter is the one screwing things up, but i need it to activate suggestions. Any help?

This is why you usually facet on unanalyzed fields. Add another field with StrField type, use a copyField directive to get the data there, and facet on this new string field.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

configure dismax requesthandlar for boost a field - boost

Related

Xquery get node with specific child element

Query Xpath match first string instead of contain text()

Apache Solr only returning results if wildcard character (*) is present (with Magento)

solr dismax lower boost for empty values

Solr search/faceting results have strange behaviour: i only get "stemmed" strings (hope it's correct definition)

Categories

Resources