Solr Search and QueryParser - magento

I have a mangeto installation with apache-solr (in linux environment)
When I am searching with a keyword, it shows unrelated products.
Later И figure out that solr is adding a query text and search with that also.
Here is the example:
Below is a part of my solr results xml,
<lst name="debug">
<str name="rawquerystring">bbb</str>
<str name="querystring">bbb</str>
<str name="parsedquery">text:PP text:bbb</str>
<str name="parsedquery_toString">text:PP text:bbb</str>
I search with the keyword "bbb". But in the parsedquery the solr has added an another query string as "PP".
So this is returning the products have "pp" in in description.
How I can prevent this automatically generation of a query text.
I hope you will clear my issue.

Most probably it's a dismax/edismax parser. Query passes the analysis chain as defined for the 'text' field type. Your 'PP' is somehow related to 'bbb', therefore the query is expanded. E.g. it can be a stemmed variation, or a synonym etc.
Check your schema.xml as D_K suggested.

Related

ElasticSearch: is it possible to highlight words in the query rather than the results

We use ElasticSearch in a reverse manner from what I usually see. We store lots of small documents, usually 1 or 2 words, for example, Job Titles like "software engineering", "car mechanics", "architect", etc.
Then we query with a longer string, for example a 1000 word Job Spec. This way we get all Job Titles present in the text of the Job Spec.
It works well. But I was wondering whether I could get ElasticSearch to highlight the matching Job Titles in the Job Spec, i.e. highlight the results in the query. I have tried the highlight keyword, but it doesn't highlight the query text, it highlights the results. I'm not sure how to get the query to be returned in the ElasticSearch response, let alone whether it can be highlighted.
You might wonder why I need ElasticSearch to highlight the query, can't I just pick out all the results from the text and highlight them myself? Yes I can, but there's various things to think about that makes it hard such as stemming and stopword removal. for example "jquery" is stemmed to "jqueri" when doing the tokenising in ElasticSearch, so it's found as a result, but if I want to highlight it myself, I have to unstem it so it matches the original text. Elasticsearch also removes symbols, so terms & conditions would become terms conditions which is problematic if I want to highlight it manually as I have to add back the "&" symbol. There's a hundred other problem cases, hence the question about whether ElasticSearch can do it for me.
I'm quite sure highlighting the query string isn't possible - only highlighting parts of documents in an index.
What you might try is indexing the query string itself in it's own index and then using the results of the first query as the query terms for a second query against the query string (in the second index). You could then have highlighting on the query string. You'll have to make an extra request to ES each time, but I think it'll get what you want.

How to query all fields individually with ElasticSearch

As I understand it, ElasticSearch searches on the magic _all field by default. The problem with this seems to be that if a field uses a different index analyzer, the analyzed data from this field is not searched.
I've had success with searching on the fields ['domain', '_all'] but I really need to avoid having to manually specify each field which was analyzed differently. I see fields supports wildcards but seemingly not '' on its own. I could do a, b*, c*, d* etc. but this seems a tad inefficient.
the special field "_all" is discontinued and copy_to function can be used instead as per the official documentation. This approach allows one to create a computed field (managed by elastic search) that one can specify to copy data from other fields to mimic _all search.
However there is an alternative approach through the use of multi_match providing wildcard field names as part of the query. This works just like the earlier mechanism searching "_all" field.
{"multi_match":{"query":"java","fields":["*"]}}]}}

Magento Solr: Increase weight of title in schema.xml?

I have a Magento-installation with Solr search. This works fairly well although it doesnt give relevant hits on a few queries. It seems that the the "title" attribute is not weighed higher than other attributes such as "description". Is there any way to define that title should have a higher weight in schema.xml?
Or any other ideas on how to manipulate the search results of a specific product? E.g. adding another attribute with some meta keywords?
I'm new to Solr so any feedback is appreciated!
My schema.xml is located here.
What you're looking for is called 'boosting' .
You can do boosting at index-time (more performant at query time) and at query-time (more flexible)
More here:
How to boost fields in solr

SOLR - search results without any default sort

I'm searching with the following query:
/select?q=:&fq=fld:dddd OR fld:aaaa OR fld:bbbb
where, the field fld is a String type and uniqueKey.
I'm getting results as:
<doc>
<str name="fld">aaaa</str>
</doc>
<doc>
<str name="fld">bbbb</str>
</doc>
<doc>
<str name="fld">dddd</str>
</doc>
Looks like the results But I want the results to be "un-sorted"... meaning, I want the results to be in the order in which I have given in the fq condition. That is, I want the results as follows:
<doc>
<str name="fld">dddd</str>
</doc>
<doc>
<str name="fld">aaaa</str>
</doc>
<doc>
<str name="fld">bbbb</str>
</doc>
How do we do that? Thanks in advance!
If you add score to your fl then you will see that all of them has the same score value, so it is sorted on fld -thats why you see aaaa bbbb dddd-
you can change scoring or give boost on query time, depending on your fl order to get a similar thing but other than those I dont think it is possible to have it without writing a plugin or hacking solr source.
You can also add a RandomSortField to your schema. Then sort the results randomly. See:
http://lucene.apache.org/solr/4_0_0/solr-core/org/apache/solr/schema/RandomSortField.html
EDIT: After re-reading the post, I realize that's not what you're looking for. You might try sorting using a function:
http://wiki.apache.org/solr/FunctionQuery

In elasticsearch, is there a way to show which field in a document was the "hit"?

When searching some documents using elasticsearch, I'd like to see which field in the document was the "hit" that flagged it up as a search result. Is there a native way to do this, or do I need to do it in the search client?
E.g:
GET /events/_search?q=nottingham
gives me:
{//elided
{'hits'[
{'id':1,
'name': 'Some name',
'nicknames': ['Nottingham']
}]}}
it's obvious from this example that the nickname matched, but can I get elasticsearch to flag that for me?
Elasticsearch can find and highlight terms from your query in the result fields. See http://www.elasticsearch.org/guide/reference/api/search/highlighting.html for more information. Technically speaking, it's not the same as flagging fields that caused the "hit", but for most practical purposes, it's as useful.

Resources