Query search string on elastic search - elasticsearch

I have a field thats defined as below.
"findings": {
"type": "string",
"fields": {
"orig": {
"type": "string"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
The findings contains the following text -
S91 - fiber cut
Now, when I do a 'term' search on 'findings.orig' for word 'fiber', I get a search response but when I do a a 'query string' search on 'findings.orig' for word 'fiber cut', I don't get any search response.
When I do a 'query string' search on '_all' for word 'fiber cut', I get the search response.
Why I dont get any response for 'fiber cut' on 'query string' search on 'findings.orig'.

Elasticsearch: query_string nested search
You can try this, Hope it work...
GET index/type/_search
{
"query": {
"query_string": {
"fields": ["findings.orig"],
"query": "S91 - fiber cut"
}
}
}
If you want to search in nested fields use this.

Related

elasticsearch query string(full text search) which can't be searched

we have a document below. I can't searched with financialmarkets. but it can be searched with industry_icon_financialmarkets.png. Can anyone tell me what is the reason?
content is the text type field.
document:
{
"title":"test",
"content":"industry_icon_financialmarkets.png"
}
Query:
{
"from": 0,
"size": 2,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "\"industry_icon_financialmarkets.png\""
}
}
]
}
}
}
The default analyzer for text field is standard which won't break industry_icon_financialmarkets into tokens using _ as a delimiter. I would suggest you to use simple analyzer instead which will breaks text into terms whenever it encounters a character which is not a letter.
You can also add sub-field of type keyword to retain the original value.
So the mapping of the field should be:
{
"content": {
"type": "text",
"analyzer": "simple",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
At the time of creating index, we should have our own mapping for each fields based on its type to get the expected result.
Mapping
PUT relevance
{"mapping":{"ID":{"type":"long"},"title":
{"type":"keyword","analyzer":"my_analyzer"},
"content":
{"type":"string","analyzer":"my_analyzer","search_analyzer":"my_analyzer"}},
"settings":
{"analysis":
{"analyzer":
{"my_analyzer":
{"tokenizer":"my_tokenizer"}},
"tokenizer":
{"my_tokenizer":
{"type":"ngram","min_gram":3,"max_gram":30,"token_chars":
["letter","digit"]
}
}
},"number_of_shards":5,"number_of_replicas":2
}
}
Then start inserting documents,
POST relevance/_doc/1
{
"name": "1elastic",
"content": "working fine" //replace special characters with space using program before inserting into ES index.
}
Query
GET relevance/_search
{"size":20,"query":{"bool":{"must":[{"match":{"content":
{"query":"fine","fuzziness":1}}}]}}}

Getting results for multi_match cross_fields query in elasticsearch with custom analyzer

I have an elastic search 5.3 server with products.
Each product has a 14 digit product code that has to be searchable by the following rules. The complete code should match as well as a search term with only the last 9 digits, the last 6, the last 5 or the last 4 digits.
In order to achieve this I created a custom analyser which creates the appropriate tokens at index time using the pattern capture token filter. This seems to be working correctly. The _analyse API shows that the correct terms are created.
To fetch the documents from elastic search I'm using a multi_match cross_fields bool query to search a number of fields simultaneously.
When I have a query string that has a part that matches a product code and a part that matches any of the other fields no results are returned, but when I search for each part separately the appropriate results are returned. Also when I have multiple parts spanning any of the fields except the product code the correct results are returned.
My maping and analyzer:
PUT /store
{
"mappings": {
"products":{
"properties":{
"productCode":{
"analyzer": "ProductCode",
"search_analyzer": "standard",
"type": "text"
},
"description": {
"type": "text"
},
"remarks": {
"type": "text"
}
}
}
},
"settings": {
"analysis": {
"filter": {
"ProductCodeNGram": {
"type": "pattern_capture",
"preserve_original": "true",
"patterns": [
"\\d{5}(\\d{9})",
"\\d{8}(\\d{6})",
"\\d{9}(\\d{5})",
"\\d{10}(\\d{4})"
]
}
},
"analyzer": {
"ProductCode": {
"filter": ["ProductCodeNGram"],
"type": "custom",
"preserve_original": "true",
"tokenizer": "standard"
}
}
}
}
}
The query
GET /store/products/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "[query_string]",
"fields": ["productCode", "description", "remarks"],
"type": "cross_fields",
"operator": "and"
}
}
]
}
}
}
Sample data
POST /store/products
{
"productCode": "999999123456789",
"description": "Foo bar",
"remarks": "Foobar"
}
The following query strings all return one result:
"456789", "foo", "foobar", "foo foobar".
But the query_string "foo 456789" returns no results.
I am very curious as to why the last search does not return any results. I am convinced that it should.
The problem is that you are doing a cross_fields over fields with different analysers. Cross fields only works for fields using the same analyser. It in fact groups the fields by analyser before doing the cross fields. You can find more information in this documentation.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#_literal_cross_field_literal_and_analysis
Although cross_fields needs the same analyzer across the fields it operates on, I've had luck using the tie_breaker parameter to allow other fields (that use different analyzers) to be weighed for the total score.
This has the added benefit of allowing per-field boosting to be calculated in the final score, too.
Here's an example using your query:
GET /store/products/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "[query_string]",
"fields": ["productCode", "description", "remarks"],
"type": "cross_fields",
"tie_breaker": 1 # You may need to tweak this
}
}
]
}
}
}
I also removed the operator field, as I believe using the "AND" operator will cause fields that don't have the same analyzer to be scored inappropriately.

Elasticsearch match exact term

I have an Elasticsearch repo and a aplication that create documents for what we call 'assets'. I need to prevent users to create 'assets' with the same 'title'.
When the user tries to create an 'asset' I am querying the repo with the title and if there is a match an error message is shown to the user.
My problem is that when I query the title I am getting multiple results (for similar matches).
This is my query so far:
GET assets-1/asset/_search
{
"query": {
"match": {
"title": {
"query": "test",
"operator": "and"
}
}
}
}
I have many records with title: 'test 1', 'test 2', 'test bla' and only one with the title 'test'.
But I am getting all of the above.
Is there any condition or property I have to add to the query so I will exact match the term?
Your title field is probably analyzed and thus the test token will match any title containing that token.
In order to implement an exact match you need to have a not_analyzed field and do a term query on it.
You need to change the mapping of your title field to this:
curl -XPUT localhost:9200/assets-1/_mapping/asset -d '{
"asset": {
"properties": {
"title": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}'
Then you need to reindex your data and you'll then be able to run an exact match query like this:
curl -XPOST localhost:9200/assets-1/asset/_search -d '{
"query": {
"term": {
"title.raw": "test"
}
}
}'

Elasticsearch UTF-8 characters in search

I have indexed record:
"žiema"
Elastic search settings:
index:
cmpCategory: {type: string, analyzer: like_analyzer}
Analyzer
analysis:
char_filter:
lt_characters:
type: mapping
mappings: ["ą=>a","Ą=>a","č=>c","Č=>c","ę=>e","Ę=>e","ė=>e","Ė=>e","į=>i","Į=>i","š=>s","Š=>s","ų=>u","Ų=>u","ū=>u","ž=>z", "Ū=>u"]
analyzer:
like_analyzer:
type: snowball
tokenizer: standard
filter : [lowercase,asciifolding]
char_filter : [lt_characters]
What I want:
By keyword "žiema" found record "žiema" AND by keyword "ziema" also found record "žiema", how to do that ?
I try execute characters replace and filter asciifolding
What am I doing wrong?
You can try indexing your field twice like it is shown in the documentation.
PUT /my_index/_mapping/my_type
{
"properties": {
"cmpCategory": {
"type": "string",
"analyzer": "standard",
"fields": {
"folded": {
"type": "string",
"analyzer": "like_analyzer"
}
}
}
}
}
so the cmpCategory field is indexed as standard with diacritics, and the cmpCategory.folded field is indexed without diacritics.
And while searching, you do the search on both indexes as such:
GET /my_index/_search
{
"query": {
"multi_match": {
"type": "most_fields",
"query": "žiema",
"fields": [ "cmpCategory", "cmpCategory.folded" ]
}
}
}
Also, I'm not sure if the char_filter is necessary since the asciifolding filter already does that transformation.

How can I get a search term with a space to be one search term

I have an elasticsearch index, with a field called "name" with a mapping as follows:
"name": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
Now let's say I have a record "Brooklyn Technical High School".
I would like somebody searching for "brooklyn t*" to have that show up. For example: http://myserver/_search?q=name:brooklyn+t*
It seems however to be tokening the search term, and searching for both "brooklyn" and "t", because I get back results like: "Ps 335 Granville T Woods".
I would like it to search the not_analyzed term using the whole term. Enclosing it in quotes doesn't seem to help either.
You need to use the term query -
Term query wont analyzer/tokenize the string before it apply the search.
{
"query": {
"term": {
"user": "kimchy"
}
}
}

Resources