ElasticSearch - Multiple query on one call (with sub limit) - elasticsearch

I have a problem with ElasticSearch, I need you :)
Today I have an index in which I have my documents. These documents represent either Products or Categories.
The structure is this:
{
"_index": "documents-XXXX",
"_type": "_doc",
"_id": "cat-31",
"_score": 1.0,
"_source": {
"title": "Category A",
"type": "category",
"uniqId": "cat-31",
[...]
}
},
{
"_index": "documents-XXXX",
"_type": "_doc",
"_id": "prod-1",
"_score": 1.0,
"_source": {
"title": "Product 1",
"type": "product",
"uniqId": "prod-1",
[...]
}
},
What I'd like to do, in one call, is:
Have 5 documents whose type is "Product" and 2 documents whose type is "Category". Do you think it's possible?
That is, two queries in a single call with query-level limits.
Also, isn't it better to make two different indexes, one for the products, the other for the categories?
If so, I have the same question, how, in a single call, do both queries?
Thanks in advance

If product and category are different contexts I would try to separate them into different indices. Is this type used in all your queries to filter results? Ex: I want to search for the term xpto in docs with type product or do you search without applying any filter?
About your other question, you can apply two queries in a request. The Multi search API can help with this.
You would have two answers one for each query.
GET my-index-000001/_msearch
{ }
{"query": { "term": { "type": { "value": "product" } }}}
{"index": "my-index-000001"}
{"query": { "term": { "type": { "value": "category" } }}}

Related

ElasticSearch - Phrase match on whole document? Not just one specific field

Is there a way I can use elastic match_phrase on an entire document? Not just one specific field.
We want the user to be able to enter a search term with quotes, and do a phrase match anywhere in the document.
{
"size": 20,
"from": 0,
"query": {
"match_phrase": {
"my_column_name": "I want to search for this exact phrase"
}
}
}
Currently, I have only found phrase matching for specific fields. I must specify the fields to do the phrase matching within.
Our document has hundreds of fields, so I don't think its feasible to manually enter the 600+ fields into every match_phrase query. The resultant JSON would be huge.
You can use a multi-match query with type phrase that runs a match_phrase query on each field and uses the _score from the best field. See phrase and phrase_prefix.
If no fields are provided, the multi_match query defaults to the
index.query.default_field index settings, which in turn defaults to *.
This extracts all fields in the mapping that are eligible to term queries and filters the metadata fields. All extracted fields are then
combined to build a query.
Adding a working example with index data, search query and search result
Index data:
{
"name":"John",
"cost":55,
"title":"Will Smith"
}
{
"name":"Will Smith",
"cost":55,
"title":"book"
}
Search Query:
{
"query": {
"multi_match": {
"query": "Will Smith",
"type": "phrase"
}
}
}
Search Result:
"hits": [
{
"_index": "64519840",
"_type": "_doc",
"_id": "1",
"_score": 1.2199391,
"_source": {
"name": "Will Smith",
"cost": 55,
"title": "book"
}
},
{
"_index": "64519840",
"_type": "_doc",
"_id": "2",
"_score": 1.2199391,
"_source": {
"name": "John",
"cost": 55,
"title": "Will Smith"
}
}
]
You can use * in match query field parameter which will search all the available field in the document. But it will reduce your query speed since you are searching the whole document

Is there any way to match similar match in Elastic Search

I have a elastic search big document
I am searching with below query
{"size": 1000, "query": {"query_string": {"query": "( string1 )"}}}
Let say my string1 = Product, If some one accident type prduct some one forgot to o
Is there any way to search for that also
{"size": 1000, "query": {"query_string": {"query": "( prdct )"}}} also has to return result of prdct + product
You can use fuzzy query that returns documents that contain terms similar to the search term. Refer this blog to get detailed explanation of fuzzy queries.
Since,you have more edit distance to match prdct. Fuzziness parameter can be defined as :
0, 1, 2
0..2 = Must match exactly
3..5 = One edit allowed
More than 5 = Two edits allowed
Index Data:
{
"title":"product"
}
{
"title":"prdct"
}
Search Query:
{
"query": {
"fuzzy": {
"title": {
"value": "prdct",
"fuzziness":15,
"transpositions":true,
"boost": 5
}
}
}
}
Search Result:
"hits": [
{
"_index": "my-index1",
"_type": "_doc",
"_id": "2",
"_score": 3.465736,
"_source": {
"title": "prdct"
}
},
{
"_index": "my-index1",
"_type": "_doc",
"_id": "1",
"_score": 2.0794415,
"_source": {
"title": "product"
}
}
]
There are many solutions to this problem:
Suggestions (did you mean X instead).
Fuzziness (edits from your original search term).
Partial matching with autocomplete (if someone types "pr" and you provide the available search terms, they can click on the correct results right away) or n-grams (matching groups of letters).
All of those have tradeoffs in index / search overhead as well as the classic precision / recall problem.

Nested attribute term Query

I have a documents something like bellow
{
"_index": "lines",
"_type": "lineitems",
"_id": "4002_11",
"_score": 2.6288738,
"_source": {
"data": {
"type": "Shirt"
}
}
}
I want to get a count based on type attribute value. Any suggestion on this?
I tried term query but no lick with that.
You should use the terms aggregation, this will return the number of documents aggregated for each "type" field values.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

Elastic filter with dot (.) in name

I'm pretty new to ELK and seem to start with the complicated questions ;-)
I have elements that look like following:
{
"_index": "asd01",
"_type": "doc",
"_id": "...",
"_score": 0,
"_source": {
"#version": "1",
"my-key": "hello.world.to.everyone",
"#timestamp": "2018-02-05T13:45:00.000Z",
"msg": "myval1"
}
},
{
"_index": "asd01",
"_type": "doc",
"_id": "...",
"_score": 0,
"_source": {
"#version": "1",
"my-key": "helloworld.from.someone",
"#timestamp": "2018-02-05T13:44:59.000Z",
"msg": "myval2"
}
I want to filter for my-key(s) that start with "hello." and want to ignore elements that start with "helloworld.". The dot seem to be interpreted as a wildchard and every kind of escaping doesn't seem to work.
With a filter for that as I want to be able to use the same expression in Kibana as well as in the API directly.
Can someone point me to how to get it working with Elasticsearch 6.1.1?
It's not being used as a wildcard, it's just being removed by the default analyzer (standard analyzer). If you do not specify a mapping, elasticsearch will create one for you. For string fields it will create a multi value field, the default will be text (with default analyzer - standard) and keyword field with the keyword analyzer. If you do not want this behaviour you must specify the mapping explicitly during index creation, or update it and reindex the data
Try using this
GET asd01/_search
{
"query": {
"wildcard": {
"my-key.keyword": {
"value": "hello.*"
}
}
}
}

Directions on how to index words and annotate with their type (entity, etc) and then Elasticsearch/w.e. returns these words with the annotations?

I'm trying to build a very simple NLP chat (I could even say pseudo-NLP?), where I want to identify a fixed subset of intentions (verbs, sentiments) and entities (products, etc)
It's a kind of entity identification or named-entity recognition, but I'm not sure I need a full fledged NER solution for what I want to achieve. I don't care if the person types cars instead of car. HE HAS to type the EXACT word. So no need to deal with language stuff here.
It doesn't need to identity and classify the words, I'm just looking for a way that when I search a phrase, it returns all results that contains each word of if.
I want to index something like:
want [type: intent]
buy [type: intent]
computer [type: entity]
car [type: entity]
Then the user will type:
I want to buy a car.
Then I send this phrase to ElasticSearch/Solr/w.e. and it should return me something like below (it doesn't have to be structured like that, but each word should come with its type):
[
{"word":"want", "type:"intent"},
{"word":"buy", "type":"intent"},
{"word":"car","type":"car"}
]
The approach I came with was Indexing each word as:
{
"word": "car",
"type": "entity"
}
{
"word": "buy",
"type": "intent"
}
And then I provide the whole phrase, searching by "word". But I had no success so far, because Elastic Search doesn't return any of the words, even although phrases contains words that are indexed.
Any insights/ideas/tips to keep this using one of the main search engines?
If I do need to use a dedicated NER solution, what would be the approach to annotate words like this, without the need to worry about fixing typos, multi-languages, etc? I want to return results only if the person types the intents and entities exactly as they are, so not an advanced NLP solution.
Curiously I didn't find much about this on google.
I created a basic index and indexed some documents like this
PUT nlpindex/mytype/1
{
"word": "buy",
"type": "intent"
}
I used query string to search for all the words that appear in a phrase
GET nlpindex/_search
{
"query": {
"query_string": {
"query": "I want to buy a car",
"default_field": "word"
}
}
}
By default the operator is OR so it will search for every single word in the phrase in word field.
This is the results I get
"hits": [
{
"_index": "nlpindex",
"_type": "mytype",
"_id": "1",
"_score": 0.09427826,
"_source": {
"word": "car",
"type": "entity"
}
},
{
"_index": "nlpindex",
"_type": "mytype",
"_id": "4",
"_score": 0.09427826,
"_source": {
"word": "want",
"type": "intent"
}
},
{
"_index": "nlpindex",
"_type": "mytype",
"_id": "3",
"_score": 0.09427826,
"_source": {
"word": "buy",
"type": "intent"
}
}
]
Does this help?

Resources