elastic search query term returns all the data - elasticsearch

I am trying to query my ES, here is my data,
you can just run this in sense, it creates index fst and fills it with 4 items.
then you can see that it returns the wrong result number
i want only one result, as should be the case.
PUT fst/objects/ggg
{
"frameAttributes": {
"identities": [
{ "_id": "DSC00263", "_score": 0.655822},
{ "_id": "DSC00262", "_score": 0.59957 },
{ "_id": "DSC00244", "_score": 0.220819},
{ "_id": "DSC00300", "_score": 0.191191},
{"_id": "DSC00276", "_score": 0.124561}
]
}
}
PUT fst/objects/ffffff
{
"frameAttributes": {
"identities": [
{"_id": "DSC00222","_score": 0.191009},
{"_id": "DSC00261","_score": 0.146157},
{"_id": "DSC00329","_score": 0.14518},
{"_id": "DSC00225","_score": 0.12622},
{"_id": "DSC00295","_score": 0.12396}
]
}
}
PUT fst/objects/aaaa
{
"frameAttributes": {
"identities": [
{"_id": "DSC00229","_score": 0.223149},
{"_id": "DSC00240","_score": 0.178388},
{"_id": "DSC00228","_score": 0.173769},
{"_id": "DSC00257","_score": 0.166746},
{"_id": "DSC00226","_score": 0.153071}
]
}
}
put fst/objects/abcdef
{
"frameAttributes": {
"identities": [
{ "_id": "DSC00262","_score": 0.427957},
{"_id": "DSC00263","_score": 0.408772},
{"_id": "DSC00282","_score": 0.284546 },
{ "_id": "DSC00283","_score": 0.191374},
{"_id": "DSC00299", "_score": 0.165478}
]
}
}
My Query should return only one result
get fst/_search
{
"query": {
"term": {
"frameAttributes.identities._id": {
"value": "DSC00229"
}
}
}
}

You will have to set the required field as not_analyzed. In your case, the field is _id. You can do that while creating the index. For example:
PUT /gb/_mapping/tweet
{
"properties" : {
"tag" : {
"type" : "string",
"index": "not_analyzed"
}
}
}
Check this link for reference: https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html

Took a while
but I learned something new elastic search doesn't like you to create a fields called _id. I changed it to personId and now it just works.
Thanks all :)

Related

Elastic Multimatch string with dash (or other symbol)

I am trying to match dashes (and other symbols) in my elastic query.
It is fuzzysearch on all the fields using default whitespace analyzer.
My query:
function_score: {
query: {
multi_match: {
query: string
analyzer: "whitespace",
fuzziness: 1
}
}
}
However this has unexpected results with dash characters. E.x. Central-Park doesnt work with this. Or
Dashes only work well when I use a phrase match and strip out the double quotes. But there is no fuzziness.
Does anyone know how I can get the fuzzysearch normally with dashes please?
Adding a working example with index mapping, index data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"place": {
"type": "text",
"analyzer":"whitespace"
}
}
}
}
Index Data:
{
"place": "Cwntral-Park"
}
{
"place": "Central-Park"
}
{
"place": "Central-Area"
}
Search Query:
{
"query": {
"bool": {
"should": {
"match": {
"place": {
"query": "Central-Park",
"fuzziness": 1
}
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65605120",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"place": "Central-Park"
}
},
{
"_index": "65605120",
"_type": "_doc",
"_id": "3",
"_score": 0.8990934,
"_source": {
"place": "Cwntral-Park"
}
}
]

Elasticsearch associating exact match terms

I have a search index of filenames containing over 100,000 entries that share about 500 unique variations of the main filename field. I have recently made some modifications to certain filename values that are being generated from my data. I was wondering if there is a way to link certain queries to return an exact match. In the following query:
"query": {
"bool": {
"must": [
{
"match": {
"filename": "foo-bar"
}
}
],
}
}
how would it be possible to modify the index and associate the results so that above query will also match results foo-bar-baz, but not foo-bar-foo or any other variation?
Thanks in advance for your help
You can use a term query instead of a match query. Perfect to use on a keyword:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
Adding a working example with index data and search query. (Using the default mapping)
Index Data:
{
"fileName": "foo-bar"
}
{
"fileName": "foo-bar-baz"
}
{
"fileName": "foo-bar-foo"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"match": {
"fileName.keyword": "foo-bar"
}
},
{
"match": {
"fileName.keyword": "foo-bar-baz"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar-baz"
}
}
]

ElasticSearch: why it is not possible to get suggest by criteria?

I want to get suggestions from some text for concrete user.
As I understand Elasticsearch provides suggestions based on the whole dictionary(inverted index) that contains all the terms in the index.
So if user1 posts some text then this text can be suggested to user2. Am I right?
Is it possible to add filter by criteria (by user for example) to reduce the set of terms to be suggested?
Yes, that's very much possible, let me show you by an example, which uses the query with filter context:
Index def
{
"mappings": {
"properties": {
"title": {
"type": "text" --> inverted index for storing suggestions on title field
},
"userId" : {
"type" : "keyword" --> like in you example
}
}
}
}
Index sample doc
{
"title" : "foo baz",
"userId" : "katrin"
}
{
"title" : "foo bar",
"userId" : "opster"
}
Search query without userId filter
{
"query": {
"bool": {
"must": {
"match": {
"title": "foo"
}
}
}
}
}
Search results(bring both results)
"hits": [
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "1",
"_score": 0.18232156,
"_source": {
"title": "foo bar",
"userId": "posted" --> note another user
}
},
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "foo baz",
"userId": "katrin" -> note user
}
}
]
Now lets reduce the suggestion by filtering the docs created by user katrin
Search query
{
"query": {
"bool": {
"must": {
"match": {
"title": "foo"
}
},
"filter": {. --> note filter on userId field
"term": {
"userId": "katrin"
}
}
}
}
}
Search result
"hits": [
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "foo baz",
"userId": "katrin"
}
}
]

find by query and push to array in Elastic search

I store data in the elastic search like this:
{
"_index": "my_index",
"_type": "doc",
"_id": "6lDquGEBFRQVe0x93eHk",
"_version": 1,
"_score": 1,
"_source": {
"ID_Number": "6947503728601",
"Userrname":"Jack.m07",
"name": "Jack",
"photos": ["img/one.png"]
}
}
I want find user by ID_Number and push new value to Photos
e.g)
"photos": ["img/one.png","img/two.png"]
How can I implement this? What is the query?
I found answer,this is the query
POST my_index/_update_by_query?conflicts=proceed
{
"script": {
"inline": "ctx._source.photos.add(params.new_photos)",
"params": {
"new_photos": "img/two.png"
}
},
"query": {
"terms": {
"ID_Number": "6947503728601"
}
}
}

Is it possible to perform user count / cardinality with logical relationship in ElasticSearch?

I have documents of Users with the following format:
{
userId: "<userId>",
userAttributes: [
"<Attribute1>",
"<Attribute2>",
...
"<AttributeN>"
]
}
I want to be able to get the number of unique users that answer a logic statement, for example How many users have attribute1 AND attribute2 OR attribute3?
I've read about the cardinality function in cardinality-aggregation but it seems to work for a single value, lacking the logic abilities of "AND" and "OR".
Note that I have around 1,000,000,000 documents and I need the results as fast as possible, this why I was looking at the cardinality estimation.
What about this attempt, considering the userAttributes as a simple array of strings (analyzed in my case, but single lowercase terms):
POST /users/user/_bulk
{"index":{"_id":1}}
{"userId":123,"userAttributes":["xxx","yyy","zzz"]}
{"index":{"_id":2}}
{"userId":234,"userAttributes":["xxx","yyy","aaa"]}
{"index":{"_id":3}}
{"userId":345,"userAttributes":["xxx","yyy","bbb"]}
{"index":{"_id":4}}
{"userId":456,"userAttributes":["xxx","ccc","zzz"]}
{"index":{"_id":5}}
{"userId":567,"userAttributes":["xxx","ddd","ooo"]}
GET /users/user/_search
{
"query": {
"query_string": {
"query": "userAttributes:(((xxx AND yyy) NOT zzz) OR ooo)"
}
},
"aggs": {
"unique_ids": {
"cardinality": {
"field": "userId"
}
}
}
}
which gives the following:
"hits": [
{
"_index": "users",
"_type": "user",
"_id": "2",
"_score": 0.16471066,
"_source": {
"userAttributes": [
"xxx",
"yyy",
"aaa"
]
}
},
{
"_index": "users",
"_type": "user",
"_id": "3",
"_score": 0.04318809,
"_source": {
"userAttributes": [
"xxx",
"yyy",
"bbb"
]
}
},
{
"_index": "users",
"_type": "user",
"_id": "5",
"_score": 0.021594046,
"_source": {
"userAttributes": [
"xxx",
"ddd",
"ooo"
]
}
}
]

Resources