ElasticSearch sorting based with wildcard in key - elasticsearch

I have a data structure something like this from query. I want to apply a sort based on the date in the object values.
{
users: {
"1234": {
name: "User 1",
joining_date: "2022-12-28T11:37:00.000Z"
},
"3456": {
name: "User 2",
joining_date: "2022-12-18T11:37:00.000Z"
}
}
}
This is my query so far.
GET /_search
{
"sort" : [ {
"users.*.joining_date": {
"order": "desc",
"format": "date",
"unmapped_type": "long"
} }
],
"query": {
"query_string": {
"query": "_schema:users"
}
}
}
The problem is with using a wildcard in the key. I have tried multiple combinations from the documentation but nothing worked so far. I will be grateful for any help.

Related

Search for documents with exactly different fields values

I'm adding documents with the following strutucte
{
"proposta": {
"matriculaIndicacao": 654321,
"filial": 100,
"cpf": "12345678901",
"idStatus": "3",
"status": "Reprovada",
"dadosPessoais": {
"nome": "John Five",
"dataNascimento": "1980-12-01",
"email": "fulanodasilva#fulano.com.br",
"emailValidado": true,
"telefoneCelular": "11 99876-9999",
"telefoneCelularValidado": true,
"telefoneResidencial": "11 2211-1122",
"idGenero": "1",
"genero": "M"
}
}
}
I'm trying to perform a search with multiple field values.
I can successfull search for a document with a specific cpf atribute with the following search
{
"query": {
"term" : {
"proposta.cpf" : "23798770823"
}
}
}
But now I need to add an AND clause, like
{
"query": {
"term" : {
"proposta.cpf" : "23798770823"
,"proposta.dadosPessoais.dataNascimento": "1980-12-01"
}
}
}
but it's returning an error message.
P.S: If possible I would like to perform a search where if the field doesn't exist, it returns the document that matches only the proposta.cpf field.
I really appreciate any help.
The idea is to combine your constraints within a bool/should query
{
"query": {
"bool": {
"should": [
{
"term": {
"proposta.cpf": "23798770823"
}
},
{
"term": {
"proposta.dadosPessoais.dataNascimento": "1980-12-01"
}
}
]
}
}
}

Elasticsearch: Search in an array of JSONs

I'm using Elasticsearch with the python library and I have a problem using the search query when the object become a little bit complex. I have objects build like that in my index:
{
"id" : 120,
"name": bob,
"shared_status": {
"post_id": 123456789,
"text": "This is a sample",
"urls" : [
{
"url": "http://test.1.com",
"displayed_url": "test.1.com"
},
{
"url": "http://blabla.com",
"displayed_url": "blabla.com"
}
]
}
}
Now I want to do a query that will return me this document only if in one of the displayed URL's a substring "test" and there is a field "text" in the main document. So I did this query:
{
"query": {
"bool": {
"must": [
{"exists": {"field": "text"}}
]
}
}
}
}
But I don't know what query to add for the part: one of the displayed URL's a substring "test"
Is that posssible? How does the iteration on the list works?
If you didn't define an explicit mapping for your schema, elasticsearch creates a default mapping based on the data input.
urls will be of type object
displayed_url will be of type string and using standard analyzer
As you don't need any association between url and displayed_url, the current schema will work fine.
You can use a match query for full text match
GET _search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "text"
}
},
{
"match": {
"urls.displayed_url": "test"
}
}
]
}
}
}

Elasticsearch: Get report of unmatched should elements in a bool query

I'm looking for a way to get a report of unmatched should querys and display it.
For instance I have two user objects
User 1:
{
"username": "user1"
"docType": "user"
"level": "Professor"
"discipline": "Sciences"
"sub-discipline": "Mathematical"
}
User 2:
{
"username": "user1"
"docType": "user"
"level": "Professor"
"discipline": "Sciences"
"subDiscipline": "Physics"
}
When I do a bool query where the matching discipline is in must query and the sub-discipline is in the should query
bool:
must: [{
term: { "doc.docType": "user" }
},{
term: { "doc.level": "professor" }
},{
term: { "doc.discipline": "sciences" }
}],
should: [{
term: { "subDiscipline": "physics" }
}]
How can I get the unmatched elements in my result like that:
Result 1: user1 match 100%
Result 2: user2 match 70% (unmatch subdiscipine "physics")
I had a look into the explainApi but the result doesn't seems to be provided for that use case and seems very complicated to parse.
You will need to use named queries for this.
Using the same , create a bool query like below -
{
"query": {
"bool": {
"must": [
{
"match": {
"SourceName": {
"query": "CNN",
"_name": "sourceMatch"
}
}
},
{
"match": {
"author": {
"query": "qbox.io",
"_name": "author"
}
}
}
]
}
}
}
In the result section , it will tell which all named queries matched.
You can use this information to fabricate the stats you are looking for.

Querystring search on array elements in Elastic Search

I'm trying to learn elasticsearch with a simple example application, that lists quotations associated with people. The example mapping might look like:
{
"people" : {
"properties" : {
"name" : { "type" : "string"},
"quotations" : { "type" : "string" }
}
}
}
Some example data might look like:
{ "name" : "Mr A",
"quotations" : [ "quotation one, this and that and these"
, "quotation two, those and that"]
}
{ "name" : "Mr B",
"quotations" : [ "quotation three, this and that"
, "quotation four, those and these"]
}
I would like to be able to use the querystring api on individual quotations, and return the people who match. For instance, I might want to find people who have a quotation that contains (this AND these) - which should return "Mr A" but not "Mr B", and so on. How can I achieve this?
EDIT1:
Andrei's answer below seems to work, with data values now looking like:
{"name":"Mr A","quotations":[{"value" : "quotation one, this and that and these"}, {"value" : "quotation two, those and that"}]}
However, I can't seem to get a query_string query to work. The following produces no results:
{
"query": {
"nested": {
"path": "quotations",
"query": {
"query_string": {
"default_field": "quotations",
"query": "quotations.value:this AND these"
}
}
}
}
}
Is there a way to get a query_string query working with a nested object?
Edit2: Yes it is, see Andrei's answer.
For that requirement to be achieved, you need to look at nested objects, not to query a flattened list of values but individual values from that nested object. For example:
{
"mappings": {
"people": {
"properties": {
"name": {
"type": "string"
},
"quotations": {
"type": "nested",
"properties": {
"value": {
"type": "string"
}
}
}
}
}
}
}
Values:
{"name":"Mr A","quotations":[{"value": "quotation one, this and that and these"}, {"value": "quotation two, those and that"}]}
{"name":"Mr B","quotations":[{"value": "quotation three, this and that"}, {"value": "quotation four, those and these"}]}
Query:
{
"query": {
"nested": {
"path": "quotations",
"query": {
"bool": {
"must": [
{ "match": {"quotations.value": "this"}},
{ "match": {"quotations.value": "these"}}
]
}
}
}
}
}
Unfortunately there is no good way to do that.
https://web.archive.org/web/20141021073225/http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/complex-core-fields.html
When you get a document back from Elasticsearch, any arrays will be in
the same order as when you indexed the document. The _source field
that you get back contains exactly the same JSON document that you
indexed.
However, arrays are indexed — made searchable — as multi-value fields,
which are unordered. At search time you can’t refer to “the first
element” or “the last element”. Rather think of an array as a bag of
values.
In other words, it is always considering all values in the array.
This will return only Mr A
{
"query": {
"match": {
"quotations": {
"query": "quotation one",
"operator": "AND"
}
}
}
}
But this will return both Mr A & Mr B:
{
"query": {
"match": {
"quotations": {
"query": "this these",
"operator": "AND"
}
}
}
}
If scripting is enabled, this should work:
"script": {
"inline": "for(element in _source.quotations) { if(element == 'this' && element == 'these') {return true;} }; return false;"
}

How to filter terms aggregation

Currently I have something like this
aggs: {
categories: {
terms: {
field: 'category'
}
}
}
and this is giving me number of products in each category. But I have additional condition. I need to get number of products in each category which are not sold already, so I need to perform filter on terms somehow.
Is there some elegant way of doing this using aggregation framework, or I need to write filtered query?
Thank you
You can merge between Terms Aggregation and Filter Aggregation, and this is how it should look: (tested)
aggs: {
categories: {
filter: {term: {sold: true}},
aggs: {
names: {
terms: {field: 'category'}
}
}
}
}
You can add also more conditions to the filter, I hope this helps.
Just to add to the other answer, you can also use a nested query. This is similar to what I had to do. I'm using Elasticsearch 5.2.
From the docs, here is the basic syntax:
"aggregations" : {
"<aggregation_name>" : {
"<aggregation_type>" : {
<aggregation_body>
}
[,"aggregations" : { [<sub_aggregation>]+ } ]?
}
[,"<aggregation_name_2>" : { ... } ]*
}
This is how I implemented it:
GET <path> core_data/_search
{
"aggs": {
"NAME": {
"nested": {
"path": "ATTRIBUTES"
},
"aggs": {
"NAME": {
"filter": {
"term": {
"ATTRIBUTES.ATTR_TYPE": "EDUCATION_DEGREE"
}
},
"aggs": {
"NAME": {
"terms": {
"field": "ATTRIBUTES.DESCRIPTION",
"size": 100
}
}
}
}
}
}
}
}
This filtered the data down to one bucket, which is what I needed.

Resources