Elasticsearch must_not terms query for Spring - spring-boot

I have a database model for a user as:
{
"referenceId": "123",
"name": "Frank",
"department": "CS",
"negativeUserList": [ "980", "739" ]
}
I am writing an ES 2.4 Query with which I can perform FTS query on user name with given department but the result should not contain a user with his ID in negativeUserList array. So, in the above object, even if User ID 980 has the same name and department, he (980) should not appear in searching user's (123) FTS. I have successfully written the following query which is working good:
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "A",
"fields": [
"name.ngram"
],
"minimum_should_match": "100%"
}
},
"filter": {
"term": {
"department": "cs"
}
},
"must_not": {
"terms": {
"referenceId": [
"980",
"739"
]
}
}
}
}
}
But when I integrate the same with Spring ElasticsearchRepository as:
#Query("{ \"query\": { \"bool\": { \"must\": { \"multi_match\": { \"query\": \"?0\", \"fields\": [ \"name.ngram\" ], \"minimum_should_match\": \"100%\" } }, \"filter\": { \"term\": { \"department\": \"?1\" } }, \"must_not\": { \"terms\": { \"referenceId\": [ \"?2\" ] } } } } }")
Page<UserSummary> getFreeTextSearchForUser(String freeText, String currentUserDepartment, List<String> negativeUserList, Pageable pageable);
Then I do not get any results when currentUserDepartment is cs and negativeUserList is [ "980", "739" ] despite of getting results when I run the native query on ES REST endpoint directly.
As of now, most of the users have an empty array in negativeUserList and it should run normally.
EDIT:
When I modify the Spring query with:
{ \"referenceId\": \"?2\" }
i.e. without the square brackets around \"?2\", I get an error as:
Caused by: org.elasticsearch.index.query.QueryParsingException: [terms] query does not support [referenceId]
at org.elasticsearch.index.query.TermsQueryParser.parse(TermsQueryParser.java:155)
If I now remove the double-quotes:
{ \"referenceId\": ?2 }
I simply do not get back any results but no error. What am I doing wrong?

Related

ElasticSearch query error: query malformed, no start_object

I was providing the below query to ElasticSearch:
{
"bool": {
"filter": [
[
"000TBT-E",
"07N3P3-E",
"BNTX",
"PFE"
],
{
"terms": {
"query_uuid": [
"0284EFDB592041B7BEC9867C096FD881",
"051AA9969F3140BFABA50B2B195249CF",
"079EF8038042418EA3A219A62A09845A",
"0890230614E14DE59473C9A2ADA72FC9",
"0A8B26197C034154B516652E30100D4D",
"110D2E40E2E04DB6845D0CFC62FA537B",
"12529A8CB715483F98333818FCBD54C8",
"1379AD490F914C8893B68E41ACF115C0",
"137C1B522A37441884B566F914127836",
"13A0E9DCD4DE4350985802D1826252DF",
"FE4F0480A0D04B7FB67B3E9B8BC96496"
]
}
}
]
}
}
When this query was executed the threw the error:
'{"col":106,"line":1,"reason":"[_na] query malformed, must start with start_object","root_cause":[{"col":106,"line":1,"reason":"[_na] query malformed, must start with start_object","type":"parsing_exception"}],"type":"parsing_exception"}'
May I know what I missing here?
There are two issues in your query:
You need to put everything inside the query section
The first filter is not acting on any field
Here is a working query:
{
"query": { <-- add this
"bool": {
"filter": [
{
"terms": { <-- add this
"fieldname": [ <-- add this
"000TBT-E",
"07N3P3-E",
"BNTX",
"PFE"
]
}
},
{
"terms": {
"query_uuid": [
"0284EFDB592041B7BEC9867C096FD881",
"051AA9969F3140BFABA50B2B195249CF",
"079EF8038042418EA3A219A62A09845A",
"0890230614E14DE59473C9A2ADA72FC9",
"0A8B26197C034154B516652E30100D4D",
"110D2E40E2E04DB6845D0CFC62FA537B",
"12529A8CB715483F98333818FCBD54C8",
"1379AD490F914C8893B68E41ACF115C0",
"137C1B522A37441884B566F914127836",
"13A0E9DCD4DE4350985802D1826252DF",
"FE4F0480A0D04B7FB67B3E9B8BC96496"
]
}
}
]
}
}
}

Elasticsearc query on array field and matchig on exact one object from array

I store in Elasticsearc objects like that:
{
"userName": "Cool User",
"orders":[
{
"orderType": "type1",
"amount": 500
},
{
"orderType": "type2",
"amount": 1000
}
]
}
And all is ok while I`m searching by 'orders.orderType' or 'orders.amount' fields.
But what query I have to use for getting objects, which has 'orders.amount >= 500' and 'orders.orderType=type2'?
I`ve tried to query like that:
{
"query": {
"bool": {
"must": [
{
"range": {
"orders.amount": {
"from": "499"
}
}
},
{
"query_string": {
"query": "type2",
"fields": [
"orders.orderType"
]
}
}
]
}
}
}
..but this request returns records that has 'orders.orderType=type2' OR 'orders.amount >= 500'.
Please help me to construct query, that will look for objects that has object inside orders array and it object has to have amount >= 500 AND 'orderType=type2'.
Finally, I found blog post that describes exactly my case.
https://www.bmc.com/blogs/elasticsearch-nested-searches-embedded-documents/
Thanks for help.

Search for documents with exactly different fields values

I'm adding documents with the following strutucte
{
"proposta": {
"matriculaIndicacao": 654321,
"filial": 100,
"cpf": "12345678901",
"idStatus": "3",
"status": "Reprovada",
"dadosPessoais": {
"nome": "John Five",
"dataNascimento": "1980-12-01",
"email": "fulanodasilva#fulano.com.br",
"emailValidado": true,
"telefoneCelular": "11 99876-9999",
"telefoneCelularValidado": true,
"telefoneResidencial": "11 2211-1122",
"idGenero": "1",
"genero": "M"
}
}
}
I'm trying to perform a search with multiple field values.
I can successfull search for a document with a specific cpf atribute with the following search
{
"query": {
"term" : {
"proposta.cpf" : "23798770823"
}
}
}
But now I need to add an AND clause, like
{
"query": {
"term" : {
"proposta.cpf" : "23798770823"
,"proposta.dadosPessoais.dataNascimento": "1980-12-01"
}
}
}
but it's returning an error message.
P.S: If possible I would like to perform a search where if the field doesn't exist, it returns the document that matches only the proposta.cpf field.
I really appreciate any help.
The idea is to combine your constraints within a bool/should query
{
"query": {
"bool": {
"should": [
{
"term": {
"proposta.cpf": "23798770823"
}
},
{
"term": {
"proposta.dadosPessoais.dataNascimento": "1980-12-01"
}
}
]
}
}
}

Elasticsearch: Search in an array of JSONs

I'm using Elasticsearch with the python library and I have a problem using the search query when the object become a little bit complex. I have objects build like that in my index:
{
"id" : 120,
"name": bob,
"shared_status": {
"post_id": 123456789,
"text": "This is a sample",
"urls" : [
{
"url": "http://test.1.com",
"displayed_url": "test.1.com"
},
{
"url": "http://blabla.com",
"displayed_url": "blabla.com"
}
]
}
}
Now I want to do a query that will return me this document only if in one of the displayed URL's a substring "test" and there is a field "text" in the main document. So I did this query:
{
"query": {
"bool": {
"must": [
{"exists": {"field": "text"}}
]
}
}
}
}
But I don't know what query to add for the part: one of the displayed URL's a substring "test"
Is that posssible? How does the iteration on the list works?
If you didn't define an explicit mapping for your schema, elasticsearch creates a default mapping based on the data input.
urls will be of type object
displayed_url will be of type string and using standard analyzer
As you don't need any association between url and displayed_url, the current schema will work fine.
You can use a match query for full text match
GET _search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "text"
}
},
{
"match": {
"urls.displayed_url": "test"
}
}
]
}
}
}

Elasticsearch: Get report of unmatched should elements in a bool query

I'm looking for a way to get a report of unmatched should querys and display it.
For instance I have two user objects
User 1:
{
"username": "user1"
"docType": "user"
"level": "Professor"
"discipline": "Sciences"
"sub-discipline": "Mathematical"
}
User 2:
{
"username": "user1"
"docType": "user"
"level": "Professor"
"discipline": "Sciences"
"subDiscipline": "Physics"
}
When I do a bool query where the matching discipline is in must query and the sub-discipline is in the should query
bool:
must: [{
term: { "doc.docType": "user" }
},{
term: { "doc.level": "professor" }
},{
term: { "doc.discipline": "sciences" }
}],
should: [{
term: { "subDiscipline": "physics" }
}]
How can I get the unmatched elements in my result like that:
Result 1: user1 match 100%
Result 2: user2 match 70% (unmatch subdiscipine "physics")
I had a look into the explainApi but the result doesn't seems to be provided for that use case and seems very complicated to parse.
You will need to use named queries for this.
Using the same , create a bool query like below -
{
"query": {
"bool": {
"must": [
{
"match": {
"SourceName": {
"query": "CNN",
"_name": "sourceMatch"
}
}
},
{
"match": {
"author": {
"query": "qbox.io",
"_name": "author"
}
}
}
]
}
}
}
In the result section , it will tell which all named queries matched.
You can use this information to fabricate the stats you are looking for.

Resources