Elasticsearch: Query to search if field not exists at all, should not match [ ] (empty array field) - elasticsearch

I have some documents with field links : [] while other documents don't have the field links at all.
I want to get documents which don't have the field links at all.
I have tried the following query:
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "links"
}
}
}
}
}
But this query also returns the documents with links:[]

Your best bet is to modify mapping of field to consider null values , refer to this link ( documentation ) .
You could use a wildcard query * inside boolean to see if it got any terms - but thats a very inefficient / slow way to query and may not be practical depending on cardinality of that field.

Related

Find all documents where field exists literally: either it has a value or it is null in Elasticsearch

I know how to determine the documents in Elasticsearch 6.8 with the non empty field, e.g.:
GET grch38_test__wes__grch38__variants__20210222/_search
{
"query": {
"bool": {
"must": [{
"exists": {
"field": "hgmd_accession"
}
}]
}
}
}
But how to return existing (non null) together with empty values in one query? I need to find the documents where the value literally exists: either empty or set to null. There can be some documents in my index where the field is just not there at all and I need to _reindex the ones that just have the field present in any form.
I don't think null values can be searched because they are not indexed by elasticsearch.
If you can change your index mapping then you should look into the null_value property provided by elasticsearch.
Find it here: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/null-value.html

Elasticsearch join-like query on multiple types and different fields

I have an Elasticsearch index called my_index which contains documents of two types, Type1 and Type2.
The two document types contain different data about the same type of entity.
The two document types both contain the ID of the related entity.
I've been trying to construct a join-like query which would return entities which match conditions on both document types, but I can't get it to work, and I also can't find any citation in the Elasticsearch multi-type or query documentation that says it's not possible.
The problem I'm trying to solve is avoiding having to manually join two result sets by getting all Type1 hits and all Type2 hits and doing the join outside of Elasticsearch, since the index has millions of documents.
The equivalent in SQL would be
select * from
Type1 inner join Type2
on Type2.EntityId = Type1.EntityId
where
Type1.Field = Condition AND
Type2.Field = Condition [...]
The URL I'm using to query against is http://elastic/my_index/Type1,Type2/_search to include both document types.
If I perform a blank query against this URL, I get hits of both Type1 and Type2.
If I add a criterion for Type1, it works as expected:
{ "query": {
"bool": {
"must": [{
"term": {
"FieldOnType1": "lorem" } } ] } } }
Somehow Elasticsearch can infer that FieldOnType1 is indeed a field on Type1.
When I add a criterion for Type2, I don't get any hits:
{ "query": {
"bool": {
"must": [{
"term": {
"FieldOnType1": "lorem" } }, {
"term": {
"FieldOnType2": "ipsum" } } ] } } }
In reality, there are sometimes more than 2 term queries, or range queries and term queries.
I'm guessing the problem with the above query is that no single document can match both criteria at once.
I've tried
using should instead of must, and I've tried
qualifying the field names with type names, and I've tried
many variations of the query (including using filters instead of queries)
but everything gives me 0 hits.
Similar questions here suggest to use the Elasticsearch multi-search API instead of the search API, but that won't solve my "manual join" problem.
Is there a way to make an elaborate "OR" query that would allow queries on both types? Or something else?
Try multi_match query (I use ES 6, so have index p/type):
GET index1,index2/_search
{
"query":{
"multi_match": {
"query": "1",
"fields": ["FieldOnType1", "FieldOnType2"]
}
}
}
If you need to use different fields, should should work:
GET test,test1/_search
{
"query":{
"bool": {
"should": [
{
"term": {"firstName": "john"}
},
{
"term": {"firstName1": "jerry1"}
}
]
}
}
}

Scope Elasticsearch Results to Specific Ids

I have a question about the Elasticsearch DSL.
I would like to do a full text search, but scope the searchable records to a specific array of database ids.
In SQL world, it would be the functional equivalent of WHERE id IN(1, 2, 3, 4).
I've been researching, but I find the Elasticsearch query DSL documentation a little cryptic and devoid of useful examples. Can anyone point me in the right direction?
Here is an example query which might work for you. This assumes that the _all field is enabled on your index (which is the default). It will do a full text search across all the fields in your index. Additionally, with the added ids filter, the query will exclude any document whose id is not in the given array.
{
"bool": {
"must": {
"match": {
"_all": "your search text"
}
},
"filter": {
"ids": {
"values": ["1","2","3","4"]
}
}
}
}
Hope this helps!
As discussed by Ali Beyad, ids field in the query can do that for you. Just to complement his answer, I am giving an working example. In case anyone in the future needs it.
GET index_name/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"field": "your query"
}
},
{
"ids" : {
"values" : ["0aRM6ngBFlDmSSLpu_J4", "0qRM6ngBFlDmSSLpu_J4"]
}
}
]
}
}
}
You can create a bool query that contains an Ids query in a MUST clause:
https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-ids-query.html
By using a MUST clause in a bool query, your search will be further limited by the Ids you specify. I'm assuming here by Ids you mean the _id value for your documents.
According to es doc, you can
Returns documents based on their IDs.
GET /_search
{
"query": {
"ids" : {
"values" : ["1", "4", "100"]
}
}
}
With elasticaBundle symfony 5.2
$query = new Query();
$IdsQuery = new Query\Ids();
$IdsQuery->setIds($id);
$query->setQuery($IdsQuery);
$this->finder->find($query, $limit);
You have two options.
The ids query:
GET index/_search
{
"query": {
"ids": {
"values": ["1, 2, 3"]
}
}
}
or
The terms query:
GET index/_search
{
"query": {
"terms": {
"yourNonPrimaryIdField": ["1", "2","3"]
}
}
}
The ids query targets the document's internal _id field (= the primary ID). But it often happens that documents contain secondary (and more) IDs which you'd target thru the terms query.
Note that if your secondary IDs contain uppercase chars and you don't set their field's mapping to keyword, they'll be normalized (and lowercased) and the terms query will appear broken because it only works with exact matches. More on this here: Only getting results when elasticsearch is case sensitive

Why I can retrieve records in Elastic search using bool query?

I've inserted a record in ElasticSearch an I can see that here:
But this query returns nothing:
{
"query": {
"filtered": {
"query": {
"bool": {
"must": {
"term": {
"name": "Ehsanl"
}
}
}
}
}
}
}
I post this query using post method to this user: http://127.0.0.1:9200/mydb/customers2/_search
What's wrong with that?
Try giving the name as "ehsanl". All in lower case.
What you see on your screenshot is the original document as you indexed it (_source field).
However, by default, string fields are analyzed (see this answer for more detail about analysis).
Using standard analyzer, your name value should have been lowercased to ehsanl and stored this way in the index : term queries search for the exact value Ehsanl in the index, which doesn't exist.
You can either :
use ehsanl value with term query
use Ehsanl value with a match query, which will apply the same analyzer before to search.

Sorting a match query with ElasticSearch

I'm trying to use ElasticSearch to find all records containing a particular string. I'm using a match query for this, and it's working fine.
Now, I'm trying to sort the results based on a particular field. When I try this, I get some very unexpected output, and none of the records even contain my initial search query.
My request is structured as follows:
{
"query":
{
"match": {"_all": "some_search_string"}
},
"sort": [
{
"some_field": {
"order": "asc"
}
}
] }
Am I doing something wrong here?
In order to sort on a string field, your mapping must contain a non-analyzed version of this field. Here's a simple blog post I found that describes how you can do this using the multi_field mapping type.

Resources