Return list of affected indices from in Elasticsearch - elasticsearch

I need to write a query which will search across all indices in Elastisearch and return me a list of all indices where at least one document meets query requirements.
For now I`m getting top 2000 documents and distinct them by index name.

To search across all indices in the elastcsearch, you can use the _all option.
You can try similar to following, to get the indices which gets hits for the query
POST _all/_search
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "you search criteia"
}
}
}
}
}
Most APIs that refer to an index parameter support execution across multiple indices, using simple test1,test2,test3 notation (or _all for all indices)
You can extract the index name from the result set which will be present under _index
sample result:
"hits": [
{
"_index": "index-name",
}
]

Related

In Elasticsearch, how do I combine multiple filters with OR without affecting the score?

In Elasticsearch, I want to filter my results with two different clauses aggregated with OR e.g. return documents with PropertyA=true OR PropertyB=true.
I've been trying to do this using a bool query. My base query is just a text search in must. If I put both clauses in the filter occurrence type, it aggregates them with an AND. If I put both clauses in the should occurrence type with minimum_should_match set to 1, then I get the right results. But then, documents matching both conditions get a higher score because "should" runs in a query context.
How do I filter to only documents matching either of two conditions, without increasing the score of documents matching both conditions?
Thanks in advance
You need to leverage the constant_score query, so everything runs in the filter context:
{
"query": {
"constant_score": {
"filter": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"PropertyA": true
}
},
{
"term": {
"PropertyB": true
}
}
]
}
}
}
}
}

Why is Elasticsearch with Wildcard Query always 1.0?

When i do a search in Elasticsearch with a Wildcard-Query (Wildcard at the End) the score results for all hits in 1.0.
Is this by design? Can I change this behavior somewhere?
Elasticsearch is basically saying that all results are equally relevant, as you've provided an unqualified search (a wildcard, equivalent to a match_all). As soon as you add some additional context through the various types of queries, you will notice changes in the scoring.
Depending on your ultimate goal, you may want to look into the Function Score query - reference: https://www.elastic.co/guide/en/elasticsearch/reference/6.7/query-dsl-function-score-query.html
The first example provided would give you essentially random scores for all documents in your cluster:
GET /_search
{
"query": {
"function_score": {
"query": { "match_all": {} },
"boost": "5",
"random_score": {},
"boost_mode":"multiply"
}
}
}

Elasticsearch join-like query on multiple types and different fields

I have an Elasticsearch index called my_index which contains documents of two types, Type1 and Type2.
The two document types contain different data about the same type of entity.
The two document types both contain the ID of the related entity.
I've been trying to construct a join-like query which would return entities which match conditions on both document types, but I can't get it to work, and I also can't find any citation in the Elasticsearch multi-type or query documentation that says it's not possible.
The problem I'm trying to solve is avoiding having to manually join two result sets by getting all Type1 hits and all Type2 hits and doing the join outside of Elasticsearch, since the index has millions of documents.
The equivalent in SQL would be
select * from
Type1 inner join Type2
on Type2.EntityId = Type1.EntityId
where
Type1.Field = Condition AND
Type2.Field = Condition [...]
The URL I'm using to query against is http://elastic/my_index/Type1,Type2/_search to include both document types.
If I perform a blank query against this URL, I get hits of both Type1 and Type2.
If I add a criterion for Type1, it works as expected:
{ "query": {
"bool": {
"must": [{
"term": {
"FieldOnType1": "lorem" } } ] } } }
Somehow Elasticsearch can infer that FieldOnType1 is indeed a field on Type1.
When I add a criterion for Type2, I don't get any hits:
{ "query": {
"bool": {
"must": [{
"term": {
"FieldOnType1": "lorem" } }, {
"term": {
"FieldOnType2": "ipsum" } } ] } } }
In reality, there are sometimes more than 2 term queries, or range queries and term queries.
I'm guessing the problem with the above query is that no single document can match both criteria at once.
I've tried
using should instead of must, and I've tried
qualifying the field names with type names, and I've tried
many variations of the query (including using filters instead of queries)
but everything gives me 0 hits.
Similar questions here suggest to use the Elasticsearch multi-search API instead of the search API, but that won't solve my "manual join" problem.
Is there a way to make an elaborate "OR" query that would allow queries on both types? Or something else?
Try multi_match query (I use ES 6, so have index p/type):
GET index1,index2/_search
{
"query":{
"multi_match": {
"query": "1",
"fields": ["FieldOnType1", "FieldOnType2"]
}
}
}
If you need to use different fields, should should work:
GET test,test1/_search
{
"query":{
"bool": {
"should": [
{
"term": {"firstName": "john"}
},
{
"term": {"firstName1": "jerry1"}
}
]
}
}
}

Scope Elasticsearch Results to Specific Ids

I have a question about the Elasticsearch DSL.
I would like to do a full text search, but scope the searchable records to a specific array of database ids.
In SQL world, it would be the functional equivalent of WHERE id IN(1, 2, 3, 4).
I've been researching, but I find the Elasticsearch query DSL documentation a little cryptic and devoid of useful examples. Can anyone point me in the right direction?
Here is an example query which might work for you. This assumes that the _all field is enabled on your index (which is the default). It will do a full text search across all the fields in your index. Additionally, with the added ids filter, the query will exclude any document whose id is not in the given array.
{
"bool": {
"must": {
"match": {
"_all": "your search text"
}
},
"filter": {
"ids": {
"values": ["1","2","3","4"]
}
}
}
}
Hope this helps!
As discussed by Ali Beyad, ids field in the query can do that for you. Just to complement his answer, I am giving an working example. In case anyone in the future needs it.
GET index_name/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"field": "your query"
}
},
{
"ids" : {
"values" : ["0aRM6ngBFlDmSSLpu_J4", "0qRM6ngBFlDmSSLpu_J4"]
}
}
]
}
}
}
You can create a bool query that contains an Ids query in a MUST clause:
https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-ids-query.html
By using a MUST clause in a bool query, your search will be further limited by the Ids you specify. I'm assuming here by Ids you mean the _id value for your documents.
According to es doc, you can
Returns documents based on their IDs.
GET /_search
{
"query": {
"ids" : {
"values" : ["1", "4", "100"]
}
}
}
With elasticaBundle symfony 5.2
$query = new Query();
$IdsQuery = new Query\Ids();
$IdsQuery->setIds($id);
$query->setQuery($IdsQuery);
$this->finder->find($query, $limit);
You have two options.
The ids query:
GET index/_search
{
"query": {
"ids": {
"values": ["1, 2, 3"]
}
}
}
or
The terms query:
GET index/_search
{
"query": {
"terms": {
"yourNonPrimaryIdField": ["1", "2","3"]
}
}
}
The ids query targets the document's internal _id field (= the primary ID). But it often happens that documents contain secondary (and more) IDs which you'd target thru the terms query.
Note that if your secondary IDs contain uppercase chars and you don't set their field's mapping to keyword, they'll be normalized (and lowercased) and the terms query will appear broken because it only works with exact matches. More on this here: Only getting results when elasticsearch is case sensitive

Why I can retrieve records in Elastic search using bool query?

I've inserted a record in ElasticSearch an I can see that here:
But this query returns nothing:
{
"query": {
"filtered": {
"query": {
"bool": {
"must": {
"term": {
"name": "Ehsanl"
}
}
}
}
}
}
}
I post this query using post method to this user: http://127.0.0.1:9200/mydb/customers2/_search
What's wrong with that?
Try giving the name as "ehsanl". All in lower case.
What you see on your screenshot is the original document as you indexed it (_source field).
However, by default, string fields are analyzed (see this answer for more detail about analysis).
Using standard analyzer, your name value should have been lowercased to ehsanl and stored this way in the index : term queries search for the exact value Ehsanl in the index, which doesn't exist.
You can either :
use ehsanl value with term query
use Ehsanl value with a match query, which will apply the same analyzer before to search.

Resources