Search in GET with a multifield condition - elasticsearch

I would like to perform an elasticsearch query relying on a GET request.
This query successfully allow me to see all the messages inside the index addressed to a particular sender (i.e., where sender.id == some value).
http://localhost:9200/myindex/messages/_search?q=sender.id:user1
Now, I would like to add a new field. In my case study, to retrieve only the messages with the boolean flag received set to true. So I tried:
http://localhost:9200/myindex/messages/_search?q=sender.id:user1&received:true
But this doesn't work, and I can't find any documentation / example on how to perform a GET query with multifield.
Please note that the parameter received exists, is always set, and is correctly working when used alone.

The q parameter take the lucene query syntax.
So to add another condition use the following:
http://localhost:9200/myindex/messages/_search?q=sender.id:user1%20AND%20received:true

Related

Must returns more results than Filter

I checked this question What is the difference between must and filter in Query DSL in elasticsearch? and read answers.
As far as I understood must and filter should return same result. Am I right? But when I change filter query to must, I receive more result? What I am doing wrong?
I compared filter and must query and got different result.
Must query gives you some score that is used to add to the total score of the doc.
Filter query does not add any score. It is just used to decide whether a doc is returned or not in the result set.
By just looking at the screenshot of the query attached, when you change filter query to must query it starts adding some value to the total score of the doc.
Since you are using min_score condition, the must clause makes more docs exceed 0.2 score and hence more docs are returned in the final result set.
Rest things will be more clear when you share the complete query.

How to overcome maxClauseCount error when using multi_match query

I have 10+ Indexes on my Elasticsearch server.
Each Index has 1 or more fields with different kind of Analyzers: keyword, standard, ngram and etc...
For Global search I am using multi_match without specifying any explicit fields.
For querying I am using using elasticsearch-dsl library, the code is bellow:
def search_for_index(indice, term, num_of_result=10):
s = Search(index=indice).sort({"_score": "desc"})
s = s[:num_of_result]
s = s.query('multi_match', query=term, operator='and')
response = s.execute()
return response.to_dict()['hits']['hits']
I get very good result, and search is working just fine, but sometimes someone enters a bit longer text, and I am getting maxClauseCount error.
For example, search that raises an error when search term term is equal to:
term=We are working on your request and will keep you posted at the earliest.
Or any other little longer text raises the same error.
Can you help me figure it out maybe some better approach for my Global search so that I can avoid this kind of error?
First of all - this limitation is there for a reason. The more boolean clauses you have - the heavier search would be. Think of it as crossing (AND) or joining (OR) subset of document ids for each of the clause. This is very heavy operation, that is why initially it has a limit of 1024 clauses.
General recommendation would be to try reduce number of fields you're searching. Maybe you have fields which consist no text data or just have some internal ids. You could cross them out during multi_match query by specifying fields section explicitly.
If you're still decided to go with current approach and you're using Elasticsearch 5.5+ and higher you could alter those by adding following line in elasticsearch.yml and restart your instance.
indices.query.bool.max_clause_count: 250000
If you're using pre-5 version of Elasticsearch the setting is called index.query.bool.max_clause_count

Couchdb get the changed document with each change notification

I'm quite sure that I want to be notified with the inserted document by each insertion in the couch db.
something like this:
http://localhost:5058/db-name/_chnages/_view/inserted-document
And I like the response to be something like the following:
{
"id":"0552065465",
"name":"james"
.
.
.
}
Reconnecting to the database for giving the actual document by each notification can cause performance issues.
Can I define a view that return the actual document by each change?
There are 3 possible way to define if a document was just added:
You add a status field to your document with a specific status for new documents.
If the revision starts with a 1- but it's not 100% accurate according to this if you do replication.
In the changes response, check if the number of revision of the document is equal to one. If so, it means it was just added(best solution IMO)
If you want to query the _changes endpoint and directly get the newly inserted documents, you can use the approach #1 and use a filter function that only returns documents with status="new".
Otherwise, you should go with approach #3 and filter the _changes responses locally. Eg: your application would receive all changes and only handle documents with revisions array count equal to 1.
And as you mentioned, you want to receive the document, not only the _id and the _rev. To do so, you can simply add the query parameter: include_docs=true

Elasticsearch not searching some fields

I have just updated a website, the update adds new fields to elasticsearch.
In my dev environment, it all works fine. but on the live site, the new fields are not being found.
Eg. I have added a new field with the value : 1
However, when adding a filtered query of
{"field":1}
It does not find any matching results.
When I look in the documents, I can see docs with the field set to 1
Would the reason for this be that the new field was added after the mappings was set? I am not all that familiar with elasticsearch, So I am not really sure where to start looking to fix it.
Any help would be appreciated.
Update:
querying from URL shows nothing either
_search/?pretty=true&size=50&q=field1:*
however there is another field that was added at the same time which I can search on.
I can see field1 in the result set but it just wont allow me to search on it.
Only difference i see in the mapping is that the one that is working is set to type:long whereas the one not working is set as type:string
Is it a length issue on the ngram? what was your "min_gram" settings?
When you check on your index settings like this:
GET <host>/<index_name>/_settings
Does it work when you filter for a two digit field?
Are all the field values one digit?
It's OK to add a field after the mapping was set. ElasticSearch will guess the mapping for you. (in fact, it's one of their selling features --- no need to define the mapping, just throw the data at it)
There are a few things that can go wrong:
Verify that data is actually in the index. To do that, just navigate to the _search url with no parameters, you should see the field if it is indexed.
Look at your mapping. Could it be that the field is explicitly set not to be indexed?
Another possibility is that your query is wrong (but that is unlikely, since you're saying it works in the development environment)

Conditional sorting in Solr 3.6

We're running Solr 3.6 and are trying to apply a conditional sort on the result set. To clarify, the data is a set of bids, and we want to add the option to sort by the current user's bid, so it can't function as a regular sort (as the bid will be different for each user that runs the query).
The documents in the result set include a "CurrentUserId" and "CurrentBid" field, so I think we need something like the following to sort:
sort=((CurrentUserId = 12345) ? CurrentBid : 0) desc
This is just pseudocode, but the idea is that if the currentUserId in Solr matches the user Id (12345 in this example), then sort by CurrentBid, otherwise, just use 0.
It seems like doing a sort by query might be the way to go with achieving this (or at least form part of the solution), using something like the following query:
http://localhost:8080/solr/select/?q=:&sort=query(CurrentUserId:10330 AND CurrentBid:[1 TO *])+desc
This doesn't seem to be working for me though, and results in the following error:
sort param could not be parsed as a query, and is not a field that exists in the index: ...
The Solr documentation indicates that the query function can be used as a sort parameter from Solr 1.4 onwards, so this seems like it should work.
Any advice on how to go about achieving this would be greatly appreciated.
According to the Solr Documentation link you provided,
Any type of subquery is supported through either parameter dereferencing $otherparam or direct specification of the query string in the LocalParams via "v".
So based on the examples and your query, I think one or both of the following should work:
http://localhost:8080/solr/select/?q=:&sort=query($qq)+desc&qq=(CurrentUserId:10330 AND CurrentBid:[1 TO *])
http://localhost:8080/solr/select/?q=:&sort=query({v='CurrentUserId:10330 AND CurrentBid:[1 TO *]'})+desc

Resources