I am passing here some parameters via get to limit query result and also query_string is passed in url. Although, I am also giving request body to filter results.
curl -XGET 'http://localhost:9200/books/fantasy/_search?from=0&size=10&q=%2A' -d '{
"query":{
"filtered":{
"filter":{
"exists":{
"field":"speacial.ean"
}
}
}
}
}'
I just want to check is this approach okay? is there any downsides doing it like this? Or should I pass any parameters in url when body is used?
This seems to work, but is it bad practice?
GET requests are not supposed to use a body ( more information on this here). While curl might convert your GET requests with a body to POST, many tools might simply drop the body, or it might be sent to Elastic but ignored because you used GET.
When executing this query in my SENSE, I get all the documents instead of just the document matching my query, proving that the body has been ignored:
GET myIndex/_search
{
"query": {
"match": {
"zlob": true
}
}
}
This example shows that you should avoid to use GET to make requests with a body, because the result will depend on the tool you use for your rest queries.
Related
The simplest example:
GET /_search
{
"from" : 0, "size" : 10,
"query" : {
"term" : { "user" : "kimchy" }
}
}
Rewrite without data raw Search URI:
GET /_search?from=0&size=10&q=user:kimchy
Is it possible to rewrite the example for Search Template like this:
GET /_search/template
{
"id": "sample_id_script",
"params": {
"gte": "2020-10-15 00:00:00",
"lte": "2020-10-15 23:59:59"
}
}
Yes, it's possible via the source query string parameter!! You simply need to inline your JSON body and add the other &source_content_type=application/json query string parameter, and voilĂ !
GET /_search/template?source={"id": "sample_id_script","params": {"gte": "2020-10-15 00:00:00","lte": "2020-10-15 23:59:59"}}&source_content_type=application/json
Please note, though, that it's not the same concept as the example you're showing. In your example, we're hitting the _search endpoint and sending a query (i.e. using q=) expressed in the Lucene Expression language. It's basically the equivalent of what you would send in a query_string query.
The second case is different, because you're sending a search template via the _search/template endpoint. So even though the effect is the same (i.e. sending a payload via the query string), the concept semantic is different.
I am trying to find out what actually got matched for a search in a specific for which the doc is returned.
Ex. I have a table index where there are fields called table_name and column_name...
My search query is finding both those fields, now If I fire a search query and any one of them gets matched ,but I want to know what got matched .. whether its column_name or the table_name.
I am aware of the Explain API but that will require me to call another API...
You don't need to call the explain API. The search API supports the explain flag
GET stackoverflow/_search?explain=true
This will return the _explanation section along with the _source section.
Update
Another solution would be to use highlight. I've used this before, for manually evaluating queries. It's an easy way to get some feedback on what matched
GET stackoverflow/_search
{
"query": {
"match": {
"FIELD": "TEXT"
}
},
"highlight": {
"fields": {
"*": {}
}
}
}
Of course, you can have the explain flag set as well
This seemingly simple task is not well-documented in the ElasticSearch documentation:
We have an ElasticSearch instance with an index that has a field in it called sourceId. What API call would I make to first, GET all documents with 100 in the sourceId field (to verify the results before deletion) and then to DELETE same documents?
You probably need to make two API calls here. First to view the count of documents, second one to perform the deletion.
Query would be the same, however the end points are different. Also I'm assuming the sourceId would be of type keyword
Query to Verify
POST <your_index_name>/_search
{
"size": 0,
"query": {
"term": {
"sourceId": "100"
}
}
}
Execute the above Term Query and take a note at the hits.total of the response.
Remove the "size":0 in the above query if you want to view the entire documents as response.
Once you have the details, you can go ahead and perform the deletion using the same query as shown in the below query, notice the endpoint though.
Query to Delete
POST <your_index_name>/_delete_by_query
{
"query": {
"term": {
"sourceId": "100"
}
}
}
Once you execute the Deletion By Query, notice the deleted field in the response. It must show you the same number.
I've used term queries however you can also make use of any Match or any complex Bool Query. Just make sure that the query is correct.
Hope it helps!
POST /my_index/_delete_by_query?conflicts=proceed&pretty
{
"query": {
"match_all": {}
}
}
Delete all the documents of an index without deleting the mapping and settings:
See: https://opster.com/guides/elasticsearch/search-apis/elasticsearch-delete-by-query/
POST http://localhost:9200/test2/drug?pretty
{
"title": "I can do this"
}
get test2/drug/_search
{
"query" : {
"match": {
"title": "cancer"
}
}
}
The mappings are:
{
"test2": {
"mappings": {
"drug": {
"properties": {
"title": {
"type": "string"
}
}
}
}
}
}
Running the above query returns the document. I want to understand what elastic is doing behind the scenes? From looking at the output of the default analyzer it does not tokenize cancer such that it returns "can" so why is a document with the word "can" being returned and what is causing this to be returned? In other words, what other processing is happening to the search query "cancer".
Updated
Is there a command I can run on my box that will clear all indexes and everything so I have a clean slate? I ran delete /* which succeeded but still getting a match.
The problem with your test is, if you are using Sense, the get request. In Sense it should be GET (capital letters).
The explanation is related to GET vs. POST http methods.
Behind the scene Sense actually converts a GET request to a HTTP POST (given that many browsers do not support HTTP GET requests with a request body). This means that, even if you write GET, the actual http request is a POST.
Because Sense has the autocomplete that forces upper case letters for request methods, it uses the same upper case letters when deciding if it's a GET (and not a lowercase get) request together with a request body. If it is, then that request is transformed to a POST one. If it compares the request method and decides is not a GET it sends the request as is, meaning with a get method and with a body. Since the body is ignored, what reaches Elasticsearch will be a test2/drug/_search which is basically a match_all.
I guess that you configured in your index mappings an NGram filter or tokenizer. Let's suppose (I hope you'll confirm my hypothesis) an Edge NGram is configured. You can check it with:
GET test2/_mapping
Then the document is tokenized: i,c,ca,can,d,do,t,th,thi,this. As a result, in the index, the token can points to the document I can do this
When you're searching cancer, the tokens c,ca,can,canc,cance,cancer are produced by the same analysis chain, and then looked for in the index. As a result your document is found.
With the NGram filter, you often need to configure a different analyzer for search than for indexing, for instance:
index_analyzer/analyzer: standard + edge ngram
search_analyzer: stardand along
Then if you search can you'll find documents containing can,cancer,candy... But if you search cancer, you'll only find documents containing cancer,cancerology... and so on.
So I've setup the following data set so I can test searching on an field storing multiple values:
post /test/participant
{
"Synonyms" : [ "foo" ]
}
post /test/participant
{
"Synonyms" : [ "bar" ]
}
post /test/participant
{
"Synonyms" : [ "foo", "bar" ]
}
I've tried to get some data back by trying something like:
get /test/participant/_search
{
"query": {
"filtered": {
"filter": {
"term": { "Synonyms": "foo" }
}
}
}
}
and I was expecting to get back the first and third records (see order above). However, I keep on getting all the records back. I've tried no end of alerations to the query to try and get something sensible (there's not enough space to add them here) and all I keep on getting is all the records in the index. Does anyone have an idea how I would query to get back those records with "foo" as a value (1st and 3rd)? And is there some subtle point I've been missing here? I'm aware that ElasticSearch does not store the values as an array but as an unordered collection.
I think you are running these queries in Sense, right?
The commands you need are these:
POST /test/participant
{"Synonyms":["foo"]}
POST /test/participant
{"Synonyms":["bar"]}
POST /test/participant
{"Synonyms":["foo","bar"]}
GET /test/participant/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"Synonyms": "foo"
}
}
}
}
}
The explanation is related to GET vs. POST http methods.
Behind the scene Sense actually converts a GET request to a HTTP POST (given that many browsers do not support HTTP GET requests with a request body). This means that, even if you write GET, the actual http request is a POST.
Because Sense has the autocomplete that forces upper case letters for request methods, it uses the same upper case letters when deciding if it's a GET (and not a get) request together with a request body. If it is, then that request is transformed to a POST one. If it compares the request method and decides is not a GET it sends the request as is, meaning with a get method and with a body. Since the body is ignored, what reaches Elasticsearch will be a get /test/participant/_search which is basically a match_all which, of course, returns all documents :-).