I'm new to the elastic-cloud interface. It allows to chooose operations get, post, put and del. I'm trying to submit queries, but I don't know the precise syntax. For instance:
tweet/_search?q=something
works, but:
tweet/_search?q={ "match_all": {} }
does not, returning a parser error. I have tried with double quotes, but it seems that then it searches for the query as a string.
The preferred way to test the search APIs are using the POST method, GET API in some case, gives even incorrect search results as it ignores the search and brings the top 10 search results for match_all query.
Elasticsearch supports both methods GET and POST to search but using the GET method which has payload information isn't common on modern app-severs, although Elasticsearch implemented it requires carefully crafting your queries.
Still, if you want to use the GET API, then for complex queries its better to send it as part of request body, I know it sounds weird to send a body to GET request but it works 😀 .
Related
I'm trying to create a small app that displays some simple visualizations from data indexed on Elasticsearch (on an AWS managed Elasticsearch service).
Since, to the best of my knowledge, the degree of access control that AWS offers over its ES service is based on allowing specific HTTP verbs (GET, POST, etc), to simplify my life and the ES admin's, I'm granting this app "read only" permissions, so only GET and HEAD.
However, I see that for its search API, ES exposes a GET endpoint that works with query string parameters, and a POST endpoint that works with a JSON based "Query DSL". This DSL seems to be the preferred method in all examples I have seen online and in the books.
Given the predominance of the Query DSL throughout the documentation, I was wondering:
Does the the Query DSL exposes functionality that standard query string parameters don't, or are they both functionally equivalent?
Does the POST search endpoint result in any data being actually POSTED, or is this only a workaround to allow to send JSON as a query that breaks a little bit with REST conventions?
As per the docs
You can use query parameters to define your search criteria directly in the request URI, rather than in the request body. Request URI searches do not support the full Elasticsearch Query DSL, but are handy for testing.
The GET behavior is slightly confusing but even Kibana sends a POST in the background when you perform a GET with a body. If you have to use GET, some query results might be unexpected. What's your exact use case? Which queries are we talking?
FYI more useful info is here and here.
Did the Explain endpoint ever support search_type: dfs_query_then_fetch? If it does now (I'm on 7.1), how do I specify it?
I was thrown for a loop when using the Explain API on two identical documents, but seeing different score calculations. Learning the documents lived in different shards, and that the TF/IDF inputs were calculated per-shard explained the difference. Using dfs_query_then_fetch on the Search API normalized the scores, but the ElasticSearch .net client (both LowLevel and NEST) don't appear to expose a way to specify it for calls to the Explain API.
I also tried to form a request manually, passing it as a querystring or request body parameter. Both fail saying the argument is invalid. I thought perhaps the Explain endpoint didn't offer a way to specify dfs_query_then_fetch, but digging through some old issues it appears that it at least did at some point:
https://github.com/elastic/elasticsearch/issues/2612
Search type is not supported on the explain API. An approach that might work would be to use the Search API with dfs_query_then_fetch and explain, with a compound query that filters only to the document you're interested in (using IdsQuery), along with the query you want the explanation for.
I need to have the sort configuration to be part of the response, so lets assume i have this query
{ sort: [ {"name":"asc"},{"age:"descr"}]}
I would need to have this as part of the response to synchronise my facets / ui state with that sorting. I see there is a "sort" response field, but it basically lists the values which have been picked for the sort, but not which field and which sort type.
Reading the docs i am not sure it should be the case https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_sort_values
Not able to find anything about this in the web, a lot of how to sort examples, also on stack, but nothing about how to reflect the sort in the response.
If it matters, i am currently using Elasticsearch 2.4
This is in fact not possible out of the box.
I solved this using my middleware. So when a client wants to search on ES, it happens this way
client -> middleware -> ES
To include sorting in the response, the middleware does something like this
result = es.search(query)
result['sort'] = query['sort'] if query.key?('sort')
return result
So i copy the sort field from the request into the response and this is in fact very useful for the client, when dealing with aggregations / facetted searches
I'm trying to debug an ElasticSearch query. I've enabled explain for the problematic query, and that is showing that the query is doing a product of intermediate scores where it should be doing a sum. (I'm creating the query request using elastic4s.)
The problem is I cannot see what the generated query actually is. I want to determine whether the bug is in elastic4s (generating the query request incorrectly), in my code, or in elasticsearch. So I've enabled logging for the embedded elasticsearch instance used in the tests using the following code:
ESLoggerFactory.setDefaultFactory(new Slf4jESLoggerFactory())
val settings = Settings.settingsBuilder
.put("path.data", dataDirPath)
.put("path.home", "/var/elastic/")
.put("cluster.name", clusterName)
.put("http.enabled", httpEnabled)
.put("index.number_of_shards", 1)
.put("index.number_of_replicas", 0)
.put("discovery.zen.ping.multicast.enabled", false)
.put("index.refresh_interval", "10ms")
.put("script.engine.groovy.inline.search", true)
.put("script.engine.groovy.inline.update", true)
.put("script.engine.groovy.inline.mapping", true)
.put("index.search.slowlog.threshold.query.debug", "0s")
.put("index.search.slowlog.threshold.fetch.debug", "0s")
.build
but I can't find any queries being logged in the log file configured in my logback.xml. Other log messages from elasticsearch are appearing there, just not the actual queries.
You can't, at least not directly, at least not in ES versions currently available. It's something that has been discussed at some length (eg https://github.com/elastic/elasticsearch/issues/9172 and https://github.com/elastic/elasticsearch/issues/12187) it seems like this may change soon, with the rewrite of the tasks API. In the meantime, you can use things like ES Restlog (https://github.com/etsy/es-restlog) and/or put nginx in front of ES and capture the queries in the nginx logs. You can also use tcpdump (eg tcpdump -vvv -x -X -i any port 9200) and capture the query as it's running on the server. One last option is to modify your application and echo the query instead of executing it (and/or inserting the query into ES itself before you execute it, since the query itself is JSON).
In the specific case of elastic4s, it offers the ability to call .show on the elastic4s query object to generate what the JSON body part of the request would have been if the JSON-over-HTTP protocol had been used to send the request, for most types of request. This can then be logged at a convenient point in your code, e.g. if you have one method that generates all ES search queries. The code in Elasticsearch that generates the fake JSON could still have bugs of course, so it should not entirely be trusted. However, it's worth trying to reproduce the issue with the output of .show using Sense against a real Elasticsearch cluster over HTTP - if you can, you (a) know that it's not an elastic4s bug, and (b) can easily manipulate the JSON to try to figure out what's causing the problem.
show calls toString in some cases, so with the plain Elasticsearch API or another JVM-based wrapper on top of it, you can call that to get the JSON string to log.
With embedded Elasticsearch, this is as good as you're going to get in terms of logging - short of putting a breakpoint on the builder invocations and observing the actual Java Elasticsearch request objects that are created (which is the most accurate approach).
The official reference states that one can send _search requests also through POST instead of GET because not all clients support sending bodys with GET (see https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html). You can then insert the query parameters from the URL also as JSON directly in the body.
Now I wonder: is this true for all GET requests that Elasticsearch offers that need query parameters?
For example, the _stat endpoint (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-stats.html) is documented as a GET request (which makes sense), but supports URI parameters. Is it safe to use POST in this case as well and pass the parameters in the body using JSON?
No, the _search endpoint is one of a few special cases. If you look at the source code for the _stats endpoint in RestIndicesStatsAction.java, you can see that only the GET HTTP method is supported.
Using the POST method usually makes sense only when the payload to be sent can be substantially big, which is not the case for the few parameters such as the ones accepted by the _stats endpoint. In that case, sending those parameters in the query string is usually more than sufficient.