Query fetching case-sensitive results in elasticsearch - elasticsearch

I have a field like this in my indexed documents
"screen_name : "9GAG"
And this is my query:
{
"query": {
"term": {
"screen_name": "9gag"
}
}
}
Im getting zero hits. But when I replace "9gag" with "9GAG" it works fine. Why is this happening and how can this be fixed?

Related

How to compare two date fields in same document in elasticsearch

In my elastic search index, each document will have two date fields createdDate and modifiedDate. I'm trying to add a filter in kibana to fetch the documents where the modifiedDate is greater than createdDate. How to create this filter in kibana?
Tried Using below query instead of greater than it is considering as gte and fetching all records
GET index/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script" : {
"inline" : "doc['modifiedTime'].value.getMillis() > doc['createdTime'].value.getMillis()",
"lang" : "painless"
}
}
}
}
}
}
There are a few options.
Option A: The easiest and most performant one is to store the difference of the two fields inside a new field of your document, e.g.
{
"createDate": "2022-01-11T12:34:56Z",
"modifiedDate": "2022-01-11T12:34:56Z",
"diffMillis": 0
}
{
"createDate": "2022-01-11T12:34:56Z",
"modifiedDate": "2022-01-11T12:35:58",
"diffMillis": 62000
}
Then, in Kibana you can query on diffMillis > 0 and figure out all documents that have been modified after their creation.
Option B: You can use a script query
GET index/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": """
return doc['createdDate'].value.millis < doc['modifiedDate'].value.millis;
"""
}
}
}
}
}
Note: depending on the amount of data you have, this option can potentially have disastrous performance, because it needs to be evaluated on ALL of your documents.
Option C: If you're using ES 7.11+, you can use runtime fields directly from the Kibana Discover view.
You can use the following script in order to add a new runtime field (e.g. name it diffMillis) to your index pattern:
emit(doc['modifiedDate'].value.millis - doc['createdDate'].value.millis)
And then you can add the following query into your search bar
diffMillis > 0

ElasticSearch query text

My index data is
{
"full_name":"Edwin Powell Hubble",
"job": "IT"
}
{
"full_name":"John Edwin",
"job": "Accountant"
}
{
"first_name":"Eric Petterson",
"job": "Accountant"
}
I am not sure if anyone could help me to build a query to get data that have full_name as Edwin. It tried with term query seem not really work.
Since full_name can be of any length and should be analyzed when indexed, I believe you have mapped the attribute as of type text.
For the same reason I also believe you will have requirements to return results as 'Edwin Powell Hubble' and 'John Edwin' when searched with 'Edwin' and return 'Edwin Powell Hubble' when search with 'Edwin Pow'
match_phrase_prefix should help you with these use cases.
GET /_search
{
"query": {
"match_phrase_prefix": {
"full_name": "Edwin"
}
}
}
You can use the match query to get data that have full_name as Edwin
{
"query": {
"match": {
"full_name": "edwin"
}
}
}
Term query works on exact text match, so you will not get any document for Edwin since there is no data in your sample index data that have a match for full_name as Edwin

Delete all index except one/some in Elasticsearch?

Is there any way to delete all indices except one?
We can use the metadata _index of document in a GET request:
GET _count
{
"query": {
"match": {
"_index": "indexname"
}
}
}
The above query doesn't make sense but just to show that we can use _index inside a query I have mentioned it.
I have tried the below query, but I guess _all API doesn't support query.
DELETE _all
{
"query" : {
"bool" : {
"must_not" : [
{
"match": {
"_index": "indexname"
}
}
]
}
}
}
Is there any way to delete all indices except one/some without using bulk API ?
Try to use multiple indices syntax. You can specify all indices with * and then exclude some of them with -.
Suppose we need to remove all indices except foo and bar, so the HTTP request should be
curl -X DELETE -i 'http://{server}:{port}/*,-foo,-bar'

Why is my very simple ElasticSearch query failing, SearchPhaseExecutionException

Im trying to do a search for "dog chew" in the invention-title field in my PatentGrants type.
query url: POST http://localhost:9200/patents/patentGrants/_search
query body:
{
"query": {
"match_all": {
"invention-title": "dog chew"
}
}
}
Below is a picture of the data in my patents index and below that is a picture of my query and the error message.
Try this:
{
"query": {
"match": {
"inventionTitle": "dog chew"
}
}
}
The field name in the screenshot is inventionTitle not invention-title.
https://www.elastic.co/guide/en/elasticsearch/reference/1.6/query-dsl-match-all-query.html - use match instead of match_all. match_all doesn't accept a search query.

Filter facet returns count of all documents and not range

I'm using Elasticsearch and Nest to create a query for documents within a specific time range as well as doing some filter facets. The query looks like this:
{
"facets": {
"notfound": {
"query": {
"term": {
"statusCode": {
"value": 404
}
}
}
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"time": {
"from": "2014-04-05T05:25:37",
"to": "2014-04-07T05:25:37"
}
}
}
]
}
}
}
In the specific case, the total hits of the search is 21 documents, which fits the documents within that time range in Elasticsearch. But the "notfound" facet returns 38, which fits the total number of ErrorDocuments with a StatusCode value of 404.
As I understand the documentation, facets collects data from withing the search. In this case, the "notfound" facet should never be able to return a count higher that 21.
What am I doing wrong here?
There's a distinct difference between filter/query/filtered_query/facet filter which is good to know.
Top level filter
{
filter: {}
}
This acts as a post-filter, meaning it will filter the results after the query phase has ended. Since facets are part of the query phase filters do not influence the documents that are facetted over. Filters do not alter score and are therefor very cacheable.
Top level query
{
query: {}
}
Queries influence the score of a document and are therefor less cacheable than filters. Queries run in the query phase and thus also influence the documents that are facetted over.
Filtered query
{
query: {
filtered: {
filter: {}
query: {}
}
}
}
This allows you to run filters in the query phase taking advantage of their better cacheability and have them influence the documents that are facetted over.
Facet filter
"facets" : {
"<FACET NAME>" : {
"<FACET TYPE>" : {
...
},
"facet_filter" : {
"term" : { "user" : "kimchy"}
}
}
}
this allows you to apply a filter to the documents that the facet is run over. Remember that the it'll be a combination of the queryphase/facetfilter unless you also specify global:true on the facet as well.
Query Facet/Filter Facet
{
"facets" : {
"wow_facet" : {
"query" : {
"term" : { "tag" : "wow" }
}
}
}
}
Which is the one that #thomasardal is using in this case which is perfectly fine, it's a facet type which returns a single value: the query hit count.
The fact that your Query Facet returns 38 and not 21 is because you use a filter for your time range.
You can fix this by either doing the filter in a filtered_query in the query phase or apply a facet filter(not a filter_facet) to your query_facet although because filters are cached better you better use facet filter inside you filter facet.
Confusingly Filter Facets are specified using .FacetFilter() on the search object. I will change this in 1.0 to avoid future confusion.
Sadly: .FacetFilter() and .FacetQuery() in NEST do not allow you to specify a facet filter like you can with other facets:
var results = typedClient.Search<object>(s => s
.FacetTerm(ft=>ft
.OnField("myfield")
.FacetFilter(f=>f.Term("filter_facet_on_this_field", "value"))
)
);
You issue here is that you are performing a Filter Facet and not a normal facet on your query (which will follow the restrictions applied via the query filter). In the JSON, the issue is because of the "query" between the facet name "notfound" and the "terms" entry. This is telling Elasticsearch to run this as a separate query and facet on the results of this separate query and not your main query with the date range filter. So your JSON should look like the following:
{
"facets": {
"notfound": {
"term": {
"statusCode": {
"value": 404
}
}
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"time": {
"from": "2014-04-05T05:25:37",
"to": "2014-04-07T05:25:37"
}
}
}
]
}
}
}
Since I see you have this tagged with NEST as well, in your call using NEST, you are probably using FacetFilter on your search request, switch this to just Facet to get the desired result.

Resources