elasticsearch with NativeSearchQueryBuilder space and uppercase - elasticsearch

I'm using the following code to filter by elastic search java api,it works fine and return result if i use string query ,but If i use text with spaces or uppercase letters it don't return any data
if use
String query={"bool":{"should":[{"term":{"name":"test"}}]}}
return data
and if i use
String query={"bool":{"should":[{"term":{"name":"test airportone"}}]}}
or
String query={"bool":{"should":[{"term":{"name":"TEST"}}]}}
return no data
String query={"bool":{"should":[{"term":{"name":"test airport one"}}]}}
BoolQueryBuilder bool = new BoolQueryBuilder();
bool.must(new WrapperQueryBuilder(query));
SearchQuery searchQuery = new
NativeSearchQueryBuilder()
.withQuery(bool)
.build();
Page<Asset> asset =
elasticsearchTemplate.queryForPage(searchQuery,Asset.class);
return asset.getContent();

You have two options depending on your use-case.
First option: You can use match instead of term to search for a string if you want to get advantage of ElasticSearch full text search capabilities.
{
"bool": {
"should": [{
"match": {
"name": "test airportone"
}
}]
}
}
Second option: You can also specify that the name field is not analyzed when mapping your index so ElasticSearch will always store it as it is and always will get the exact match.
"mappings": {
"user": {
"properties": {
"name": {
"type": "string"
"index": "not_analyzed"
}
}
}
}

Related

Elasticsearch “data”: { “type”: “float” } query returns incorrect results

I have a query like below and when date_partition field is "type" => "float" it returns queries like 20220109, 20220108, 20220107.
When field "type" => "long", it only returns 20220109 query. Which is what I want.
Each queries below, the result is returned as if the query 20220119 was sent.
--> 20220109, 20220108, 20220107
PUT date
{
"mappings": {
"properties": {
"date_partition_float": {
"type": "float"
},
"date_partition_long": {
"type": "long"
}
}
}
}
POST date/_doc
{
"date_partition_float": "20220109",
"date_partition_long": "20220109"
}
#its return the query
GET date/_search
{
"query": {
"match": {
"date_partition_float": "20220108"
}
}
}
#nothing return
GET date/_search
{
"query": {
"match": {
"date_partition_long": "20220108"
}
}
}
Is this a bug or is this how float type works ?
2 years of data loaded to Elasticsearch (like day-1, day-2) (20 gb pri shard size per day)(total 15 TB) what is the best way to change the type of just this field ?
I have 5 float type in my mapping, what is the fastest way to change all of them.
Note: In my mind I have below solutions but I'm afraid it's slow
update by query API
reindex API
run time search request (especially this one)
Thank you!
That date_partition field should have the date type with format=yyyyMMdd, that's the only sensible type to use, not long and even worse float.
PUT date
{
"mappings": {
"properties": {
"date_partition": {
"type": "date",
"format": "yyyyMMdd"
}
}
}
}
It's not logical to query for 20220108 and have the 20220109 document returned in the results.
Using the date type would also allow you to use proper time-based range queries and create date_histogram aggregations on your data.
You can either recreate the index with the adequate type and reindex your data, or add a new field to your existing index and update it by query. Both options are valid.
It can be answer of my question => https://discuss.elastic.co/t/elasticsearch-data-type-float-returns-incorrect-results/300335

Kibana search pattern issue

I am trying to create a elastic search query for one of my Library projects. I am trying to use regex but I do not get any result. I am trying to enter the following regex query.
GET /manifestation_v1/_search
{
"query": {
"regexp": {
"bibliographicInformation.title": {
"value": "python access*"
}
}
}
}
access is a wildcard so i want to create a query which takes as python access* not python access
Can anyone help me out who already has some experience in kibana?
you can try wildcard query
{
"query": {
"wildcard": {
"bibliographicInformation.title": {
"value": "saba safavi*"
}
}
}
}
You need to run regex query on keyword field and use .* instead of *
ex.
GET /manifestation_v1/_search
{
"query": {
"regexp": {
"bibliographicInformation.title": {
"value": "python access.*"
}
}
}
}
Regex is slower , you can also try prefix query
{
"query": {
"prefix": {
"bibliographicInformation.title": {
"value": "python access"
}
}
}
}
If field is of nested type then you need to use nested query
Update
For "text" type , field is stored as tokens. i.e
"python access" is stored as ["python","access"]. You query is trying to match "phython access*" with each of these tokens individually. You need to query against keyword field , which is stored as single value "phython access".

Elasticsearch: why can't find by `term` query but can find by `match` query?

I am using Elasticsearch 11 for query text.
I have below query but it doesn't return any document.
POST/_search
{
"query": {
"term":{
"metric_name" : {"value": "ConsumedReadCapacityUnits","boost": 1.0}
}
}
}
Then I change it to text query like below which can find the matched document:
POST/_search
{
"query": {
"match":{
"metric_name" : "ConsumedReadCapacityUnits"
}
}
}
Based on the doc in term query, it matches exact term but the value ConsumedReadCapacityUnits is an exact one for metric_name, so why term query doesn't return anything?
Match query analyzes the search term, based on the standard analyzer (if no analyzer is specified) and then matches the analyzed term with the terms stored in the inverted index. By default text type field uses a standard analyzer if no analyzer is specified. For eg. SchooL gets analyzed to school
Term query returns documents that contain an exact term in a provided field. If you have not defined any explicit index mapping, then you need to add .keyword to the field. This uses the keyword analyzer instead of the standard analyzer.
As mentioned in the comments above mapping type of ConsumedReadCapacityUnits is text, so you can perform term query on ConsumedReadCapacityUnits by updating your index mapping
If you want to store the ConsumedReadCapacityUnits field as of both text and keyword type, then you can update your index mapping as shown below to use multi fields
PUT /_mapping
{
"properties": {
"ConsumedReadCapacityUnits": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
And then reindex the data again. After this, you will be able to perform term query using the "ConsumedReadCapacityUnits.keyword" field as of keyword type and "ConsumedReadCapacityUnits" as of text type
OR the other way is to create a new index, with the below index mapping
{
"mappings": {
"properties": {
"ConsumedReadCapacityUnits": {
"type": "keyword"
}
}
}
}
And then index the data in this new index

How can I configure multi-fields support for undeclared new attribute in Elasticsearch Indexing?

Default Elasticsearch adds any new attribute into the index mapping with type text that contains the string value but I need multi fields support(text and keyword)
Your mapping is correct and it can be used for match and term queries, you can find more info about multi-fields here.
For fulltext query you can use this query:
{
"query": {
"match": {
"dynamic_field001": "search"
}
}
}
and for term query:
{
"query": {
"term": {
"dynamic_field001.keyword": "search"
}
}
}

Elastic query bool must match issue

Below is the query part in Elastic GET API via command line inside openshift pod , i get all the match query as well as unmatch element in the fetch of 2000 documents. how can i limit to only the match element.
i want to specifically get {\"kubernetes.container_name\":\"xyz\"}} only.
any suggestions will be appreciated
-d ' {\"query\": { \"bool\" :{\"must\" :{\"match\" :{\"kubernetes.container_name\":\"xyz\"}},\"filter\" : {\"range\": {\"#timestamp\": {\"gte\": \"now-2m\",\"lt\": \"now-1m\"}}}}},\"_source\":[\"#timestamp\",\"message\",\"kubernetes.container_name\"],\"size\":2000}'"
For exact matches there are two things you would need to do:
Make use of Term Queries
Ensure that the field is of type keyword datatype.
Text datatype goes through Analysis phase.
For e.g. if you data is This is a beautiful day, during ingestion, text datatype would break down the words into tokens, lowercase them [this, is, a, beautiful, day] and then add them to the inverted index. This process happens via Standard Analyzer which is the default analyzer applied on text field.
So now when you query, it would again apply the analyzer at querying time and would search if the words are present in the respective documents. As a result you see documents even without exact match appearing.
In order to do an exact match, you would need to make use of keyword fields as it does not goes through the analysis phase.
What I'd suggest is to create a keyword sibling field for text field that you have in below manner and then re-ingest all the data:
Mapping:
PUT my_sample_index
{
"mappings": {
"properties": {
"kubernetes":{
"type": "object",
"properties": {
"container_name": {
"type": "text",
"fields":{ <--- Note this
"keyword":{ <--- This is container_name.keyword field
"type": "keyword"
}
}
}
}
}
}
}
}
Note that I'm assuming you are making use of object type.
Request Query:
POST my_sample_index
{
"query":{
"bool": {
"must": [
{
"term": {
"kubernetes.container_name.keyword": {
"value": "xyz"
}
}
}
]
}
}
}
Hope this helps!

Resources