How can I find the location that are within the certain range from my input location? - elasticsearch

I have a mapping like below:
{
"my_locations": {
"aliases": {
},
"mappings": {
"_doc": {
"properties": {
"location": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
I know that if the field type of location is "geo_point" then I can use following geo distance query.
GET /my_locations/_search
{
"query": {
"bool" : {
"must" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "200km",
"location" : {
"lat" : 40,
"lon" : -70
}
}
}
}
}
}
I read that I cannot change the field type for location from text to geo_point(from elastic search documentation and stackoverflow) and I already have many data. So how can I find the location that are within the certain range from my input location?

First, you need to create a new index with the correct data type
PUT my_locations_2
{
"mappings": {
"_doc": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
Then you can use the reindex API in order to copy the data from the old index to the new one:
POST _reindex
{
"source": {
"index": "my_locations"
},
"dest": {
"index": "my_locations_2"
}
}

Related

Update "keyword" to "text" field type of an index for inexact words matching in elasticsearch

{
"myindex": {
"mappings": {
"properties": {
"city": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
I tried to update by using below PUT request on the index, but still getting the above ouput of _mapping
{
"_doc" : {
"properties" : {
"city" : {"type" : "text"}
}
}
}
I am not able to query with inexact words because its type is "keyword", for the below the actual value in record is "Mumbai"
{
"query": {
"bool": {
"must": {
"match": {
"city": {
"query": "Mumbi",
"minimum_should_match": "10%"
}
}
}
}
}
}
Below mapping (What is shared in the question) will store 'city' as text and 'city.keyword' as a keyword.
{
"myindex": {
"mappings": {
"properties": {
"city": {
"type": "text", // ==========> Store city as text
"fields": {
"keyword": {
"type": "keyword", // =========> store city.keyword as a keyword
"ignore_above": 256
}
}
}
}
}
}
}
your's is the use case of Fuzzy search and not minimum_should_match.
ES Docs for Fuzzy Search: https://www.elastic.co/blog/found-fuzzy-search
Try below query
{
"query": {
"match": {
"city": {
"query": "mubai",
"fuzziness": "AUTO"
}
}
}
}
minimum_should_match
Minimum number of clauses that must match for a document to be returned
It signifies the percentage of clauses not the percentage of the string. Go through this documentation to frame the query to get the expected results. Invalid queries return invalid results.

elasticsearch reindex nested object's element to keyword

I have an index structured like below:
"my_index": {
"mappings": {
"my_index": {
"properties": {
"adId": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"title": {
"type": "keyword"
},
"creativeStatistics": {
"type": "nested",
"properties": {
"clicks": {
"type": "long"
},
"creativeId": {
"type": "keyword"
}
}
}
}
}
}
}
I need to remove the nested object in a new index and just save the creativeId as a new keyword (to make it clear: I know I will loose the clicks data, and it is not important). It means the final new index scheme would be:
"my_new_index": {
"mappings": {
"my_new_index": {
"properties": {
"adId": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"title": {
"type": "keyword"
},
"creativeId": {
"type": "keyword"
}
}
}
}
}
Right now each row has exactly one creativeStatistics. and therefore there is no complexity in selecting one of the creativeIds.
I know it is possible to reindex using painless scripts, but I don't know how can I do that. Any help will be appreciated.
You can do it like this:
POST _reindex
{
"source": {
"index": "my_old_index"
},
"dest": {
"index": "my_new_index"
},
"script": {
"source": "if (ctx._source.creativeStatistics != null && ctx._source.creativeStatistics.size() > 0) {ctx._source.creativeId = ctx._source.creativeStatistics[0].creativeId; ctx._source.remove('creativeStatistics')}",
"lang": "painless"
}
}
You can also create a Pipeline by creating a Script Processor as follows:
PUT _ingest/pipeline/my_pipeline
{
"description" : "My pipeline",
"processors" : [
{ "script" : {
"source": "for (item in ctx.creativeStatistics) { if(item.creativeId!=null) {ctx.creativeId = item.creativeId;} }"
}
},
{
"remove": {
"field": "creativeStatistics"
}
}
]
}
Note that if you have multiple nested objects, it would append the last object's creativeId. And it would only add creativeId if a source document has one in its creativeStatistics.
Below is how you can then use reindex query:
POST _reindex
{
"source": {
"index": "creativeindex_src"
},
"dest": {
"index": "creativeindex_dest",
"pipeline": "my_pipeline"
}
}

How to search a elasticsearch index by partial text of a field in the indexed document?

I have an ElsaticSearch index where I keep certain data. Each document in the index has a field named file_namein a nested document. So a doc looks like
{
...
"file_data":{
"file_name": "sample_filename_acp24_20180223_1222.json"
}
...
}
I want my search to return above document if I search for sample, filename,acp24 and 20180223 and likewise.
So far I tried following analyzers and full text search queries. But still it doesn't return the above doc if I searched for acp24, 20180223.
Index Mapping
{
"index_name": {
"mappings": {
"type": {
"properties": {
"file_data": {
"type": "nested",
"properties": {
"file_name": {
"type": "text",
"analyzer": "keyword_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
Analyzer
{
"analysis": {
"analyzer": {
"keyword_analyzer":{
"type": "pattern",
"pattern":"\\W|_",
"lowercase": true
}
}
}
}
Search Query
{
"query": {
"match_phrase_prefix": {
"_all": {
"query": "20180223",
"analyzer": "keyword_analyzer"
}
}
}
}
Any help on how to achieve this is very much appreciated. I have spent so many hours with this and still couldn't find a solution.
If I understand right, you could use the wildcard query :
POST /my_index
{
"query" : {
"wildcard" : {
"file_data.file_name" : {
"wildcard" : "sample_*filename_acp24*", "boost" : 2.0
}
}
}
}
(tested with elasticsearch 6.1, might need to change the syntax for other versions)

elasticsearch run any query on field exists

I want to run the any query/filter based on the field exists. In our case if user answers a particular field then only we will store that value, other wise will not store that field it self. How can I run the query?
Below is my mapping:
"mappings": {
"responses_10_57": {
"properties": {
"rid: {
"type": "long"
},
"end_time": {
"type": "date",
"format": "dateOptionalTime"
},
"start_time": {
"type": "date",
"format": "dateOptionalTime"
},
"qid_1": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"analyzer": "str_params"
}
}
},
"qid_2": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"analyzer": "str_params"
}
}
},
"qid_3": {
"properties": {
"msg_text": {
"type": "string"
},
"msg_tags": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"analyzer": "str_params"
}
}
}
}
}
}
}
}
qid_1 is the name field, qid_2 is the category field, qid_3 is the text message field.
But the qid_3 is not a mandatory field. So we will not insert the record if user doesn't entered any text message.
1) I want each category wide count those who responded the third question.
2) I have to search the names who answered the third question.
How can I write these two queries?
Both queries should have an exists filter to limit the response to only those documents where the qid_3 exists (is not null). For your first query you could try a terms aggregation. For your second query, you can filter the source to include only the names in the response or store the field and use fields.
1)
{
"size": 0,
"filter" : {
"exists" : { "field" : "quid_3" }
},
"aggs" : {
"group_by_category" : {
"terms" : { "field" : "qid_2" }
}
}
}
2)
{
"filter" : {
"exists" : { "field" : "quid_3" }
},
"_source": [ "qid_1"]
}

How to filter fields in ElasticSearch using GET

I recently installed ElasticSearch with the Wikipedia river because I'm working on an autocomplete box with article titles. I have been trying to figure out the best way to query the dataset. The following works:
/wikipedia/_search?q=mysearch&fields=title,redirect&size=20
but I would like to add more constraints to the search:
disambiguation=false, redirect=false, stub=false, special=false
I'm new to ElasticSearch and the documentation hasn't gotten me far. From what I've read I need a filtered query; is there a way to do that from a GET request? That would make it much easier for my specific use case. If not, how would the POST request look? Thanks in advance.
For reference, the mapping is:
{
"wikipedia": {
"page": {
"properties": {
"category": {
"type": "string"
},
"disambiguation": {
"type": "boolean"
},
"link": {
"type": "string"
},
"redirect": {
"type": "boolean"
},
"special": {
"type": "boolean"
},
"stub": {
"type": "boolean"
},
"text": {
"type": "string"
},
"title": {
"type": "string"
}
}
}
}
}
For adding more constraints you can continue with the lucene syntax and do something like:
/wikipedia/_search?q=mysearch AND disambiguation:false AND redirect:false AND stub:false AND special:false&fields=title,redirect&size=20
For improving the performance you can use filters using the json API, the query will look like:
curl -XGET 'http://localhost:9200/wikipedia/_search?pretty=true' -d '
{
"from" : 0,
"size" : 20,
"query":{
"filtered" : {
"query" : {
"text" : { "title" : "test" }
},
"filter" : {
"and": [
{
"term" : {
"stub" : false
}
},
{
"term" : {
"disambiguation" : false
}
},
{
"term" : {
"redirect" : false
}
},
{
"term" : {
"special" : false
}
}
]
}
}
}
}
'

Resources