Elasticsearch using special characters - elasticsearch

I'm indexing pdf's into Elasticsearch.
When I search for content like "§ 123", than the "§" is ignored.
What do I have to do so that "§" is also included in the search?
Here the Indices.Create:
CreateIndexResponse createIndexResponse = elasticClient.Indices.Create(sep.DefaultIndex, c => c
.Settings(s => s
.Analysis(a => a
.Analyzers(ad => ad
.Custom("windows_path_hierarchy_analyzer", ca => ca
.Tokenizer("windows_path_hierarchy_tokenizer")
)
)
.Tokenizers(t => t
.PathHierarchy("windows_path_hierarchy_tokenizer", ph => ph
.Delimiter('\\')
)
.NGram("ngram_tokenizer", td => td
.MinGram(2)
.MaxGram(20)
.TokenChars(
TokenChar.Letter,
TokenChar.Digit,
TokenChar.Symbol)
)
)
)
)
.Map<ElasticDocument>(mp => mp
.AutoMap()
.Properties(ps => ps
.Text(s => s
.Name(n => n.Path)
.Analyzer("windows_path_hierarchy_analyzer")
)
.Object<Attachment>(a => a
.Name(n => n.Attachment)
.AutoMap()
)
)
)
);
Here the mappings:
{
"attachments": {
"mappings": {
"properties": {
"attachment": {
"properties": {
"author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content_length": {
"type": "long"
},
"content_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"creator_tool": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"date": {
"type": "date"
},
"description": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"format": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"keywords": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"language": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"metadata_date": {
"type": "date"
},
"modified": {
"type": "date"
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"id": {
"type": "long"
},
"instance": {
"type": "long"
},
"path": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}

Tldr;
As pointed out by #llermaly in the mapping you do not seems to be using the analyser you are creating with your code.
Be default text fields are going to be analysed with the standard analyser.
Solution
You will need to specify the analyser you want to be used on text fields
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "windows_path_hierarchy_analyzer"
}
}
}
}
else the standard analyser is used.
Which is going to delete special char.
POST _analyze
{
"analyzer": "standard",
"text": "§ 123"
}
gives you:
{
"tokens": [
{
"token": "123",
"start_offset": 2,
"end_offset": 5,
"type": "<NUM>",
"position": 0
}
]
}
The § has been removed.

Related

results not showing during indexing elastic search

I am performing elastic search full indexing using Bulk request. I have an issue during the indexing the results are coming as empty. As I am deleting the index during the full index, How I can handle this situation.
I have done the these steps:
delete index
Create index
Create Mapping
bulk request
Index properties and Mapping:
{
"products": {
"aliases": {},
"mappings": {
"properties": {
"assemblyrequired": {
"type": "boolean"
},
"australianmade": {
"type": "boolean"
},
"australiasellable": {
"type": "boolean"
},
"avgRating": {
"type": "float"
},
"category": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"categorylevel1": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"categorylevel2": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"categorylevel3": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"categoryname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"categoryname_old": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"clearance": {
"type": "boolean"
},
"commercialuse": {
"type": "boolean"
},
"customisable": {
"type": "boolean"
},
"depth": {
"type": "float"
},
"freedelivery": {
"type": "boolean"
},
"genericcolourcode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"height": {
"type": "float"
},
"hideprice": {
"type": "boolean"
},
"listprice": {
"type": "float"
},
"materialcode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"moneybackguarantee": {
"type": "boolean"
},
"newrelease": {
"type": "boolean"
},
"numberOfRating": {
"type": "long"
},
"online": {
"type": "boolean"
},
"outdooruse": {
"type": "boolean"
},
"predictivecategorydata": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"pricematchguarantee": {
"type": "boolean"
},
"productcode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"productid": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"productimageurl": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"productname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"producttypecode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"promotedprice": {
"type": "float"
},
"sale": {
"type": "integer"
},
"saleprice": {
"type": "float"
},
"sellable": {
"type": "boolean"
},
"sellercode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"shortdescription": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"sku": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"sortweight": {
"type": "long"
},
"state": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"stylecode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"warrantycode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"weight": {
"type": "float"
},
"width": {
"type": "float"
}
}
},
"settings": {
"index": {
"number_of_shards": "1",
"provided_name": "products",
"max_result_window": "500000",
"creation_date": "1595814303422",
"number_of_replicas": "1",
"uuid": "sGJxwr73Rkyu7-JekWFYsw",
"version": {
"created": "7060199"
}
}
}
}
}
I have around 75k documents.
Thanks,
Sree.
If you want the full index to be available during the reindex, your only option is to not delete the original index until after the indexing is done. In that case, I would probably work with aliases. For example, let's assume products-2020.07.28 was your current index, you would then create a new index for today and change the alias as soon as the indexing is done.
Create Index
PUT /products-2020.07.28
{
"settings": {
... your settings ...
},
"mappings": {
... your mappings ...
}
}
Bulk Index Request
Change Alias to new Index
POST /_aliases
{
"actions" : [
{ "remove" : { "index" : "products-2020.07.27", "alias" : "products" } },
{ "add" : { "index" : "products-2020.07.28", "alias" : "products" } }
]
}
Delete old Index
DELETE /products-2020.07.27
Any requests can then go directly to the alias, instead of the index.
GET /products/_search
That way you can reindex without the user noticing anything.

How to find objects with inner objects having multiple fields by specific values in Elastic Search

I have an index with objects named "DynamicFields" and each of them have inner objects named "Fields" like this:
{
"DynamicFields": [
{
"Fields": [
{
"DFieldVal": "Value1",
"Owned": 0,
"DFieldRelCode": 181254,
"DFieldCode": 1835
},
{
"DFieldVal": "Value2",
"Owned": 0,
"DFieldRelCode": 181255,
"DFieldCode": 1836
},
{
"DFieldVal": "Value3",
"Owned": 1,
"DFieldRelCode": 181256,
"DFieldCode": 1837
},
{
"DFieldVal": "Value4",
"Owned": 0,
"DFieldRelCode": 181257,
"DFieldCode": 1838
}
]
}
]
}
I need to find objects "DynamicFields" that has inner objects "Fields" with this exact values:
"DFieldCode": 1837
and
"Owned": 0
Im using this query for it, but it gives me wrong result, it should return an empty result because there isn't any inner object "Fields" having both of the values:
{
"from":0,
"size":10,
"query": {
"bool":{
"must":[
{ "terms": { "DynamicFields.Fields.Owned" : [0] } },
{ "terms": { "DynamicFields.Fields.DFieldCode" : [1837] } }
]
}
}
}
I think the problem is that Elastic search sees the inner objects properties as normal property for the Root Object so it returns the objects that have the mentioned fields in all inner objects no matter in the same inner object.
EDIT:
i have summarized the data to make it simpler
the mapping is full map of the data:
{
"marketplace": {
"mappings": {
"object": {
"properties": {
"Addresses": {
"properties": {
"AddrID": {
"type": "long"
},
"AddressText": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AddressTree": {
"properties": {
"AddrFieldRelID": {
"type": "long"
},
"AddrTitleName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AddrTitlePersianName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AddrValName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Latitude": {
"type": "float"
},
"Longitude": {
"type": "float"
}
}
},
"Latitude": {
"type": "float"
},
"Longitude": {
"type": "float"
},
"Tel": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"DelFlag": {
"type": "long"
},
"DynamicFields": {
"properties": {
"DynamicDefCode": {
"type": "long"
},
"DynamicDefDataTypeName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"DynamicDefName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"DynamicValKind": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Fields": {
"properties": {
"DFieldCode": {
"type": "long"
},
"DFieldRelCode": {
"type": "long"
},
"DFieldVal": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Owned": {
"type": "boolean"
}
}
}
}
},
"GFRefCode": {
"type": "long"
},
"GoodsDesc": {
"properties": {
"FName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GoodsFullName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Supplier": {
"properties": {
"Barcode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GPackDayPrice": {
"type": "long"
},
"GoodsEnterDate": {
"type": "date"
},
"GoodsFinalCode": {
"type": "long"
},
"GoodsFullName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GoodsWHStock": {
"type": "long"
},
"StoreName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"UserName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"WHName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"WareHouseCode": {
"type": "long"
}
}
},
"UserName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"GoodsFinalCode": {
"type": "long"
},
"Images": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"IsMainObject": {
"type": "boolean"
},
"ObjectDetailPackID": {
"type": "long"
},
"ObjectKind": {
"type": "long"
},
"Prices": {
"properties": {
"Barcode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GPWeight": {
"type": "float"
},
"GpackDayPrice": {
"type": "long"
},
"PackingName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"RefGoodsFinalCode": {
"type": "long"
},
"TreePath": {
"properties": {
"DFieldCode": {
"type": "long"
},
"DFieldRelCode": {
"type": "long"
},
"DFieldVal": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
thanks.
As your index shows you saved your array as an object,
read more about this here
Basically unless specified otherwise elasticsearch flattens arrays when being saved, making objects in arrays lose their structure.
you should define the type of Fields as nested to avoid this.

Elasticsearch - using nested object value in Function Score

I currently have a nested object interest_scores in ES that looks like this:
[{
username: 'Somebody',
interest_scores: [
{ name: 'Running', score: 10 }
{ name: 'Food and drinks', score: 21 }
]
},
{
username: 'SomebodyElse',
interest_scores: [
{ name: 'Running', score: 7 }
{ name: 'Food and drinks', score: 29 }
]
}]
When I enter the search term Running I would like the user with the highest score for Running to get returned first.
I know the way to do this is to use a Function Score Query but I am not sure how to use the matching search term in the function / script. What I think is that the query will return all documents that have the interest "Running" and then I could use something like interest_scores.{match}.score to add to or multiply by the document score.
Any help with this would be greatly appreciated!
As requested, here is the mapping:
{
"influencers": {
"mappings": {
"influencer": {
"properties": {
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"gender": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"geo": {
"type": "geo_point"
},
"hashtags": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"interest_scores": {
"type": "nested",
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"score": {
"type": "long"
}
}
},
"interests": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"language": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"location": {
"properties": {
"city": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"country": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"country_code": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"lat": {
"type": "float"
},
"lng": {
"type": "float"
},
"state_code": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"subdivision": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"network_data": {
"properties": {
"facebook": {
"properties": {
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"instagram": {
"properties": {
"bio": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"engagement": {
"type": "float"
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"picture": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"reach": {
"type": "long"
},
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"pinterest": {
"properties": {
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"twitter": {
"properties": {
"bio": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"engagement": {
"type": "float"
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"picture": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"reach": {
"type": "long"
},
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"youtube": {
"properties": {
"bio": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"engagement": {
"type": "float"
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"picture": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"reach": {
"type": "long"
},
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"videos": {
"type": "long"
},
"views": {
"type": "long"
},
"views_per_video": {
"type": "float"
}
}
}
}
},
"networks": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"picture": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"total_reach": {
"type": "long"
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
I do not have a function score query yet, I am only testing in the Dev Tools of Kibana - I do have all of the other filters working correctly though. I am just looking to say "If the search term matches a interest_scores.name then sort the hits by the interest_scores.score of that interest_scores.name
Update
The following seems to be working when I test it in Kibana dev tools:
{
"query": {
"nested": {
"path": "interest_scores",
"score_mode": "sum",
"query": {
"function_score": {
"query": {
"match": { "interest_scores.name": "Running" }
},
"script_score": {
"script": "_score + doc['interest_scores.score'].value"
}
}
}
}
}
}
I have tested it with a few different search terms and it always returns the highest score first, but what is weird is that I get the same results when I remove the script_score function. Can anyone tell me if this is a good solution, or why it works without the script_score?
As described here, you can sort by nested fields:
{
"_source": false, # for inner hits - you can remove it
"query": {
"nested": {
"path": "interest_scores",
"filter": {
"range": {
"interest_scores.score": {
"gte": "0"
}
}
},
"inner_hits": {} # for inner hits - you can remove it
}
},
"sort": {
"interest_scores.score": {
"order": "desc",
"mode": "max",
"nested_filter": {
"range": {
"interest_scores.score": {
"gte": "0"
}
}
}
}
}
}
*Pay attention that, you can use the inner_hits ability to show only relevant nested documents. If all inner hits documents are relevant - please remove the marked lines.
**Use the filter on score field or on any other field (e.g: name you would like to filter by).
EDIT 1:
If you want to get the sorted scores of specific name, try:
{
"_source": false,
"query": {
"nested": {
"path": "interest_scores",
"filter": {
"term": {
"interest_scores.name": "SCORE_NAME"
}
},
"inner_hits": {}
}
},
"sort": {
"interest_scores.score": {
"order": "desc",
"mode": "max",
"nested_filter": {
"range": {
"interest_scores.score": {
"gte": "0"
}
}
}
}
}
}
Put the desired score name instead SCORE_NAME.

Missing query for nested property in ElasticSearch

I am trying to write a query that returns all docs which don't have a particular field which is a nested property but it doesn't work.
I am using ES 5.4
{
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "shares"
}
}
]
}
}
}
What am I doing wrong?
This is my mapping
{
"test": {
"aliases": {},
"mappings": {
"vendor": {
"properties": {
"address": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"categories": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"city": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"displayTypes": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"emailId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"facebookAppId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"foursquareType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"googleTypes": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"hours": {
"properties": {
"day": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"from": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"to": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"isOnBoarded": {
"type": "boolean"
},
"kiaskCategories": {
"type": "keyword"
},
"latitude": {
"type": "float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "float"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"phoneNumber": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"pictures": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"placeId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"price": {
"type": "float"
},
"rating": {
"type": "float"
},
"shares": {
"type": "nested",
"properties": {
"_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"createdDate": {
"type": "date"
},
"endDate": {
"type": "date"
},
"facebookLikes": {
"type": "long"
},
"images": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"price": {
"type": "long"
},
"quantity": {
"type": "long"
},
"shareType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"startDate": {
"type": "date"
},
"tags": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"state": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"timezone": {
"type": "float"
},
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"website": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"yelpTypes": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
},
"settings": {
"index": {
"creation_date": "1496319860700",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "0S0A9YN-S-SrW4eeOHX06w",
"version": {
"created": "5040099"
},
"provided_name": "test"
}
}
}
}
this should work
You have enforce nested data type mappings and use nested query.
{
"query" : {
"nested": {
"path": "shares",
"query": {
"bool": {
"must_not": [
{
"exists" : {
"field" : "shares._id"
}
}
]
}
}
}
}
}
Note: The following setup works for me.
PUT test_index
{
"mappings": {
"document_type" : {
"properties": {
"name" : {
"type": "text"
},
"shares" : {
"type": "nested"
}
}
}
}
}
POST test_index/document_type
{
"name" : "vicky"
}
POST test_index/_search
{
"query": {
"nested": {
"path": "shares",
"query": {
"bool": {
"must_not": [
{
"exists" : {
"field" : "shares.city"
}
}
]
}
}
}
}
}

How can I use/map the field 'geoip.locations' in my index for Kibana which is looking for type 'geo_point'?

I have a log that returns a field called 'ip' from a parsed json field called 'message'.
I've set up my logstash.conf file as so:
filter {
json {
source => "message"
}
geoip {
source => "ip"
target => "geoip"
#add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
#add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
#mutate {
# convert => [ "[geoip][coordinates]", "float"]
#}
}
However, I'm not seeing anything I can use in my index pattern to create a map. Kibana gives me the error:
The "myindex*" index pattern does not contain any of the following field types: geo_point
What am I doing wrong here?
EDIT: results of curl on /_mapping
{
"<my index>": {
"mappings": {
"%{[#metadata][type]}": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"host": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"tracking log": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"accept_language": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"agent": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"beat": {
"properties": {
"hostname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"context": {
"properties": {
"course_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"course_user_tags": {
"type": "object"
},
"org_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"path": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"user_id": {
"type": "long"
}
}
},
"event": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"event_source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"event_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"geoip": {
"properties": {
"city_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"continent_code": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"coordinates": {
"type": "float"
},
"country_code2": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"country_code3": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"country_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"dma_code": {
"type": "long"
},
"ip": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"latitude": {
"type": "float"
},
"location": {
"type": "float"
},
"longitude": {
"type": "float"
},
"postal_code": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"region_code": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"region_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"timezone": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"host": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"input_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ip": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"offset": {
"type": "long"
},
"page": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"referer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"session": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"tags": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"time": {
"type": "date"
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
I used this command to execute the map:
curl -XPOST -u elastic 'localhost:9200/_template/<my index>' -H 'Content-Type: application/json' -d'
{"order": 0, "template": "<my index>*","mappings": {"_default_": {"dynamic_templates": [{"location_fields": {"mapping": {"type": "geo_point"},"match": "geoip"}}]}}}
'
You don't need the add_field lines or the mutate/convert. What you do need is to properly map the geoip field as a geo_point in elasticsearch.
Generally you'd do this with an index template -- this mapps any field named geoip to a geo_point -- you can be more specific with your templates though.
POST /_template/myindex
{
"order": 0,
"template": "myindex*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"location_fields": {
"mapping": {
"type": "geo_point"
},
"match": "location"
}
}
]
}
}
}
In any event, you'll need to re-index your data to get the mapping right and then tell kibana to reload your index mappings (under settings).
Ok - I solved this with the help of #Alcanzar. Apparently, as he said, it is impossible to change a the type of a field once data has been sent to it -- essentially, once you've started logstash and started using it.
So -- to map geoip.location to type geo_point -- here it what I did:
Steps:
1) I stopped logstash.
2) Deleted my index using 'DELETE /"
3) Created a new index with the name I wanted using 'PUT /' in the Kibana dev console.
4) Applied the mapping #Alcanzar contributed above using 'geoip.location' instead of 'location'
5) Restarted logstash.
Now geoip.location is of type geo_point.
It might be obvious that these are the steps to take for some of you experienced ES users, but it wasn't to me. I hope this helps someone else.

Resources