field type mismatches in elasticsearch - elasticsearch

I am dumping data in to ElasticSearch from a json file which is exported from mongodb. I am facing an issue where my data from json with array fields converted into a string.
"_source" : {
"CITIES" : [
"ABC"
],
"CITY_AREAS" : """["COLONY (AIT)"]""",
"INTERESTS" : [
"CARS"
]}
I am not doing any mapping and I know the elastic is using its default mapping on the basis of very first document which got inserted in ES.
I want to find a solution where I run the update command to update the fields with array type for all the documents containing "CITY_AREAS"
E.G :
"CITY_AREAS" : ["COLONY (AIT)"]
P.S : Some documents are having "CITY _AREAS" key and some don;t have.

you will need to reindex for this, so that it uses the correct mapping type. an update will not work

Related

Index DynamoDB streams to elastic search

I have a requirement for implementing following entities in a DynamoDB table
I have stored these entities in DynamoDB as below.
Partition Key : PROJ#ProjectId:CountryId
Sort Key : Project Name
Company : company data as JSON document
Since this is a one to many relationship, N number of projects of the same company will create N number of project records and same company details will be stored in their Company attribute. The reason for doing this is, the most critical data access point is via ProjectId and CountryId (Assume that I can't change this DB design)
I have a requirement to implement a search functionality which supports filter table using company name, address, project name, country etc (using a single filter or any combination of these filters). I'm using DynamoDB streams to feed elastic search cluster and update any creation, deleting or update of the details there and use elastic search API to query data.
But I need to index these data in following format, so that when I receive the details from elastic search, data will not be duplicated
{
"id" : 1
"name" : "ABC",
"description" : "description",
"address" : "address",
"projects" : [
{
"id" : 10,
"name" : "project 1",
"countryId" : 10
},
{
"id" : 20,
"name" : "project 1",
"countryId" : 10
}
]
}
At the record creation time, since Project records are creating as single records, is there any recommended or standard way that I can grab all the Project records of Company and create the above json document and index it in elastic search?
This is how I would approach it :
In elastic the document id will be the companyID
What you can do is create a lambda that is triggered based on the change streams and use elastic's update by query to query for the document and PAINLESS scripting to update the project section of the document, this will work for less frequent changes.

elastic search fetch the exact match first followed by others

I am newbie to elastic search
I have an education index in es
index creation
when i search 'btech' with match query as
"match" : { "name" : "btech" }
the result is like
result json object
but i need btech(exact match word) as the first document and remaining documents followed by it.
so for that what i have to change in my index creation
can anybody please help me
You can use term query
"term" : { "name" : "btech" }
Or regexp query
"regexp" : { "name" : "btech" }
You are using text type, make sure to check keyword type too
from documentation
If you need to index structured content such as email addresses,
hostnames, status codes, or tags, it is likely that you should rather
use a keyword field.

How to change the field data type in elasticsearch

"#version":{
"type":"string",
"index":"not_analyzed",
"ignore_above":1024
},
Here I have to change the type string to long .
I have used curl -XPUT 'http://localhost:9200/' this is just a sample
Does anyone has any idea on this?
Supposing you are using dynamic mapping (which is by default), the type of a field depends of the type of data present in the field of the first indexed document.
So if the first indexed document has a field "version" of type string, the mapping will have a field "version" of type string.
Documentation on the dynamic mapping.
You can't update a mapping. As explained in the documentation, you need to create a new index and reindex your data.

can terms lookup mechanism query by other field but id?

here is elasticsearch official website about terms:
https://www.elastic.co/guide/en/elasticsearch/reference/2.1/query-dsl-terms-query.html
As we can see, if we want to do terms lookup mechanism query, we should use command like this:
curl -XGET localhost:9200/tweets/_search -d '{
"query" : {
"terms" : {
"user" : {
"index" : "users",
"type" : "user",
"id" : "2",
"path" : "followers"
}
}
}
}'
But what if i want to do query by other field of users.
Assume that users has some other fields such as name and can i use terms lookup mechanism finding the tweets by giving users name but not id.
I have tried to use command like this:
curl -XGET localhost:9200/tweets/_search -d '{
"query" : {
"terms" : {
"user" : {
"index" : "users",
"type" : "user",
"name" : "Jane",
"path" : "followers"
}
}
}
}'
but it occurs error.
Looking forward to your help. Thank you!
The terms lookup mechanism is basically a built-in optimization to not have to make two queries to JOIN two indices, i.e. one in index A to get the ids to lookup and a second to fetch the documents with those ids in index B.
In contrary to SQL, such a JOIN can only work on the id field since this is the only way to uniquely retrieve a document from Elasticsearch via a GET call, which is exactly what Elasticsearch will do in the terms lookup.
So to answer your question, the terms lookup mechanism will not work on any other field than the id field since the first document to be retrieved must be unique. In your case, ES would not know how to fetch the document for the user with name Jane since name is just a field present in the user document, but in no way a unique identifier for user Jane.
I think you did not understand exactly how this works. Terms lookup query works by reading values from a field of a document with the given id. In this case, you are trying to match the value of field user in tweets index with values of field followers in document with id "2" present in users index and user type.
If you want to read from any other field then simply mention that in "path".
What you mainly need to understand is that the lookup values are all fetched from a field of a single document and not multiple documents.

Query two indexes simultaneously in Kibana 4?

Whenever I create a visualization, Kibana 4 asks me to select the index for doing the search. My project requires searching data that is present in multiple indexes and hence I am stuck. I wish to search two indexes for my data and then visualize them. Any help would be valuable.
Kibana can create Visualization from multiple indexes. But! indexes should have similar names, or alias names with similar names, for example, you can simply grab data from indexes: logstash-2015-01-01 and logstash-2015-01-02 using mask logstash-*.
But yes it would be handy if we could write something like index1,onother_index.
A solution that works in any case: create an alias in Elasticsearch for the indexes you want to query simultaneously and then use the alias as an index-pattern in Kibana.
In the plugin Marvel, through the Sense interface, you can create an alias for multiple indexes by doing this request :
POST _aliases
{
"actions" : [
{ "add" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}
Or using CURL:
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "add" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}'
Then, you just need to add an index-pattern in Kibana for "alias1" and create your visualizations.
For more informations on aliases, see https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html
Thanks for all the help, But I figured out a way in which this could be done.
In Index Pattern of Kibana 4 create an index Pattern as _all. This index pattern contains all the indexes present in your elasticsearch. Hence when you create a new visualization simply select the _all index pattern there and all the data fields from all the indexes in your elasticsearch are accessible and you can easily use it to create visualizations.
If I understand what you are asking correctly, then it may depend on how you've named your indexes.
I can query multiple logstash indexes, by selecting my pattern 'logstash-*'. When you setup your indexes it gives you the option to specify a pattern.
(Settings => Indices => Index Pattern => Add New)
I hope that helps.
Two wildcards (i.e. *-*) works for me in Kibana 4.
I'm not sure i understand correctly, but I think your best option is to create that visualization on both indexes you want separately, and build a dashboard including both the visualizations.
Kibana can't display a single visualization with searches from two separate indexes.

Resources