How to speed up document append to existing array in elasticsearch? - elasticsearch

I am using elasticsearch version 6.3.1.
And I am creating a nested type field,I have created this field to append all the documents of same ID.
Here is my schema for index:-
curl -XPUT 'localhost:9200/axes_index_test12?pretty' -H 'Content-Type: application/json' -d'
{
"mappings": {
"axes_type_test12": {
"properties": {
"totalData": {
"type": "nested",
"properties": {
"gpsdt": {
"type": "date",
"format":"dateOptionalTime"
},
"extbatlevel": {
"type": "integer"
},
"intbatlevel" : {
"type" : "integer"
},
"lastgpsdt": {
"type": "date",
"format":"dateOptionalTime"
},
"satno" : {
"type" : "integer"
},
"srtangle" : {
"type" : "integer"
}
}
},
"imei": {
"type": "long"
},
"date": {
"type": "date", "format":"dateOptionalTime"
},
"id" : {
"type" : "long"
}
}
}
}
}'
And to append into existing array I call following API : -
Here is the documents which I have to append:-
data={
"script" : {"source": "ctx._source.totalData.add(params.count)",
"lang": "painless",
"params" : {"count" : { "gpsdt" : gpsdt,
"analog1" : analog1,
"analog2" : analog2,
"analog3" : analog3,
"analog4" : analog4,
"digital1" : digital1,
"digital2" : digital2,
"digital3" : digital3,
"digital4" : digital4,
"extbatlevel" : extbatlevel,
"intbatlevel" : intbatlevel,
"lastgpsdt" : lastgpsdt,
"latitude" : latitude,
"longitude" : longitude,
"odo" : odo,
"odometer" : odometer,
"satno" : satno,
"srtangle" : srtangle,
"speed" : speed
}
}
}
}
Document Parsing:-
json_data = json.dumps(data)
And API url is: -
API_ENDPOINT = "http://localhost:9200/axes_index_test12/axes_type_test12/"+str(documentId)+"/_update"
And Finnaly I call this API:-
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url = API_ENDPOINT, data = json_data,headers=headers
Everything is fine with this but I am not getting good performance when I append new documents in existing array.
So please suggest me what changes I should make?
And I have 4 node cluster, 1 master, 2 data nodes and one cordinator node.

Related

Bulk API error while indexing data into elasticsearch

I want to import some data into elasticsearch using bulk API. this is the mapping I have created using Kibana dev tools:
PUT /main-news-test-data
{
"mappings": {
"properties": {
"content": {
"type": "text"
},
"title": {
"type": "text"
},
"lead": {
"type": "text"
},
"agency": {
"type": "keyword"
},
"date_created": {
"type": "date"
},
"url": {
"type": "keyword"
},
"image": {
"type": "keyword"
},
"category": {
"type": "keyword"
},
"id":{
"type": "keyword"
}
}
}
}
and this is my bulk data:
{ "index" : { "_index" : "main-news-test-data", "_id" : "1" } }
{
"content":"\u0641\u0647\u06cc\u0645\u0647 \u062d\u0633\u0646\u200c\u0645\u06cc\u0631\u06cc: \u0627\u06af\u0631\u0686\u0647 \u062f\u0631 \u0647\u06cc\u0627\u0647\u0648\u06cc ",
"title":"\u06a9\u0627\u0631\u0647\u0627\u06cc \u0642\u0627\u0644\u06cc\u0628\u0627\u0641",
"lead":"\u062c\u0627\u0645\u0639\u0647 > \u0634\u0647\u0631\u06cc -.",
"agency":"13",
"date_created":1494518193,
"url":"http://www.khabaronline.ir/(X(1)S(bud4wg3ebzbxv51mj45iwjtp))/detail/663749/society/urban",
"image":"uploads/2017/05/11/1589793661.jpg",
"category":"15",
"id":"2981643"
}
{ "index" : { "_index" : "main-news-test-data", "_id" : "2" } }
{
....
but when I want to post data I receive this error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]"
}
"status" : 400
}
what is the problem? I used both PowerShell and POST method in Kibana dev tools but I receive the same error in both.
The data should be specified in a single line like this:
{ "index" : { "_index" : "main-news-test-data", "_id" : "1" } }
{ "content":"\u0641\u0647","title":"\u06a9" }
Please refer this SO answer
Try this below format of bulk JSON. I have tested this bulk API request locally also, and now it's working perfectly fine:
{ "index" : { "_index" : "main-news-test-data", "_id" : "1" } }
{"content":"\u0641\u0647\u06cc\u0645\u0647 \u062d\u0633\u0646\u200c\u0645\u06cc\u0631\u06cc: \u0627\u06af\u0631\u0686\u0647 \u062f\u0631 \u0647\u06cc\u0627\u0647\u0648\u06cc ", "title":"\u06a9\u0627\u0631\u0647\u0627\u06cc \u0642\u0627\u0644\u06cc\u0628\u0627\u0641", "lead":"\u062c\u0627\u0645\u0639\u0647 > \u0634\u0647\u0631\u06cc -.", "agency":"13", "date_created":1494518193, "url":"http://www.khabaronline.ir/(X(1)S(bud4wg3ebzbxv51mj45iwjtp))/detail/663749/society/urban", "image":"uploads/2017/05/11/1589793661.jpg", "category":"15", "id":"2981643"}
Dont forget to add a new line at the end of your content.

Mapping array of long values from hive to elastic search index

I have data in hive in following format
user_ids name city owner_ids
[1, 324, 456] some_name some_city [4567, 12345678]
I want to be able to search by user_ids = 324 as filter criteria or owner_ids = 12345678 and be able to get back above document as response. (Exact match on ids)
Currently I am using dynamic template for mapping which maps user_ids field to long and I am unable to get any results, what type should I force field mapping of user_ids and owner_ids to get this response?
Mapping configuration
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"doc": {
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword"
}
}
}
]
}
}
}
Result mapping
{
"user_search" : {
"mappings" : {
"doc" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"ds" : {
"type" : "date"
},
"user_ids" : {
"type" : "long"
},
"owner_ids" : {
"type" : "long"
}
}
}
}
}
}

Kibana - Pie-Chart with sum over two different fields

In an index I have two mappings.
"mappings" : {
"deliveries" : {
"properties" : {
"#timestamp": { "type" : "date", "format": "yyyy-MM-dd" },
"receiptName" : { "type" : "text" },
"amountDelivered" : { "type" : "integer" },
"amountSold" : { "type" : "integer" },
"sellingPrice" : { "type" : "float" },
"earned" : { "type" : "float" }
}
},
"expenses" : {
"properties" : {
"#timestamp": { "type" : "date", "format": "yyyy-MM-dd" },
"description": { "type" : "text" },
"amount": { "type": "float" }
}
}
}
Now I wanted to create a simple Pie Chart in Kibana for sumarize up deliveries.earned and expenses.amount.
Is this possible or do I have to switch to an client application? The number of documents (2 or 3 a month) is really to less to start some development here xD
You can create a simple scripted_field through Kibana which maps amount and earned fields to the same field, called transaction_amount.
Painless script:
if(doc['earned'].length > 0) { return doc['earned']; } else { return doc['amount']; }
Then you can create a Pie Chart with "Slice Size" configured as the sum of transaction_amount and "Split Slices" configured as a Terms Aggregation on _type.

elasticsearch: search within text

For example if name as P.Shanmukha Sharma and if user searches for Shanmukha will not be available for search result. its returning only for P.Shanmukha and Sharma, is there any way if i will search Shanmukha and it will return result?
"user" : {
"properties" : {
"city" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
},
"created" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"id" : {
"type" : "long"
},
"latitude" : {
"type" : "double"
},
"longitude" : {
"type" : "double"
},
"profile_image" : {
"type" : "string"
},
"state" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
},
"super_verification" : {
"type" : "string"
},
"type" : {
"type" : "string"
},
"username" : {
"type" : "string",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
}
}
}
username is defined as a search analyzer
and search query is
def EsSearch(self, index, page, size, searchTerm):
body = {
'query': {
'match': searchTerm
},
'sort': {
'created': {
'order': 'desc'
}
},
'filter': {
'term': {
'super_verification': 'verified'
}
}
}
res = self.conn.search(index=index, body=body)
output = []
for doc in res['hits']['hits']:
output.append(doc['_source'])
return output
so doing so much of research on ES i Got this solution with wildcard. Thanks EveryOne
{
"query": {
"wildcard": {
"username": {
"value": "*Shanmukha*"
}
}
}
}
Basically, 2 way in to do so,
By GET method and URL:
http://localhost:9200/your_index/your_type/_search?q=username:*Shanmukha*&pretty=true
By Fuzzy Query
as given by #krrish this one:

how to change type of a value in elasticsearch

I am trying to do geomap of a value in Elasticsearch but the value type of the client_location is set as a string and I would like to change it to geo_point. When I run the following I am getting:
#curl -XGET "http://core.z0z0.tk:9200/_all/_mappings/http?pretty"
{
"packetbeat-2015.12.04" : {
"mappings" : {
"http" : {
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"beat" : {
"properties" : {
"hostname" : {
"type" : "string"
},
"name" : {
"type" : "string"
}
}
},
"bytes_in" : {
"type" : "long"
},
"bytes_out" : {
"type" : "long"
},
"client_ip" : {
"type" : "string"
},
"client_location" : {
"type" : "string"
},
"client_port" : {
"type" : "long"
},
"client_proc" : {
"type" : "string"
},
"client_server" : {
"type" : "string"
},
"count" : {
"type" : "long"
},
"direction" : {
"type" : "string"
},
"http" : {
"properties" : {
"code" : {
"type" : "long"
},
"content_length" : {
"type" : "long"
},
"phrase" : {
"type" : "string"
}
}
},
"ip" : {
"type" : "string"
},
"method" : {
"type" : "string"
},
"notes" : {
"type" : "string"
},
"params" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"port" : {
"type" : "long"
},
"proc" : {
"type" : "string"
},
"query" : {
"type" : "string"
},
"responsetime" : {
"type" : "long"
},
"server" : {
"type" : "string"
},
"status" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
}
}
}
}
When I run the following command to change the type of the value from string to geo_point I am getting the following error:
# curl -XPUT "http://localhost:9200/_all/_mappings/http" -d '
> {
> "http" : {
> "properties" : {
> "client_location" : {
> "type" : "geo_point"
> }
> }
> }
> }
> '
{"error":{"root_cause":[{"type":"merge_mapping_exception","reason":"Merge failed with failures {[mapper [client_location] of different type, current_type [string], merged_type[geo_point]]}"}],"type":"merge_mapping_exception","reason":"Merge failed with failures {[mapper [client_location] of different type, current_type [string], merged_type [geo_point]]}"},"status":400}
Any suggestion how should I correctly change the type?
Thanks in advance.
Unfortunately, once you've created a field you cannot change its type anymore. The best thing to do is to delete the index and recreate it properly with the adequate mapping.
Another temporary solution if you don't want to delete your index immediately, is to create a sub-field of your existing field:
# curl -XPUT "http://localhost:9200/_all/_mappings/http" -d '{
"http": {
"properties": {
"client_location": {
"type": "string",
"fields": {
"geo": {
"type": "geo_point"
}
}
}
}
}
}'
And then you can access it in your queries using client_location.geo.
Also note that you have to re-index your data in order to populate that new sub-field... which means you might just as well delete your index and re-create it properly.
UPDATE
After installing Packetbeat you need to make sure to install the packetbeat template yourself as described here (i.e. it is not done automatically):
https://www.elastic.co/guide/en/beats/packetbeat/current/packetbeat-getting-started.html#packetbeat-template
curl -XPUT 'http://localhost:9200/_template/packetbeat' -d#/etc/packetbeat/packetbeat.template.json

Resources