I'm getting error when try full text search number full size "9" in couchbase 6.0.3. Exception throws : err: bleve: QueryBleve validating request, err: parse error: error parsing number: strconv.ParseFloat: parsing.
If i searching with some string "9abc" , searching successfull so i think , lib of couchbase search regconize "9" is number and parse failed. I dont know to to resolve problem. Please help me!
Couchbase 6.0.3
ConjunctionQuery fts = SearchQuery.conjuncts(SearchQuery.queryString(source));
fts = fts.and(SearchQuery.matchPhrase("123").field("tm"));
fts = fts.and(SearchQuery.booleanField(true).field("active"));
SearchQuery query = new SearchQuery("segmentIndex"), fts);
SearchQueryResult result = bucket.query(query);
Exception throws : err: bleve: QueryBleve validating request, err: parse error: error parsing number: strconv.ParseFloat: parsing.
{
"name": "tmSegmentIndex",
"type": "fulltext-index",
"params": {
"doc_config": {
"docid_prefix_delim": "",
"docid_regexp": "",
"mode": "type_field",
"type_field": "type"
},
"mapping": {
"analysis": {
"analyzers": {
"remove_fullsize_number": {
"char_filters": [
"remove_fullsize_number"
],
"token_filters": [
"cjk_bigram",
"cjk_width"
],
"tokenizer": "whitespace",
"type": "custom"
}
},
"char_filters": {
"remove_fullsize_number": {
"regexp": "9",
"replace": "9",
"type": "regexp"
}
}
},
"default_analyzer": "cjk",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"default_mapping": {
"default_analyzer": "cjk",
"dynamic": true,
"enabled": true
},
"default_type": "_default",
"docvalues_dynamic": true,
"index_dynamic": true,
"store_dynamic": false,
"type_field": "_type"
},
"store": {
"indexType": "scorch",
"kvStoreName": "mossStore"
}
},
"sourceType": "couchbase",
"sourceName": "tm-segment",
"sourceUUID": "973fdbffc567cdfe8f423289b9700f19",
"sourceParams": {},
"planParams": {
"maxPartitionsPerPIndex": 171,
"numReplicas": 0
},
"uuid": "1265a6bedbfd027c"
}
can you just try a custom analyser with an asciifolding character filter like below.
Also, when you directly search from the UI box without a field name, its getting searched in the "_all" field which won't get the right/intended analyser used for parsing the query text.
You may field scope the query there like => field:"9"
{
"type": "fulltext-index",
"name": "FTS",
"uuid": "401ee8132818cee3",
"sourceType": "couchbase",
"sourceName": "sample",
"sourceUUID": "6bd6d0b1c714fcd7697a349ff8166bf8",
"planParams": {
"maxPartitionsPerPIndex": 171,
"indexPartitions": 6
},
"params": {
"doc_config": {
"docid_prefix_delim": "",
"docid_regexp": "",
"mode": "type_field",
"type_field": "type"
},
"mapping": {
"analysis": {
"analyzers": {
"custom": {
"char_filters": [
"asciifolding"
],
"tokenizer": "unicode",
"type": "custom"
}
}
},
"default_analyzer": "standard",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"default_mapping": {
"dynamic": false,
"enabled": true,
"properties": {
"id": {
"dynamic": false,
"enabled": true,
"fields": [
{
"analyzer": "custom",
"docvalues": true,
"include_in_all": true,
"include_term_vectors": true,
"index": true,
"name": "id",
"type": "text"
}
]
}
}
},
"default_type": "_default",
"docvalues_dynamic": true,
"index_dynamic": true,
"store_dynamic": false,
"type_field": "_type"
},
"store": {
"indexType": "scorch"
}
},
"sourceParams": {}
}
Asciifolding filters are a part of 6.5.0 Couchbase release. Its available in beta for trials.
Related
I am currently getting data from an external API for use my in Laravel API. I have everything working but I feel like it is slow.
I'm getting the data from the API with Http:get('url) and that works fast. It is only when I start looping through the data and making edits when things are slowing down.
I don't need all the data, but it would still be nice to edit before entering the data to the database as things aren't very consitent if possible. I also have a few columns that use data and some logic to make new columns so that each app/site doesn't need to do it.
I am saving to the database on each foreach loop with the eloquent Model:updateOrCreate() method which works but these json files can easily be 6000 lines long or more so it obviously takes time to loop through each set modify values and then save to the database each time. There usually isn't more than 200 or so entries but it still takes time. Will probably eventually update this to the new upset() method to make less queries to the database. Running in my localhost it is currently take about a minute and a half to run, which just seams way too long.
Here is a shortened version of how I was looping through the data.
$json = json_decode($contents, true);
$features = $json['features'];
foreach ($features as $feature){
// Get ID
$id = $feature['id'];
// Get primary condition data
$geometry = $feature['geometry'];
$properties = $feature['properties'];
// Get secondary geometry data
$geometryType = $geometry['type'];
$coordinates = $geometry['coordinates'];
Model::updateOrCreate(
[
'id' => $id,
],
[
'coordinates' => $coordinates,
'geometry_type' => $geometryType,
]);
}
Most of what I'm doing behind the scenes to the data before going into the database is cleaning up some text strings but there are a few logic things to normalize or prep the data for websites and apps.
Is there a more efficient way to get the same result? This will ultimately be used in a scheduler and run on an interval.
Example Data structure from API documentation
{
"$schema": "http://json-schema.org/draft-04/schema#",
"additionalProperties": false,
"properties": {
"features": {
"items": {
"additionalProperties": false,
"properties": {
"attributes": {
"type": [
"object",
"null"
]
},
"geometry": {
"additionalProperties": false,
"properties": {
"coordinates": {
"items": {
"items": {
"type": "number"
},
"type": "array"
},
"type": "array"
},
"type": {
"type": "string"
}
},
"required": [
"coordinates",
"type"
],
"type": "object"
},
"properties": {
"additionalProperties": false,
"properties": {
"currentConditions": {
"items": {
"properties": {
"additionalData": {
"type": "string"
},
"conditionDescription": {
"type": "string"
},
"conditionId": {
"type": "integer"
},
"confirmationTime": {
"type": "integer"
},
"confirmationUserName": {
"type": "string"
},
"endTime": {
"type": "integer"
},
"id": {
"type": "integer"
},
"sourceType": {
"type": "string"
},
"startTime": {
"type": "integer"
},
"updateTime": {
"type": "integer"
}
},
"required": [
"id",
"userName",
"updateTime",
"startTime",
"conditionId",
"conditionDescription",
"confirmationUserName",
"confirmationTime",
"sourceType",
"endTime"
],
"type": "object"
},
"type": "array"
},
"id": {
"type": "string"
},
"name": {
"type": "string"
},
"nameId": {
"type": "string"
},
"parentAreaId": {
"type": "integer"
},
"parentSubAreaId": {
"type": "integer"
},
"primaryLatitude": {
"type": "number"
},
"primaryLongitude": {
"type": "number"
},
"primaryMP": {
"type": "number"
},
"routeId": {
"type": "integer"
},
"routeName": {
"type": "string"
},
"routeSegmentIndex": {
"type": "integer"
},
"secondaryLatitude": {
"type": "number"
},
"secondaryLongitude": {
"type": "number"
},
"secondaryMP": {
"type": "number"
},
"sortOrder": {
"type": "integer"
}
},
"required": [
"id",
"name",
"nameId",
"routeId",
"routeName",
"primaryMP",
"secondaryMP",
"primaryLatitude",
"primaryLongitude",
"secondaryLatitude",
"secondaryLongitude",
"sortOrder",
"parentAreaId",
"parentSubAreaId",
"routeSegmentIndex",
"currentConditions"
],
"type": "object"
},
"type": {
"type": "string"
}
},
"required": [
"type",
"geometry",
"properties",
"attributes"
],
"type": "object"
},
"type": "array"
},
"type": {
"type": "string"
}
},
"required": [
"type",
"features"
],
"type": "object"
}
Second, related question.
Since this is being updated on an interval I have it updating and creating records from the json data, but is there an efficient way to delete old records that are no longer in the json file? I currently get an array of current ids and compare them to the new ids and then loop through each and delete them. There has to be a better way.
Have no idea what to say to your first question, but I think you may try to do something like this regarding the second question.
SomeModel::query()->whereNotIn('id', $newIds)->delete();
$newIds you can collect during the first loop.
Issue - completion suggester with custom keyword lowercase analyzer not working as expected. We can reproduce the issue with following steps.
Not able to understand whats causing issue here. However, if we search for "PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE" , it is giving result.
Create index
curl -X PUT "localhost:9200/com.tmp.index?pretty" -H 'Content-Type: application/json' -d'{
"mappings": {
"dynamic": "false",
"properties": {
"namesuggest": {
"type": "completion",
"analyzer": "keyword_lowercase_analyzer",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50,
"contexts": [
{
"name": "searchable",
"type": "CATEGORY"
}
]
}
}
},
"settings": {
"index": {
"mapping": {
"ignore_malformed": "true"
},
"refresh_interval": "5s",
"analysis": {
"analyzer": {
"keyword_lowercase_analyzer": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "0",
"number_of_shards": "1"
}
}
}'
Index document
curl -X PUT "localhost:9200/com.tmp.index/_doc/123?pretty" -H 'Content-Type: application/json' -d'{
"namesuggest": {
"input": [
"PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE LIMITED."
],
"contexts": {
"searchable": [
"*"
]
}
}
}
'
Issue - Complete suggest not giving result
curl -X GET "localhost:9200/com.tmp.index/_search?pretty" -H 'Content-Type: application/json' -d'{
"suggest": {
"legalEntity": {
"prefix": "PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE LIMITED.",
"completion": {
"field": "namesuggest",
"size": 10,
"contexts": {
"searchable": [
{
"context": "*",
"boost": 1,
"prefix": false
}
]
}
}
}
}
}'
You are facing this issue because of default value of max_input_length parameter is set to 50.
Below is description given for this parameter in documentation:
Limits the length of a single input, defaults to 50 UTF-16 code
points. This limit is only used at index time to reduce the total
number of characters per input string in order to prevent massive
inputs from bloating the underlying datastructure. Most use cases
won’t be influenced by the default value since prefix completions
seldom grow beyond prefixes longer than a handful of characters.
If you enter below string which is exact 50 character then you will get response:
PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE
Now if you add one more or two character to above string then it will not resturn the result:
PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE L
You can use this default behaviour or you can updated your index mapping with increase value of max_input_length parameter and reindex your data.
{
"mappings": {
"dynamic": "false",
"properties": {
"namesuggest": {
"type": "completion",
"analyzer": "keyword_lowercase_analyzer",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 100,
"contexts": [
{
"name": "searchable",
"type": "CATEGORY"
}
]
}
}
},
"settings": {
"index": {
"mapping": {
"ignore_malformed": "true"
},
"refresh_interval": "5s",
"analysis": {
"analyzer": {
"keyword_lowercase_analyzer": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "0",
"number_of_shards": "1"
}
}
}
You will get response like below after updating index:
"suggest": {
"legalEntity": [
{
"text": "PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE LIMITED",
"offset": 0,
"length": 58,
"options": [
{
"text": "PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE LIMITED.",
"_index": "74071871",
"_id": "123",
"_score": 1,
"_source": {
"namesuggest": {
"input": [
"PRAXIS CONSULTING AND INFORMATION SERVICES PRIVATE LIMITED."
],
"contexts": {
"searchable": [
"*"
]
}
}
},
"contexts": {
"searchable": [
"*"
]
}
}
]
}
]
}
I am having my data indexed in elastic search in version 7.11. This is my mapping i got when i directly added documents to my index.
{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}
I havent added the keyword part but no idea where it came from.
I am running a wild card query on the same. But unable to get data for keywords with spaces.
{
"query": {
"bool":{
"should":[
{"wildcard": {"name":"*hello world*"}}
]
}
}
}
Have seen many answers related to not_analyzed . And i have tried updating {"index":"true"} in mapping but with no help. How to make the wild card search work in this version of elastic search
Tried adding the wildcard field
PUT http://localhost:9001/indexname/_mapping
{
"properties": {
"name": {
"type" :"wildcard"
}
}
}
And got following response
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [name] cannot be changed from type [text] to [wildcard]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [name] cannot be changed from type [text] to [wildcard]"
},
"status": 400
}
Adding a sample document to match
{
"_index": "accelerators",
"_type": "_doc",
"_id": "602ec047a70f7f30bcf75dec",
"_score": 1.0,
"_source": {
"acc_id": "602ec047a70f7f30bcf75dec",
"name": "hello world example",
"type": "Accelerator",
"description": "khdkhfk ldsjl klsdkl",
"teamMembers": [
{
"userId": "karthik.r#gmail.com",
"name": "Karthik Ganesh R",
"shortName": "KR",
"isOwner": true
},
{
"userId": "anand.sajan#gmail.com",
"name": "Anand Sajan",
"shortName": "AS",
"isOwner": false
}
],
"sectorObj": [
{
"item_id": 14,
"item_text": "Cross-sector"
}
],
"geographyObj": [
{
"item_id": 4,
"item_text": "Global"
}
],
"technologyObj": [
{
"item_id": 1,
"item_text": "Artificial Intelligence"
}
],
"themeColor": 1,
"mainImage": "assets/images/Graphics/Asset 35.svg",
"features": [
{
"name": "Ideation",
"icon": "Asset 1007.svg"
},
{
"name": "Innovation",
"icon": "Asset 1044.svg"
},
{
"name": "Strategy",
"icon": "Asset 1129.svg"
},
{
"name": "Intuitive",
"icon": "Asset 964.svg"
},
],
"logo": {
"actualFileName": "",
"fileExtension": "",
"fileName": "",
"fileSize": 0,
"fileUrl": ""
},
"customLogo": {
"logoColor": "#B9241C",
"logoText": "EC",
"logoTextColor": "#F6F6FA"
},
"collaborators": [
{
"userId": "muhammed.arif#gmail.com",
"name": "muhammed Arif P T",
"shortName": "MA"
},
{
"userId": "anand.sajan#gmail.com",
"name": "Anand Sajan",
"shortName": "AS"
}
],
"created_date": "2021-02-18T19:30:15.238000Z",
"modified_date": "2021-03-11T11:45:49.583000Z"
}
}
You cannot modify a field mapping once created. However, you can create another sub-field of type wildcard, like this:
PUT http://localhost:9001/indexname/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"wildcard": {
"type" :"wildcard"
},
"keyword": {
"type" :"keyword",
"ignore_above":256
}
}
}
}
}
When the mapping is updated, you need to reindex your data so that the new field gets indexed, like this:
POST http://localhost:9001/indexname/_update_by_query
And then when this finishes, you'll be able to query on this new field like this:
{
"query": {
"bool": {
"should": [
{
"wildcard": {
"name.wildcard": "*hello world*"
}
}
]
}
}
}
I'm trying to make range aggregation on the following data set:
{
"ProductType": 1,
"ProductDefinition": "fc588f8e-14f2-4871-891f-c73a4e3d17ca",
"ParentProduct": null,
"Sku": "074617",
"VariantSku": null,
"Name": "Paraboot Avoriaz/Jannu Marron Brut Marron Brown Hiking Boot Shoes",
"AllowOrdering": true,
"Rating": null,
"ThumbnailImageUrl": "/media/1106/074617.jpg",
"PrimaryImageUrl": "/media/1106/074617.jpg",
"Categories": [
"399d7b20-18cc-46c0-b63e-79eadb9390c7"
],
"RelatedProducts": [],
"Variants": [
"84a7ff9f-edf0-4aab-87f9-ba4efd44db74",
"e2eb2c50-6abc-4fbe-8fc8-89e6644b23ef",
"a7e16ccc-c14f-42f5-afb2-9b7d9aefbc5c"
],
"PriceGroups": [
"86182755-519f-4e05-96ef-5f93a59bbaec"
],
"DisplayName": "Paraboot Avoriaz/Jannu Marron Brut Marron Brown Hiking Boot Shoes",
"ShortDescription": "",
"LongDescription": "<ul><li>Paraboot Avoriaz Mountaineering Boots</li><li>Marron Brut Marron (Brown)</li><li>Full leather inners and uppers</li><li>Norwegien Welted Commando Sole</li><li>Hand made in France</li><li>Style number : 074617</li></ul><p>As featured on Pritchards.co.uk</p>",
"UnitPrices": {
"EUR 15 pct": 343.85
},
"Taxes": {
"EUR 15 pct": 51.5775
},
"PricesInclTax": {
"EUR 15 pct": 395.4275
},
"Slug": "paraboot-avoriazjannu-marron-brut-marron-brown-hiking-boot-shoes",
"VariantsProperties": [
{
"Key": "ShoeSize",
"Value": "8"
},
{
"Key": "ShoeSize",
"Value": "10"
},
{
"Key": "ShoeSize",
"Value": "6"
}
],
"Guid": "0d4f6899-c66a-4416-8f5d-26822c3b57ae",
"Id": 178,
"ShowOnHomepage": true
}
I'm aggregating on VariantsProperties which have the following mapping
"VariantsProperties": {
"type": "nested",
"properties": {
"Key": {
"type": "keyword"
},
"Value": {
"type": "keyword"
}
}
}
Terms aggregations are working fine with following code:
{
"aggs": {
"Nest": {
"nested": {
"path": "VariantsProperties"
},
"aggs": {
"fieldIds": {
"terms": {
"field": "VariantsProperties.Key"
},
"aggs": {
"values": {
"terms": {
"field": "VariantsProperties.Value"
}
}
}
}
}
}
}
}
However when I try to do a range aggregation to get shoes in size between 8 - 12 such as:
{
"aggs": {
"Nest": {
"nested": {
"path": "VariantsProperties"
},
"aggs": {
"fieldIds": {
"range": {
"field": "VariantsProperties.Value",
"ranges": [ { "from": 8, "to": 12 }]
}
}
}
}
}
}
I get the following error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "product-avenueproductindexdefinition-24476f82-en-us",
"node": "ejgN4XecT1SUfgrhzP8uZg",
"reason": {
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
}
}
},
"status": 400
}
Is there a way to "transform" the terms aggregation into a range aggregation, without the need of changing the schema? I know I could build the ranges myself by extracting the data from the terms aggregation and building the ranges out of it, however, I would prefer a solution within the elastic itself.
There are two ways to solve this:
Option A: Use a script instead of a field. This option will work without having to reindex your data, but depending on your volume of data, the performance might suffer.
POST test/_search
{
"aggs": {
"Nest": {
"nested": {
"path": "VariantsProperties"
},
"aggs": {
"fieldIds": {
"range": {
"script": "Integer.parseInt(doc['VariantsProperties.Value'].value)",
"ranges": [
{
"from": 8,
"to": 12
}
]
}
}
}
}
}
}
Option B: Add an integer sub-field in your mapping.
PUT my-index/_mapping
{
"properties": {
"VariantsProperties": {
"type": "nested",
"properties": {
"Key": {
"type": "keyword"
},
"Value": {
"type": "keyword",
"fields": {
"numeric": {
"type": "integer",
"ignore_malformed": true
}
}
}
}
}
}
}
Once your mapping is modified, you can run _update_by_query on your index in order to reindex the VariantsProperties.Value data
PUT my-index/_update_by_query
Finally, when this last command is done, you can run the range aggregation on the VariantsProperties.Value.numeric field.
Also note that this second but will be more performant on the long term.
I am migrating elasticsearch prod data from 1.4.3v to 5.5v, for which I am using reindex. When I try to reindex old ES index to new ES index the reindexing fails with an exception Failed Reason: mapper [THROUGHPUT_ROWS_PER_SEC] cannot be changed from type [long] to [float]. Failed Type: illegal_argument_exception
ES mapping for task_history index in ES 1.4.3v
{
"task_history": {
"mappings": {
"task_run_hist": {
"_all": {
"enabled": false
},
"_routing": {
"required": true,
"path": "org_id"
},
"properties": {
"RUN_TIME_IN_MINS": {
"type": "double"
},
"THROUGHPUT_ROWS_PER_SEC": {
"type": "long"
},
"account_name": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
ES mapping for task_history index in ES 5.5v (this mapping gets created as part reindexing)
{
"task_history": {
"mappings": {
"task_run_hist": {
"_all": {
"enabled": false
},
"_routing": {
"required": true
},
"properties": {
"RUN_TIME_IN_MINS": {
"type": "float"
},
"THROUGHPUT_ROWS_PER_SEC": {
"type": "long"
},
"account_name": {
"type": "keyword",
"store": true
}
}
}
}
}
}
Sample data
{
"_index": "task_history",
"_type": "task_run_hist",
"_id": "1421955143",
"_score": 1,
"_source": {
"RUN_TIME_IN_MINS": 0.47,
"THROUGHPUT_ROWS_PER_SEC": 46,
"org_id": "xxxxxx",
"account_name": "Soma Acc1"
}
},
{
"_index": "task_history",
"_type": "task_run_hist",
"_id": "1421943738",
"_score": 1,
"_source": {
"RUN_TIME_IN_MINS": 1.02,
"THROUGHPUT_ROWS_PER_SEC": 65.28,
"org_id": "yyyyyy",
"account_name": "Choma Acc1"
}
}
2 Questions
How elasticsearch 1.4.3 is saving float numbers when mapping for THROUGHPUT_ROWS_PER_SEC type is long?
If it's a data issue in old ES how can I remove all float numbers before starting the reindexing process?
For 2nd option I am trying to list all documents having float numbers using below query, so that I can verify once and delete it, but below query still lists documents having THROUGHPUT_ROWS_PER_SEC as non floating numbers.
Note: Groovy scripting is enabled
GET task_history/task_run_hist/_search?size=100
{
"filter": {
"script": {
"script": "doc['THROUGHPUT_ROWS_PER_SEC'].value % 1 == 0"
}
}
}
Updated with solution provided by Val
When I try below script in reindexing, I get a runtime error. Listed below. Any clue on what is getting wrond here? I added additional condition to convert RUN_TIME_IN_MINS to float as your original script pointed out an error in RUN_TIME_IN_MINS field. mapper [RUN_TIME_IN_MINS] cannot be changed from type [long] to [float]"
POST _reindex?wait_for_completion=false
{
"source": {
"remote": {
"host": "http://esip:15000"
},
"index": "task_history"
},
"dest": {
"index": "task_history"
},
"script": {
"inline": "if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;",
"lang": "painless"
}
}
Runtime error
{
"completed": true,
"task": {
"node": "wZOzypYlSayIRlhp9y3lVA",
"id": 645528,
"type": "transport",
"action": "indices:data/write/reindex",
"status": {
"total": 18249521,
"updated": 4691,
"created": 181721,
"deleted": 0,
"batches": 37,
"version_conflicts": 0,
"noops": 67076,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0
},
"description": """
reindex from [host=esip port=15000 query={
"match_all" : {
"boost" : 1.0
}
}][task_history] updated with Script{type=inline, lang='painless', idOrCode='if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;', options={}, params={}} to [task_history]
""",
"start_time_in_millis": 1502336063507,
"running_time_in_nanos": 93094657751,
"cancellable": true
},
"error": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [],
"script": "if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' } ctx._source.RUN_TIME_IN_MINS = (float) ctx._source.RUN_TIME_IN_MINS;",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null
}
}
}
You obviously want to keep your existing ES 5.x mapping with a longso all you need to do is to add a script to your reindex call that modifies the THROUGHPUT_ROWS_PER_SEC field to a long. Something like this should do:
POST _reindex
{
"source": {
"remote": {
"host": "http://es1host:9200",
},
"index": "task_history"
},
"dest": {
"index": "task_history"
},
"script": {
"inline": "if (ctx._source.THROUGHPUT_ROWS_PER_SEC % 1 != 0) { ctx.op = 'noop' }" },
"lang": "painless"
}
}