Setting a hardcoded value on an Elastic document with Painless - elasticsearch

I'm trying to learn Painless so that I could use it while trying to enrich and manipulate incoming documents. However, every way I've seen for accessing the document just results in errors.
Having input this in the Painless Lab in Kibana, these are the errors I'm getting:
def paths = new String[3];
paths[0]= '.com';
paths[1] = 'bar.com';
paths[2] = 'foo.bar.com';
doc['my_field'] = paths; // does not work: '[Ljava.lang.String; cannot be cast to org.elasticsearch.index.fielddata.ScriptDocValues'
ctx.my_field = paths; // does not compile: 'cannot resolve symbol [ctx.my_field]'
return doc['my_field'] == 'field_value'; // does not work: 'No field found for [my_field] in mapping'
doc['my_field'] == 'field_value' complains despite the field being present in the test document, though doc.containsKey('my_field') does return false.
How should I actually be accessing and manipulating the incoming document? I'm using ElasticSearch 7.12.

You can create ingest pipeline with set processor for adding hardcode value to incoming document.
{
"description" : "sets the value of count to 1",
"set": {
"field": "count",
"value": 1
}
}
There are very specific context available for painless API. you are using String[] which may be causing issue so you need to use either Arrays or ArraysList. you can check example of painless lab here.
Below is script i have tried in painless lab and it is working as expcted:
def ctx = params.ctx;
ArrayList paths = new ArrayList();
paths.add('.com');
paths.add('bar.com');
paths.add('foo.bar.com');
ctx['my_field'] = paths;
return ctx
Add below in parameters tab, i missed to add this in answer. this required because in actual implmentation you will get value from context and update context.
{
"ctx":{
"my_field":["test"]
}
}

Related

Including distance in geo search result

I have implemented Elastic Search - Java API to search data based on the distance.
QueryBuilder geoQuery = QueryBuilders.geoDistanceQuery("article_location.location").point(lat , lon).distance(5 , DistanceUnit.KILOMETERS);
QueryBuilders.boolQuery().filter(geoQuery);
However, I would like to get distance in each result from input lat/lon. Is there any way to specify that?
I see in elastic DSL we can do it via script_fields but do not know how to get it via Java API.
"script_fields": {
"distance": {
"script": "doc['latlong'].distanceInKm(lat, lon)"
}
Im using
elasticsearch-rest-high-level-client 7.9.2 version.
The Java client also allows you to add script fields to your query, simply like this:
sourceBuilder.query(...);
// add lat/lon as parameters
Map<String, Object> params = new HashMap<String, Object>();
params.put("lat", lat);
params.put("lon", lon);
// create the script
Script script = new Script(ScriptType.INLINE, "painless", "doc['latlong'].distanceInKm(params.lat, params.lon)", params);
// add the script field to the source query builder
sourceBuilder.scriptField("distance", script);

ElasticSearch / NEST 6 - Serialization of enums as strings in terms query

I've been trying to update to ES6 and NEST 6 and running into issues with NEST serializing of search requests - specifically serializing Terms queries where the underlying C# type is an enum.
I've got a Status enum mapped in my index as a Keyword, and correctly being stored in its string representation by using NEST.JsonNetSerializer and setting the contract json converter as per Elasticsearch / NEST 6 - storing enums as string
The issue comes when trying to search based on this Status enum. When I try to use a Terms query to specify multiple values, these values are being serialized as integers in the request and causing the search to find no results due to the type mismatch.
Interestingly the enum is serialized correctly as a string in a Term query, so I'm theorizing that the StringEnumConverter is being ignored in a scenario where it's having to serialize a collection of enums rather than a single enum.
Lets show it a little more clearly in code. Here's the enum and the (simplified) model used to define the index:
public enum CampaignStatus
{
Active = 0,
Sold = 1,
Withdrawn = 2
}
public class SalesCampaignSearchModel
{
[Keyword]
public Guid Id { get; set; }
[Keyword(DocValues = true)]
public CampaignStatus CampaignStatus { get; set; }
}
Here's a snippet of constructing the settings for the ElasticClient:
var pool = new SingleNodeConnectionPool(new Uri(nodeUri));
var connectionSettings = new ConnectionSettings(pool, (builtin, serializerSettings) =>
new JsonNetSerializer(builtin,
serializerSettings,
contractJsonConverters: new JsonConverter[]{new StringEnumConverter()}
)
)
.EnableHttpCompression();
Here's the Term query that correctly returns results:
var singleTermFilterQuery = new SearchDescriptor<SalesCampaignSearchModel>()
.Query(x => x.Term(y => y.Field(z => z.CampaignStatus).Value(CampaignStatus.Active)));
Generating the request:
{
"query": {
"term": {
"campaignStatus": {
"value": "Active"
}
}
}
}
Here's the Terms query that does not return results:
var termsFilterQuery = new SearchDescriptor<SalesCampaignSearchModel>()
.Query(x => x.Terms(y => y.Field(z => z.CampaignStatus).Terms(CampaignStatus.Active, CampaignStatus.Sold)));
Generating the request:
{
"query": {
"terms": {
"campaignStatus": [
0,
1
]
}
}
}
So far I've had a pretty good poke around at the options being presented by the JsonNetSerializer, tried a bunch of the available attributes (NEST.StringEnumAttribute, [JsonConverter(typeof(StringEnumConverter))] rather than using the global one on the client, having an explicit filter object with ItemConverterType set on the collection of CampaignStatuses, etc.) and the only thing that has had any success was a very brute-force .ToString() every time I need to query on an enum.
These are toy examples from a reasonably extensive codebase that I'm trying to migrate across to NEST 6, so what I'm wanting is to be able to specify global configuration somewhere rather than multiple developer teams needing to be mindful of this kind of eccentricity.
So yeah... I've been looking at this for a couple of days now. Good chances there's something silly I've missed. Otherwise I'm wondering if I need to be providing some JsonConverter with a contract that would match to an arbitrary collection of enums, and whether NEST and their tweaked Json.NET serializer should just be doing that kind of recursive resolution out of the box already.
Any help would be greatly appreciated, as I'm going a bit crazy with this one.

Elasticsearch 2.x index mapping _id

I ran ElasticSearch 1.x (happily) for over a year. Now it's time for some upgrading - to 2.1.x. The nodes should be turned off and then (one-by-one) on again. Seems easy enough.
But then I ran into troubles. The major problem is the field _uid, which I created myself so that I knew the exact location of a document from a random other one (by hashing a value). This way I knew that only that the exact one will be returned. During upgrade I got
MapperParsingException[Field [_uid] is a metadata field and cannot be added inside a document. Use the index API request parameters.]
But when I try to map my former _uid to _id (which should also be good enough) I get something similar.
The reason why I used the _uid param is because the lookup time is a lot lower than a termsQuery (or the like).
How can I still use the _uid or _id field in each document for the fast (and exact) lookup of certain exact documents? Note that I have to call thousands exact ones at the time, so I need an ID like query. Also it may occur the _uid or _id of the document does not exist (in that case I want, like now, a 'false-like' result)
Note: The upgrade from 1.x to 2.x is pretty big (Filters gone, no dots in names, no default access to _xxx)
Update (no avail):
Updating the mapping of _uid or _id using:
final XContentBuilder mappingBuilder = XContentFactory.jsonBuilder().startObject().startObject(type).startObject("_id").field("enabled", "true").field("default", "xxxx").endObject()
.endObject().endObject();
CLIENT.admin().indices().prepareCreate(index).addMapping(type, mappingBuilder)
.setSettings(Settings.settingsBuilder().put("number_of_shards", nShards).put("number_of_replicas", nReplicas)).execute().actionGet();
results in:
MapperParsingException[Failed to parse mapping [XXXX]: _id is not configurable]; nested: MapperParsingException[_id is not configurable];
Update: Changed name into _id instead of _uid since the latter is build out of _type#_id. So then I'd need to be able to write to _id.
Since there appears to be no way around setting the _uid and _id I'll post my solution. I mapped all document which had a _uid to uid (for internal referencing). At some point it came to me, you can set the relevant id
To bulk insert document with id you can:
final BulkRequestBuilder builder = client.prepareBulk();
for (final Doc doc : docs) {
builder.add(client.prepareIndex(index, type, doc.getId()).setSource(doc.toJson()));
}
final BulkResponse bulkResponse = builder.execute().actionGet();
Notice the third argument, this one may be null (or be a two valued argument, then the id will be generated by ES).
To then get some documents by id you can:
final List<String> uids = getUidsFromSomeMethod(); // ids for documents to get
final MultiGetRequestBuilder builder = CLIENT.prepareMultiGet();
builder.add(index_name, type, uids);
final MultiGetResponse multiResponse = builder.execute().actionGet();
// in this case I simply want to know whether the doc exists
if (only_want_to_know_whether_it_exists){
for (final MultiGetItemResponse response : multiResponse.getResponses()) {
final boolean exists = response.getResponse().isExists();
exist.add(exists);
}
} else {
// retrieve the doc as json
final String string = builder.getSourceAsString();
// handle JSON
}
If you only want 1:
client.prepareGet().setIndex(index).setType(type).setId(id);
Doing - the single update - using curl is mapping-id-field (note: exact copy):
# Example documents
PUT my_index/my_type/1
{
"text": "Document with ID 1"
}
PUT my_index/my_type/2
{
"text": "Document with ID 2"
}
GET my_index/_search
{
"query": {
"terms": {
"_id": [ "1", "2" ]
}
},
"script_fields": {
"UID": {
"script": "doc['_id']"
}
}
}

How to update multiple fields using java api elasticsearch script

I am trying to update multiple value in index using Java Api through Elastic Search Script. But not able to update fields.
Sample code :-
1:
UpdateResponse response = request.setScript("ctx._source").setScriptParams(scriptParams).execute().actionGet();
2:
UpdateResponse response = request.setScript("ctx._source.").setScriptParams(scriptParams).execute().actionGet();
if I mentioned .(dot) in ("ctx._source.") getting illegalArgument Exception and if i do not use dot, not getting any exception but values not getting updated in Index.
Can any one tell me the solutions to resolve this.
First of all, your script (ctx._source) doesn't do anything, as one of the commenters already pointed out. If you want to update, say, field "a", then you would need a script like:
ctx._source.a = "foobar"
This would assign the string "foobar" to field "a". You can do more than simple assignment, though. Check out the docs for more details and examples:
http://www.elasticsearch.org/guide/reference/api/update/
Updating multiple fields with one script is also possible. You can use semicolons to separate different MVEL instructions. E.g.:
ctx._source.a = "foo"; ctx._source.b = "bar"
In Elastic search have an Update Java API. Look at the following code
client.prepareUpdate("index","typw","1153")
.addScriptParam("assignee", assign)
.addScriptParam("newobject", responsearray)
.setScript("ctx._source.assignee=assignee;ctx._source.responsearray=newobject ").execute().actionGet();
Here, assign variable contains object value and response array variable contains list of data.
You can do the same using spring java client using the following code. I am also listing the dependencies used in the code.
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.index.query.QueryBuilder;
import org.springframework.data.elasticsearch.core.query.UpdateQuery;
import org.springframework.data.elasticsearch.core.query.UpdateQueryBuilder;
private UpdateQuery updateExistingDocument(String Id) {
// Add updatedDateTime, CreatedDateTime, CreateBy, UpdatedBy field in existing documents in Elastic Search Engine
UpdateRequest updateRequest = new UpdateRequest().doc("UpdatedDateTime", new Date(), "CreatedDateTime", new Date(), "CreatedBy", "admin", "UpdatedBy", "admin");
// Create updateQuery
UpdateQuery updateQuery = new UpdateQueryBuilder().withId(Id).withClass(ElasticSearchDocument.class).build();
updateQuery.setUpdateRequest(updateRequest);
// Execute update
elasticsearchTemplate.update(updateQuery);
}
XContentType contentType =
org.elasticsearch.client.Requests.INDEX_CONTENT_TYPE;
public XContentBuilder getBuilder(User assign){
try {
XContentBuilder builder = XContentFactory.contentBuilder(contentType);
builder.startObject();
Map<String,?> assignMap=objectMap.convertValue(assign, Map.class);
builder.field("assignee",assignMap);
return builder;
} catch (IOException e) {
log.error("custom field index",e);
}
IndexRequest indexRequest = new IndexRequest();
indexRequest.source(getBuilder(assign));
UpdateQuery updateQuery = new UpdateQueryBuilder()
.withType(<IndexType>)
.withIndexName(<IndexName>)
.withId(String.valueOf(id))
.withClass(<IndexClass>)
.withIndexRequest(indexRequest)
.build();

GSON - Exclude Object based on Value of Field

I have some JSON, which contains a key named "type". This key can have the value include or exclude. I want to configure Gson to not deserialize the Json, and create an object when the key value is exclude.
I realize I can write a custom deserializer, check for the appropriate, and create the object or not. However, I was not sure if there was another way using some type of exclusion strategy.
The example I outlined is over-simplified. My real JSON contains many more fields.
// Deserialize me
{
"type" : "include"
}
// Skip over me, and do not deserialize
{
"type" : "exclude"
}
I don't think the ExclusionStrategy can help here. It works with classes rather than instances and at the time instances get processed there's just a result of its evaluation present (in case you want to have a look at the code, see ReflectiveTypeAdapterFactory.BoundField).
This might help you...
Gson gson = new Gson();
JsonObject jsonObj = gson.fromJson (jsonStr, JsonElement.class).getAsJsonObject();
// Just read the required field
if(jsonObj.get("type").getAsString().equals("include")) {
// Continue parsing/deserializing other details
} else {
// Skip
}
You can refer this for Gson API documentation

Resources