How to update multiple fields using java api elasticsearch script - elasticsearch

I am trying to update multiple value in index using Java Api through Elastic Search Script. But not able to update fields.
Sample code :-
1:
UpdateResponse response = request.setScript("ctx._source").setScriptParams(scriptParams).execute().actionGet();
2:
UpdateResponse response = request.setScript("ctx._source.").setScriptParams(scriptParams).execute().actionGet();
if I mentioned .(dot) in ("ctx._source.") getting illegalArgument Exception and if i do not use dot, not getting any exception but values not getting updated in Index.
Can any one tell me the solutions to resolve this.

First of all, your script (ctx._source) doesn't do anything, as one of the commenters already pointed out. If you want to update, say, field "a", then you would need a script like:
ctx._source.a = "foobar"
This would assign the string "foobar" to field "a". You can do more than simple assignment, though. Check out the docs for more details and examples:
http://www.elasticsearch.org/guide/reference/api/update/
Updating multiple fields with one script is also possible. You can use semicolons to separate different MVEL instructions. E.g.:
ctx._source.a = "foo"; ctx._source.b = "bar"

In Elastic search have an Update Java API. Look at the following code
client.prepareUpdate("index","typw","1153")
.addScriptParam("assignee", assign)
.addScriptParam("newobject", responsearray)
.setScript("ctx._source.assignee=assignee;ctx._source.responsearray=newobject ").execute().actionGet();
Here, assign variable contains object value and response array variable contains list of data.

You can do the same using spring java client using the following code. I am also listing the dependencies used in the code.
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.index.query.QueryBuilder;
import org.springframework.data.elasticsearch.core.query.UpdateQuery;
import org.springframework.data.elasticsearch.core.query.UpdateQueryBuilder;
private UpdateQuery updateExistingDocument(String Id) {
// Add updatedDateTime, CreatedDateTime, CreateBy, UpdatedBy field in existing documents in Elastic Search Engine
UpdateRequest updateRequest = new UpdateRequest().doc("UpdatedDateTime", new Date(), "CreatedDateTime", new Date(), "CreatedBy", "admin", "UpdatedBy", "admin");
// Create updateQuery
UpdateQuery updateQuery = new UpdateQueryBuilder().withId(Id).withClass(ElasticSearchDocument.class).build();
updateQuery.setUpdateRequest(updateRequest);
// Execute update
elasticsearchTemplate.update(updateQuery);
}

XContentType contentType =
org.elasticsearch.client.Requests.INDEX_CONTENT_TYPE;
public XContentBuilder getBuilder(User assign){
try {
XContentBuilder builder = XContentFactory.contentBuilder(contentType);
builder.startObject();
Map<String,?> assignMap=objectMap.convertValue(assign, Map.class);
builder.field("assignee",assignMap);
return builder;
} catch (IOException e) {
log.error("custom field index",e);
}
IndexRequest indexRequest = new IndexRequest();
indexRequest.source(getBuilder(assign));
UpdateQuery updateQuery = new UpdateQueryBuilder()
.withType(<IndexType>)
.withIndexName(<IndexName>)
.withId(String.valueOf(id))
.withClass(<IndexClass>)
.withIndexRequest(indexRequest)
.build();

Related

How to use ES Java API to create a new type of an index

I have succeed create an index use Client , the code like this :
public static boolean addIndex(Client client,String index) throws Exception {
if(client == null){
client = getSettingClient();
}
CreateIndexRequestBuilder requestBuilder = client.admin().indices().prepareCreate(index);
CreateIndexResponse response = requestBuilder.execute().actionGet();
return response.isAcknowledged();
//client.close();
}
public static boolean addIndexType(Client client, String index, String type) throws Exception {
if (client == null) {
client = getSettingClient();
}
TypesExistsAction action = TypesExistsAction.INSTANCE;
TypesExistsRequestBuilder requestBuilder = new TypesExistsRequestBuilder(client, action, index);
requestBuilder.setTypes(type);
TypesExistsResponse response = requestBuilder.get();
return response.isExists();
}
however, the method of addIndexType is not effected, the type is not create .
I don't know how to create type ?
You can create types when you create the index by providing a proper mapping configuration. Alternatively a type gets created when you index a document of a certain type. However the first suggestion is the better one, because then you can control the full mapping of that type instead of relying on dynamic mapping.
You can set types in the following way:
// JSON schema is the JSON mapping which you want to give for the index.
JSONObject builder = new JSONObject().put(type, JSONSchema);
// And then just fire the below command
client.admin().indices().preparePutMapping(indexName)
.setType(type)
.setSource(builder.toString(), XContentType.JSON)
.execute()
.actionGet();

Using spring jdbc template to query for list of parameters

New to Spring JDBC template but I'm wondering if I am able to pass a list of parameters and execute query once for each parameter in list. As I've seen many examples, the list of parameters being passed is for the execution of the query using all the parameters provided. Rather I am trying to execute query multiple times and for each time using new parameter in list.
For example:
Let's say I have a List of Ids - params (Strings)
List<String> params = new ArrayList<String>();
params.add("1234");
params.add("2345");
trying to do something like:
getJdbcTemplate().query(sql, params, new CustomResultSetExtractor());
which I know as per documentation is not allowed. I mean for one it has to be an array. I've seen simple examples where query is something like "select * from employee where id = ?" and they are passing new Object[]{"1234"} into method. And I'm trying to avoid the IN() condition. In my case each id will return multiple rows which is why I'm using ResultSetExtractor.
I know one option would be to iterate over list and include each id in list as a parameter, something like:
for(String id : params){
getJdbcTemplate().query(sql, new Object[]{id}, new CustomResultSetExtractor());
}
Just want to know if I can do this some other way. Sorry, I Should mention that I am trying to do a Select. Originally was hoping to return a List of custom objects for each resultset.
You do need to pass an array of params for the API, but you may also assume that your first param is an array. I believe this should work:
String sql = "select * from employee where id in (:ids)"; // or should there be '?'
getJdbcTemplate().query(sql, new Object[]{params}, new CustomResultSetExtractor());
Or you could explicitly specify, that the parameter is an array
getJdbcTemplate().query(sql, new Object[]{params}, new int[]{java.sql.Types.ARRAY}, new CustomResultSetExtractor());
You can use preparedStatement and do batch job:
eg. from http://docs.spring.io/spring/docs/current/spring-framework-reference/html/jdbc.html
public int[] batchUpdate(final List<Actor> actors) {
int[] updateCounts = jdbcTemplate.batchUpdate("update t_actor set first_name = ?, " +
"last_name = ? where id = ?",
new BatchPreparedStatementSetter() {
public void setValues(PreparedStatement ps, int i) throws SQLException {
ps.setString(1, actors.get(i).getFirstName());
ps.setString(2, actors.get(i).getLastName());
ps.setLong(3, actors.get(i).getId().longValue());
}
public int getBatchSize() {
return actors.size();
}
});
return updateCounts;
}
I know you don't want to use the in clause, but I think its the best solution for your problem.
If you use a for in this way, I think it's not optimal.
for(String id : params){
getJdbcTemplate().query(sql, new Object[]{id}, new CustomResultSetExtractor());
}
I think it's a better solution to use the in clause. And then use a ResultSetExtractor to iterate over the result data. Your extractor can return a Map instead of a List, actually a Map of List.
Map<Integer, List<MyObject>>
Here there is a simple tutorial explaining its use
http://pure-essence.net/2011/03/16/how-to-execute-in-sql-in-spring-jdbctemplate/
I think this is the best solution:
public List<TestUser> findUserByIds(int[] ids) {
String[] s = new String[ids.length];
Arrays.fill(s, "?");
String sql = StringUtils.join(s, ',');
return jdbcTemplate.query(String.format("select * from users where id in (%s)", sql),
ArrayUtils.toObject(ids), new BeanPropertyRowMapper<>(TestUser.class));
}
this one maybe what you want. BeanPropertyRowMapper is just for example, it will be very slow when there's a lot of records. you should change it to another more efficient RowMapper.

saving & updating full json document with Spring data MongoTemplate

I'm using Spring data MongoTemplate to manage mongo operations. I'm trying to save & update json full documents (using String.class in java).
Example:
String content = "{MyId": "1","code":"UG","variables":[1,2,3,4,5]}";
String updatedContent = "{MyId": "1","code":"XX","variables":[6,7,8,9,10]}";
I know that I can update code & variables independently using:
Query query = new Query(where("MyId").is("1"));
Update update1 = new Update().set("code", "XX");
getMongoTemplate().upsert(query, update1, collectionId);
Update update2 = new Update().set("variables", "[6,7,8,9,10]");
getMongoTemplate().upsert(query, update2, collectionId);
But due to our application architecture, it could be more useful for us to directly replace the full object. As I know:
getMongoTemplate().save(content,collectionId)
getMongoTemplate().save(updatedContent,collectionId)
implements saveOrUpdate functionality, but this creates two objects, do not update anything.
I'm missing something? Any approach? Thanks
You can use Following Code :
Query query = new Query();
query.addCriteria(Criteria.where("MyId").is("1"));
Update update = new Update();
Iterator<String> iterator = json.keys();
while(iterator.hasNext()) {
String key = iterator.next();
if(!key.equals("MyId")) {
Object value = json.get(key);
update.set(key, value);
}
}
mongoTemplate.updateFirst(query, update, entityClass);
There may be some other way to get keyset from json, you can use according to your convenience.
You can use BasicDbObject to get keyset.
you can get BasicDbObject using mongoTemplate.getConverter().

Elasticearch and Spark: Updating existing entities

What is the correct way, when using Elasticsearch with Spark, to update existing entities?
I wanted to something like the following:
Get existing data as a map.
Create a new map, and populate it with the updated fields.
Persist the new map.
However, there are several issues:
The list of returned fields cannot contain the _id, as it is not part of the source.
If, for testing, I hardcode an existing _id in the map of new values, the following exception is thrown:
org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest
How should the _id be retrieved, and how should it be passed back to Spark?
I include the following code below to better illustrate what I was trying to do:
JavaRDD<Map<String, Object>> esRDD = JavaEsSpark.esRDD(jsc, INDEX_NAME+"/"+TYPE_NAME,
"?source=,field1,field2).values();
Iterator<Map<String, Object>> iter = esRDD.toLocalIterator();
List<Map<String, Object>> listToPersist = new ArrayList<Map<String, Object>>();
while(iter.hasNext()){
Map<String, Object> map = iter.next();
// Get existing values, and do transformation logic
Map<String, Object> newMap = new HashMap<String, Object>();
newMap.put("_id", ??????);
newMap.put("field1", new_value);
listToPersist.add(newMap);
}
JavaRDD javaRDD = jsc.parallelize(ImmutableList.copyOf(listToPersist));
JavaEsSpark.saveToEs(javaRDD, INDEX_NAME+"/"+TYPE_NAME);
Ideally, I would want to update the existing map in place, rather than create a new one.
Does anyone have any example code to show, when using Spark, the correct way to update existing entities in elasticsearch?
Thanks
This is how I've done it (Scala/Spark 2.3/Elastic-Hadoop v6.5).
To read (id or other metadata):
spark
.read
.format("org.elasticsearch.spark.sql")
.option("es.read.metadata",true) // allow to read metadata
.load("yourindex/yourtype")
.select(col("_metadata._id").as("myId"),...)
To update particular columns in ES:
myDataFrame
.select("myId","columnToUpdate")
.saveToEs(
"yourindex/yourtype",
Map(
"es.mapping.id" -> "myId",
"es.write.operation" -> "update", // important to change operation to partial update
"es.mapping.exclude" -> "myId"
)
)
Try adding this upsert to your Spark:
.config("es.write.operation", "upsert")
that will let you add new fields to existing documents
According to Elasticsearch Configuration you can get document metadata like _id by set read metadata option to true:
.config("es.read.metadata", "true")
And i think you cannot use '_id' as field name.
But you can create new field with different name like:
newMap.put("idfield", yourId);
then set name of the new field as a value for mapping id option to inform elastic that this field has the document id:
.config("es.mapping.id", "idfield")
BTW don't forget to set write operation as update:
.config("es.write.operation", "update")

How to iterate through Elasticsearch source using Apache Spark?

I am trying to build a recommendation system by integrating Elasticsearch with Apache Spark. I am using Java. I am using movilens dataset as example data. I have indexed the data to Elasticsearch as well. So far, I have been able to read the input from Elasticsearch index as follows:
SparkConf conf = new SparkConf().setAppName("Example App").setMaster("local");
conf.set("spark.serializer", org.apache.spark.serializer.KryoSerializer.class.getName());
conf.set("es.nodes", "localhost");
conf.set("es.port", "9200");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaPairRDD<String, Map<String, Object>> esRDD = JavaEsSpark.esRDD(sc, "movielens/recommendation");
Using esRDD.collect() function, I can see that I am retrieving the data from elastic search correctly. Now I need to feed the user id, item id and preference from the Elasticsearch result to Spark's recommendation. If I am using a csv file, I would be able to do it as follows:
String path = "resources/user_data.data";
JavaRDD<String> data = sc.textFile(path);
JavaRDD<Rating> ratings = data.map(
new Function<String, Rating>() {
public Rating call(String s) {
String[] sarray = s.split(" ");
return new Rating(Integer.parseInt(sarray[0]), Integer.parseInt(sarray[1]),
Double.parseDouble(sarray[2]));
}
}
);
What could be an equivalent mapping if I need to iterate through the elastic search output stored in esRDD and create a similar map as above? If there is any example code that I could refer to, that would be of great help.
Apologies for not answering the Spark question directly, but in case you missed it, there is a description of doing recommendations on MovieLens data using elasticsearch here: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_significant_terms_demo.html
You have not specified the format of the data in ElasticSearch. But let's assume it has fields userId, movieId and rating so an example document looks something like {"userId":1,"movieId":1,"rating":4}.
Then you should be able to do (ignoring null checks etc):
JavaRDD<Rating> ratings = esRDD.map(
new Function<Map<String, Object>, Rating>() {
public Rating call(Map<String, Object> m) {
Int userId = Integer.parseInt(m.get("userId"));
Int movieId = Integer.parseInt(m.get("movieId"));
Double rating = Double.parseDouble(m.get("rating"));
return new Rating(userId, movieId, rating);
}
}
);

Resources