Setting Time To Live (TTL) from Java - sample requested - elasticsearch

EDIT:
This is basically what I want to do, only in Java
Using ElasticSearch, we add documents to an index bypassing IndexRequest items to a BulkRequestBuilder.
I would like for the documents to be dropped from the index after some time has passed (time to live/ttl)
This can be done either by setting a default for the index, or on a per-document basis. Either approach is fine by me.
The code below is an attempt to do it per document. It does not work. I think it's because TTL is not enabled for the index. Either show me what Java code I need to add to enable TTL so the code below works, or show me different code that enables TTL + sets default TTL value for the index in Java I know how to do it from the REST API but I need to do it from Java code, if at all possible.
logger.debug("Indexing record ({}): {}", id, map);
final IndexRequest indexRequest = new IndexRequest(_indexName, _documentType, id);
final long debug = indexRequest.ttl();
if (_ttl > 0) {
indexRequest.ttl(_ttl);
System.out.println("Setting TTL to " + _ttl);
System.out.println("IndexRequest now has ttl of " + indexRequest.ttl());
}
indexRequest.source(map);
indexRequest.operationThreaded(false);
bulkRequestBuilder.add(indexRequest);
}
// execute and block until done.
BulkResponse response;
try {
response = bulkRequestBuilder.execute().actionGet();
Later I check in my unit test by polling this method, but the document count never goes down.
public long getDocumentCount() throws Exception {
Client client = getClient();
try {
client.admin().indices().refresh(new RefreshRequest(INDEX_NAME)).actionGet();
ActionFuture<CountResponse> response = client.count(new CountRequest(INDEX_NAME).types(DOCUMENT_TYPE));
CountResponse countResponse = response.get();
return countResponse.getCount();
} finally {
client.close();
}
}

After a LONG day of googling and writing test programs, I came up with a working example of how to use ttl and basic index/object creation from the Java API. Frankly most of the examples in the docs are trivial, and some JavaDoc and end-to-end examples would go a LONG way to help those of us who are using the non-REST interfaces.
Ah well.
Code here: Adding mapping to a type from Java - how do I do it?

Related

How to correctly get the results from MgetResponse object?

In our app, we are synchronizing some of the data to elasticsearch, and some of this data is users' records. The app is grails 5.1 and we are using Elasticsearch Java API Client for elasticsearch integration.
The indexing is working perfectly fine, and an example of user data looks like this:
Now, we have this following function that suppose to get the list of users by their ids:
PublicUser[] getAllByIds(Long[] ids) {
MgetRequest request = new MgetRequest.Builder()
.ids(ids.collect { it.toString() }.toList())
.index("users")
.build()
MgetResponse<PublicUser> response = elasticSearchClientProviderService.getClient().mget(
request,
PublicUser.class
)
response.docs().collect {
it.result().source()
}
}
And when the response holds at least one user record, we are getting a list of PulicUser objects -> as expected.
However, if the search result is empty, the eventual return from this function is a list with one null element.
Some investigation
response.docs() holds a single non-existing document (looks like this one is filled with the request data).
And, as a result, the return from this function is (as I mentioned above) list of one null element.
Another observation:
I expected that response object will have .hits(), for the actual results are accessible through: response.hits().hits(). But now of that exist.
The only season I started looking into docs() directly is because if this documentation: https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-multi-get.html
There is a lack of Elasticsearch Java API Client docs. They mostly refer to REST API docs.
What is the correct way to get the list of results from mget request?
For now, I am solving this the following way. Will be glad to see if there is a better way, though.
PublicUser[] getAllByIds(Long[] ids) {
MgetRequest request = new MgetRequest.Builder()
.ids(ids.collect { it.toString() }.toList())
.index("users")
.build()
MgetResponse<PublicUser> response = elasticSearchClientProviderService.getClient().mget(
request,
PublicUser.class
)
List<PublicUser> users = []
response.docs().each {
if (it.result().found()) {
users.add(it.result().source())
}
}
users
}

How can I enable automatic slicing on Elasticsearch operations like UpdateByQuery or Reindex using the Nest client?

I'm using the Nest client to programmatically execute requests against an Elasticsearch index. I need to use the UpdateByQuery API to update existing data in my index. To improve performance on large data sets, the recommended approach is to use slicing. In my case I'd like to use the automatic slicing feature documented here.
I've tested this out in the Kibana dev console and it works beautifully. I'm struggling on how to set this property in code through the Nest client interface. here's a code snippet:
var request = new Nest.UpdateByQueryRequest(indexModel.Name);
request.Conflicts = Elasticsearch.Net.Conflicts.Proceed;
request.Query = filterQuery;
// TODO Need to set slices to auto but the current client doesn't allow it and the server
// rejects a value of 0
request.Slices = 0;
var elasticResult = await _elasticClient.UpdateByQueryAsync(request, cancellationToken);
The comments on that property indicate that it can be set to "auto", but it expects a long so that's not possible.
// Summary:
// The number of slices this task should be divided into. Defaults to 1, meaning
// the task isn't sliced into subtasks. Can be set to `auto`.
public long? Slices { get; set; }
Setting to 0 just throws an error on the server. Has anyone else tried doing this? Is there some other way to configure this behavior? Other APIs seem to have the same problem, like ReindexOnServerAsync.
This was a bug in the spec and an unfortunate consequence of generating this part of the client from the spec.
The spec has been fixed and the change will be reflected in a future version of the client. For now though, it can be set with the following
var request = new Nest.UpdateByQueryRequest(indexModel.Name);
request.Conflicts = Elasticsearch.Net.Conflicts.Proceed;
request.Query = filterQuery;
((IRequest)request).RequestParameters.SetQueryString("slices", "auto");
var elasticResult = await _elasticClient.UpdateByQueryAsync(request, cancellationToken);

Update Builder gives late response when multiple versions are there in Elasticsearch?

Project : Spring Boot
I'm updating my elasticsearch document using following way,
#Override
public Document update(DocumentDTO document) {
try {
Document doc = documentMapper.documentDTOToDocument(document);
Optional<Document> fetchDocument = documentRepository.findById(document.getId());
if (fetchDocument.isPresent()) {
fetchDocument.get().setTag(doc.getTag());
Document result = documentRepository.save(fetchDocument.get());
final UpdateRequest updateRequest = new UpdateRequest(Constants.INDEX_NAME, Constants.INDEX_TYPE, document.getId().toString());
updateRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
updateRequest.doc(jsonBuilder().startObject().field("tag", doc.getTag()).endObject());
UpdateResponse updateResponse = client.update(updateRequest, RequestOptions.DEFAULT);
log.info("ES result : "+ updateResponse.status());
return result;
}
} catch (Exception ex) {
log.info(ex.getMessage());
}
return null;
}
Using this my document updated successfully and version incremented but when version goes 20+.
It takes lot many time to retrieve data(around 14sec).
I'm still confused regarding process of versioning. How it works in update and delete scenario? At time of search it process all the data version and send latest one? Is it so?
Elasticsearch internally uses Lucene which uses immutable segments to store the data. as these segments are immutable, every update on elasticsearch internally marks the old document delete(soft delete) and inserts a new document(with a new version).
The old document is later on cleanup during a background segment merging process.
A newly updated document should be available in 1 second(default refresh interval) but it can be disabled or change, so please check this setting in your index. I can see you are using wait_for param in your code, please remove this and you should be able to see the updated document fast if you have not changed the default refresh_interval.
Note:- Here both update and delete operation works similarly, the only difference is that in delete operation new document is not created, and the old document is marked soft delete and later on during segment merge deleted permanently.

How to enable document routing in Transport Client or Node Client

I want to use routing-field in Elastic-Search.
But I am not able to find any Java API to enable the same.
I have gone through link 1 and link 2 but none seems to have addressed this.
My code:
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", elasticSearchCluster).build();
this.client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(esHost, esPort));
}
public void execute(Tuple tuple) {
try {
String document = tuple.toString();
byte[] byteBuffer = document.getBytes();
IndexResponse response = this.client.prepareIndex(indexName, type, id)
.setSource(byteBuffer).execute().actionGet();
} catch (Exception e) {
e.printStackTrace();
}
collector.ack(tuple);
}
Note that I am using TransportClient here as there does not seem to be a good way of using Node-Client with storm but the question is irrespective of that. If there is a way of using Node-Client with routing, please do suggest otherwise TransportClient's routing would also be of great help.
I believe you are confusing two different "routing" concepts in ES. One is document routing and the other is index allocation routing (or "filtering").
The _routing field allows you specify the value to be used when indexing each document to determine which shard the document will be indexed on. The other two links you provided refer to an index-level (as opposed to document-level) setting that determines how the shards of an index are allocated to the various nodes in your cluster.
It sounds like you are trying to do document routing. This can be accomplished in the Java API using the IndexRequestBuilder class and the setRouting(String) method. Have a look at the source code on GitHub.
There are also some good code examples here which specify the routing field during indexing.
Almost!
you can just replace one line of codes
from
IndexResponse response = this.client.prepareIndex(indexName, type, id)
.setSource(byteBuffer).execute().actionGet();
to
String routingValue = "ANY_ROUTING_VALUE_YOU_WANT";
IndexResponse response = this.client.prepareIndex(indexName, type, id) .setSource(byteBuffer).setRouting(routingValue).execute().actionGet();
Then your documetns will be stored in a specific shard corresponding to the routing value you provide. In search time, you can provide the same routing value so that your search request hits only one specific shard.

Cannot make XBAP cookies work

I am trying to make a XBAP application communicating with a webservice with login.
But I want the user to skip the login step if they already logged in the last seven days.
I got it to work using html/aspx.
But it fails continuously with XBAP.
While debugging, the application is given full trust.
This is the code I have so far to write the cookie:
protected static void WriteToCookie(
string pName,
Dictionary<string, string> pData,
int pExiresInDays)
{
// Set the cookie value.
string data = "";
foreach (string key in pData.Keys)
{
data += String.Format("{0}={1};", key, pData[key]);
}
string expires = "expires=" + DateTime.Now.AddDays(pExiresInDays).ToUniversalTime().ToString("r");
data += expires;
try
{
Application.SetCookie(new Uri(pName), data);
}
catch (Exception ex)
{
}
}
And this is what I have to read the cookie:
protected static Dictionary<string, string> ReadFromCookie(
string pName)
{
Dictionary<string, string> data = new Dictionary<string, string>();
try
{
string myCookie = Application.GetCookie(new Uri(pName));
// Returns the cookie information.
if (String.IsNullOrEmpty(myCookie) == false)
{
string[] splitted = myCookie.Split(new char[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
string[] sub;
foreach(string split in splitted)
{
sub = split.Split(new char[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
if (sub[0] == "expires")
{
continue;
}
data.Add(sub[0], sub[1]);
}
}
}
catch(Exception ex)
{
}
return data;
}
The pName is set with:
string uri = "http://MyWebSiteName.com";
When the user authenticate the first time, I call the WriteToCookie function and set it with 7 days to expire.
It looks like everything is fine as I get no exception of error messages. (I have a break point in the catch)
After that, I close the session and start it again.
The first thing I do is a ReadFromCookie.
Then I get an exception with the following message: No more data is available
So my application is sending the user automatically back to the login screen.
I also tried to do a ReadFromCookie right after the WriteToCookie in the same session, and I get the same error.
Application.SetCookie(new Uri("http://MyWebSiteName.com/WpfBrowserApplication1.xbap"), "Hellllo");
string myCookie2 = Application.GetCookie(new Uri("http://MyWebSiteName.com/WpfBrowserApplication1.xbap"));
It seems to me that the cookie is not even written in the first place.
So I am guessing I am doing something wrong.
Maybe the uri I am using is wrong. Is there a specific format needed for it?
Just like you need a very specific format for the expire date.
I have been searching quite a lot of internet for a good sample/tutorial about using cookies with XBAP, and I could not find anything really well documented or tested.
A lot of people say that it works, but no real sample to try.
A lot of people also handle the authentication in html, then go to the XBAP after successfully reading/writing the cookies.
I would prefer a full XBAP solution if possible.
To answer some questions before they are asked, here are the project settings:
Debug:
Command line arguments: -debug -debugSecurityZoneURL http://MyWebSiteName.com "C:\Work\MyWebSiteName\MyWebSiteNameXBAP\bin\Debug\MyWebSiteNameXBAP.xbap"
Security:
Enable ClickOnce security settings (Checked)
This is a full trust application (selected)
I also created a certificate, and added it the 3 stores like explained in "publisher cannot be verified" message displayed
So I do not have the warning popup anymore. I just wanted to make sure that it was not a permission issue.
Finally found the answer to this problem.
Thanks for this CodeProject I was finally able to write/read cookies from the XBAP code.
As I had guessed, the URI needs to be very specific and you cannot pass everything you want in it.
What did the trick was using: BrowserInteropHelper.Source
In the end the read/write code looks like:
Application.SetCookie(BrowserInteropHelper.Source, data);
string myCookie = Application.GetCookie(BrowserInteropHelper.Source);
It looks like you cannot use ';' to separate your own data.
If you do so, you will only get the first entry in your data.
Use a different separator (ex: ':') and then you can get everything back
The data look like this:
n=something:k=somethingElse;expires=Tue, 12 May 2015 14:18:56 GMT ;
The only thing I do not get back from Application.GetCookie is the expire date.
Not sure if it is normal or not. Maybe it is flushed out automatically for some reason. If someone knows why, I would appreciate a comment to enlighten me.
At least now I can read/write data to the cookie in XBAP. Yeah!

Resources