elasticsearch query for newest index - elasticsearch

Elasticsearch newbie.
I would like to query for the newest index.
Every day logstash creates new indices with a naming convention something like: our_sales_data-%{dd-mm-yyyy}% or something very close. Se I end up with lots of indices like:
our-sales_data-14-09-2015
our-sales-data-15-09-2015
our-sales-data-16-09-2015
and so on.
I need to be able to query for the newest index. Obviously I can query for and retrieve all the indices with 'our-sales-data*' in the name.. but I only want to return the very newest one and no other.
Possible?

Well the preferred method would be to compute the latest index name from client side by resolving the date in our_sales_data-%{dd-mm-yyyy}%.
Another solution would be to run a sort query and get one of the latest document. You can infer the index from the index name of the document.
{
"size": 1,
"sort": {
"#timestamp": {
"order": "desc"
}
}
}

We have a search alias and a write alias. The write alias is technically always the latest until we roll it over and add a new one into the this alias.
Our search alias contains all the previous indexes plus the latest index (also in write).
Could you do something like this and then just query the write alias?

Related

Elasticsearch Join

I have two indices. One indices "indications" which have some set of values.
Other is "projects". In this indices, I will add indications value like " indication = oncology".
Now I want to show all indications. Which I can do using terms aggregations. But my issue is that I also want to show count of project in which that indication is used .
So for that, I need to write join query.
Can anyone help me to resolve this issue?
Expected result example:
[{name:"onclogogy",projectCount:"12"}]
You cannot have joins in Elasticsearch. What you can do is store indication name in project index and then apply the term aggregation on project index. That basically will get you the different indications from all the project documents and count of each indication.
Something of the sort:
GET /project/_search
{
"query": {},
"aggs": {
"indcation":{
"terms": {
"field": "indication_name"
}
}
}
}
Elasticsearch does not supports joins. That's the whole point of having NoSQL that you keep the data as denormalised as you can. Make the documents more and more self sufficient.
There are some ways with which you can add some sort of relationship b/w your data. This is a nice blog on it.

Add _id to the source as a separate field to all exist docs in index

I'm new to Elastic Search. I need go through all the documents, take the _id and add it to the _source as a separate field by script. Is it possible? If yes, сan I have an example of something similar or a link to similar scripts? I haven't seen anything like that on the docks. Why i need it? - Because after that i will do SELECT with Opendistro and SQL. This frame cannot return me fields witch not in source. If anyone can suggest I would be very grateful.
There are two options:
First option: Add this new field in your existing index and populate it and build the new index again.
Second option: Simply define a new field in a new index mapping(keep rest all field same) and than use reindex API with below script.
"script": {
"source": "ctx._source.<your-field-name> = ctx._id"
}

How to find what index a field belongs to in elasticsearch?

I am new to elasticsearch. I have to write a query using a given field but I don't know how to find the appropriate index. How would I find this information?
Edit:
Here's an easier/better way using mapping API
GET _mapping/field/<fieldname>
One of the ways you can find is to get records where the field exist
Replace the <fieldName> with your fields name. /_search will search across all indices and return any document that matches or has the field. Set _source to false, since you dont care about document contents but only index name.
GET /_search
{
"_source": false,
"query": {
"exists": {
"field": "<fieldName>"
}
}
}
Another, more visual way to do that is through the kibana Index Management UI (assuming you have privileges to access the site).
There you can click on the indices and open the mappings tab to get all fields of the particular index. Then just search for the desired field.
Summary:
#Polynomial Proton's answer is the way of choice in 90% of the time. I just wanted to show you another way to solve your issue. It will require more manual steps than #Polynomial Proton's answer. Also, if you have a large amount of indices this way is not appropriate.

how to update the nested data of elastic search?

i am new to elastic search. i have successfully setup elastic-search server and implemented ES package in laravel. now i can add data to elastic search, but the problem is how can i update a nested item value in a row?. i have added a screen shot of my data structure here a link!
Now how can i update comment_id 1 with my desired content?
In your case it will be a little problematic.
You should be aware of the way elasticsearch index arrays.
So in your case you will get something like this:
{
.
.
"comments":{
"id": [1,2,3],
"comment": ["this is comment1", "this is comment2", "this is comment3"]
}
}
So you loose the correlation between "id" and "comment".
If you like to keep this correlation you will need to define "comments" as "nested" in your mappings. look here.
In order to update your nested document you will probebly need to use scripted update.
If you will need to update a specific comment in the array, you can write a script that find it and replace it, or you can read the whole array, edit it and override the current array.

How to retrieve all document with version n in elastic search

ElasticSearch comes with versioning https://www.elastic.co/blog/versioning
Maybe I misunderstood the meaning of the versioning here.
I want to find all the documents that are in version 1 so I can update them.
An obvious way is to go through all the document one by one and select those that are in version 1.
Question:
Is it possible to retrieve all the Documents that are in version 1 with ONE query?
Because of Elasticsearch distributed nature, it needs a way to ensure that changes are applied in the correct order. This is where _version comes into play. It's an internal way of making sure than an older version of a document never overwrites a newer version.
You can also use _version as a way to make sure that the document you want to delete / update hasn't been modified in the meantime - this is done by specifying the version number in the URL; for example PUT /website/blog/1?version=5 will succeed only if the current _version of the document stored in the index is 5.
You can read more about it here: Optimistic Concurrency Control
To answer your question,
Is it possible to retrieve all the Documents that are in version 1 with ONE query?
No.
You can use scripted _reindex into an empty temporary index. The target index will then contain just those documents that have _version=1.
You can add further query stanzas as well, to limit your raw input (using the reverse index, faster), as well as further painless conditions (per document, flexible), too.
# ES5, use "source" i/o "inline" for later ES versions.
http POST :9200/_reindex <<END
{
"conflicts": "proceed",
"source": {
"index": "your-source-index"
},
"dest": {
"index": "some-temp-index",
"version_type": "external"
},
"script": {
"lang": "painless",
"inline": "if(ctx._version!=1){ctx.op='delete'}"
}
}
END

Resources