Couchdb get the changed document with each change notification - view

I'm quite sure that I want to be notified with the inserted document by each insertion in the couch db.
something like this:
http://localhost:5058/db-name/_chnages/_view/inserted-document
And I like the response to be something like the following:
{
"id":"0552065465",
"name":"james"
.
.
.
}
Reconnecting to the database for giving the actual document by each notification can cause performance issues.
Can I define a view that return the actual document by each change?

There are 3 possible way to define if a document was just added:
You add a status field to your document with a specific status for new documents.
If the revision starts with a 1- but it's not 100% accurate according to this if you do replication.
In the changes response, check if the number of revision of the document is equal to one. If so, it means it was just added(best solution IMO)
If you want to query the _changes endpoint and directly get the newly inserted documents, you can use the approach #1 and use a filter function that only returns documents with status="new".
Otherwise, you should go with approach #3 and filter the _changes responses locally. Eg: your application would receive all changes and only handle documents with revisions array count equal to 1.
And as you mentioned, you want to receive the document, not only the _id and the _rev. To do so, you can simply add the query parameter: include_docs=true

Related

Get the latest index of a certain form in elasticsearch

I am periodically training an anomaly detection model and I am saving it in elasticsearch index of the form 'anomaly_detection_model-' + date_model_trained. This means that I am ending up with indices of the form: anomaly_detection_model-31.08.2022, anomaly_detection_model-29.08.2022, anomaly_detection_model-27.08.2022, etc. I need to be able to get access to the latest model at real-time in order to make the predictions without knowing what is the date that this model was trained. Do you have any ideas of how this could be possible?
if you need to access to the last created index try this:
GET /_cat/indices?pretty&s=creation.date:desc

elasticsearch copy field when indexing

I would like to create a one to many relashanship for the purpose of aggregations.
The "join" will be according to a field called "common_id":
When I create the first document belonging to the same group I would like to use it's flakeId (it's _id) as the common_id.
When adding other document belonging to the same group I would like to explicitly set the common_id to have the same value as the first document I added. This can be done by my app since my application will know the common_id of the first element.
My problem is with the first document:
How can i tell elasticsearch to copy the _id into common_id in a single call to elastic (I know I can do it using update script, or using two calls one for index and one for update... but this requires two requests instead of one).
I would like a simple syntax for this.
thanks

Bulk add new field to ALL documents in an elasticsearch index

I need to add a new field to ALL documents in an index without pulling down the document and pushing it back up (this will take about a day). Is it possible to use the _BULK api to achieve this?
I have also researched the update_by_query plugin, and it seems to would take just as long as pulling them down and pushing them back myself.
Yes, the bulk API supports updates which can add a new field using a partial document or script. To iterate through your document ids do a scan and scroll with the fields parameter set to an empty array.

Elasticsearch not searching some fields

I have just updated a website, the update adds new fields to elasticsearch.
In my dev environment, it all works fine. but on the live site, the new fields are not being found.
Eg. I have added a new field with the value : 1
However, when adding a filtered query of
{"field":1}
It does not find any matching results.
When I look in the documents, I can see docs with the field set to 1
Would the reason for this be that the new field was added after the mappings was set? I am not all that familiar with elasticsearch, So I am not really sure where to start looking to fix it.
Any help would be appreciated.
Update:
querying from URL shows nothing either
_search/?pretty=true&size=50&q=field1:*
however there is another field that was added at the same time which I can search on.
I can see field1 in the result set but it just wont allow me to search on it.
Only difference i see in the mapping is that the one that is working is set to type:long whereas the one not working is set as type:string
Is it a length issue on the ngram? what was your "min_gram" settings?
When you check on your index settings like this:
GET <host>/<index_name>/_settings
Does it work when you filter for a two digit field?
Are all the field values one digit?
It's OK to add a field after the mapping was set. ElasticSearch will guess the mapping for you. (in fact, it's one of their selling features --- no need to define the mapping, just throw the data at it)
There are a few things that can go wrong:
Verify that data is actually in the index. To do that, just navigate to the _search url with no parameters, you should see the field if it is indexed.
Look at your mapping. Could it be that the field is explicitly set not to be indexed?
Another possibility is that your query is wrong (but that is unlikely, since you're saying it works in the development environment)

Passing parameters to a couchbase view

I'm looking to search for a particular JSON document in a bucket and I don't know its document ID, all I know is the value of one of the sub-keys. I've looked through the API documentation but still confused when it comes to my particular use case:
In mongo I can do a dynamic query like:
bucket.get({ "name" : "some-arbritrary-name-here" })
With couchbase I'm under the impression that you need to create an index (for example on the name property) and use startKey / endKey but this feels wrong - could you still end up with multiple documents being returned? Would be nice to be able to pass a parameter to the view that an exact match could be performed on. Also how would we handle multi-dimensional searches? i.e. name and category.
I'd like to do as much of the filtering as possible on the couchbase instance and ideally narrow it down to one record rather than having to filter when it comes back to the App Tier. Something like passing a dynamic value to the mapping function and only emitting documents that match.
I know you can use LINQ with couchbase to filter but if I've read the docs correctly this filtering is still done client-side but at least if we could narrow down the returned dataset to a sensible subset, client-side filtering wouldn't be such a big deal.
Cheers
So you are correct on one point, you need to create a view (an index indeed) to be able to query on on the content of the JSON document.
So in you case you have to create a view with this kind of code:
function (doc, meta) {
if (doc.type == "youtype") { // just a good practice to type the doc
emit(doc.name);
}
}
So this will create a index - distributed on all the nodes of your cluster - that you can now use in your application. You can point to a specific value using the "key" parameter

Resources