CouchDB Views: created_at greater than a passed value - view

I'm trying to write a couchdb view that takes a created_at timestamp in a sortable format (2009/05/07 21:40:17 +0000) and returns all documents that have a greater created_at value.
I'm specifically using couch_foo but if I can figure out how to write the view I can create it in futon or in the couch_foo model instead of letting couch_foo do it for me.
I've searched all around and can't figure out the map/reduce to do this, if it's possible.

This is the kind of problem I ran into initially before I fully understood how views work.
The key to the understanding is that the view is only run once for each (revision of) a document. In other words, when you query a view, you don't run the function, you simply look up the results of when the function ran. As such, there is no way to pass any user-submitted parameters into a view.
How then to compare a value in a view with a user-submitted value? The secret is to emit that field as a key in the map function and rely on letting couchdb order by the keys.
Your map function would be something like
"map" : "function(doc) { emit(doc.created_at, doc); }"
and you would query it like so:
http://localhost:5984/db/_design/ddoc/_view/view?startkey=%222009/05/07%2021:40:17 +0000%22
I have taken the liberty of uriEncoding the quotes and spaces in the url so that it should be usable as is.

You want to write a view that creates a key of the timestamp field in that format, then query it with the startkey parameter.
So the view would look something like:
"map" : "function(doc) { emit(doc.timestamp_field, doc) }"
And your URL would be something like:
http://mysever/database/_design/mydoc/_view/myview?startkey="2009/05/07 21:40:17 +0000"
The HTTP view API page on the Wiki has more info. You may also consider the User Mailing List.

Please mind that couchdb works only on json values. If the timezone if the document stored in couchdb is different to the timezone of your startkey the query likely will fail.

Related

Is it possible to pass LIVE VIEW name as a parameter in Clickhouse query?

For example, like
CREATE LIVE VIEW %(live_view_name)s WITH REFRESH
It works only with other parameters (WHERE condition) but not with view name.
Look like you use python for make HTTP query?
I think it's not possible because parameter value will escape as string literal, not as field name
Look to
https://clickhouse.com/docs/en/interfaces/http#cli-queries-with-parameters
and
https://clickhouse.com/docs/en/interfaces/cli#cli-queries-with-parameters
and try to use CREATE LIVE VIEW {live_view_name:Identifier} WITH REFRESH

Couchdb get the changed document with each change notification

I'm quite sure that I want to be notified with the inserted document by each insertion in the couch db.
something like this:
http://localhost:5058/db-name/_chnages/_view/inserted-document
And I like the response to be something like the following:
{
"id":"0552065465",
"name":"james"
.
.
.
}
Reconnecting to the database for giving the actual document by each notification can cause performance issues.
Can I define a view that return the actual document by each change?
There are 3 possible way to define if a document was just added:
You add a status field to your document with a specific status for new documents.
If the revision starts with a 1- but it's not 100% accurate according to this if you do replication.
In the changes response, check if the number of revision of the document is equal to one. If so, it means it was just added(best solution IMO)
If you want to query the _changes endpoint and directly get the newly inserted documents, you can use the approach #1 and use a filter function that only returns documents with status="new".
Otherwise, you should go with approach #3 and filter the _changes responses locally. Eg: your application would receive all changes and only handle documents with revisions array count equal to 1.
And as you mentioned, you want to receive the document, not only the _id and the _rev. To do so, you can simply add the query parameter: include_docs=true

ES custom dynamic mapping field name change

I have a use case which is a bit similar to the ES example of dynamic_template where I want certain strings to be analyzed and certain not.
My document fields don't have such a convention and the decision is made based on an external schema. So currently my flow is:
I grab the inputs document from the DB
I grab the approrpiate schema (same database, currently using logstash for import)
I adjust the name in the document accordingly (using logstash's ruby mutator):
if not analyzed I don't change the name
if analyzed I change it to ORIGINALNAME_analyzed
This will handle the analyzed/not_analyzed problem thanks to dynamic_template I set but now the user doesn't know which fields are analyzed so there's no easy way for him to write queries because he doesn't know what's the name of the field.
I wanted to use field name aliases but apparently ES doesn't support them. Are there any other mechanisms I'm missing I could use here like field rename after indexation or something else?
For example this ancient thread mentions that field.sub.name can be queried as just name but I'm guessing this has changed when they disallowed . in the name some time ago since I cannot get it to work?
Let the user only create queries with the original name. I believe you have some code that converts this user query to Elasticsearch query. When converting to Elasticsearch query, instead of using the field name provided by the user alone use both the field names ORIGINALNAME as well as ORIGINALNAME_analyzed. If you are using a match query, convert it to multi_match. If you are using a term query, convert it to a bool should query. I guess you get where I am going with this.
Elasticsearch won't mind if a field does not exists. This can be a problem if there is already a field with _analyzed appended in its original name. But with some tricks that can be fixed too.

How to filter and delete a record that has a specific attribute

Im trying filter records with that has a specific value key and delete them. I tried "withFields" and "hasFields" but seems that i can't apply delete to them. My question is how can i do that?
r.db('databaseFoo').table('checkpoints').filter(function (user) {
return user('type').default(false);
}).delete();
If you want all documents that have a type key, you can use hasFields for that.
r.db('databaseFoo').table('checkpoints')
.hasFields('type')
In your current query, what you are doing is getting all documents that don't have a type key or where the value for type is equal to false. This might be what you want, but it's a little confusing if you only want documents that have a type property.
Keeping a reference to the original document
The problem with using hasFields is that it converts a selection (a sequence with a reference to the specific rows in the database) that you can update, and delete into a sequence, with which you can't do that. This is a known issue in RethinkDB. You can read this blog post to understand the different types in ReQL a bit better.
In order to get around this, you can use the hasFields method with the filter method.
r.db('databaseFoo').table('checkpoints')
.filter(r.row.hasFields('type'))
.delete()
This query will work since it returns a selection which can then be passed into delete.
If you want to get all records with with a specific value at a specific key, you can do so a couple of different ways. To get all documents where the property type is equal to false, you can do as follows:
r.db('databaseFoo').table('checkpoints')
.filter({ type: false })
or, you can do:
r.db('databaseFoo').table('checkpoints')
.filter(r.row('type').eq(false))

Passing parameters to a couchbase view

I'm looking to search for a particular JSON document in a bucket and I don't know its document ID, all I know is the value of one of the sub-keys. I've looked through the API documentation but still confused when it comes to my particular use case:
In mongo I can do a dynamic query like:
bucket.get({ "name" : "some-arbritrary-name-here" })
With couchbase I'm under the impression that you need to create an index (for example on the name property) and use startKey / endKey but this feels wrong - could you still end up with multiple documents being returned? Would be nice to be able to pass a parameter to the view that an exact match could be performed on. Also how would we handle multi-dimensional searches? i.e. name and category.
I'd like to do as much of the filtering as possible on the couchbase instance and ideally narrow it down to one record rather than having to filter when it comes back to the App Tier. Something like passing a dynamic value to the mapping function and only emitting documents that match.
I know you can use LINQ with couchbase to filter but if I've read the docs correctly this filtering is still done client-side but at least if we could narrow down the returned dataset to a sensible subset, client-side filtering wouldn't be such a big deal.
Cheers
So you are correct on one point, you need to create a view (an index indeed) to be able to query on on the content of the JSON document.
So in you case you have to create a view with this kind of code:
function (doc, meta) {
if (doc.type == "youtype") { // just a good practice to type the doc
emit(doc.name);
}
}
So this will create a index - distributed on all the nodes of your cluster - that you can now use in your application. You can point to a specific value using the "key" parameter

Resources