How do I get unique field values using rethinkdb javascript? - rethinkdb

I have a field which has similar values. For eg {country : 'US'} occurs multiple times in the table. Similar for other countries too. I want to return an array which contains non-redundant values of 'country' field. I am new to creating Databases so likely this is a trivial question but I couldn't find anything useful in rethinkdb api.[SOLVED]
Thanks

You can use distinct, but the distinct command was created for short sequences only.
If you have a lot of data, you can use map/reduce
r.table("data").map(function(doc) {
return r.object(doc("country"), true) // return { <country>: true}
}).reduce(function(left, right) {
return left.merge(right)
}).keys() // return all the keys of the final document

Related

Getting max value on server (Entity Framework)

I'm using EF Core but I'm not really an expert with it, especially when it comes to details like querying tables in a performant manner...
So what I try to do is simply get the max-value of one column from a table with filtered data.
What I have so far is this:
protected override void ReadExistingDBEntry()
{
using Model.ResultContext db = new();
// Filter Tabledata to the Rows relevant to us. the whole Table may contain 0 rows or millions of them
IQueryable<Measurement> dbMeasuringsExisting = db.Measurements
.Where(meas => meas.MeasuringInstanceGuid == Globals.MeasProgInstance.Guid
&& meas.MachineId == DBMatchingItem.Id);
if (dbMeasuringsExisting.Any())
{
// the max value we're interested in. Still dbMeasuringsExisting could contain millions of rows
iMaxMessID = dbMeasuringsExisting.Max(meas => meas.MessID);
}
}
The equivalent SQL to what I want would be something like this.
select max(MessID)
from Measurement
where MeasuringInstanceGuid = Globals.MeasProgInstance.Guid
and MachineId = DBMatchingItem.Id;
While the above code works (it returns the correct value), I think it has a performance issue when the database table is getting larger, because the max filtering is done at the client-side after all rows are transferred, or am I wrong here?
How to do it better? I want the database server to filter my data. Of course I don't want any SQL script ;-)
This can be addressed by typing the return as nullable so that you do not get a returned error and then applying a default value for the int. Alternatively, you can just assign it to a nullable int. Note, the assumption here of an integer return type of the ID. The same principal would apply to a Guid as well.
int MaxMessID = dbMeasuringsExisting.Max(p => (int?)p.MessID) ?? 0;
There is no need for the Any() statement as that causes an additional trip to the database which is not desirable in this case.

rethinkdb - hasFields to find all documents with multiple multiple missing conditions

I found an answer for finding all documents in a table with missing fields in this SO thread RethinkDB - Find documents with missing field, however I want to filter according to a missing field AND a certain value in a different field.
I want to return all documents that are missing field email and whose isCurrent: value is 1. So, I want to return all current clients who are missing the email field, so that I can add the field.
The documentation on rethink's site does not cover this case.
Here's my best attempt:
r.db('client').table('basic_info').filter(function (row) {
return row.hasFields({email: true }).not(),
/*no idea how to add another criteria here (such as .filter({isCurrent:1})*/
}).filter
Actually, you can do it in one filter. And, also, it will be faster than your current solution:
r.db('client').table('basic_info').filter(function (row) {
return row.hasFields({email: true }).not()
.and(row.hasFields({isCurrent: true }))
.and(row("isCurrent").eq(1));
})
or:
r.db('client').table('basic_info').filter(function (row) {
return row.hasFields({email: true }).not()
.and(row("isCurrent").default(0).eq(1));
})
I just realized I can chain multiple .filter commands.
Here's what worked for me:
r.db('client').table('basic_info').filter(function (row) {
return row.hasFields({email: true }).not()
}).filter({isCurrent: 1}).;
My next quest: put all of these into an array and then feed the email addresses in batch

Searching a MongoDB collection from the end (c#)

I am looking for the most efficient way to get the last elements of a fairly large (> 1 million docs) MongoDB collection.
Specifically, it is the oplog collection and I am looking for all entries after a given timestamp. It makes no sense to search the first million or so entries for a timestamp larger than the current one, since they are all definitely older because the collection is stored in its natural order.
Is there a way to tell MongoDB to search from the end of a collection?
I tried a linq query with Skip(N) but it's very slow. It seems it parses through all documents from the beginning and just doesn't return the first N.
The most efficient way is probably using aggregation. If your collection is sorted, you can get the last Timestamp using this aggregation:
var group = new BsonDocument
{
{
"$group", new BsonDocument
{
{"_id", 0},
{"newestTimeStamp", new BsonDocument { {"$last","$timeStamp"} } }
}
}
};
var pipeline = new[] {group};
var result = _dtCollection.Aggregate(pipeline);
}
Then you can deserialize the result into a Timestamp class. If you want to get several elements, you could create a similar expression using $match.
Also make sure to add an index to the collection on the TimeStamp field. This will probably make your LINQ-query faster if you decide to use that instead.

Rethinkdb: Know which records were updated

Is there any way to update a sequence and know the primary keys of the updated documents?
table.filter({some:"value"}).update({something:"else"})
Then know the primary keys of the records that were updated without needing a second query?
It's currently not possible to return multiple values with {returnVals: true}, see for example https://github.com/rethinkdb/rethinkdb/issues/1382
There's is however a way to trick the system with forEach
r.db('test').table('test').filter({some: "value"}).forEach(function(doc) {
return r.db('test').table('test').get(doc('id')).update({something: "else"}, {returnVals: true}).do(function(result) {
return {generated_keys: [result("new_val")]}
})
})("generated_keys")
While it works, it's really really hack-ish. Hopefully with array limits, returnVals will soon be available for range writes.

Sorting a NotesDocumentCollection based on a date field in SSJS

Using Server side javascript, I need to sort a NotesDcumentCollection based on a field in the collection containing a date when the documents was created or any built in field when the documents was created.
It would be nice if the function could take a sort option parameter so I could put in if I want the result back in ascending or descending order.
the reason I need this is because I use database.getModifiedDocuments() which returns an unsorted notesdocumentcollection. I need to return the documents in descending order.
The following code is a modified snippet from openNTF which returns the collection in ascending order.
function sortColByDateItem(dc:NotesDocumentCollection, iName:String) {
try{
var rl:java.util.Vector = new java.util.Vector();
var tm:java.util.TreeMap = new java.util.TreeMap();
var doc:NotesNotesDocument = dc.getFirstDocument();
while (doc != null) {
tm.put(doc.getItemValueDateTimeArray(iName)[0].toJavaDate(), doc);
doc = dc.getNextDocument(doc);
}
var tCol:java.util.Collection = tm.values();
var tIt:java.util.Iterator = tCol.iterator();
while (tIt.hasNext()) {
rl.add(tIt.next());
}
return rl;
}catch(e){
}
}
When you construct the TreeMap, pass a Comparator to the constructor. This allows you to define custom sorting instead of "natural" sorting, which by default sorts ascending. Alternatively, you can call descendingMap against the TreeMap to return a clone in reverse order.
This is a very expensive methodology if you are dealing with large number of documents. I mostly use NotesViewEntrycollection (always sorted according to the source view) or view navigator.
For large databases, you may use a view, sorted according to the modified date and navigate through entries of that view until the most recent date your code has been executed (which you have to save it somewhere).
For smaller operations, Tim's method is great!

Resources