display calculated data with CouchDB and PouchDB - view

I'm trying to understand how to return calculated data on docs using CouchDB and PouchDB.
Say I have two types of docs on my CouchDB: Blocks and Reports.
Reports consists of: report_id, block_id and date.
Block consists of: block_id and name.
I'd like to calculate for each block it's last report_id (the id of the most recent report), and return it with block's doc.
Is there a way to achieve that?
I'm assuming that a View of some type will do the trick but I can't figure it out.

You can do this with map/reduce functions in CouchDB.
Let's say you have those documents :
{
"_id": "report_1",
"type": "report",
"block_id": "block_1",
"date": "1500325245"
}
{
"_id": "report_2",
"type": "report",
"block_id": "block_1",
"date": "1153170045"
}
You would like to get the reports with the highest timestamp (in this case, repot_1).
We start by creating a map function that will map the documents with the bloc_id as the key and the timestamp+ report id as the value for reduce function.
Map :
function (doc) {
if(doc.type == "report")
emit(doc.block_id,{date:doc.created,report:doc._id});
}
Then, we will create a reduce function. When rereduce is false, we will simply return the values. When rereduce is true, we will find the maximum timestamp and return the report id associated to it
Reduce function :
function (keys, values, rereduce) {
if (rereduce) {
var max = 0;
var maxReportId = -1;
for (var i = 0; i < values.length; i++) {
var val = values[i][0];
if (parseInt(val.date) > max) {
max = val.date;
maxReportId = val.report;
}
}
//We return the report id of the most recent report.
return maxReportId;
} else
return values;
}

Related

elasticsearch sort by price with currency

I have data
{
"id": 1000,
"price": "99,01USA",
},
{
"id": 1001,
"price": "100USA",
},
{
"id": 1002,
"price": "780USA",
},
{
"id": 1003,
"price": "20USA",
},
How I sort order by price (ASC , DESC)
You can alter it a little to parse price to integer and then sort it
You can create a dynamic sort function that sorts objects by their value that you pass:
function dynamicSort(property) {
var sortOrder = 1;
if(property[0] === "-") {
sortOrder = -1;
property = property.substr(1);
}
return function (a,b) {
/* next line works with strings and numbers,
* and you may want to customize it to your needs
*/
var result = (a[property] < b[property]) ? -1 : (a[property] > b[property]) ? 1 : 0;
return result * sortOrder;
}
}
So you can have an array of objects like this:
var People = [
{Name: "Name", Surname: "Surname"},
{Name:"AAA", Surname:"ZZZ"},
{Name: "Name", Surname: "AAA"}
];
...and it will work when you do:
People.sort(dynamicSort("Name"));
People.sort(dynamicSort("Surname"));
People.sort(dynamicSort("-Surname"));
Actually this already answers the question. Below part is written because many people contacted me, complaining that it doesn't work with multiple parameters.
Multiple Parameters
You can use the function below to generate sort functions with multiple sort parameters.
function dynamicSortMultiple() {
/*
* save the arguments object as it will be overwritten
* note that arguments object is an array-like object
* consisting of the names of the properties to sort by
*/
var props = arguments;
return function (obj1, obj2) {
var i = 0, result = 0, numberOfProperties = props.length;
/* try getting a different result from 0 (equal)
* as long as we have extra properties to compare
*/
while(result === 0 && i < numberOfProperties) {
result = dynamicSort(props[i])(obj1, obj2);
i++;
}
return result;
}
}
Which would enable you to do something like this:
People.sort(dynamicSortMultiple("Name", "-Surname"));
Subclassing Array
For the lucky among us who can use ES6, which allows extending the native objects:
class MyArray extends Array {
sortBy(...args) {
return this.sort(dynamicSortMultiple(...args));
}
}
That would enable this:
MyArray.from(People).sortBy("Name", "-Surname");

Getting field value in elastic search plugin against document returned by result

My requirement is to search documents from elasticsearch based on fuzzy matching and then 'rescore' the documents by comparing the value of the document and an input string for e.g. If the query is returning 3 documents (doc:1,2,3), then for comparing the constant value 'Star Wars', the comparison should be as:
doc:1, MovieName:"Star Wars" (compare ('Star Wars','Star Wars'))
doc:2, MovieName:"Starr Warz" (compare ('Star Wars','Starr Warz'))
doc:3, MovieName:"The Star Wars" (compare ('Star Wars','The Star Wars'))
I found the following elasticsearch rescore plugin example and implemented it to achieve the above.
https://github.com/elastic/elasticsearch/blob/6.2/plugins/examples/rescore/src/main/java/org/elasticsearch/example/rescore/ExampleRescoreBuilder.java
I am able to pass and access the input 'Star Wars' in the plugin, however I am facing trouble getting the value of the MovieName field of the documents returned in the results (topdocs).
My Query:
GET movie-idx/_search?
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"MovieName"
],
"query": "Star Wars",
"minimum_should_match": "61%",
"fuzziness": 1,
"_name": "fuzzy"
}
}
]
}
},
"rescore": {
"calculateMovieScore": {
"MovieName": "Star Wars"
}
}
}
And my rescorer class looks like:
private static class DocsRescorer implements Rescorer {
private static final DocsRescorer INSTANCE = new DocsRescorer();
#Override
public TopDocs rescore(TopDocs topDocs, IndexSearcher searcher, RescoreContext rescoreContext) throws IOException {
DocRescoreContext context = (DocRescoreContext) rescoreContext;
int end = Math.min(topDocs.scoreDocs.length, rescoreContext.getWindowSize());
MovieScorer MovieScorer = new MovieScorerBuilder()
.withInputName(context.MovieName)
.build();
for (int i = 0; i < end; i++) {
String name = <get MovieName values from actual document returned by topdocs>
float score = MovieScorer.calculateScore(name);
topDocs.scoreDocs[i].score = score;
}
List<ScoreDoc> scoreDocList = Stream.of(topDocs.scoreDocs).filter((a) -> a.score >= context.threshold).sorted(
(a, b) -> {
if (a.score > b.score) {
return -1;
}
if (a.score < b.score) {
return 1;
}
// Safe because doc ids >= 0
return a.doc - b.doc;
}
).collect(Collectors.toList());
ScoreDoc[] scoreDocs = scoreDocList.toArray(new ScoreDoc[scoreDocList.size()]);
topDocs.scoreDocs = scoreDocs;
return topDocs;
}
#Override
public Explanation explain(int topLevelDocId, IndexSearcher searcher, RescoreContext rescoreContext,
Explanation sourceExplanation) throws IOException {
DocRescoreContext context = (DocRescoreContext) rescoreContext;
// Note that this is inaccurate because it ignores factor field
return Explanation.match(context.factor, "test", singletonList(sourceExplanation));
}
#Override
public void extractTerms(IndexSearcher searcher, RescoreContext rescoreContext, Set<Term> termsSet) {
// Since we don't use queries there are no terms to extract.
}
}
My understanding is that the plugin code will execute once, it will get topdocs as results from the initial query (the fuzzy search in this case) and for (int i = 0; i < end; i++) will loop through each document returned in the result. The place where I need help is:
String name = <get MovieName value from actual document returned by topdocs>
I know it's been over 2 years, but i've ran into the same problem and found a solution so i'm posting it here. this was done for a Rescorer plugin in ES 7.8.0. the base example i used is the Grouping plugin Link.
It's a bunch of code that i don't fully understand, but the main principle is that you need an IFD (IndexFieldData<?>) instance of the field you want to get. in my example, i just needed _id of the hits. it looks like this:
prepare the IFD in advance and pass it to the RescoreContext: add a member to the class extending RescoreContext to keep this IFD on the context, lets call it "idField" (later used in section 3).
#Override
public RescoreContext innerBuildContext(int windowSize, QueryShardContext queryShardContext) throws IOException {
return new MyRescoreContext(windowSize, queryShardContext.getForField(queryShardContext.fieldMapper("_id")));
}
next, in the Rescorer itself: (method rescore(...) )
2.1) first sort by scoreDoc.doc
ScoreDoc[] hits = topDocs.scoreDocs;
Arrays.sort(hits, Comparator.comparingInt((d) -> d.doc));
2.2) PERFORM BLACK MAGIC (code i don't understand)
List<LeafReaderContext> readerContexts = searcher.getIndexReader().leaves();
int currentReaderIx = -1;
int currentReaderEndDoc = 0;
LeafReaderContext currentReaderContext = null;
for (int i = 0; i < end; i++) {
ScoreDoc hit = hits[i];
// find segment that contains current document
while (hit.doc >= currentReaderEndDoc) {
currentReaderIx++;
currentReaderContext = readerContexts.get(currentReaderIx);
currentReaderEndDoc = currentReaderContext.docBase + currentReaderContext.reader().maxDoc();
}
int docId = hit.doc - currentReaderContext.docBase;
// code from section 3 goes here //
}
And now, with this magical "docId" in hand, you can fetch from the IFD inside the For loop:
SortedBinaryDocValues values = rescoreContext.idField.load(currentReaderContext).getBytesValues();
values.advanceExact(docId);
String id = values.nextValue().utf8ToString();
in your case, instead of the _id field, get the IFD for the field you want, and create a Hashmap from docId -> string value inside the For loop.
then use this map in the same For loop where you apply the score.
hope this helps everybody! this technique is not documented at all and there are no explanations anywhere!

How should I handle very large projections in an event-sourcing context?

I wanted to explore the implications of event-sourcing v.s. active-record.
Suppose I have events with payloads like this:
{
"type": "userCreated",
"id": "4a4cf26c-76ec-4a5a-b839-10cadd206eac",
"name": "Alice",
"passwordHash": "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"
}
... and...
{
"type": "userDeactivated",
"id": "39fd0e9a-1025-42e6-8793-ed5bfa236f40"
}
I can reach the current state of my system with a reducer like this:
const activeUsers = new Map();
for (const event of events) {
// userCreated
if (event.payload.type == 'userCreated') {
const { id, name, passwordHash } = event.payload;
if (!activeUsers.has(id)) {
activeUsers.set(id, { name, passwordHash });
}
}
// userDeactivated
if (event.payload.type == 'userDeactivated') {
const { id } = event.payload;
if (activeUsers.has(id)) {
activeUsers.delete(id);
}
}
}
However, I cannot have my entire user table in a single Map.
So it seems I need a reducer for each user:
const userReducer = id => // filter events by user id...
But this will lead to slow performance because I need to run a reducer over all events for each new user.
I could also shard the users by a function of their id:
const shard = nShards => id => {
let hash = 0, i, chr;
if (this.length === 0) {
return hash;
}
for (i = 0; i < this.length; i++) {
chr = this.charCodeAt(i);
hash = ((hash << 5) - hash) + chr;
hash |= 0; // Convert to 32bit integer
}
return hash % nShards;
};
Then the maps will be less enormous.
How is this problem typically solved in event-sourcing models?
As I understand you think you need to replay all the events using a reducer in order to query all the users, correct?
This is where cqrs comes into play together with read models/denormalizers.
What almost everyone does is they have a read model (which for example is stored in a sql database or something else which is good at querying data). this read model is constantly being updated when new events are created.
When you need to query all users you query this read model and not replay all events.

Script to return array for scripted metric aggregation from combine

For scripted metric aggregation , in the example shown in the documentation , the combine script returns a single number.
Instead here , can i pass an array or hash ?
I tried doing it , though it did not return any error , i am not able to access those values from reduce script.
In reduce script per shard i am getting an instance when converted to string read as 'Script2$_run_closure1#52ef3bd9'
Kindly let me know , if this can be accomplished in any way.
At least for Elasticsearch version 1.5.1 you can do so.
For example, we can modify Elasticsearch example (scripted metric aggregation) to receive an average profit (profit divided by number of transactions):
{
"query": {
"match_all": {}
},
"aggs": {
"avg_profit": {
"scripted_metric": {
"init_script": "_agg['transactions'] = []",
"map_script": "if (doc['type'].value == \"sale\") { _agg.transactions.add(doc['amount'].value) } else { _agg.transactions.add(-1 * doc['amount'].value) }",
"combine_script": "profit = 0; num_of_transactions = 0; for (t in _agg.transactions) { profit += t; num_of_transactions += 1 }; return [profit, num_of_transactions]",
"reduce_script": "profit = 0; num_of_transactions = 0; for (a in _aggs) { profit += a[0] as int; num_of_transactions += a[1] as int }; return profit / num_of_transactions as float"
}
}
}
}
NOTE: this is just a demo for an array in the combine script, you can calculate average easily without using any arrays.
The response will look like:
"aggregations" : {
"avg_profit" : {
"value" : 42.5
}
}

SNMP4j agent snmp table

I have created on SNMP agent using snmp4j api but getting issue with snmp table registration
Once i register a table and and rows in table. and after that if i set the values in table all the rows got set with same value.
I have snmp table created from JSON
In below table if i set value
.1.3.6.1.4.1.1.201.6.2. it set the values for all the rows that are registered in below table. Does anyone aware of how to register and set the values properly using snmmpj agent.
{
"tableName": "table1",
"tableId": ".1.3.6.1.4.1.1.201.6.1",
"columns": [
{
"columnName": "column1",
"columnOID": 1,
"dataType": 70,
"accessType": 1,
"defaultValue":0
},
{
"columnName": "column2",
"columnOID": 2,
"dataType": 70,
"accessType": 1,
"defaultValue":0
},
{
"columnName": "column3",
"columnOID": 3,
"dataType": 70,
"accessType": 1,
"defaultValue":0
},
]
}
public static MOTable<MOTableRow<Variable>, MOColumn<Variable>, MOTableModel<MOTableRow<Variable>>> createTableFromJSON(
JSONObject data) {
MOTable table = null;
if (data != null) {
MOTableSubIndex[] subIndex = new MOTableSubIndex[] { moFactory
.createSubIndex(null, SMIConstants.SYNTAX_INTEGER, 1, 100) };
MOTableIndex index = moFactory.createIndex(subIndex, false,
new MOTableIndexValidator() {
public boolean isValidIndex(OID index) {
boolean isValidIndex = true;
return isValidIndex;
}
});
Object indexesObj = data.get("indexValues");
if(indexesObj!=null){
String indexes = data.getString("indexValues");
String tableOID = data.getString("tableId");
JSONArray columnArray = data.getJSONArray("columns");
int columnSize = columnArray.size();
MOColumn[] columns = new MOColumn[columnSize];
Variable[] initialValues = new Variable[columnSize];
for (int i = 0; i < columnSize; i++) {
JSONObject columnObject = columnArray.getJSONObject(i);
columns[i] = moFactory.createColumn(columnObject
.getInt("columnOID"), columnObject.getInt("dataType"),
moFactory.createAccess(columnObject
.getInt("accessType")));
initialValues[i] = getVariable(columnObject.get("defaultValue"));
}
MOTableModel tableModel = moFactory.createTableModel(new OID(
tableOID), index, columns);
table = moFactory.createTable(new OID(tableOID), index, columns,
tableModel);
String[] indexArrString = indexes.split(";");
for(String indexStr: indexArrString){
MOTableRow<Variable> row = createRow(new Integer(indexStr.trim()), initialValues);
table.addRow(row);
}
}
}
return table;
}
First of all, OIDs do not start with a dot (as specified by ASN.1).
Second, you do not seem to use any row index data. Rows are indentified by their indexes. A row index is the instance identifier suffix of a tabular instance OID:
<tableOID>.1.<rowIndex>
The can consists several sub-index values encoded as OIDs.

Resources