Custom unicode sorting order for PouchDB/CouchDB Index (Mango Query) - sorting

I am using PouchDB (with a Cloudant remote database) to have a local database in a dictionary web app.
I need to have an index with a custom Pashto alphabet order (using Arabic unicode letters).
The localdb.find queries with $gte (alphabetically searching with partial words) do not work well because of the irregular Unicode characters in the Pashto alphabet.
Is it possible to create a custom sort, based on the Pashto alphabet, for an index?
See Mango Query Language

In this reference it is mentioned that:
The most important feature of a view result is that it is sorted by key.
Assume you have a database consisting of docs with a unicodeString field inside each doc. So a sample doc would look like below:
{
"_id":"2018-01-30-18-04-11",
"_rev":"AE19EBC7654",
"title":"Hello elephant",
"unicodeString":"שלום פיל",
}
Now you can have a CouchDB view with a map function like this:
function(doc) {
emit(doc.unicodeString, doc.title); // doc.unicodeString is key
// doc.title is value
}
The above view sorts all the docs inside the database according to its key which is doc.unicodeString. Therefore, if you use the above view, all of your docs would be sorted based on your Unicode string inside docs.
If you have 3 docs in database, when you query the above view, you receive a response result like this in which rows array is sorted according to key in each row:
{
"total_rows": 3,
"offset": 0,
"rows": [
{
"key": "ארץ",
"id": "2017-09-01-09-05-11",
"value": "Earth"
},
{
"key": "בין",
"id": "2015-01-19-11-30-28",
"value": "between"
},
{
"key": "שלום פיל",
"id": "2018-01-30-18-04-11",
"value": "Hello elephant"
}
]
}

Related

Filtering JSON based on sub array in a Power Automate Flow

I have some json data that I would like to filter in a Power Automate Flow.
A simplified version of the json is as follows:
[
{
"ItemId": "1",
"Blah": "test1",
"CustomFieldArray": [
{
"Name": "Code",
"Value": "A"
},
{
"Name": "Category",
"Value": "Test"
}
]
},
{
"ItemId": "2",
"Blah": "test2",
"CustomFieldArray": [
{
"Name": "Code",
"Value": "B"
},
{
"Name": "Category",
"Value": "Test"
}
]
}
]
For example, I wish to filter items based on Name = "Code" and Value = "A". I should be left with the item with ItemId 1 in that case.
I can't figure out how to do this in Power Automate. It would be nice to change the data structure, but this is the way the data is, and I'm trying to work out if this is possible in Power Automate without changing the data itself.
Firstly, I had to fix your JSON, it wasn't complete.
Secondly, filtering on sub array information isn't what I'd call easy. However, to get around the limitations, you can perform a bit of trickery.
Prior to the step above, I create a variable of type Array and called it Array.
In the step above, the left hand side expression is ...
string(item()?['CustomFieldArray'])
... and the contains comparison on the right hand side is simply as you can see, a string with the appropriate filter value ...
{"Name":"Code","Value":"A"}
... it's not an expression or a proper object, just a string.
If you need to enhance it to cater for case sensitive values, just set everything to lower case using the toLower expression on the left.
Although it's hard to see, that will produce your desired result ...
... you can see by the vertical scrollbars that it's reduced the size of the array.

Index main-object, sub-objects, and do a search on sub-objects (that return sib-objects)

I've an object like it (simplified here), Each strain have many chromosomes, that have many locus, that have many features, that have many products, ... Here I just put 1 of each.
The structure in json is:
{
"name": "my strain",
"public": false,
"authorized_users": [1, 23, 51],
"chromosomes": [
{
"name": "C1",
"locus": [
{
"name": "locus1",
"features": [
{
"name": "feature1",
"products": [
{
"name": "product1"
//...
}
]
}
]
}
]
}
]
}
I want to add this object in Elasticsearch, for the moment I've add objects separatly: locus, features and products. It's okay to do a search (I want type a keyword, watch in name of locus, name of features, and name of products), but I need to duplicate data like public and authorized_users, in each subobject.
Can I register the whole object in elasticsearch and just do a search on each locus level, features and products ? And get it individually ? (no return the Strain object)
Yes you can search at any level (ie, with a query like "chromosomes.locus.name").
But as you have arrays at each level, you will have to use nested objects (and nested query) to get exactly what you want, which is a bit more complex:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.3/query-dsl-nested-query.html
For your last question, no, you cannot get subobjects individually, elastic returns the whole json source object.
If you want only data from subobjects, you will have to use nested aggregations.

Using elastic search to build flow/funnel results based on unique identifiers

I want to be able to return a set of counts of individual documents from a single index based on a previous set of results, and am wondering if there is a way to do it without running a separate query for each.
So, given a data set like this (simplified version of my ES documents):
{
"name": "visit",
"sessionId": "session1"
},
{
"name": "visit",
"sessionId": "session2"
},
{
"name": "visit",
"sessionId": "session3"
},
{
"name": "click",
"sessionId": "session1"
},
{
"name": "click",
"sessionId": "session3"
}
What I would like to do is be able to search for name: visit and give a count of all those. That part is easy. But I would also like to be able to now count my name: click docs that have the sessionId of the name: visit result set and return a count of how many of those name: click there were as well as the name: visit.
Is there an easy way to do this? I have looked at aggregation APIs but they all seem to not quite fit my needs. There also seems to be a parent/child relationship but it doesn't apply to my situation since both documents I want to individually get counts of are of the same type.
Expected result would be something like this:
{
"count": {
// total number of visit events since this is my start point
"visit": 3,
// the amount of click results that have sessionId
// matching my previous search's sessionId values
"click": 2
}
}
At first glance, you need to do this in two queries:
the first aggregation query to retrieve the sessionIds and
a second aggregation query filtered with those sessionIds to find the count of clicks.
I don't think it's a big deal to run those two queries, but that depends on how much data you have and how many sessionIds you want to retrieve at once.

CouchDB pagination and sorting

So I am using this approach on CouchDB docs to perform pagination.
Request rows_per_page + 1 rows from the view
Display rows_per_page rows, store + 1 row as next_startkey and next_startkey_docid
As page information, keep startkey and next_startkey
Use the next_* values to
create the next link, and use the others to create the previous link
One thing I don't understand is, how do I perform sorting using this approach, assuming each document have a last updated timestamp and I want to sort using that field instead of sorting using ids.
First of all, sorting will always be on the KEYS.
Querying _all_docs result by query a table where the key is the _id.
[
{
"key": "my_first_id",
"value": {}
},
{
"key": "my_second_id",
"value": {}
}
]
So if you want to sort on another field than _id, you will need to use Map/Reduce(Views) For example, you could create a view where the key is the updatedAt field.
This would result in something like this :
[
{
"key": "1475858068",
"value": {}
},
{
"key": "1475553268",
"value": {}
}
]
So using the sort would result by sorting the key :)

RestKit 2.0 - Mapping json array to an enity relationship loses array sequence

I have a problem mapping json to CoreData and reading it out again. I map from json to an Activity-Entity with a relationship of last participant entities. The last_particpants is an array with the most recent participants, ordered from most recent first by the API.
{
"id": 50,
"type": "Initiative",
"last_participants": [
{
"id": 15,
"first_name": "Chris",
},
{
"id": 3,
"first_name": "Mary",
},
{
"id": 213,
"first_name": "Dany",
}
]
}
I have RestKit logging on and see that the mapping reads the array elements one by one and keeps the order. However CoreData saves them as an NSSet of entities and then the order gets lost. When I read out the data its is mixed up. What options do I have to keep the order in which the array was mapped? Any help would be great.
2 options:
Use an ordered set in Core Data (set on the attribute in the properties inspector).
Use the #metadata provided by RestKit to access the collection order during mapping.

Resources