I have a ProductDocument model in CosmosDB, which represents a Product. Within that model there is a subdocument contributors which holds who has contributed to the Product. Each contributor has a role.
Now I have been experimenting with a query that needs to:
Only select ProductDocument with a contributor.roleDescription of Author
Only select ProductDocument with a division of Pub 1
Only include contributors sub documents with a contributor.roleDescription of Author in the result set.
Now I'm struggling with:
Part 3 of select above. How do I accomplish this bit as my result set is including both contributor.roleDescription of Author AND Illustrator
Example Cosmos Model:
[
{
"id": "1",
"coverTitle": "A Title",
"pubPrice": 2.99,
"division" :"Pub 1",
"Availability": {
"code": "20",
"description": "Digital No Stock"
},
"contributors": [
{
"id": 1,
"firstName": "Brad",
"lastName": "Smith",
"roleDescription": "Author",
"roleCode": "A01"
},
{
"id": 2,
"firstName": "Steve",
"lastName": "Bradley",
"roleDescription": "Illustrator",
"roleCode": "A12"
}
]
},
{
"id": "2",
"coverTitle": "Another Title",
"division" :"Pub 2",
"pubPrice": 2.99,
"Availability": {
"code": "50",
"description": "In Stock"
},
"contributors": [
{
"id": 1,
"firstName": "Gareth Bradley",
"lastName": "Smith",
"roleDescription": "Author",
"roleCode": "A01"
}
]
}]
Here is my SQL which I have been playing around with in the Data Explorer:
SELECT VALUE p
FROM Products p
JOIN c IN p.contributors
WHERE c.roleDescription = 'Author'
AND p.division = 'Pub 1'
Here is my LINQ query from my service:
var query = client.CreateDocumentQuery<ProductDocument>(
UriFactory.CreateDocumentCollectionUri("BiblioAPI", "Products"),
new FeedOptions
{
MaxItemCount = -1,
EnableCrossPartitionQuery = true
}
)
.SelectMany(product => product.Contributors
.Where(contributor => contributor.RoleDescription == "Author")
.Select(c => product)
.Where(p => product.Division == "Pub 1"))
.AsDocumentQuery();
List<ProductDocument> results = new List<ProductDocument>();
while (query.HasMoreResults)
{
results.AddRange(await query.ExecuteNextAsync<ProductDocument>());
}
It selects the correct records, but how do I de-select the Illustrator sub document of contributor, because at the moment I get the following:
{
"id": "1",
"coverTitle": "A Title",
"pubPrice": 2.99,
"division" :"Pub 1",
"Availability": {
"code": "20",
"description": "Digital No Stock"
},
"contributors": [
{
"id": 1,
"firstName": "Brad",
"lastName": "Smith",
"roleDescription": "Author",
"roleCode": "A01"
},
{
"id": 2,
"firstName": "Steve",
"lastName": "Bradley",
"roleDescription": "Illustrator",
"roleCode": "A12"
}
]
}
But the following output is what I want, excluding the Illustrator contributor sub document:
{
"id": "1",
"coverTitle": "A Title",
"pubPrice": 2.99,
"division" :"Pub 1",
"Availability": {
"code": "20",
"description": "Digital No Stock"
},
"contributors": [
{
"id": 1,
"firstName": "Brad",
"lastName": "Smith",
"roleDescription": "Author",
"roleCode": "A01"
}
]
}
EDIT:
I would like to filter on Product if one of the subdocument contributor.roleDescription equals Author. So if the Product record doesn't include a Author contributor I don't want it
I want to include each contributor subdocument that equals Author. So if there are multiple Author contributor subdocuments for a Product, I want to include them, but exclude the Illustrator ones.
You could have a Collection of ProductDocuments.
Help on the fluent LINQ syntax would help greatly.
Azure CosmosDB now supports subqueries. Using subqueries, you could do this in two ways, with minor differences:
You could utilize the ARRAY expression with a subquery in your projection, filtering out contributors that you don’t want, and projecting all your other attributes. This query assumes that you need a select list of attributes to project apart from the array.
SELECT c.id, c.coverTitle, c.division, ARRAY(SELECT VALUE contributor from contributor in c.contributors WHERE contributor.roleDescription = "Author") contributors
FROM c
WHERE c.division="Pub 1"
This assumes that you need to filter on division "Pub 1" first followed by the subquery with the ARRAY expression.
Alternately, if you want the entire document along with the filtered contributors, you could do this:
SELECT c, ARRAY(SELECT VALUE contributor from contributor in c.contributors WHERE contributor.roleDescription = "Author") contributors
FROM c
WHERE c.division="Pub 1"
This will project the original document with a "Pub 1" division in the property labeled "c", along with a filtered contributor array separately in the property labeled "contributors". You could refer this contributor array for your filtered contributors and ignore the one in the document.
This will do what you want, but obviously if you have multiple contributors you want to show it might not do quite what you are after - it's hard to tell with your question if that is what you want exactly
SELECT p.id, p.coverTitle, p.pubPrice, p.division, p.Availability, c as contributors
FROM Products p
JOIN c IN p.contributors
WHERE c.roleDescription = 'Author'
AND p.division = 'Pub 1'
and the output is:
[
{
"id": "1",
"coverTitle": "A Title",
"pubPrice": 2.99,
"division": "Pub 1",
"Availability": {
"code": "20",
"description": "Digital No Stock"
},
"contributors": {
"id": 1,
"firstName": "Brad",
"lastName": "Smith",
"roleDescription": "Author",
"roleCode": "A01"
}
}
]
Note that contributors is not a list, it's a single value, so if multiple contributors match the filter, then you will get the same product returned multiple times.
Related
I have 2 tables
Category's ( id, name )
Sub_categories ( id, key, value, category_id )
I'm trying to get all Category's whose all sub_categories are deactivated ( means are soft-deleted )
let me explain more
i have sub_categories data like this
[
{
"id": 1,
"category_id": 1,
"key": "sub 1",
"value": "sub_1",
"deleted_at": null
},
{
"id": 2,
"category_id": 1,
"key": "sub 2",
"value": "1",
"deleted_at": null
},
{
"id": 4,
"category_id": 1,
"key": "sub 3",
"value": "1",
"deleted_at": "2019-07-09 06:06:01"
},
{
"id": 5,
"category_id": 2,
"key": "sub 1",
"value": "33",
"deleted_at": "2019-07-09 06:06:01"
},
{
"id": 6,
"category_id": 2,
"key": "sub 2",
"value": "33",
"deleted_at": "2019-07-09 06:06:01"
}
]
i want only category_id -> 2 ( where all sub_categories are softedeleted )
hear's category model code
public function subCategory() {
$this->makeVisible('deleted_at');
return $this->hasMany('App\SubCategory','category_id','id');
}
$categories = Categories::doesntHave('subCategory')->get();
you first have to define Relationship between Categorys and Sub_categories - https://laravel.com/docs/5.8/eloquent-relationships.
Then use query to get what you want https://laravel.com/docs/5.8/queries
Get IDs of categories with sub categories:
$categoryIdsWithSubCategories = SubCategory::get()->pluck('category_id')->toArray();
Get categories without sub categories:
$categoriesWithoutSubCategories = Category::whereNotIn('id', $categoryIdsWithSubCategories)->get();
Need help with specific ES query.
I have objects at Elastic Search index. Example of one of them (Participant):
{
"_id": null,
"ObjectID": 6008,
"EventID": null,
"IndexName": "crmws",
"version_id": 66244,
"ObjectData": {
"PARTICIPANTTYPE": "2",
"STATE": "ACTIVE",
"EXTERNALID": "01010111",
"CREATORID": 1006,
"partAttributeList":
[
{
"SYSNAME": "A",
"VALUE": "V1"
},
{
"SYSNAME": "B",
"VALUE": "V2"
},
{
"SYSNAME": "C",
"VALUE": "V2"
}
],
....
I need to find the only entity(s) by partAttributeList entities. For example whole Participant entity with SYSNAME=A, VALUE=V1 at the same entity of partAttributeList.
If i use usul matches:
{"match": {"ObjectData.partAttributeList.SYSNAME": "A"}},
{"match": {"ObjectData.partAttributeList.VALUE": "V1"}}
Of course I will find more objects than I really need. Example of redundant object that can be found:
...
{
"SYSNAME": "A",
"VALUE": "X"
},
{
"SYSNAME": "B",
"VALUE": "V1"
}..
What I get you are trying to do is to search multiple fields of the same object for exact matches of a piece of text so please try this out:
https://www.elastic.co/guide/en/elasticsearch/guide/current/multi-query-strings.html
I have been working with Openrefine for the last days trying to figure out how to export a Google Data sheet into a JSON file.
I have the following data that I want to export to a JSON file.
id first name last name friends first name friends last name family first name family last name
1 James Brown Judy Garland Mary Brown
John Neverland Marlene Brown
Paul Garland Judy Brown
2 John Buller Amy Garland Francis Buller
Peter Flake John Buller
Jules Peter Judy Buller
The JSON that I'm expecting is:
{
"results": [
{
"id": 1,
"firstName": "James",
"lastName": "Brown",
"has": {
"friends": [
{
"firstName": "Judy",
"lastName": "Garland"
},
{
"firstName": "John",
"lastName": "Neverland"
},
{
"firstName": "Paul",
"lastName": "Garland"
}
],
"family": [
{
"firstName": "Mary",
"lastName": "Brown"
},
{
"firstName": "Marlene",
"lastName": "Brown"
},
{
"firstName": "Judy",
"lastName": "Brown"
}
]
}
},
{
"id": 2,
"firstName": "John",
"lastName": "Buller",
"has": {
"friends": [
{
"firstName": "Amy",
"lastName": "Garland"
},
{
"firstName": "Peter",
"lastName": "Flake"
},
{
"firstName": "Jules",
"lastName": "Peter"
}
],
"family": [
{
"firstName": "Francis",
"lastName": "Buller"
},
{
"firstName": "John",
"lastName": "Buller"
},
{
"firstName": "Judy",
"lastName": "Buller"
}
]
}
}
]
}
So far I have tried several approaches:
1) using excel-to-json but it's limited to single nesting and it has some limitations as to column names
2) using Openrefine and the Templating tool but I have encountered several issues:
- Although they are detected as records in openrefine, you export rows and not records so it will export 6 rows to JSON, 4 of them containing empty data
- If i try filling down columns it will aso export 6 rows to JSON, 4 of them with duplicates thus loosing the relations between the person and his family members and friends
Any help would be much appreciated as I'm trying to export about 150,000 records of this type that have to be in this JSON format.
OpenRefine only support one level of nesting. You might need to go with a programming language or an ETL solution to have nested element.
Currently we have a problem to perform a query (or more precisely to design a mapping) in elasticsearch, which help us to perform a query over a relational problem, that we didn't get solved with our non-document orientated thinking from sql.
We want to create a many-to-many relation between different Elasticsearch entries. We need this to edit an entry once and keep all using’s updated to this.
To describe the problem, we'll use the following simple data model:
Broadcast Content
------------ ---------
Id Id
Day Title
Contents [] Description
So we have two different types to index, broadcasts and contents.
A broadcast can have many contents and single contents could also be part of different broadcasts (e.g. repetition).
JSON like:
index/broadcasts
{
"Id": "B1",
"Day": "2014-10-15",
"Contents": [
"C1",
"C2"
]
}
{
"Id": "B2",
"Day": "2014-10-16",
"Contents": [
"C1",
"C3"
]
}
index/contents
{
"Id": "C1",
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
}
{
"Id": "C2",
"Title": "Have a break!",
"Description": "Everything about Android"
}
{
"Id": "C3",
"Title": "Late Night Disaster",
"Description": "Comedy show"
}
Now we want to rename the "Late Night Disaster" into something more precisely and keep all references up to date.
How could we approach this? Are there fourther options in ES, like includes in RavenDB?
Nested objects or child-parent relations didn't helped us so far.
What about denormalizing? seems difficult if we come from the SQL mindset, but give you a try, even with millions of documents, LUCENE indexing can help, and renaming will be a batch job.
[
{
"Id": "B1",
"Day": "2014-10-15",
"Contents": [
{
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
},
{
"Title": "Have a break!",
"Description": "Everything about Android"
}
]
},
{
"Id": "B2",
"Day": "2014-10-16",
"Contents": [
{
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
},
{
"Title": "Late Night Disaster",
"Description": "Comedy show"
}
]
}
]
I am developing a platform with JSON API using Python Flask. In some cases I need to join three tables. How to join tables with a array of IDs gave me some guidance but I need a solution beyond it.
Let's assume we have three tables for a messaging app.
Accounts
Conversations
Messages
Message Readers
Accounts table snippet
{
"id": "account111",
"name": "John Doe",
},
Conversations table snippet
{
"id": "conversation111",
"to": ["account111", "account222", "account333"], // accounts who are participating the conversation
"subject": "RethinkDB",
}
Messages table snippet
{
"id": "message111",
"text": "I love how RethinkDB does joins.",
"from": "account111", // accounts who is the author of the message
"conversation": "conversation111"
}
Message Readers table snippet
{
"id": "messagereader111",
"message": "message111",
"reader": "account111",
}
My question is "What's the magic query to get the document below when I receive a get request on an account document with id="account111"?"
{
"id": "account111",
"name": John Doe,
"conversations": [ // 2) Join account table with conversations
{
"id": "conversation111",
"name": "RethinkDB",
"to": [ // 3) Join conversations table with accounts
{
"id": "account111",
"name": "John Doe",
},
{
"id": "account222",
"name": "Bobby Zoya",
},
{
"id": "account333",
"name": "Maya Bee",
},
]
"messages": [ // 4) Join conversations with messages
{
"id": "message111",
"text": "I love how RethinkDB does joins.",
"from": { // 5) Join messages with accounts
"id": "account111",
"name": "John Doe",
},
"message_readers": [
{
"name": "John Doe",
"id": "account111",
}
],
},
],
},
],
}
Any guidance or advice would be fantastic. JavaScript or Python code would be awesome.
I had a hard time understanding what you want (you have multiple documents with the id 111), but I think this is the query you are looking for
Python query:
r.table("accounts").map(lambda account:
account.merge({
"conversations": r.table("conversations").filter(lambda conversation:
conversation["to"].contains(account["id"])).coerce_to("array").map(lambda conversation:
conversation.merge({
"to": conversation["to"].map(lambda account:
r.table("accounts").get(account)).pluck(["id", "name",]).coerce_to("array"),
"messages": r.table("messages").filter(lambda message:
message["conversation"] == conversation["id"]).coerce_to("array").map(lambda message:
message.merge({
"from": r.table("accounts").get(message["from"]).pluck(["id", "name",]),
"readers": r.table("message_readers").filter(lambda message_reader:
message["id"] == message_reader["message"]).coerce_to("array").order_by(r.desc("created_on")),
})).order_by(r.desc("created_on"))
})).order_by(r.desc("modified_on"))
})).order_by("id").run(db_connection)