MongoDB pagination with grouped item - performance

I am building a messaging module into an existing web app. We are storing the messages in mongo with a data structure that something looks like:
{
id: "",
inResponseToMessageId: ""
to: []
cc: []
bcc: []
subject: ""
body: ""
owners: [{id:4, status:"read", folder:"inbox"}, {id:1, status:'unread', folder:'inbox'}]
dateSent:
}
The client would like us to be able to display messages in both a conversation view and a singleton view.
I am having trouble figuring out an efficient query that can
Return results grouped by message thread.
Work well with pagination.
Sortable by date and subject.
I can modify the data structure however I need in order to get this to work well.
Below are a few possible solutions but they all seem to fall short:
Store messages as children of a thread object.
Add a threadId to each message and then query and group by threadId.
Create some type of meta information object that helps.
The problem with the standard mongo group or $group function is that I imagine it will perform very poorly when the collection is large. We are expecting there to be hundreds of millions of messages in the collection.

Put a threadId on your messages.
Return results grouped by message thread.
You can find messages by thread like
db.messages.find({ "threadId" : id })
I don't think there's any need to group them in the sense of the $group operator.
Work well with pagination.
Pagination for the messages in what order? Pagination works well if you have a sort on a unique field. dateSent should be unique if you keep it to millisecond precision, so you can paginate on it.
// page 1
db.messages.find({ "threadId" : id }).sort({ "dateSent" : -1 }).limit(25)
// page 2
db.messages.find({ "threadId" : id, "dateSent" : { "$gt" : <25th date sent> } }).sort({ "dateSent" : -1 }).limit(25)
Sortable by date and subject.
Who sorts messages by subject? Anyway, this is just a matter of creating the right indexes if you want to retrieve messages in date or subject order. Depending on your requirements you might be doing this sorting for a client view, where it might not be necessary to have the database sort the results. The client could do it for the returned subset instead.

Related

Adding a custom sorting to listing with an aggregate in shopware 6

I am trying to build a custom sorting for the product listings in shopware 6.
I want to include a foreign table (entity is: leasingPlanEntity), get the min of one of the fields of that table (period_price) and then order the search result by that value.
I have already built a Subscriber, and try it like that, what seems to work.
public static function getSubscribedEvents(): array
{
return [
//ProductListingCollectFilterEvent::class => 'addFilter'
ProductListingCriteriaEvent::class => ['addCriteria', 5000]
];
}
public function addCriteria(ProductListingCriteriaEvent $event): void
{
$criteria = $event->getCriteria();
$criteria->addAssociation('leasingPlan');
$criteria->addAggregation(new MinAggregation('min_period_price', 'leasingPlan.periodPrice'));
// Sortierung hinzufügen.
$availableSortings = $event->getCriteria()->getExtension('sortings') ?? new ProductSortingCollection();
$myCustomSorting = new ProductSortingEntity();
$myCustomSorting->setId(Uuid::randomHex());
$myCustomSorting->setActive(true);
$myCustomSorting->setTranslated(['label' => 'My Custom Sorting at runtime']);
$myCustomSorting->setKey('my-custom-runtime-sort');
$myCustomSorting->setPriority(5);
$myCustomSorting->setFields([
[
'field' => 'leasingPlan.periodPrice',
'order' => 'asc',
'priority' => 1,
'naturalSorting' => 0,
],
]);
$availableSortings->add($myCustomSorting);
$event->getCriteria()->addExtension('sortings', $availableSortings);
}
Is this already the right way to get the min(periodPrice)? Or is it taking just a random value out of the leasingPlan table to define the sort-order?
I didn't find a way, to define the min_period_price aggregate value in the $myCustomSorting->setFields Methods.
Update 1
Some days later, I asked a less complex question in the shopware community on slack:
Is it possible to use the DAL to define a subquery for an association in the product-listing?
It should generate something like:
FROM
JOIN (
SELECT ... FROM ... WHERE ... GROUP BY ... ORDER BY ...
) AS ...
The answer there was:
Don't think so
Update 2
I also did an in-deep anlysis of the DAL-Query-Builder, and it really seems to be not possible, to perform a subquery with the current version.
Update 3 - Different approach
A different approach might be, to define custom fields in the main entity. Every time a change is made on the main entity, the values of this custom fields should be recalculated.
It is a lot of overhead work, to realize this. Especially when the fields you are adding, are dependend on other data like the availability of a product in the store, for example.
So check, if it is worth the extra work. Would be better, to have a solution for building subqueries.
Unfortunately it seems that in your case there is no easy way to achieve this, if I understand the issue correctly.
Consider the following: for each product you can have multiple leasingPlan entities, and I assume that for a given context (like a specific sales channel or listing) that still holds. This means that you would have to sort the leasingPlan entities by price, then take the one with the lowest price, and then sort the products by their lowest-price leasingPlan's price.
There seems to be no other way to achieve that, and unfortunately for you, sorting is applied at the end, even if it is sort of a subquery.
So, for example, if you have the following snippet
$criteria = $event->getCriteria();
$criteria->addAssociation('leasingPlan');
$criteria->getAssociation('leasingPlan')
->addSorting(new FieldSorting('price', FieldSorting::ASCENDING))
->setLimit(1)
;
The actual price-sorting would be applied AFTER the leasingPlan entities are fetched - essentially the results would be sorted, meaning that you would not get the cheapest leasing plan per product, instead getting the first one.
You can only do something like that with filters, but in this case there is nothing to filter by - I assume you don't have one leasingPlan per SalesChannel or per language, so that you could limit that list to just one entry that could be used for sorting
That is not to mention that this could not be included in a ProductSortingEntity, but you could always work around that by plugging into the appropriate events and modifying the criteria during runtime
I see two ways to resolve your issue
Making another table which would store the cheapest leasingPlan per product and just using that as your association
Storing the information about the cheapest leasingPlans in e.g. cache and using that for filtering (caution: a mistake here would probably break the sorting, for example if you end up with too few or too many leasingPlans per product)
public function applyCustomSorting(ProductListingCriteriaEvent $event): void
{
// One leasingPlan per one product
$cheapestLeasingPlans = $this->myCustomService->getCheapestLeasingPlanIds();
$criteria = $event->getCriteria();
$criteria->addAssociation('leasingPlan');
$criteria->getAssociation('leasingPlan')
->addSorting(new FieldSorting('price', FieldSorting::ASCENDING))
->addFilter(new EqualsAnyFilter('id', $cheapestLeasingPlans))
;
}
And then you could sort by
$criteria->addSorting(new FieldSorting('leasingPlan.periodPrice', FieldSorting::ASCENDING));
There should be no need to add the association manually and to add the aggregation to the criteria, that should happen automatically behind the scenes if your custom sorting is selected in the storefront.
For more information refer to the official docs.

Group queries in GraphQL (not "group by")

in my app there are many entities which get exposed by GraphQL. All that entities get Resolvers and those have many methods (I think they are called "fields" in GraphQl). Since there is only one Query type allowed, I get an "endless" list of fields which belong to many different contexts, i.E.:
query {
newsRss (...)
newsCurrent (...)
userById(...)
weatherCurrent (...)
weatherForecast(...)
# ... many more
}
As you can see, there are still 3 different contexts here: news, users and weather. Now I can go on and prefix all fields ([contextName]FieldName), as I did in the example, but the list gets longer and longer.
Is there a way to "group" some of them together, if they relate to the same context? Like so, in case of the weather context:
query {
weather {
current(...)
forecast(...)
}
}
Thanks in advance!
If you want to group them together , you need to have a type which contain all fields under the same context . Take weather as an example , you need to have a type which contain currentWeather and forecastWeather field. Does this concept make sense to your application such that you can name it easily and users will not feel strange about it ? If yes , you can change the schema to achieve your purpose.
On the other hand, if all fields of the same context actually return the same type but they just filtering different things, you can consider to define arguments in the root query field to specify the condition that you want to filter , something like :
query {
weather(type:CURRENT){}
}
and
query {
weather(type:FORECAST){}
}
to query the current weather and forecast weather respectively.
So it is a question about how you design the schema.

readFragment to return all object of a type

i'm using Apollo Client do request a very structured dataset from my server. Something like
-Show
id
title
...
-Seasons
number
-Episodes
id
number
airdate
Thanks to normalization my episodes are stored individually but i cannot query them. For exemple i would like to query all the episodes to then sort them by date to display coming next.
the only way i see is to either 'reduce' my show list to an array of episode and then do the filtering. Or to do a new query to the server.
But it will be so much faster if I could get a list of all Episodes in cache.
Unfortunately with readFragment you can only query One object by its id.
Question:
Is there a way to query the cache for all object of a defined type?
The answer is late, but could have helped someone else, currently apollo does not support it. This is the issue here from github, and also a work around.
https://github.com/apollographql/apollo-client/issues/4724#issuecomment-487373566
Here is the copied workaround by #superandrew213
const serializedState = client.cache.extract()
const typeNameItems = Object.values(serializedState)
.filter(item => item.__typename === 'TypeName')
.map(item => client.readFragment({
fragmentName: 'FragmentName',
fragment: Fragment,
id: item.id,
}))
Please take note that this method is slow, especially if you have a large normalized data.

Ruby: Paginate and sort across a large number of records

When simply displaying large amounts of data (over 100k records) my code works well, and I paginate on the server.
However, when I need to sort this data I'm stuck. I'm only sorting on the page, and NOT sorting on ALL the records related to this one customer.
How can I paginate but also sort across all the records of my customer and NOT simply sort the records returned from the server side pagination?
I'm also using BootStrap Table to display all my data.
Here is my code that gets all the customers:
def get_customers
#data_to_return = []
#currency = current_shop.country_currency
customers = current_shop.customers.limit(records_limit).offset(records_offset)#.order("#{sort_by}" " " "#{sort_order}")
customers.each do |customer|
#data_to_return.push(
state: false,
id: customer.id,
email: customer.email,
accepts_marketing: customer.accepts_marketing,
customer_status: customer.customer_status,
tags: customer.tags)
end
sort_customers
end
And then this is the sort_customers method:
def sort_customers
fixed_data = data_to_return.sort_by {|hsh| hsh[sort_by]}
customer_size = current_shop.customers.length
if sort_order == "ASC"
fixed_data
else
fixed_data.reverse!
end
render json: {"total": customer_size, "rows": fixed_data}
end
In the above code you can see that data_to_return is coming from get_customers and its limited. But I don't want to return ALL the customers for many reasons.
How can I sort across all the records, but only return the paginated subset?
You should actually sort at the model/query level, not at the ruby level.
The difference is basically:
# sort in ruby
relation.sort_by { |item| foo(item) }
# sort in database - composes with pagination
relation.order('column_name ASC/DESC')
In the first case, the relation is implicitly executed, enumerated and converted to array before calling sort_by. If you did pagination (manually or with kaminari), you will get just that page of data.
In the second case, you are actually composing the limit, offset and where (limit and offset are anyways used under the hood by kaminari, where is implicit when you use associations) with a order so your database would execute
SELECT `customers`.`*` FROM `customers`
WHERE ...
OFFSET ...
LIMIT ...
ORDER BY ...
which will return the correct data.
A good option is to define scopes in the model, like
class Customer < ApplicationRecord
scope :sorted_by_email, ->(ascending = true) { order("email #{ascending ? 'ASC' : 'DESC'}") }
end
# in controller
customers = current_shop.customers.
limit(records_limit).
offset(records_offset).
sorted_by_email(false)
You can resolve sorting and paginate issue using Data Tables library, which is client side. It's a Jquery library. Using this you need to load all data into page, then it would work very well.
Below are the references please check.
Data tables jquery libray
Data tables gem for rails
You can try these, they will work very well. You can customise it as well
If the answer is helpful, you can accept it.

Zoho Creator: Sort sub-form records in both Main Forms and Views/Reports

Zoho Creator is a great system for quickly creating simple cloud applications. I've run into a problem with sub-forms, though: currently, Zoho Creator does not provide functionality for sorting sub-form records by a specified column. Instead, it sorts records in the order in which they were added.
My sub-form is a Creator Form that's linked to another Creator Form (basically, 2 different tables). The forms are linked with a bi-directional lookup relationship.
I've seen and tried implementing these "hacks", but none of them work for my situation:
[Zoho Forums, "Subforms sorting rows"][1]
[Zoho Forums, "Hack to sort rows of a subform and pre-populate row fields that I want to preset"][2]
I also called Zoho tech support, and after looking at my application, they said that sorting sub-form records is not currently possible.
Any other ideas?
My tested solution is still a hack, but until Zoho implements a method to sort sub-form records via the GUI, this will have to do.
First, create a function that you can call from anywhere (e.g. when a new sub-form record is added or changed)--for details on that, go here: http://www.zoho.com/creator/help/script/functions.html
This function will first duplicate the sub-form records by the parent record ID (sorting by the appropriate column) and then delete all sub-form records that were inserted before the script started:
int SubFormRecords_SortByAnything_ReturnCount(int ParentRecordID)
{
scriptStartTime = zoho.currenttime;
for each rSubFormRecord in SubFormRecords [ParentFieldName = input.ParentRecordID] sort by FieldName1, FieldName3, FieldName2
{
NewSubFormRecordID = insert into SubFormRecords
[
FieldName1 = rSubFormRecord.FieldName1
FieldName2 = rSubFormRecord.FieldName2
FieldName3 = rSubFormRecord.FieldName3
];
}
delete from SubFormRecords[ (Series == input.ParentRecordID && Added_Time < scriptStartTime) ];
return SubFormRecords[ParentFieldName == input.EventID].count();
}
Once the above sorting function is in place (customized for your application), call it when appropriate. I call it when adding a record associated with the sub-form, or when I change the sorting column values.
That works well, and as long as you don't have complex logic associated with adding and deleting records, it should have minimal impact on application performance.
Please let me know whether that works for you, and if you have any better ideas.
Caveat: This solution is not suitable for forms containing additional sub-form records because deleting the records will delete linked sub-form values.
Thanks.
I have a a very simple workaround:
1) You have to add a Form Workflow
2)Record Event - Create OR Edit OR Create/Edit (As per your requirement)
3)Form Event - On successful form submission
4)Let Main_Form be the link name of the Main Form
4)Let Sub_Form be the Link name of the Sub Form (Not the link name you specify in the main form for the same sub form)
4)Let Field1 and Field2 are fields of subform on which you want to sort subform records
5)Let Link_ID be lookup field of Mainform ID in the subform
Workflow
1)Sub_Records = Sub_Form[Link_ID == input.ID] sort by Field1,Field2;
(sort by multiple fields, add asc/desc as per requirement)
2)delete from Sub_Form[Link_ID == input.ID];
3)for each sub_record in Sub_Records
{
insert into Sub_Form
[
Added_User = zoho.loginuser
Link_ID = input.ID
Field1 = sub_record.Field1
Field2 = sub_record.Field2
]
}
//Now you check the results in edit view of the main form

Resources