Performance & User Experience of fetching paginated data with paginated children - user-interface

I'm tackling an issue that I'd like some input as to how to solve. Basically what I'm trying to do is the following: I have an LMS app in which teachers can create courses which contains different kinds of learning material. The course is modeled as a polymorphic tree: each node of the tree represents a type of learning content; learning content is hierarchical, hence the tree (for example there can be a node of type Topic which, as children, has nodes of type Lesson).
When a student accesses a course, they are shown the top-level nodes for the tree. Some nodes, like Folder, don't expose their children and require the user to "open" the folder in the UI. Other nodes, like the above mentioned Topic, do show their children: you can think of them being rendered as a title -- the title of the topic -- in the UI, and the children nodes appear below the title.
In order to limit the size of the initial fetch response, I want to use pagination on the number of nodes returned by each response. Similarly, a node's children should also be returned paginated upon request.
I'm planning on using some form of infinite scroll in order to provide a seamless experience (think Google Classroom).
I'm looking for the optimal way to fetch and display data which balances performance and user experience. Keep in mind: the problem arises solely from the fact that some nodes are rendered with their children immediately shown; if all types of nodes required the user to "open" them, this would be trivial as I could just do the following:
when the student enters a course, fetch the top-level nodes paginated
when a node is opened, fetch its children paginated
in both views, simply fetch the following pages when the user scrolls to the bottom of the page
Here are the 2 approaches I'm torn between:
never include a node's children in the response. This is the simplest approach, especially from the backend standpoint: when a student enters a course, the frontend fetches the top-level nodes paginated; they are continued to be fetched as the user scrolls down. If a node is fetched and it's one of the nodes that render their children immediately (e.g. a Topic node), the frontend does a separate call to get its (paginated children). The downsides are more requests to the server and the fact that the user may have a bad user experience if the node has a lot of children, as they will continue to appear pushing downwards the content that's below the parent node (I can show a skeleton to mitigate this)
include the paginated children in the representation of a node. The upside is that if the node contains few children, the whole node and its children can be rendered right away without issues. If there are more children than the page size, the frontend will perform subsequent requests to obtain the other pages. This is much more complex to handle as there is nested pagination
Is there a clear winner here? Are there other better approaches I could take?

Related

How to save persistent data structure into elastic search

I have a relational DB, and it saved a persistent segment tree structure, means all parent and children are always manyToMany, the height of this tree is about 4 to 5 levels
Persistent segment tree reference: https://www.geeksforgeeks.org/persistent-segment-tree-set-1-introduction/
Because it is segment tree, node data won't change much, when it changed, there should be another node generated
My questions is which way I should save my data into elastic search.
My current thought:
nested will build too many documents, anything changes from leaf node will generate a new root, and in new root, will make a lot of redundant data, because a lot of un-versioned nodes are not changed, and it is hard to trace node versioning
parent-child mode looks good fit, however each node may have multiple parents and children,
this many-to-many relationship is almost everywhere from my DB, denormalize it seems impossible, the solution that I could think of is maintaining children and parents IDs from each level, but it makes so hard to maintain, plus join might make it slow
Please enlight me with your ideas.

Request part (not all) of a JSON or XML object?

I'm planning a D3.js application that will display a network graph (i.e. display nodes and edges--not a line plot or barchart, etc.). Only some nodes and edges need to be displayed at any given moment, and the attributes of nodes and edges will change, too--all in response to user interaction. So far, so good--I know that d3.js can do this sort of thing, as illustrated by the force-collapsible example and the health and wealth of nations example. It would be simplest to keep all of the data in a single JSON or XML object.
I'm worried that if my application loads all of the data needed for all parts of the network at any time, I'll overwhelm the user's system. A typical network will have 35000 nodes, with attributes that vary at up to 5000 timesteps. (This is about 4GB in a GEXF format XML file with unnecessary whitespace removed.)
Is there a way to request only part of a JSON or XML object, i.e. only those parts of the tree that I need a given time? Or will I have to do something more complicated? Any pointers to options to investigate will be appreciated.
(This might be FAQ, but it's one of those things that's difficult to search on.)

Efficient view updating with functional data model

In functional programming, data models are immutable, and updating a data model is done by applying a function on the data model, and getting a new version of the data model in return. I'm wondering how people write efficient viewers/editors for such data models, though (more specifically in Clojure)
A simplified example: suppose that you want to implement a viewer for a huge tree. In the non-functional world, you could have a controller for the Tree, with a function updateNode(Node, Value), which could then notify all observers to tell them that a specific node in the tree has been updated. On the viewer side, you would put all the nodes in a TreeView widget, keep a mapping of Node->WidgetNode, and when you are notified that a Node has changed, you can update just the one corresponding NodeWidget in the tree that needs updating.
The solution described in another Clojure MVC question talks about keeping the model in a ref, and adding a watcher. While this would indeed allow you to get notified of a change in the model, you still wouldn't know which node was updated, and would have to traverse the whole tree, correct?
The best thing I can come up with from the top of my head requires you to in the worst case update all the nodes on the path from root to the changed node (as all these nodes will be different)
What is the standard solution for updating views on immutable data models?
I'm not sure how this is a problem that's unique to functional programming. If you kept all of your state in a singly rooted mutable object graph with a notify when it changed, the same issue would exist.
To get around this, you could simply store the current state of model, and some information about what changed for the last edit. You could even keep a history of these things to allow for easy undo/redo because Clojure's persistent data structures make that extremely efficient with their shared underlying state.
That's just one thought on how to attack it. I'm sure there are many more.
I also think it's worth asking, "How efficient does it need to be?" The answer is, "just efficient enough for the circumstances." It might be the the plain map of data will work because you don't really have all that much data to deal with in a given application.

Suitable tree data structure

Which is the most suitable tree data structure to model a hierarchical (containment relationship) content. My language is bit informal as I don't have much theoretical background on these
Parent node can have multiple children.
Unique parent
Tree structure is rarely changed, os it is ok to recreate than add/rearrange nodes.
Two way traversal
mainly interested in, find parent, find children, find a node with a unique id
Every node has a unique id
There might be only hundreds of nodes in total, so performance may not be big influence
Persistence may be good to have, but not necessary as I plan to use it in memory after reading the data from DB.
My language of choice is go (golang), so available libraries are limited. Please give a recommendation without considering the language which best fit the above requirement.
http://godashboard.appspot.com/ lists some of the available tree libraries. Not sure about the quality and how active they are. I read god about
https://github.com/petar/GoLLRB
http://www.stathat.com/src/treap
Please let know any additional information required.
Since there are only hundreds of nodes, just make the structure the same as you have described.
Each node has a unique reference to parent node.
Each node has a list of child node.
Each node has an id
There is a (external) map from id --> node. May not even necessary.
2 way traversal is possible, since parent and child nodes are know. Same for find parent and find child.
Find id can be done by traverse the whole tree, if no map. Or you can use the map to quickly find the node.
Add node is easy, since there is a list in each node. Rearrange is also easy, since you can just freely add/remove from list of child nodes and reassign the parent node.
I'm answering this question from a language-agnostic aspect. This is a tree structure without any limit, so implementation is not that popular.
I think B-Tree is way to go for your requirements. http://en.wikipedia.org/wiki/B-tree
Point 1,2,3: B-Tree inherently support this. (multiple children, unique parent, allows insertion/deletion of elements
Point 4,5: each node will have pointers for its child by default implementation . Additionally you can maintain pointer of parent for each node. you can implement your search/travers operations with BFS/DFS with help of these pointers
Pomit 6: depends on implementation of your insert method if you dont allow duplicate records
Pont 7,8: not a issue as for you have mentioned that you have only hundreds of records. Though B-Trees are quite good data structure for external disk storage also.

What are pro/cons of push/pull data flow models?

I've been developing an in-house DSP application (Java w/ hooks for Groovy/Jython/JRuby, plugins via OSGi, plenty of JNI to go around) in data flow/diagram style, similar to pure data and simulink. My current design is a push model. The user interacts with some source component causing it to push data onto the next component and so on until a end block (typically a display or file writer). There are some unique challenges with this design specifically when a component starves for input. There is no easy way to request more input. I have mitiated some of this with feedback control flow, ex an FFT block can broadcast that it needs more data to source block of it's chain. I've contemplated adding support for components to be either push/pull/both.
I'm looking for responses regarding the merits of push vs pull vs both/hybrid. Have you done this before? What are some of the "gotchas"? How did you handle them? Is there a better solution for this problem?
Some experience with a "mostly-pull" approach in a large-scale product:
Model: Nodes build a 1:N tree, i.e. each component (except the root) has 1 parent and 1..N children. Data flows almost exclusively from parent to children. Change notifications can originate from any node in the tree.
Implementation: All leafs are notified with the sending node's id and a "generation" counter. Leafs know which node path they depend on, so they know if they need to update. (Any other child node update algorithm would do, too, and might have been better in hindsight).
Leafs query their parent for current data, query bubbles up recursively. The generation counter is included, so the bubble-up stops at the originating node.
Advantages:
parent nodes don't need much/any information about their children. Data can be consumed by anyone - this allowed a generic approach to implementing some (initially not expected) non-UI functionality on top of the data intended for display
Child nodes can aggregate and delay updates (avoiding repaints sure beats fast painting)
inactive leafs do cause no data traffic at all
Disadvantages:
Incremental updates are expensive, as full data is published.
The implementation actually allows for different data packets to be requested (and
the generation counter could prevent unecessary data traffic), but the data packets initially designed are very large. Slicing them was an afterthought, but works ok.
You need a real good generation mechanism. The one initially implemented collided with initial updates (that need special handling - see "incremental updates") and aggregation of updates
the need for data travelling up the tree was greatly underestimated.
publish is cheap only when the node offers read-only access to current data. This might require additional update synchronization, though
sometimes you want intermediate nodes to update, even when all leafs are inactive
some leafs ended up implementing polling, some base nodes ended up relying on that. ugly.
Generally:
Data-Pull "feels" more native to me when data and processing layer should know nothing about the UI. However, it requires a complex change notificatin mechanism to avoid "Updating the universe".
Data-Push simplifies incremental updates, but only if the sender intimately knows the receiver.
I have no experience of similar scale using other models, so I can't really make a recommendation. Looking back, I see that I've mostly used pull, which was less of a hassle. It would be interesting to see other peoples experiences.
I work on a pure-pull image processing library. It's more geared to batch-style operations where we don't have to deal with dynamic inputs and for that it seems to work very well. Pull works especially well for large data sets and for threading: we scale linearly to at least 32 CPUs (depending on the graph being evaluated, of course, heh).
We have a GUI that allows leaves to be dynamic data sources (for example, a video camera delivering frames) and they are handled by throwing away and rebuilding the relevant parts of the graph on a change. This is cheap in our case, so the overhead isn't that high.

Resources