dataloader facebook optimization - graphql

everybody!
I'm trying to use Dataloader by facebook in my graphql project.
So, now I'm faced to the next problem. When I ask my database for data by ids for example: select * from books where books.author in (4,5,6,7) I got an Error: "function did not return a Promise of an Array of the same length as the Array of keys". Cause by id 4 I can fetch more then just one book.
Does anybody know how to fix it?

Dataloader is expecting you to return an array of the same length as the input to your loader. So, if the loader gets [4,5,6,7] as an input, it will need to return an array with a length of 4. Also keep in mind that the results returned from the loader need to be in the same order as the input ids. This may or may not be something you have to worry about depending on how the data is returned from your database.

You should return an array for each id - array of arrays. You have to convert sql result - flat list with duplicates into 'groupped' arrays of records preserving input ids (amount and order).

Related

Sorting Issue After Table Render in Laravel DataTables as a Service Implementation

I have implemented laravel dataTable as a service.
The initial two columns are actual id and names so, I am able to sort it asc/desc after the table renders.
But the next few columns renders after performing few calculations, i.e. these values are not fetched directly from any column rather it is processed.
I am unable to sort these columns where calculations were performed, and I get this error. And I know it is looking for that particular column for eg outstanding_amount which I don't have in the DB, rather it is a calculated amount from two or more columns that are in some other tables.
Any Suggestions on how to overcome this issue?
It looks like you're trying to sort by values that aren't columns, but calculated values.
So the main issue here is to give Eloquent/MySql the data it needs to provide the sorting.
// You might need to do some joins first
->addSelect(DB::raw('your_calc as outstanding_amount'))
->orderBy('outstanding_amount') // asc can be omitted as this is the default
// Anternative: you don't need the value sorted by
// Don't forget any joins you might need
->orderByRaw('your_calc_for_outstanding_amount ASC')
For SQL functions it'll work as follow
->addSelect(DB::raw('COUNT(products.id) as product_count'));
->orderByRaw(DB::raw('COUNT(products.id)'),'DESC');

RethinkDB: count unique rows within grouped data

I'm trying to count all unique rows within grouped data, i.e, how many unique rows exist within each group.
Although groupedData.distinct().count() works for relatively small amounts of rows, running it on ~200k rows, such as in my case, ends with "over size limit".
I understand why it happens, yet I can't come up with more efficient way of doing it - is there a way?
Count is an expensive thing in RethinkDB to my experience. Especially for count operation that require iterating the whole data set. I myself struggle with this for a bit before.
To my understanding, when you pass groupData to distinct, it creates an array, because groupData will be a sequence, therefore it has 100,000 element limits.
To solve this, I think we have to use a stream, and count the stream instead. We cannot use group because it returns a group of stream, or in other words, an array of stream to my understanding again.
So here is how I solve it:
Create an index on the field I want to groups
Call distnct on that table with the index.
Map the stream, passing value into a count function with getAll, using index
An example query
r.table('t').distinct({index: 'index_name'})
.map(function(value) {
return {group: value, total: r.table('t').getAll(value, {index: 'index_name'}).count()}
})
With this, everything is a stream and we can lazily iterator over result set to get the count of each group.

couchdb - retrieve unique documents for a view that emits non-unique two array keys

I have an map function in a view in CouchDB that emits non-unique two array keys, for documents of type message, e.g.
The first position in the array key is a user_id, the second position represents whether or not the user has read the message.
This works nicely in that I can set include_docs=true and retrieve the actual documents. However, I'm retrieving duplicate documents in that case, as you can see above in the view results. I need to be able to write a view that can be queried to return unique messages that have been read by a given user. Additionally, I need to be able to efficiently paginate the resultset.
notice in the image above that [66, true] is emitted twice for doc id 26a9a271de3aac494d37b17334aaf7f3. As far as I can tell, with the keys in my map function, I cannot reduce in such a way that unique documents will be returned.
the next idea I had was to emit doc._id also in the map function and reduce with group_level=exact the result being:
now I am able to get unique document ids, but I cannot get the documents without doing a second query. And even in the case of a second query, it will require a lot of complexity to do pagination like this (at least I think so).
the last idea I came up with is to emit the entire document rather than the doc._id in the third position in the array key, then I can access the entire document and likely paginate. This seems really brutish.
So my question is:
Is #3 above a terrible idea? Is there something I'm missing? Is there a better approach?
Thanks in advance.
See #WickedGrey's comment to the question. The solution is to ensure that I never emit the same key twice for one document. I do this in the map function by keeping track of the keys as I emit them in an array, then skipping the emit if the key exists in the array.

Comment System using Redis Database System

I am trying to build a comment system using Redis database, I am currently using hashes to store the comment data, but the problem I am facing is that after 10 or 12 comments, comments lose their order and start appearing randomly, anyone know what data type should be used for building a commenting system using Redis, currently my hashes are of the form.
postid:comments commentid:userid "Testcomment"
Thanks, Any help will be appreciated.
Hashes are set up for quick access by key rather than retrieval in order. If you need items in a particular order, try a list or sorted set.
The reason it appears to work at first is an optimization for small sets - when you only have a small number of items a list is the most efficient structure, so that is what redis uses internally. When you get more items, an actual hashmap is needed for efficient querying and redis rearranges the data so that it is ordered by hash rather than by insertion order.
With my web app, I am using a format like this.
(appname):(postid):(comment id) - The hash of the posts
(appname):(postid):count - The latest comment id
And then I query the (appname):(postid):count key to get the amount of times I should run a loop that gets the contents of the (appname):(postid):(comment id) hash.
Example Code
$c = $redis->get('(appname):(postid):count');
for($i = 0; $i<$c; $i++) {
var_dump($redis->hgetall('(appname):(postid):'.$i));
}

Efficient data structure to get an ID

I need an efficient data structure to generate IDs. The IDs should be able to be released using a method in the data structure. After an ID was released it can be generated again. The data structure must always retrieve the lowest unused ID.
What efficient data structure can be used for this?
Can't you just increment an integer and return that, with appropriate currency control. If someone releases an integer back store that in another sorted data structure and return that. If the list of returned integers is empty then your return is a simple as read, increment, write, return. If the list of returned integers is not empty then just read, return and remove the first int from the returned integers list

Resources