I use Room library to access a database. I need to select a lot of data and want to track aprogress of a loading process. Before I could track the progress when iterating over cursor. Using Room's #Query I can get only the whole bunch of selected data at once. Is there any possibility to track the progress of select queries using Room? Below is the query example:
#Query("select * from large_table")
fun getALotOfData() : LiveData<Array<OneObject>>
Related
My app uses AppSync resolvers to fetch data from DDB and return it to our front-end. One table we have is for Notifications. A Notification can be either pending or default (non-pending). The table itself has a primary key of notification_id and we have a GSI called userIndex to grab the notifications for a user, with a sort key of timestamp.
In the app, I show all notifications in a list, pending first and then default. Given that a user may have many notifications, I'd like to implement pagination to fetch a batch at a time. The only way I've been able to do this is to
change the query to include a isPending parameter, which I use as a filter expression for the query to only return notifications that are isPending or isNotPending.
Store two "nextTokens", one for each isPending and isNotPending, along with corresponding lists.
Make separate queries for pending/non-pending, and use the filter to return to the appropriate list.
This is obviously inefficient and I am re-reading data from DynamoDB. My question is, given my DynamoDB table/requirements, is there a way I can paginate so that I can get all the pending notifications first (sorted by timestamp) and then all the default notifications next (sorted by timestamp) by using one query and one nextToken
I've seen the use of #model and #key, but I haven't been able to make it work in my app.
Thanks!
No, not really. There is a hard limit on returns for a Dynamodb query - and that cannot be bypassed. the only way to make use of nextToken is another query.
However, it is also worth noting that the FilterExpression happens after the data has already been retrieved and is filtered client side. It does not reduce the documents pulled from the query - only whats displayed. So the next token is still going to be (relatively) the same for each query. You can instead filter it yourself after the call before the next pagination query and save yourself a little bit in terms of multiple calls.
When building the query and type graph structure in a GraphQL API, where would you put highly contextual queries that only apply to the viewer?
On the top-level (query.friendRequests)
This would remove noise in the User entity and only keep queries in there that are queryable for all users. Not just the viewing user.
It would add much more top-level queries with a risk of them becoming specialists in specific things which is not really thinking-in-a-graph and model-data-around-business-logic ideas.
On the viewer entity (query.viewer.friendRequests)
From a data perspective, this makes more sense to put it underneath the viewer entity (which is a User type). friend requests always belong to a parent object which is always a user.
Other Examples
Dashboard widgets
User notifications
Action items / TODO items / Task lists
Messages
Counters and badges
What are you guys' thoughts on this? What would be a good best-practice to follow for viewing user contextual queries that don't apply to other user entities in an API implementation?
We have always put it under a specific field in Query. First we started with a me query that would return a user. But this did not turn out very practical because the user type got very big and also most fields did not need the whole user object but only the user's ID. In your example we would have done two queries
SELECT * FROM account WHERE id = $id
SELECT * FROM friend_request WHERE account_id = $id
Unless we would query a trivial field on the me query the first query was completely wasted.
Then we got inspired a bit this thread and especially this answer from Lee Byron
Viewer is what we used everywhere at FB, so it’s stuck with me. Also, a Viewer is not a User, it’s an Auth session - which references a User. So there’s a useful distinction of terms.
Now we have a viewer query that returns a Viewer object. This object then has a field user to query the actual user object. This also might or might not help solving the problem around private and public fields on your user object.
Is it possible, using an ADO.NET typed DataSet containing two tables in a parent/child relationship, to populate the DataSet with ONE trip to the d/b (query could return one or two tables; if one, then result set has columns from both tables, right?), and to update the d/b with ONE trip to the d/b (call to generated stored proc, I guess).
By "is it possible", I mean is it possible to have Visual Studio (2012) automagically generate the classes and SQL code to make this happen?
Or am I kind of on my own? It's looking an awful lot like VS really wants to generate one d/b server round trip for each table involved.
*I guess the update stored proc would have to take table-typed parameters from both parent and child, and perform inserts/updates/deletes appropriately.
Yes, one round trip per table is the way to go.
(- It's certainly possible to use a join query to populate a datatable but VS will then be reluctant to generate update etc SQL. This may or may not be a problem, depending on what you intend to do with the dataset.)
But if you have two tables in a dataset, lets say customers - orders, then you would typically use two queries, and two trips to the db:
SELECT * FROM customers WHERE customers.customerid=#customerid
and
SELECT * FROM orders WHERE orders.customerid=#customerid
Somewhat more counter-intuitive is the situation where you want all customers and orders for one country:
SELECT * FROM customers WHERE customers.countryid=#countryid
and
SELECT orders.* FROM orders INNER JOIN customers ON customers.customerid=orders.customerid WHERE customers.countryid=#countryid
Note how the join query returns data from only one table, but uses the join to identify which rows to return.
Then, once you have the data in your dataset, you can navigate it using the getparentrow and getchildrows methods. This is how ADO.Net manages hierarchical data.
You do need this one-table-at-a-time approach, because, assuming you have foreign key constraints in your db, you need to insert and update in reverse order from delete.
EDIT Yes, this does mean that in some circumstances, depending on the data you want and the structure of your primary keys, you could end up with a humungous set of JOINS that still only pull the data from the table at the end of the hierarchy. This might seem wrong in terms of traditional SQL, but actually it's fine. The time you have lost in the multiple, more complex queries is saved by the reduced amount of data you have to pull back across the wire, compared with one big join query that would be returning multiple copies of the parent data.
We use CRM 4.0 at our institution and have no plans to upgrade presently as we've spend the last year and a half customising and extending the CRM to work with our processes.
A tiny part of model is a simply hierarchy, we have a group of learning rooms that has a one-to-many relationship with another entity that describes the courses available for that learning room.
Another entity has a list of all potential and enrolled students who have expressed an interest in whichever course.
That bit's all straightforward and works pretty well and is modelled into 3 custom entities.
Now, we've got an Admin application that reads the rooms and then wants to show the courses for that room, but only where there are enrolled students.
In SQL this is simplified to:
SELECT DISTINCT r.CourseName, r.OtherInformation
FROM Rooms r
INNER JOIN Students S
ON S.CourseId = r.CourseId
WHERE r.RoomId = #RoomId
And this indeed is very close to the eventual SQL that CRM generates.
We use a Crm QueryEntity, a Filter and a LinkEntity to represent this same structure.
The problem now is that the CRM normalizes the a customize entity into a Base Table which has the standard CRM entity data that all share, and then an ExtensionBase Table which has our customisations. To Give a flattened access to this, it creates a view that merges both tables.
This view is what is used by the Generated SQL.
Now the base tables have indices but the view doesn't.
The problem we have is that all we want to do is return Courses where the inner join is satisfied, it's enough to prove there are entries and CRM makes it SELECT DISTINCT, so we only get one item back for Room.
At first this worked perfectly well, but now we have thousands of queries, it takes well over 30 seconds and of course causes a timeout in anything but SMS.
I'm given to believe that we can create and alter indices on tables in CRM and that's not considered to be an unsupported modification; but what about Views ?
I know that if we alter an entity then its views are recreated, which would of course make us redo our indices when this happens.
Is there any way to hint to CRM4.0 that we want a specific index in place ?
Another source recommends that where you get problems like this, then it's best to bring data closer together, but this isn't something I'd feel comfortable in trying to engineer into our solution.
I had considered putting a new entity in that only has RoomId, CourseId and Enrolment Count in to it, but that smacks of being incredibly hacky too; After all, an index would resolve the need to duplicate this data and have some kind of trigger that updates the data after every student operation.
Lastly, whilst I know we're stuck on CRM4 at the moment, is this the kind of thing that we could expect to have resolved in CRM2011 ? It would certainly add more weight to the upgrading this 5 year old product argument.
Since views are "dynamic" (conceptually, their contents are generated on-the-fly from the base tables every time they are used), they typically can't be indexed. However, SQL Server does support something called an "indexed view". You need to create a unique clustered index on the view, and the query analyzer should be able to use it to speed up your join.
Someone asked a similar question here and I see no conclusive answer. The cited concerns from Microsoft are Referential Integrity (a non-issue here) and Upgrade complications. You mention the unsupported option of adding the view and managing it over upgrades and entity changes. That is an option, as unsupported and hackish as it is, it should work.
FetchXml does have aggregation but the query execution plans still uses the views: here is the SQL generated from a simple select count from incident:
'select
top 5000 COUNT(*) as "rowcount"
, MAX("__AggLimitExceededFlag__") as "__AggregateLimitExceeded__" from (select top 50001 case when ROW_NUMBER() over(order by (SELECT 1)) > 50000 then 1 else 0 end as "__AggLimitExceededFlag__" from Incident as "incident0" ...
I dont see a supported solution for your problem.
If you are building an outside admin app and you are hosting CRM 4 on-premise you could go directly to the database for your query bypassing the CRM API. Not supported but would allow you to solve the problem.
I'm going to add this as a potential answer although I don't believe its a sustainable or indeed valid long-term solution.
After analysing the indexes that CRM had defined automatically, I realised that selecting more information in my query would be enough to fulfil the column requirements of an Index and now the query runs in less then a second.
I am using an IList<Employee> where i get the records more then 5000 by using linq which could be better? empdetailsList has 5000
Example :
foreach(Employee emp in empdetailsList)
{
Employee employee=new Employee();
employee=Details.GetFeeDetails(emp.Emplid);
}
The above example takes a lot of time in order to iterate each empdetails where i need to get corresponding fees list.
suggest me anybody what to do?
Linq to SQL/Linq to Entities use a deferred execution pattern. As soon as you call For Each or anything else that indirectly calls GetEnumerator, that's when your query gets translated into SQL and performed against the database.
The trick is to make sure your query is completely and correctly defined before that happens. Use Where(...), and the other Linq filters to reduce as much as possible the amount of data the query will retrieve. These filters are built into a single query before the database is called.
Linq to SQL/Linq to Entities also both use Lazy Loading. This is where if you have related entities (like Sales Order --> has many Sales Order Lines --> has 1 Product), the query will not return them unless it knows it needs to. If you did something like this:
Dim orders = entities.SalesOrders
For Each o in orders
For Each ol in o.SalesOrderLines
Console.WriteLine(ol.Product.Name)
Next
Next
You will get awful performance, because at the time of calling GetEnumerator (the start of the For Each), the query engine doesn't know you need the related entities, so "saves time" by ignoring them. If you observe the database activity, you'll then see hundreds/thousands of database roundtrips as each related entity is then retrieved 1 at a time.
To avoid this problem, if you know you'll need related entities, use the Include() method in Entity Framework. If you've got it right, when you profile the database activity you should only see a single query being made, and every item being retrieved by that query should be used for something by your application.
If the call to Details.GetFeeDetails(emp.Emplid); involves another round-trip of some sort, then that's the issue. I would suggest altering your query in this case to return fee details with the original IList<Employee> query.