Return subset of Redis Values matching specific property? - performance

Let's say I have this kind of model:
public class MyModel
{
public long ID { get; set; }
public long ParentModelID { get; set; }
public long ReferenceID1 { get; set; }
public long ReferenceID2 { get; set; }
}
There are more attributes, but for examples sake, it is just this. There are around 5000 - 10000 rows of this model. Currently storing it in a Redis Set.
Is there an efficient way in REDIS to query only a subset of the whole Data Set? For example, in LINQ I can do:
allModels.Where(m => m.ParentModelID == my_id);
or
allModels.Where(m => m.ReferenceID1 == my_referenceid);
Basically, being able to search through the dataset without returning the whole dataset and performing the LINQ queries against that. Because querying and returning 10,000 rows to get only 100 is not efficient?

You can use an OHM (Object-Hash Mapper, like ORM) in your favorite language to achieve the LINQ-like behavior. There are quite a few listed under the "Higher level libraries and tools" section of the [Redis Clients page](https://redis.io/clients.
Alternatively, you can implement it yourself using the patterns described at https://redis.io/topics/indexes.

You can't use something like LINQ in Redis out of the box. Redis is just a key-value store, so it doesn't have the same principles or luxuries as a relational database. It doesn't have queries or relations, so something like LINQ just doesn't translate at all.
As a workaround, you could segment your data using different keys. Each key could reference a set that stores values with a specific range of reference Ids. That way you wouldn't need to retrieve all 10,000 items.
I would also recommend looking at hashes, this might be more appropriate than a set depending on your use case as they're better at storing complex data objects.

Related

How can I delete all records from a table?

I've been searching for an answer on how to delete ALL records from a table using LINQ method syntax but all answers do it based on an attribute.
I want to delete every single record from the databse.
The table looks like so:
public class Inventory
{
public int InventoryId { get; set; }
public string InventoryName { get; set; }
}
I'm not looking to delete records based on a specific name or id.
I want to delete ALL recods.
LINQ method syntax isn't a must, bt I do prefer it since it's easier to read.
To delete all data from DB table I recommend to use SQL:
Trancate Table <tableName>
Linq is not meant to change the source. There are no LINQ methods to delete or update any element from your input.
The only method to change you input, is to select the (identifiers of the )data that you want to delete in some collection, and then delete the items one by one in a foreach. It might be that your interface with the source collection already has a DeleteRange, in that case you don't have to do the foreach.
Alas you didn't mention what your table was: Is it a System.Data.DataTable? Or maybe an Entity Framework DbSet<...>? Any other commonly used class that represents a Table?
If you table class is a System.Data.DataTable, or implements ICollection, it should have a method Clear.
If your tabls is an entity framework DbSet<...>, then it depends on your Provider (the database management system that you use) whether you can use `Clear'. Usually you need to do the following:
using (var dbContext = new MyDbContext(...))
{
List<...> itemsToDelete = dbContext.MyTable.Where(...).ToList();
dbContext.MyTable.RemoveRange(itemsToDelete);
dbContext.SaveChanges();
}

How can I speedup IEnumerable<T> database access

I am using EF6 code first to execute a query that pulls a large amount of data for parallelized GPU processing. The linq query returns an IEnumerable.
IEnumerable<DatabaseObject> results = ( from items in _myContext.DbSet
select items).Include("Table1").Include("Table2");
Now, I need to perform some statistical analysis on the complete set of data, and present the result to the user.
Unfortunately, because of the sheer size of the returned data, just doing a
results.ToList() is taking an extremely long time to complete... and I haven't even begun the parallelized processing of the data as yet!
I there anything that I can do to make this more efficient other than reducing the amount of data being pulled? This is not an option since it is the complete set of data that needs to be processed.
EDIT 1
My current code first is as follows:
public class Orders
{
[Key]
public virtual DateTime ServerTimeId
{
get;
set;
}
public string Seller
{
get;
set;
}
public decimal Price
{
get;
set;
}
public decimal Volume
{
get;
set;
}
public List<Table1> Tables1{ get; set; }
public List<Table2> Table22{ get; set; }
}
Although by not using .Include my query speeds up significantly, if I do not use .Include ("Tables1).Include("Tables2") these fields are null
in the final result for this query:
var result = ( from items in _context.DbOrders
select orderbook ).Include("Tables1").Include("Tables2")
In my DbContext, I have defined:
public DbSet<Orderok> DbOrders { get; set; }
If there is a way to force EF6 to populate these tables without the use of .Include, then I'd be very pleased if someone could instruct me.
You can load the main table, DbOrders and the child tables separately into the context:
_myContext.Configuration.ProxyCreationEnabled = false;
_myContext.DbOrders.Load();
_myContext.Table1.Load();
_myContext.Table2.Load();
Now the context is fully charged with the data you need. I hope you won't run into an out of memory exception (because then the whole approach collapses).
Entity Framework excecutes relationship fixup, which means that it populates the navigation properties DbOrders.Table1 and DbOrders.Table1.
Disabling proxy creation has two reasons:
The materialized objects will be as light-weight as possible
Lazy loading is disabled, otherwise it would be triggered when you access a navigation property.
Now you can continue working wit the data by accessing the Local collection:
from entity in _myContext.DbOrders.Local
...
You can further try to speed op the process by unmapping all database fields that you don't need. This makes the SQL result sets smaller and the materialized object will be even lighter. To achieve that, maybe you have to create a dedicated context.

Linq Pivot Query on Column with Unknown Values

Say I have the following class:
public class Sightings
{
public string CommonName { get; set; }
public string ScientificName { get; set; }
public string TimePeriod { get; set; }
public bool Seen { get; set; }
}
I would like to create a pivot query on TimePeriod. If I knew what the values in TimePeriod were ahead of time I would know how to write the query. However, I don't know. They could be years (e.g., 2007, 2008, 2009 etc.) or they could be months (e.g., Jan-2004, Feb-2004, March-2004) or they could be quarters (e.g., Q1-2004, Q2-2005, Q3-2004 etc.). It really depends on how the user wants to view the data.
Is it possible to write a Linq query that can do this, when the values in the pivot column are not known?
I've considered writing the values to an Excel sheet, creating a pivot sheet and then reading the values out of the sheet. I may have to do that if a Linq query won't work.
Do you know the most granular values that the user can choose ?
If for example it's months, then you can simply write a query for months (see Is it possible to Pivot data using LINQ?).
If the user then chooses years, it should be simple to aggregate those values to years.
If they want quarters, then you can aggregate to quarters and so on.
It's maybe not the best/most generic solution, but it should get the job done quite easily.

How to use a Dictionary or Hashtable for LINQ query performance underneath an OData service

I am very new to OData (only started on it yesterday) so please excuse me if this question is too dumb :-)
I have built a test project as a Proof of Concept for migrating our current web services to OData. For this test project, I am using Reflection Providers to expose POCO classes via OData. These POCO classes come from in-memory cache. Below is the code so far:
public class DataSource
{
public IQueryable<Category> CategoryList
{
get
{
List<Category> categoryList = GetCategoryListFromCache();
return categoryList.AsQueryable();
}
}
// below method is only required to allow navigation
// from Category to Product via OData urls
// eg: OData.svc/CategoryList(1)/ProductList(2) and so on
public IQueryable<Category> ProductList
{
get
{
return null;
}
}
}
[DataServiceKeyAttribute("CategoryId")]
public class Category
{
public int CategoryId { get; set; }
public string CategoryName { get; set; }
public List<Product> ProductList { get; set; }
}
[DataServiceKeyAttribute("ProductId")]
public class Product
{
public int ProductId { get; set; }
public string ProductName { get; set; }
}
To the best of my knowledge, OData is going to use LINQ behind the scenes to query these in-memory objects, ie: List in this case if somebody navigates to OData.svc/CategoryList(1)/ProductList(2) and so on.
Here is the problem though: In the real world scenario, I am looking at over 18 million records inside the cache representing over 24 different entities.
The current production web services make very good use of .NET Dictionary and Hashtable collections to ensure very fast look ups and to avoid a lot of looping. So to get to a Product having ProductID 2 under Category having CategoryID 1, the current web services just do 2 look ups, ie: first one to locate the Category and the second one to locate the Product inside the Category. Something like a btree.
I wanted to know how could I follow a similar architecture with OData where I could tell OData and LINQ to use Dictionary or Hashtables for locating records rather than looping over a Generic List?
Is it possible using Reflection Providers or I am left with no other choice but to write my custom provider for OData?
Thanks in advance.
You will need to process expression trees, so you will need at least partial IQueryable implementation over the underlying LINQ to Objects. For this you don't need a full blown custom provider though, just return you IQueryable from the propties on the context class.
In that IQueryable you would have to recognize filters on the "key" properties (.Where(p => p.ProductID = 2)) and translate that into a dictionary/hashtable lookup. Then you can use LINQ to objects to process the rest of the query.
But if the client issues a query with filter which doesn't touch the key property, it will end up doing a full scan. Although, your custom IQueryable could detect that and fail such query if you choose so.

Linq dynamic queries for user search screens

I have a database that has a user search screen that is "dynamic" in that I can add additional search criteria on the fly based on what columns are available in the particular view the search is based on and it will allow the user to use them immediately. Previously I had been using nettiers for this database, but now I am programming a new application against it using RIA and EntFramework 4 and LINQ.
I currently have 2 tables that are used for this, one that fills the combobox with the available search string patterns:
LastName
LastName, FirstName
Phone
etc....
then I have an other table that splits those criteria out and is used in my nettiers algorithms. It works well, but I want to use LINQ..and it doesnt fit this model very well. Besides I think I can pare it down to just one table with linq...
using a format similar to this or something very close...
ID Criteria WhereClause
1 LastName 'Lastname Like '%{0}%'
now I know this wont fit specifically into a linq query..but I am trying to use a univeral syntax for clarity here...
the real where clause would look something like this: a=>a.LastName.Contains("{0}")
My first question is: Is that even possible to do? Feed a lambda in to a string and use it in a Linq Query?
My second question is: at one point when I was researching this before I found a linq syntax that had a prefix like it.LastName{0}
and I appear to have tried using it because vestiges of it are still in my test databases...but I dont know recall where I read about it.
Is anyone doing this? I have done some searches and found similar occurances but they mostly have static fields that are optional, not exactly the way I am doing it...
As for your first question, you can do this using Dynamic Linq as described by Scott Gu here
var query = Northwind.Products.Where("Lastname LIKE "test%");
I'm not sure how detailed your dynamic query needs to be, but when I need to do dynamic queries, I create a class to represent filter values. Then I pass that class to a search method on my repository. If the value for a field is null then the query ignores it. If it has a value it adds the appropriate filter.
public class CustomerSearchCriteria{
public string LastName { get; set; }
public string FirstName { get; set; }
public string PhoneName { get; set; }
}
public IEnumberable<Customer> Search(CustomerSearchCriteria criteria){
var q = db.Customers();
if(criteria.FirstName != null){
q = q.Where(c=>c.FirstName.Contains(criteria.FirstName));
}
if(criteria.LastName!= null){
q = q.Where(c=>c.LastName.Contains(criteria.LastName));
}
if(criteria.Phone!= null){
q = q.Where(c=>c.Phone.Contains(criteria.Phone));
}
return q.AsEnumerable();
}

Resources