How can I create multiple List with lambda expression? - filter

I have a User with age property.
And in my method I have List. How can I split it to multiple List for another user like:
List<User> lt6Users = new ArrayList<User>();
List<User> gt6Users = new ArrayList<User>();
for(User user:users){
if(user.getAge()<6){
lt6Users.add(user);
}
if(user.getAge()>6){
gt6Users.add(user);
}
// more condition
}
I just known 2 way with lambda expression:
lt6Users = users.stream().filter(user->user.getAge()<6).collect(Collectors.toList());
gt6Users = users.stream().filter(user->user.getAge()>6).collect(Collectors.toList());
The code above is very poor for performance because it will loop through the list many time
users.stream().foreach(user->{
if(user.getAge()<6){
lt6Users.add(user);
}
if(user.getAge()>6{
gt6Users.add(user);
}
});
the code above is look like the code from start code without lambda expression. Is there another way to write code using lambda expression feature like filter and Predicate?

You can use Collectors.partitioningBy(Predicate<? super T> predicate) :
Map<Boolean, List<User>> partition = users.stream()
.collect(Collectors.partitioningBy(user->user.getAge()<6));
partition.get(true) will give you the list of Users with ages < 6, and partition.get(false) will give you the list of the Users with ages >= 6.

I've found a way to use lambda expression for this problem:
Write a method with Predicate:
public void addUser(List<User> users,User user,Predicate<User> p){
if(p.test(user)){
users.add(user);
}
}
So the loop can be write like this:
users.foreach(user->{
addUser(lt6Users,user,(User u)->u.getAge()<6);
addUser(gt6Users,user,(User u)->u.getAge()>6);
// more condition
});

Related

Don't know how to treat that as a predicate

I'm trying to run a custom query in my repository, but I'm getting a InvalidDataAccessResourceUsageException. "Don't know how to treat that as a predicate String("n.id = '1234'")".
public void myMethod() {
myRepository.queryUsingCustomFilters("n.id = '1234'");
}
public interface MyRepository() extends Neo4jRepository<MyObject, String> {
#Query("MATCH (n) WHERE {filter} RETURN n")
List<MyObject> queryUsingCustomFilters(#Param("filter") String filter);
}
I have a simple example for now, but the string I'm passing in the future could be a little bit more complicated, such as "n.id = '1234' AND (n.name = 'one name' OR n.name = 'another name')"
I don't believe you can pass entire clauses/predicates/queries as a #Param.
If you want to build queries at run time, you might want to look at composing it using the lower level Neo4j OGM filters (see https://neo4j.com/docs/ogm-manual/current/reference/#reference:filters)
So in the case you describe above, you could simply add Filters as required and chain them together to build your WHERE clause

Using spring jdbc template to query for list of parameters

New to Spring JDBC template but I'm wondering if I am able to pass a list of parameters and execute query once for each parameter in list. As I've seen many examples, the list of parameters being passed is for the execution of the query using all the parameters provided. Rather I am trying to execute query multiple times and for each time using new parameter in list.
For example:
Let's say I have a List of Ids - params (Strings)
List<String> params = new ArrayList<String>();
params.add("1234");
params.add("2345");
trying to do something like:
getJdbcTemplate().query(sql, params, new CustomResultSetExtractor());
which I know as per documentation is not allowed. I mean for one it has to be an array. I've seen simple examples where query is something like "select * from employee where id = ?" and they are passing new Object[]{"1234"} into method. And I'm trying to avoid the IN() condition. In my case each id will return multiple rows which is why I'm using ResultSetExtractor.
I know one option would be to iterate over list and include each id in list as a parameter, something like:
for(String id : params){
getJdbcTemplate().query(sql, new Object[]{id}, new CustomResultSetExtractor());
}
Just want to know if I can do this some other way. Sorry, I Should mention that I am trying to do a Select. Originally was hoping to return a List of custom objects for each resultset.
You do need to pass an array of params for the API, but you may also assume that your first param is an array. I believe this should work:
String sql = "select * from employee where id in (:ids)"; // or should there be '?'
getJdbcTemplate().query(sql, new Object[]{params}, new CustomResultSetExtractor());
Or you could explicitly specify, that the parameter is an array
getJdbcTemplate().query(sql, new Object[]{params}, new int[]{java.sql.Types.ARRAY}, new CustomResultSetExtractor());
You can use preparedStatement and do batch job:
eg. from http://docs.spring.io/spring/docs/current/spring-framework-reference/html/jdbc.html
public int[] batchUpdate(final List<Actor> actors) {
int[] updateCounts = jdbcTemplate.batchUpdate("update t_actor set first_name = ?, " +
"last_name = ? where id = ?",
new BatchPreparedStatementSetter() {
public void setValues(PreparedStatement ps, int i) throws SQLException {
ps.setString(1, actors.get(i).getFirstName());
ps.setString(2, actors.get(i).getLastName());
ps.setLong(3, actors.get(i).getId().longValue());
}
public int getBatchSize() {
return actors.size();
}
});
return updateCounts;
}
I know you don't want to use the in clause, but I think its the best solution for your problem.
If you use a for in this way, I think it's not optimal.
for(String id : params){
getJdbcTemplate().query(sql, new Object[]{id}, new CustomResultSetExtractor());
}
I think it's a better solution to use the in clause. And then use a ResultSetExtractor to iterate over the result data. Your extractor can return a Map instead of a List, actually a Map of List.
Map<Integer, List<MyObject>>
Here there is a simple tutorial explaining its use
http://pure-essence.net/2011/03/16/how-to-execute-in-sql-in-spring-jdbctemplate/
I think this is the best solution:
public List<TestUser> findUserByIds(int[] ids) {
String[] s = new String[ids.length];
Arrays.fill(s, "?");
String sql = StringUtils.join(s, ',');
return jdbcTemplate.query(String.format("select * from users where id in (%s)", sql),
ArrayUtils.toObject(ids), new BeanPropertyRowMapper<>(TestUser.class));
}
this one maybe what you want. BeanPropertyRowMapper is just for example, it will be very slow when there's a lot of records. you should change it to another more efficient RowMapper.

Scalable Contains method for LINQ against a SQL backend

I'm looking for an elegant way to execute a Contains() statement in a scalable way. Please allow me to give some background before I come to the actual question.
The IN statement
In Entity Framework and LINQ to SQL the Contains statement is translated as a SQL IN statement. For instance, from this statement:
var ids = Enumerable.Range(1,10);
var courses = Courses.Where(c => ids.Contains(c.CourseID)).ToList();
Entity Framework will generate
SELECT
[Extent1].[CourseID] AS [CourseID],
[Extent1].[Title] AS [Title],
[Extent1].[Credits] AS [Credits],
[Extent1].[DepartmentID] AS [DepartmentID]
FROM [dbo].[Course] AS [Extent1]
WHERE [Extent1].[CourseID] IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Unfortunately, the In statement is not scalable. As per MSDN:
Including an extremely large number of values (many thousands) in an IN clause can consume resources and return errors 8623 or 8632
which has to do with running out of resources or exceeding expression limits.
But before these errors occur, the IN statement becomes increasingly slow with growing numbers of items. I can't find documentation about its growth rate, but it performs well up to a few thousands of items, but beyond that it gets dramatically slow. (Based on SQL Server experiences).
Scalable
We can't always avoid this statement. A JOIN with the source data in stead would generally perform much better, but that's only possible when the source data is in the same context. Here I'm dealing with data coming from a client in a disconnected scenario. So I have been looking for a scalable solution. A satisfactory approach turned out to be cutting the operation into chunks:
var courses = ids.ToChunks(1000)
.Select(chunk => Courses.Where(c => chunk.Contains(c.CourseID)))
.SelectMany(x => x).ToList();
(where ToChunks is this little extension method).
This executes the query in chunks of 1000 that all perform well enough. With e.g. 5000 items, 5 queries will run that together are likely to be faster than one query with 5000 items.
But not DRY
But of course I don't want to scatter this construct all over my code. I am looking for an extension method by which any IQueryable<T> can be transformed into a chunky executing statement. Ideally something like this:
var courses = Courses.Where(c => ids.Contains(c.CourseID))
.AsChunky(1000)
.ToList();
But maybe this
var courses = Courses.ChunkyContains(c => c.CourseID, ids, 1000)
.ToList();
I've given the latter solution a first shot:
public static IEnumerable<TEntity> ChunkyContains<TEntity, TContains>(
this IQueryable<TEntity> query,
Expression<Func<TEntity,TContains>> match,
IEnumerable<TContains> containList,
int chunkSize = 500)
{
return containList.ToChunks(chunkSize)
.Select (chunk => query.Where(x => chunk.Contains(match)))
.SelectMany(x => x);
}
Obviously, the part x => chunk.Contains(match) doesn't compile. But I don't know how to manipulate the match expression into a Contains expression.
Maybe someone can help me make this solution work. And of course I'm open to other approaches to make this statement scalable.
I’ve solved this problem with a little different approach a view month ago. Maybe it’s a good solution for you too.
I didn’t want my solution to change the query itself. So a ids.ChunkContains(p.Id) or a special WhereContains method was unfeasible. Also should the solution be able to combine a Contains with another filter as well as using the same collection multiple times.
db.TestEntities.Where(p => (ids.Contains(p.Id) || ids.Contains(p.ParentId)) && p.Name.StartsWith("Test"))
So I tried to encapsulate the logic in a special ToList method that could rewrite the Expression for a specified collection to be queried in chunks.
var ids = Enumerable.Range(1, 11);
var result = db.TestEntities.Where(p => Ids.Contains(p.Id) && p.Name.StartsWith ("Test"))
.ToChunkedList(ids,4);
To rewrite the expression tree I discovered all Contains Method calls from local collections in the query with a view helping classes.
private class ContainsExpression
{
public ContainsExpression(MethodCallExpression methodCall)
{
this.MethodCall = methodCall;
}
public MethodCallExpression MethodCall { get; private set; }
public object GetValue()
{
var parent = MethodCall.Object ?? MethodCall.Arguments.FirstOrDefault();
return Expression.Lambda<Func<object>>(parent).Compile()();
}
public bool IsLocalList()
{
Expression parent = MethodCall.Object ?? MethodCall.Arguments.FirstOrDefault();
while (parent != null) {
if (parent is ConstantExpression)
return true;
var member = parent as MemberExpression;
if (member != null) {
parent = member.Expression;
} else {
parent = null;
}
}
return false;
}
}
private class FindExpressionVisitor<T> : ExpressionVisitor where T : Expression
{
public List<T> FoundItems { get; private set; }
public FindExpressionVisitor()
{
this.FoundItems = new List<T>();
}
public override Expression Visit(Expression node)
{
var found = node as T;
if (found != null) {
this.FoundItems.Add(found);
}
return base.Visit(node);
}
}
public static List<T> ToChunkedList<T, TValue>(this IQueryable<T> query, IEnumerable<TValue> list, int chunkSize)
{
var finder = new FindExpressionVisitor<MethodCallExpression>();
finder.Visit(query.Expression);
var methodCalls = finder.FoundItems.Where(p => p.Method.Name == "Contains").Select(p => new ContainsExpression(p)).Where(p => p.IsLocalList()).ToList();
var localLists = methodCalls.Where(p => p.GetValue() == list).ToList();
If the local collection passed in the ToChunkedList method was found in the query expression, I replace the Contains call to the original list with a new call to a temporary list containing the ids for one batch.
if (localLists.Any()) {
var result = new List<T>();
var valueList = new List<TValue>();
var containsMethod = typeof(Enumerable).GetMethods(BindingFlags.Static | BindingFlags.Public)
.Single(p => p.Name == "Contains" && p.GetParameters().Count() == 2)
.MakeGenericMethod(typeof(TValue));
var queryExpression = query.Expression;
foreach (var item in localLists) {
var parameter = new List<Expression>();
parameter.Add(Expression.Constant(valueList));
if (item.MethodCall.Object == null) {
parameter.AddRange(item.MethodCall.Arguments.Skip(1));
} else {
parameter.AddRange(item.MethodCall.Arguments);
}
var call = Expression.Call(containsMethod, parameter.ToArray());
var replacer = new ExpressionReplacer(item.MethodCall,call);
queryExpression = replacer.Visit(queryExpression);
}
var chunkQuery = query.Provider.CreateQuery<T>(queryExpression);
for (int i = 0; i < Math.Ceiling((decimal)list.Count() / chunkSize); i++) {
valueList.Clear();
valueList.AddRange(list.Skip(i * chunkSize).Take(chunkSize));
result.AddRange(chunkQuery.ToList());
}
return result;
}
// if the collection was not found return query.ToList()
return query.ToList();
Expression Replacer:
private class ExpressionReplacer : ExpressionVisitor {
private Expression find, replace;
public ExpressionReplacer(Expression find, Expression replace)
{
this.find = find;
this.replace = replace;
}
public override Expression Visit(Expression node)
{
if (node == this.find)
return this.replace;
return base.Visit(node);
}
}
Please allow me to provide an alternative to the Chunky approach.
The technique involving Contains in your predicate works well for:
A constant list of values (no volatile).
A small list of values.
Contains will do great if your local data has those two characteristics because these small set of values will be hardcoded in the final SQL query.
The problem begins when your list of values has entropy (non-constant). As of this writing, Entity Framework (Classic and Core) do not try to parameterize these values in any way, this forces SQL Server to generate a query plan every time it sees a new combination of values in your query. This operation is expensive and gets aggravated by the overall complexity of your query (e.g. many tables, a lot of values in the list, etc.).
The Chunky approach still suffers from this SQL Server query plan cache pollution problem, because it does not parametrizes the query, it just moves the cost of creating a big execution plan into smaller ones that are more easy to compute (and discard) by SQL Server, furthermore, every chunk adds an additional round-trip to the database, which increases the time needed to resolve the query.
An Efficient Solution for EF Core
🎉 NEW! QueryableValues EF6 Edition has arrived!
For EF Core keep reading below.
Wouldn't it be nice to have a way of composing local data in your query in a way that's SQL Server friendly? Enter QueryableValues.
I designed this library with these two main goals:
It MUST solve the SQL Server's query plan cache pollution problem ✅
It MUST be fast! ⚡
It has a flexible API that allows you to compose local data provided by an IEnumerable<T> and you get back an IQueryable<T>; just use it as if it were another entity of your DbContext (really), e.g.:
// Sample values.
IEnumerable<int> values = Enumerable.Range(1, 1000);
// Using a Join (query syntax).
var query1 =
from e in dbContext.MyEntities
join v in dbContext.AsQueryableValues(values) on e.Id equals v
select new
{
e.Id,
e.Name
};
// Using Contains (method syntax)
var query2 = dbContext.MyEntities
.Where(e => dbContext.AsQueryableValues(values).Contains(e.Id))
.Select(e => new
{
e.Id,
e.Name
});
You can also compose complex types!
It goes without saying that the provided IEnumerable<T> is only enumerated at the time that your query is materialized (not before), preserving the same behavior of EF Core in this regard.
How Does It Works?
Internally QueryableValues creates a parameterized query and provides your values in a serialized format that is natively understood by SQL Server. This allows your query to be resolved with a single round-trip to the database and avoids creating a new query plan on subsequent executions due to the parameterized nature of it.
Useful Links
Nuget Package
GitHub Repository
Benchmarks
SQL Server Cache Pollution Problem
QueryableValues is distributed under the MIT license
Linqkit to the rescue! Might be a better way that does it directly, but this seems to work fine and makes it pretty clear what's being done. The addition being AsExpandable(), which lets you use the Invoke extension.
using LinqKit;
public static IEnumerable<TEntity> ChunkyContains<TEntity, TContains>(
this IQueryable<TEntity> query,
Expression<Func<TEntity,TContains>> match,
IEnumerable<TContains> containList,
int chunkSize = 500)
{
return containList
.ToChunks(chunkSize)
.Select (chunk => query.AsExpandable()
.Where(x => chunk.Contains(match.Invoke(x))))
.SelectMany(x => x);
}
You might also want to do this:
containsList.Distinct()
.ToChunks(chunkSize)
...or something similar so you don't get duplicate results if something this occurs:
query.ChunkyContains(x => x.Id, new List<int> { 1, 1 }, 1);
Another way would be to build the predicate this way (of course, some parts should be improved, just giving the idea).
public static Expression<Func<TEntity, bool>> ContainsPredicate<TEntity, TContains>(this IEnumerable<TContains> chunk, Expression<Func<TEntity, TContains>> match)
{
return Expression.Lambda<Func<TEntity, bool>>(Expression.Call(
typeof (Enumerable),
"Contains",
new[]
{
typeof (TContains)
},
Expression.Constant(chunk, typeof(IEnumerable<TContains>)), match.Body),
match.Parameters);
}
which you could call in your ChunkContains method
return containList.ToChunks(chunkSize)
.Select(chunk => query.Where(ContainsPredicate(chunk, match)))
.SelectMany(x => x);
Using a stored procedure with a table valued parameter could also work well. You in effect write a joint In the stored procedure between your table / view and the table valued parameter.
https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/table-valued-parameters

How can I create an Expression within another Expression?

Forgive me if this has been asked already. I've only just started using LINQ. I have the following Expression:
public static Expression<Func<TblCustomer, CustomerSummary>> SelectToSummary()
{
return m => (new CustomerSummary()
{
ID = m.ID,
CustomerName = m.CustomerName,
LastSalesContact = // This is a Person entity, no idea how to create it
});
}
I want to be able to populate LastSalesContact, which is a Person entity.
The details that I wish to populate come from m.LatestPerson, so how can I map over the fields from m.LatestPerson to LastSalesContact. I want the mapping to be re-useable, i.e. I do not want to do this:
LastSalesContact = new Person()
{
// Etc
}
Can I use a static Expression, such as this:
public static Expression<Func<TblUser, User>> SelectToUser()
{
return x => (new User()
{
// Populate
});
}
UPDATE:
This is what I need to do:
return m => (new CustomerSummary()
{
ID = m.ID,
CustomerName = m.CustomerName,
LastSalesContact = new Person()
{
PersonId = m.LatestPerson.PersonId,
PersonName = m.LatestPerson.PersonName,
Company = new Company()
{
CompanyId = m.LatestPerson.Company.CompanyId,
etc
}
}
});
But I will be re-using the Person() creation in about 10-15 different classes, so I don't want exactly the same code duplicated X amount of times. I'd probably also want to do the same for Company.
Can't you just use automapper for that?
public static Expression<Func<TblCustomer, CustomerSummary>> SelectToSummary()
{
return m => Mapper.Map<TblCustomer, CustommerSummary>(m);
}
You'd have to do some bootstrapping, but then it's very reusable.
UPDATE:
I may not be getting something, but what it the purpose of this function? If you just want to map one or collection of Tbl object to other objects, why have the expression?
You could just have something like this:
var customers = _customerRepository.GetAll(); // returns IEnumerable<TblCustomer>
var summaries = Mapper.Map<IEnumerable<TblCustomer>, IEnumerable<CustomerSummary>>(customers);
Or is there something I missed?
I don't think you'll be able to use a lambda expression to do this... you'll need to build up the expression tree by hand using the factory methods in Expression. It's unlikely to be pleasant, to be honest.
My generally preferred way of working out how to build up expression trees is to start with a simple example of what you want to do written as a lambda expression, and then decompile it. That should show you how the expression tree is built - although the C# compiler gets to use the metadata associated with properties more easily than we can (we have to use Type.GetProperty).
This is always assuming I've understood you correctly... it's quite possible that I haven't.
How about this:
public static Person CreatePerson(TblPerson data)
{
// ...
}
public static Expression<Func<TblPerson, Person>> CreatePersonExpression()
{
return d => CreatePerson(d);
}
return m => (new CustomerSummary()
{
ID = m.ID,
CustomerName = m.CustomerName,
LastSalesContact = CreatePerson(m.LatestPerson)
});

Using an IEqualityComparer with a LINQ to Entities Except clause

I have an entity that I'd like to compare with a subset and determine to select all except the subset.
So, my query looks like this:
Products.Except(ProductsToRemove(), new ProductComparer())
The ProductsToRemove() method returns a List<Product> after it performs a few tasks. So in it's simplest form it's the above.
The ProductComparer() class looks like this:
public class ProductComparer : IEqualityComparer<Product>
{
public bool Equals(Product a, Product b)
{
if (ReferenceEquals(a, b)) return true;
if (ReferenceEquals(a, null) || ReferenceEquals(b, null))
return false;
return a.Id == b.Id;
}
public int GetHashCode(Product product)
{
if (ReferenceEquals(product, null)) return 0;
var hashProductId = product.Id.GetHashCode();
return hashProductId;
}
}
However, I continually receive the following exception:
LINQ to Entities does not recognize
the method
'System.Linq.IQueryable1[UnitedOne.Data.Sql.Product]
Except[Product](System.Linq.IQueryable1[UnitedOne.Data.Sql.Product],
System.Collections.Generic.IEnumerable1[UnitedOne.Data.Sql.Product],
System.Collections.Generic.IEqualityComparer1[UnitedOne.Data.Sql.Product])'
method, and this method cannot be
translated into a store expression.
Linq to Entities isn't actually executing your query, it is interpreting your code, converting it to TSQL, then executing that on the server.
Under the covers, it is coded with the knowledge of how operators and common functions operate and how those relate to TSQL. The problem is that the developers of L2E have no idea how exactly you are implementing IEqualityComparer. Therefore they cannot figure out that when you say Class A == Class B you mean (for example) "Where Person.FirstName == FirstName AND Person.LastName == LastName".
So, when the L2E interpreter hits a method it doesn't recognize, it throws this exception.
There are two ways you can work around this. First, develop a Where() that satisfies your equality requirements but that doesn't rely on any custom method. In other words, test for equality of properties of the instance rather than an Equals method defined on the class.
Second, you can trigger the execution of the query and then do your comparisons in memory. For instance:
var notThisItem = new Item{Id = "HurrDurr"};
var items = Db.Items.ToArray(); // Sql query executed here
var except = items.Except(notThisItem); // performed in memory
Obviously this will bring much more data across the wire and be more memory intensive. The first option is usually the best.
You're trying to convert the Except call with your custom IEqualityComparer into Entity SQL.
Obviously, your class cannot be converted into SQL.
You need to write Products.AsEnumerable().Except(ProductsToRemove(), new ProductComparer()) to force it to execute on the client. Note that this will download all of the products from the server.
By the way, your ProductComparer class should be a singleton, like this:
public class ProductComparer : IEqualityComparer<Product> {
private ProductComparer() { }
public static ProductComparer Instance = new ProductComparer();
...
}
The IEqualityComparer<T> can only be executed locally, it can't be translated to a SQL command, hence the error

Resources