List Find ,Hashset Or Linq Which One is Better On list - linq

I Have a list of string where i want to find particular value and return.
If i just want to search i can use Hashset instead of list
HashSet<string> data = new HashSet<string>();
bool contains = data.Contains("lokendra"); //
But for list i am using Find because i want to return the value also from list.
I found this methos is time consuming. The method where this code resides is hit more than 1000 times and the size of list is appx 20000 to 25000.This method takes time.Is there any other way i can make search faster.
List<Employee> employeeData= new List<Employee>();
var result = employeeData.Find(element=>element.name=="lokendra")
Do we have any linq or any other approach which makes retrievel of data faster from search.
Please help.
public struct Employee
{
public string role;
public string id;
public int salary;
public string name;
public string address;
}
I have the list of this structure and if the name property matches the value "lokendra".then i want to retrun the whole object.Consider list as the employee data.
I want to know the way we have Hashset to get faster search is there anyway we can search data and return fast other than find.

It sounds like what you actually want is a Dictionary<string, Employee>. Build that once, and you can query it efficiently many times. You can build it from a list of employees easily:
var employeesByName = employees.ToDictionary(e => e.Name);
...
var employee;
if (employeesByName.TryGetValue(name, out employee))
{
// Yay, found the employee
}
else
{
// Nope, no employee with that name
}
EDIT: Now I've seen your edit... please don't create struct types like this. You almost certainly want a class instead, and one with properties rather than public fields...

You can try with employeeData.FirstOrDefault(e => e == "lokendra"), but it still needs to iterate over collection, so will have performance list Find method.
If your list content is set only once and then you're searching it again and again you should consider implementing your own solution:
sort list before first search
use binary search (which would be O(log n) instead of O(n) for standard Find and Where)

Related

How to GroupBy objects from a list by some common catalog of properties in Java 8

I've been struggling with a problem with one of my lists of data because one of the requirements after generating it is to group some of them by some common parameters (more than 1)
What I should get at the end is a map where the value is a list of common objects. For example.
List<Cause> listToGroup = new ArrayList<>();
listToGroup.add(Similar);
listToGroup.add(Common);
listToGroup.add(Similar);
listToGroup.add(Similar);
listToGroup.add(Common);
In a weird way to represent one group (Similar) and the other (Common), those should be separated into two different lists (that list is generated by a request to other methods, in that case, I just added manually to show what could be the contained data in the list). My main problem is the criteria to group them because is based on a group of parameters that are shared, but not all (if the required parameters are equal, should belong to the same list) In the class shown below, that behaviour is seen because there are some parameters that are not being considered.
public class Cause extends GeneralDomain {
//parameters which must be equals between objects
private Long id;
private Date creationDate;
private Part origin;
private Part destination;
//parameters which are not required to be equal
private BigDecimal value
private Stage stageEvent
//omitted getters and setters
}
I've been seeing the comparator method and the groupingBy method provided in Java 8, but at the moment I just know how to perform that task considering just one parameter (for example grouping them by id) And I have no idea about how to group them using more than one parameter.
//this should be the code if the requirement would be just one parameter to groupby, but in my case are more than one.
Map<Long, List<Cause>> result = request.getList(criteria)
.stream()
.map(p -> parsin.createDto(p))
.collect(groupingBy(Cause ::getId));
I would be really glad for any suggestion. If my explanation is not clear, I'm so sorry. That became so complicated that even is hard for me to explain

spring crud repository find top n Items by field A and field B in list order by field C

I have in a Spring Repo something like this:
findTop10ItemsByCategIdInOrderByInsertDateDesc(List ids)
I want the first 10 items where category id in list of ids ordered by insert date.
Another similar query:
findTop10ItemsByDomainIdAndCategIdInOrderByInsertDateDesc(List ids, #Param Integer domainId)
Here I want that the domain id is equal to the given param and the categId to be in given list.
I managed to resolve it using #Query but I wonder if there is an one liner for the above queries.
thanks
EDIT
The top works fine. Initially I had findTop10ItemsByDomainIdAndCategIdOrderByInsertDateDesc. Now I want the results from a list of category ids. That's the new requirement.
SECOND EDIT
My query works for find the set o results where domain id is equal to a given param and categ id is contained in a given list. BUT I found out that HQL doesn't support a setMaxResult kind of thing as top or limit.
#Query("select i from Items i where i.domainId = :domainId and i.categId in :categoryIds order by i.insertDate desc")
The params for this method were (#Param("domainid") Integer domainid,List<Integer> categoryIds) but it seams that I'm alowed to use either #Param annotation to each parameter or no #Param at all ( except for Pageable return; not my case )
I still don't know how to achieve this think:
extract top n elements where field a eq to param, field b in set of param, ordered by another field.
ps: sorry for tags but there is no spring-crudrepository :)
The method to resolve your problem is:
List<MyClass> findTop10ByDomainIdAndCategIdInOrderByInsertDateDesc(Long domainId, List<Long> ids);
Top10 limits the results to first 10.
ByDomainId limits results to those that have passed domainId.
And adds another constraint.
CategIdIn limits results to those entries that have categId in the passed List.
OrderByInsertDateDesc orders results descending by insert date before limiting to TOP10.
I have tested this query on the following example:
List<User> findTop10ByEmailAndPropInOrderByIdDesc(String email, List<Long> props);
Where User is:
private Long id;
private String username;
private String password;
private String email;
private Long prop;
Currently I would recommend using LocalDate or LocalDateTime for storing dates using Spring Data JPA.

How to combine collection of linq queries into a single sql request

Thanks for checking this out.
My situation is that I have a system where the user can create custom filtered views which I build into a linq query on the request. On the interface they want to see the counts of all the views they have created; pretty straight forward. I'm familiar with combining multiple queries into a single call but in this case I don't know how many queries I have initially.
Does anyone know of a technique where this loop combines the count queries into a single query that I can then execute with a ToList() or FirstOrDefault()?
//TODO Performance this isn't good...
foreach (IMeetingViewDetail view in currentViews)
{
view.RecordCount = GetViewSpecificQuery(view.CustomFilters).Count();
}
Here is an example of multiple queries combined as I'm referring to. This is two queries which I then combine into an anonymous projection resulting in a single request to the sql server.
IQueryable<EventType> eventTypes = _eventTypeService.GetRecords().AreActive<EventType>();
IQueryable<EventPreferredSetup> preferredSetupTypes = _eventPreferredSetupService.GetRecords().AreActive<EventPreferredSetup>();
var options = someBaseQuery.Select(x => new
{
EventTypes = eventTypes.AsEnumerable(),
PreferredSetupTypes = preferredSetupTypes.AsEnumerable()
}).FirstOrDefault();
Well, for performance considerations, I would change the interface from IEnumerable<T> to a collection that has a Count property. Both IList<T> and ICollection<T> have a count property.
This way, the collection object is keeping track of its size and you just need to read it.
If you really wanted to avoid the loop, you could redefine the RecordCount to be a lazy loaded integer that calls GetViewSpecificQuery to get the count once.
private int? _recordCount = null;
public int RecordCount
{
get
{
if (_recordCount == null)
_recordCount = GetViewSpecificQuery(view.CustomFilters).Count;
return _recordCount.Value;
}
}

Linq compared to IComparer

I have seen this class that looks like this:
/// <summary>
/// Provides an internal structure to sort the query parameter
/// </summary>
protected class QueryParameter
{
public QueryParameter(string name, string value)
{
Name = name;
Value = value;
}
public string Name { get; private set; }
public string Value { get; private set; }
}
/// <summary>
/// Comparer class used to perform the sorting of the query parameters
/// </summary>
protected class QueryParameterComparer : IComparer<QueryParameter>
{
public int Compare(QueryParameter x, QueryParameter y)
{
return x.Name == y.Name
? string.Compare(x.Value, y.Value)
: string.Compare(x.Name, y.Name);
}
}
Then there is a call later in the code that does the sort:
parameters.Sort(new QueryParameterComparer());
which all works fine.
I decided that it was a waste of time creating a QueryParameter class that only had name value and it would probably be better to use Dictionary. With the dictionary, rather than use the Sort(new QueryParameterComparer()); I figured I could just do this:
parameters.ToList().Sort((x, y) => x.Key == y.Key ? string.Compare(x.Value, y.Value) : string.Compare(x.Key, y.Key));
The code compiles fine, but I am unsure whether it is working because the list just seems to output in the same order it was put in.
So, can anyone tell me if I am doing this correctly or if I am missing something simple?
Cheers
/r3plica
The List<T>.Sort method is not part of LINQ.
You can use OrderBy/ThenBy extension methods before calling ToList():
parameters = parameter.OrderBy(x => x.Key).ThenBy(x => x.Value).ToList();
From your code, I surmise that parameters is your dictionary, and you're calling
parameters.ToList().Sort(...);
and then carrying on using parameters.
ToList() creates a new list; you are then sorting this list and discarding it. You're not sorting parameters at all, and in fact you can't sort it because it's a dictionary.
What you need is something along the lines of
var parametersList = parameters.ToList();
parametersList.Sort(...);
where ... is the same sort as before.
You could also do
var parametersList = parameters.OrderBy(...).ToList();
which is a more LINQ-y way of doing things.
It may even be appropriate to just do e.g.
foreach(var kvp in parameters.OrderBy(...))
(or however you plan on using the sorted sequence) if you're using the sorted seqence more often than you're changing the dictionary (i.e. there's no point caching a sorted version because the original data changes a lot).
Another point to note - a dictionary can't contain duplicate keys, so there's no point checking x.Key == y.Key any more - you just need to sort via (x, y) => string.Compare(x.Key, y.Key)
I'd be careful here, though - by the look of it, the original code did support duplicate keys, so by switchnig to a dictionary you might be breaking something.
Dictionary are only equivalent to two hash map, and allow you to access to any alement (given the key) with costant time O(1) (because the make a lookup search on an hashtable).
So if you would order the elements because you intended to do a dicotomic search later, you do not need that you should use directly dictionary (or if you would query for both the value in dictionary you could use a couple of dictionary with the same elements but switching key value pairs).
As somebody write before me, if you question is how to order a list with linq you should work with linq and with orderby thenby.

Design a Data Structure for web server to store history of visited pages

The server must maintain data for last n days. It must show the most visited pages of the current day first and then the most visited pages of next day and so on.
I'm thinking along the lines of hash map of hash maps. Any suggestions ?
Outer hash map with key of type date and value of type hash map.
Inner hash map with key of type string containing the url and value of type int containing the visit count.
Example in C#:
// Outer hash map
var visitsByDay =
new Dictionary<DateTime, VisitsByUrl>(currentDate, new VisitsByUrl());
...
// inner hash map
public class VisitsByUrl
{
public Dictionary<string, int> Urls { get; set; }
public VisitsByUrl()
{
Urls = new Dictionary<string, int>();
}
public void Add(string url)
{
if (Urls[url] != null)
Urls[url] += 1;
else
Urls.Add(url, 1);
}
}
You can keep a hash for each day that has will of the type :-
And a queue of length n. which will have these hashes for each day. Also you will store seperate hash totalHits which will sum all of these
Class Stats {
queue< hash<url,hits> > completeStats;
hash<url,hits> totalStats;
public:-
int getNoOfTodayHits(url) {
return completeStats[n-1][url];
}
int getTotalStats(url) {
return totalStats[url];
}
void addAnotherDay() {
// before popping check if the length is n or not :)
hash<url,hits> lastStats = completeStats.pop();
hash<url,hits> todayStats;
completeStats.push_back(todayStats);
// traverse through lastStats and decrease the value from total stats;
}
// etc.
};
We can have a combination of Stack & Hash Map.
We can create an Object of URL and timestamp, then push it onto the Stack.
Most recent visited Url will be on the top.
We can use the timestamp combined with URL to create a key, which is mapped to the count of visited Urls.
In order to display most visited pages in chronological order, we can pop the stack, create a key and fetch the count associated with the Url. Sort them while displaying.
Time complexity: O(n) + Sort time (depends on the number of pages visited)
This depends on what you want. For example, do you want to store the actual data for the pages in the history, or just the URLs? If somebody has visited a page twice, should it show up twice in the history?
A hash map would be suitable if you wanted to store the data for a page, and wanted each page to show up only once.
If, as I'd consider more likely, you want to store only the URLs, but want each stored multiple times if it was visited more than once, an array/vector would probably make more sense. If you expect to see a lot of duplication of (relatively) long URLs, you could create a set of URLs, and for each visit store some sort of pointer/index/reference to the URL in question. Note, however, that maintaining this can become somewhat non-trivial.

Resources