Multidimensional data lookup - data-structures

I have a collection of tuples of N values. A value may be a wildcard (matches any value), or a concrete value. What would be the best way to lookup all tuples in the collection matching a specific tuple without scanning the entire collection and testing items one by one?
E.g. 1.2.3 matches 1.*.3 and *.*.3, but not 1.2.4 or *.2.4.
What data structure am I looking for here?

I'd use a trie to implement this. Here's how I would construct the trie:
The data structure would look like:
Trie{
Integer value
Map<Integer, Trie> tries
}
To insert:
insert(tuple, trie){
curTrie = trie
foreach( number in tuple){
nextTrie = curTrie.getTrie(number)
//add the number to the trie if it isn't in there
if(nextTrie == null){
newTrie = new Trie(number)
curTrie.setTrie(number, newTrie)
}
curTrie = curTrie.getTrie(number)
}
}
To get all the tuples:
getTuples(tuple, trie){
if(head(tuple) == "*"){
allTuples = {}
forEach(subTrie in trie){
allTuples.union(getTuples(restOf(tuple), subTrie))
forEach(partialTuple in allTuples){
partialTuple = head(tuple)+partialTuple
}
}
return allTuples
}
if(tuple == null)
return {trie.value}
if(trie.getTrie(head(tuple)) == null)
raise error because tuple does not exist
allTuples = {}
allTuples.union(getTuples(restOf(tuple), trie.getTrie(head(tuple))
forEach(partialTuple in allTuples){
partialTuple = head(tuple)+partialTuple
}
return allTuples
}

Related

Kotlin. ArrayList, how to move element to first position

I have a list of Lessons. Here is my Lessons class:
data class Lessons(
val id: Long,
val name: String,
val time: Long,
val key: String
)
I need to move the element to the beginning of the list, whose key field has a value "priority".
Here is my code:
val priorityLesson = lessons.find { it.key == "priority" }
if (priorityLesson != null) {
lessons.remove(priorityLesson)
lessons.add(0, priorityLesson)
}
Everything is working but I do not like this solution, perhaps there is a more efficient way to perform this algorithm. In addition, it comes to me to convert the list to mutable, and I would like to leave it immutable.
Please help me.
One way is to call partition() to split the list into a list of priority lesson(s), and a list of non-priority lessons; you can then rejoin them:
val sorted = lessons.partition{ it.key == "priority" }
.let{ it.first + it.second }
As well as handling the case of exactly one priority lesson, that will cope if there are none or several. And it preserves the order of priority lessons, and the order of non-priority lessons.
(That will take a little more memory than modifying the list in-place; but it scales the same — both are 𝒪(n). It's also easier to understand and harder to get wrong!)
First, I would call your class Lesson rather than Lessons as it represents a single lesson. Your choice of the variable name lessons is good for your list of lessons.
You can use a mutable list and move the item to the top:
val priorityLessonIndex = lessons.indexOf { it.key == "priority" }
if (priorityLessonIndex != -1)
lessons[0] = lessons[priorityLessonIndex]
.also { lessons[priorityLessonIndex] = lessons[0] }
Or you can use an immutable list:
val priorityLesson = lessons.firstOrNull { it.key == "priority" }
val newList =
if (priorityLesson != null)
listOf(priorityLesson) + (lessons - priorityLesson)
else
lessons
A possibly more efficient way, which avoids creation of intermediate lists:
val newList = buildList(lessons.size) {
lessons.filterTo(this) { it.key == "priority" }
lessons.filterTo(this) { it.key != "priority" }
}

How to use Stream, to write an efficient shuffling method

I have an ArrayList of Residence objects. Each Residence object has two fields, type::String, and price::BigInteger. I was wondering if there is an efficient way to restructure the list, in such a way, so no Residence object with the same name is next to each other. The goal is to write an efficient, shuffling method.
I suggest you use a HashMap<String, Stack<Residence>> and save the corresponding element for each type.
Then loop the hashmap through the keys in a Round Robin way and pop the item from the stack. Each item you get, you can add it to a new list.
Assuming your ArrayList of Residence is residences, the code should be something like this (Not tested, only for show the algorithm):
HashMap<String, Stak<Residence>> hm;
ArrayList<Residence> resultList = new ArrayList();
for (Residence r : residences) {
hm.put(r.type, r);
}
boolean exist = true;
while(exist) {
exist = false;
for(Map.Entry m : hm.entrySet()){
if(!m.getValue().isEmpty()) {
exist = true;
resultList.add(m.getValue().pop());
}
}
}

Most efficient way to determine if there are any differences between specific properties of 2 lists of items?

In C# .NET 4.0, I am struggling to come up with the most efficient way to determine if the contents of 2 lists of items contain any differences.
I don't need to know what the differences are, just true/false whether the lists are different based on my criteria.
The 2 lists I am trying to compare contain FileInfo objects, and I want to compare only the FileInfo.Name and FileInfo.LastWriteTimeUtc properties of each item. All the FileInfo items are for files located in the same directory, so the FileInfo.Name values will be unique.
To summarize, I am looking for a single Boolean result for the following criteria:
Does ListA contain any items with FileInfo.Name not in ListB?
Does ListB contain any items with FileInfo.Name not in ListA?
For items with the same FileInfo.Name in both lists, are the FileInfo.LastWriteTimeUtc values different?
Thank you,
Kyle
I would use a custom IEqualityComparer<FileInfo> for this task:
public class FileNameAndLastWriteTimeUtcComparer : IEqualityComparer<FileInfo>
{
public bool Equals(FileInfo x, FileInfo y)
{
if(Object.ReferenceEquals(x, y)) return true;
if (x == null || y == null) return false;
return x.FullName.Equals(y.FullName) && x.LastWriteTimeUtc.Equals(y.LastWriteTimeUtc);
}
public int GetHashCode(FileInfo fi)
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
hash = hash * 23 + fi.FullName.GetHashCode();
hash = hash * 23 + fi.LastWriteTimeUtc.GetHashCode();
return hash;
}
}
}
Now you can use a HashSet<FileInfo> with this comparer and HashSet<T>.SetEquals:
var comparer = new FileNameAndLastWriteTimeUtcComparer();
var uniqueFiles1 = new HashSet<FileInfo>(list1, comparer);
bool anyDifferences = !uniqueFiles1.SetEquals(list2);
Note that i've used FileInfo.FullName instead of Name since names aren't unqiue at all.
Sidenote: another advantage is that you can use this comparer for many LINQ methods like GroupBy, Except, Intersect or Distinct.
This is not the most efficient way (probably ranks a 4 out of 5 in the quick-and-dirty category):
var comparableListA = ListA.Select(a =>
new { Name = a.Name, LastWrite = a.LastWriteTimeUtc, Object = a});
var comparableListB = ListB.Select(b =>
new { Name = b.Name, LastWrite = b.LastWriteTimeUtc, Object = b});
var diffList = comparableListA.Except(comparableListB);
var youHaveDiff = diffList.Any();
Explanation:
Anonymous classes are compared by property values, which is what you're looking to do, which led to my thinking of doing a LINQ projection along those lines.
P.S.
You should double check the syntax, I just rattled this off without the compiler.

Search for a list of words in a paragraph

I have a paragraph written in English.
I have a list of words.
I want to check if the paragraph contains any one word
What is the best algorithm to do so:
Presently, I have the following but it seems very naive:
private boolean findMatch(List<String> list, String param, ArrayList<String> skipChars) {
boolean matchResult = false;
for (String s : list) {
if(skipChars == null || !skipChars.contains(s)){
if (param.indexOf(s) != -1) {
matchResult = true;
break;
}
}
}
return matchResult;
}
}
split the paragraph to wrods, and store them in a hash table
now for each word in your list search for it in the hash.
for real life applications this will probably do.
--EDIT--
if you cannot split the paragraph into words, and you need to tell if only one word is in the paragraph I suggest constructing a trie from your list of words, and then going over the paragraph and checking the trie for matches as you go.
In c# i usually use linq to entities for quering list and get result.
this is my code:
private bool findMatch(List<String> list, String param, List<String> skipChars)
{
if (skipChars == null)
skipChars = new List<string>();
var c = (from l in list.Except(skipChars)
where param.IndexOf(l) != -1
select l).Count();
return c != 0;
}

Item-by-item list comparison, updating each item with its result (no third list)

The solutions I have found so far in my research on comparing lists of objects have usually generated a new list of objects, say of those items existing in one list, but not in the other. In my case, I want to compare two lists to discover the items whose key exists in one list and not the other (comparing both ways), and for those keys found in both lists, checking whether the value is the same or different.
The object being compared has multiple properites that constitute the key, plus a property that constitutes the value, and finally, an enum property that describes the result of the comparison, e.g., {Equal, NotEqual, NoMatch, NotYetCompared}. So my object might look like:
class MyObject
{
//Key combination
string columnA;
string columnB;
decimal columnC;
//The Value
decimal columnD;
//Enum for comparison, used for styling the item (value hidden from UI)
//Alternatively...this could be a string type, holding the enum.ToString()
MyComparisonEnum result;
}
These objects are collected into two ObservableCollection<MyObject> to be compared. When bound to the UI, the grid rows are being styled based on the caomparison result enum, so the user can easily see what keys are in the new dataset but not in the old, vice-versa, along with those keys in both datasets with a different value. Both lists are presented in the UI in data grids, with the rows styled based on the comparison result.
Would LINQ be suitable as a tool to solve this efficiently, or should I use loops to scan the lists and break out when the key is found, etc (a solution like this comes naturally to be from my procedural programming background)... or some other method?
Thank you!
You can use Except and Intersect:
var list1 = new List<MyObject>();
var list2 = new List<MyObject>();
// initialization code
var notIn2 = list1.Except(list2);
var notIn1 = list2.Except(list1);
var both = list1.Intersect(list2);
To find objects with different values (ColumnD) you can use this (quite efficient) Linq query:
var diffValue = from o1 in list1
join o2 in list2
on new { o1.columnA, o1.columnB, o1.columnC } equals new { o2.columnA, o2.columnB, o2.columnC }
where o1.columnD != o2.columnD
select new { Object1 = o1, Object2 = o2 };
foreach (var diff in diffValue)
{
MyObject obj1 = diff.Object1;
MyObject obj2 = diff.Object2;
Console.WriteLine("Obj1-Value:{0} Obj2-Value:{1}", obj1.columnD, obj2.columnD);
}
when you override Equals and GetHashCode appropriately:
class MyObject
{
//Key combination
string columnA;
string columnB;
decimal columnC;
//The Value
decimal columnD;
//Enum for comparison, used for styling the item (value hidden from UI)
//Alternatively...this could be a string type, holding the enum.ToString()
MyComparisonEnum result;
public override bool Equals(object obj)
{
if (obj == null || !(obj is MyObject)) return false;
MyObject other = (MyObject)obj;
return columnA.Equals(other.columnA) && columnB.Equals(other.columnB) && columnC.Equals(other.columnC);
}
public override int GetHashCode()
{
int hash = 19;
hash = hash + (columnA ?? "").GetHashCode();
hash = hash + (columnB ?? "").GetHashCode();
hash = hash + columnC.GetHashCode();
return hash;
}
}

Resources