Data Structure for Multi-Level Traversal - algorithm

I have an application that deals with data in the following structure:
struct Message
{
int time;
string name;
string details;
};
For example, I may have a data set that looks like the following:
9:00:00 Bob <Info>
9:01:00 John <Info>
9:05:00 Bob <Info>
9:11:00 Mary <Info>
9:17:00 John <Info>
9:25:00 Mary <Info>
9:30:00 Bob <Info>
And I will have a list of Message structures that represent each line in the data set.
Some operations I will need to do on this data include:
Collect all data in chronological order and dostuff()
Collect all data from John (or whoever) in chronological order and dostuff()
So, I need a way to traverse the list such that I can pass every message in chronological order, and also choose a person, and pass through only their messages in chronological order.
My thoughts are to have a struct like this:
struct Node
{
Message* message;
Node* next_time;
Node* next_name;
};
In which next_time points to the next Node in chronological order, and next_name points to the next Node that belongs to message->name. And a Root structure points to the first of each type.
struct Root
{
Node* first_time;
Node* first_bob;
Node* first_john;
Node* first_mary;
Node* last_time;
Node* last_bob;
Node* last_john;
Node* last_mary;
};
Here is an image to illustrate the point.
This structure allows me to fairly easily traverse through every message, or through only Bob's messages, or only John's, etc.
However, I am worried that maybe this is more complicated than it needs to be. I also have concerns about maintenance (see below). I need the search/select/read operations to be pretty fast, which I think they are. And I need insert operations to be reasonably fast. But right now, for every Message I insert, I must (1) update some next_time pointer and (2) update some next_name pointer.
My question is:
Does a data structure exist that provides this type of functionality already? If not, am I approaching this problem correctly?
Please provide any code samples in C++ or C#, if possible.
Thanks.
Additional: Suppose later I want to add to my Message struct. Let's say I add a field called City. Now, I may want to do this:
Collect all data from a specific City in chronological order and dostuff()
This would require adding a next_city, and then for every insert, I would have to update next_time, next_name, AND next_city.
Further, suppose I want to do this:
Collect all data from a specific City AND a specific name in chronological order and dostuff()
I think this makes the problem incredibly more difficult unless I opt to traverse every Message and skip the ones I don't care about.

A simple linked list of all messages (sorted by time).
struct Node
{
Message* message;
Node* next_name;
};
This will satisfy req 1. You can add() and getAll() in O(1)
A separate hashmap with User as key and a list of Node* as values.
Hashmap
{key = User, value = List(Message*)}
This will satisfy req 2. You can add a new entry to the end of the list of the specific user O(1) and getAllOfUser() can also run in O(1)

I would probably create a class to represent a user, storing users in an hash table by some identifier like the name, then have each user hold a list of Messages sorted in chronological order, which are also stored in a single global list which holds everyone's Messages in chronological order. For every added Message, you'd have to insert it once in each list (by list I mean some collection), which could be log n time, or as bad as n, depending on the data structure.

Related

Extracting all children belongs to specific parent in graphql

I am using GrapgQL and Java. I need to extract all the children belongs to specific parent. I have used the below way but it will fetch only the parent and it does not fetch any children.
schema {
query: Query
}
type LearningResource{
id: ID
name: String
type: String
children: [LearningResource]
}
type Query {
fetchLearningResource: LearningResource
}
#Component
public class LearningResourceDataFetcher implements DataFetcher{
#Override
public LearningResource get(DataFetchingEnvironment dataFetchingEnvironment) {
LearningResource lr3 = new LearningResource();
lr3.setId("id-03");
lr3.setName("Resource-3");
lr3.setType("Book");
LearningResource lr2 = new LearningResource();
lr2.setId("id-02");
lr2.setName("Resource-2");
lr2.setType("Paper");
LearningResource lr1 = new LearningResource();
lr1.setId("id-01");
lr1.setName("Resource-1");
lr1.setType("Paper");
List<LearningResource> learningResources = new ArrayList<>();
learningResources.add(lr2);
learningResources.add(lr3);
learningResource1.setChildren(learningResources);
return lr1;
}
}
return RuntimeWiring.newRuntimeWiring().type("Query", typeWiring -> typeWiring.dataFetcher("fetchLearningResource", learningResourceDataFetcher)).build();
My Controller endpoint
#RequestMapping(value = "/queryType", method = RequestMethod.POST)
public ResponseEntity query(#RequestBody String query) {
System.out.println(query);
ExecutionResult result = graphQL.execute(query);
System.out.println(result.getErrors());
System.out.println(result.getData().toString());
return ResponseEntity.ok(result.getData());
}
My request would be like below
{
fetchLearningResource
{
name
}
}
Can anybody please help me to sort this ?
Because I get asked this question a lot in real life, I'll answer it in detail here so people have easier time googling (and I have something to point at).
As noted in the comments, the selection for each level has to be explicit and there is no notion of an infinitely recursive query like get everything under a node to the bottom (or get all children of this parent recursively to the bottom).
The reason is mostly that allowing such queries could easily put you in a dangerous situation: a user would be able to request the entire object graph from the server in one easy go! For any non-trivial data size, this would kill the server and saturate the network in no time. Additionally, what would happen once a recursive relationship is encountered?
Still, there is a semi-controlled escape-hatch you could use here. If the scope in which you need everything is limited (and it really should be), you could map the output type of a specific query as a (complex) scalar.
In your case, this would mean mapping LearningResource as a scalar. Then, fetchLearningResource would effectively be returning a JSON blob, where the blob would happen to be all the children and their children recursively. Query resolution doesn't descent deeper once a scalar field is reached, as scalars are leaf nodes, so it can't keep resolving the children level-by-level. This means you'd have to recursively fetch everything in one go, by yourself, as GraphQL engine can't help you here. It also means sub-selections become impossible (as scalars can't have sub-selections - again, they're leaf nodes), so the client would always get all the children and all the fields from each child back. If you still need the ability to limit the selection in certain cases, you can expose 2 different queries e.g. fetchLearningResource and fetchAllLearningResources, where the former would be mapped as it is now, and the latter would return the scalar as explained.
An object scalar implementation is provided by the graphql-java ExtendedScalars project.
The schema could then look like:
schema {
query: Query
}
scalar Object
type Query {
fetchLearningResource: Object
}
And you'd use the method above to produce the scalar implementation:
RuntimeWiring.newRuntimeWiring()
.scalar(ExtendedScalars.Object) //register the scalar impl
.type("Query", typeWiring -> typeWiring.dataFetcher("fetchLearningResource", learningResourceDataFetcher)).build();
Depending on how you process the results of this query, the DataFetcher for fetchLearningResource may need to turn the resulting object into a map-of-maps (JSON-like object) before returning to the client. If you simply JSON-serialize the result anyway, you can likely skip this. Note that you're side-stepping all safety mechanisms here and must take care not to produce enormous results. By extension, if you need this in many places, you're very likely using a completely wrong technology for your problem.
I have not tested this with your code myself, so I might have skipped something important, but this should be enough to get you (or anyone googling) onto the right track (if you're sure this is the right track).
UPDATE: I've seen someone implement a custom Instrumentation that rewrites the query immediately after it's parsed, and adds all fields to the selection set if no field had already been selected, recursively. This effectively allows them to select everything implicitly.
In graphql-java v11 and prior, you could mutate the parsed query (represented by the Document class), but as of v12, it will no longer be possible, but instrumentations in turn gain the ability to replace the Document explicitly via the new instrumentDocument method.
Of course, this only makes sense if your schema is such that it can not be exploited or you fully control the client so there's no danger. You could also only do it selectively for some types, but it would be extremely confusing to use.

data structure for a file system--interview

I've encountered the following interview questions online. Based on my understanding, it's asked you to design a data structure to simulate the file system. Can anyone give me some hints?
// addMapping("/foo/bar/x", "XController")
// addMapping("/foo/bar/z", "ZController")
// addMapping("/foo/baz", "BazController");
//getMapping("/foo/bar/x") -> ["XController"]
//getMapping("/foo/bar") -> ["XController", "ZController"]
public void addMapping(String path, String destination) {
//candidate TODO
}
public List<String> getMapping(String path) {
//candidate TODO
}
I think the best structure to use for this mapping is a Trie or even better its compressed version - a Patricia Tree(a.k.a radix tree). The idea is the following - both structures store prefixes of dictionary words. When a user queries for a given path you traverse the structure(be it a trie or a radix tree) according to the query string. After that you do any walk over the subtree under the node where you end up and print all the controllers associated with the nodes there.

Linked List Class in MATLAB - Insert node manually without insertAfter()

I am trying to use the Linked List Class implementation in MATLAB.
Now, it says the only way to insert a node into the list is by using insertBefore() or insertAfter().
But I want to insert the node manually, by specifying the Next value of the new node, like
newnode = dlnode(new);
ptr.Next = newnode;
newnode.Next=ptrnxt;
Would this work? I cannot use the insertBefore() or insertAfter() in my particular application since I am not maintaining a pointer of the current node.
Details regarding the linked list class is given here.
No, I don't think that would work with just those three lines of code because it ignores all the other logic that occurs to make sure that the Next and Prev are set for newnode and ignores the update to ptrnxt that would need to occur so that its Prev is now newnode. (And the Next property is private so you would have to change that to public…)
It is unclear by what you mean with I cannot use the insertBefore() or insertAfter() in my particular application since I am not maintaining a pointer of the current node. Yet you have the nodes that you want to insert newnode between? I'm guessing that the order of your nodes (prior to the insertion of the new one) is …,ptr,ptrnxt,…. Then why not just use
newnode.insertAfter(ptr);
which would change the order to …,ptr,newnode,ptrnxt,… and all three properties for each node would be set correctly/automatically.
Else, you would have to change your code to something like
newnode = dlnode(new);
ptr.Next = newnode;
newnode.Prev = ptr; % to make sure that newnode points back to ptr
newnode.Next = ptrnxt;
ptrnxt.Prev = newnode; % to make sure that ptrnxt points back to newnode
Much easier and safer to use the insertAfter and insertBefore methods.

Efficient implementation of immutable (double) LinkedList

Having read this question Immutable or not immutable? and reading answers to my previous questions on immutability, I am still a bit puzzled about efficient implementation of simple LinkedList that is immutable. In terms of array tha seems to be easy - copy the array and return new structure based on that copy.
Supposedly we have a general class of Node:
class Node{
private Object value;
private Node next;
}
And class LinkedList based on the above allowing the user to add, remove etc. Now, how would we ensure immutability? Should we recursively copy all the references to the list when we insert an element?
I am also curious about answers in Immutable or not immutable? that mention cerain optimization leading to log(n) time and space with a help of a binary tree. Also, I read somewhere that adding an elem to the front is 0(1) as well. This puzzles me greatly, as if we don't provide the copy of the references, then in reality we are modifying the same data structures in two different sources, which breaks immutability...
Would any of your answers alo work on doubly-linked lists? I look forward to any replies/pointers to any other questions/solution. Thanks in advance for your help.
Supposedly we have a general class of Node and class LinkedList based on the above allowing the user to add, remove etc. Now, how would we ensure immutability?
You ensure immutability by making every field of the object readonly, and ensuring that every object referred to by one of those readonly fields is also an immutable object. If the fields are all readonly and only refer to other immutable data, then clearly the object will be immutable!
Should we recursively copy all the references to the list when we insert an element?
You could. The distinction you are getting at here is the difference between immutable and persistent. An immutable data structure cannot be changed. A persistent data structure takes advantage of the fact that a data structure is immutable in order to re-use its parts.
A persistent immutable linked list is particularly easy:
abstract class ImmutableList
{
public static readonly ImmutableList Empty = new EmptyList();
private ImmutableList() {}
public abstract int Head { get; }
public abstract ImmutableList Tail { get; }
public abstract bool IsEmpty { get; }
public abstract ImmutableList Add(int head);
private sealed class EmptyList : ImmutableList
{
public override int Head { get { throw new Exception(); } }
public override ImmutableList Tail { get { throw new Exception(); } }
public override bool IsEmpty { get { return true; } }
public override ImmutableList Add(int head)
{
return new List(head, this);
}
}
private sealed class List : ImmutableList
{
private readonly int head;
private readonly ImmutableList tail;
public override int Head { get { return head; } }
public override ImmutableList Tail { get { return tail; } }
public override bool IsEmpty { get { return false; } }
public override ImmutableList Add(int head)
{
return new List(head, this);
}
}
}
...
ImmutableList list1 = ImmutableList.Empty;
ImmutableList list2 = list1.Add(100);
ImmutableList list3 = list2.Add(400);
And there you go. Of course you would want to add better exception handling and more methods, like IEnumerable<int> methods. But there is a persistent immutable list. Every time you make a new list, you re-use the contents of an existing immutable list; list3 re-uses the contents of list2, which it can do safely because list2 is never going to change.
Would any of your answers also work on doubly-linked lists?
You can of course easily make a doubly-linked list that does a full copy of the entire data structure every time, but that would be dumb; you might as well just use an array and copy the entire array.
Making a persistent doubly-linked list is quite difficult but there are ways to do it. What I would do is approach the problem from the other direction. Rather than saying "can I make a persistent doubly-linked list?" ask yourself "what are the properties of a doubly-linked list that I find attractive?" List those properties and then see if you can come up with a persistent data structure that has those properties.
For example, if the property you like is that doubly-linked lists can be cheaply extended from either end, cheaply broken in half into two lists, and two lists can be cheaply concatenated together, then the persistent structure you want is an immutable catenable deque, not a doubly-linked list. I give an example of a immutable non-catenable deque here:
http://blogs.msdn.com/b/ericlippert/archive/2008/02/12/immutability-in-c-part-eleven-a-working-double-ended-queue.aspx
Extending it to be a catenable deque is left as an exercise; the paper I link to on finger trees is a good one to read.
UPDATE:
according to the above we need to copy prefix up to the insertion point. By logic of immutability, if w delete anything from the prefix, we get a new list as well as in the suffix... Why to copy only prefix then, and not suffix?
Well consider an example. What if we have the list (10, 20, 30, 40), and we want to insert 25 at position 2? So we want (10, 20, 25, 30, 40).
What parts can we reuse? The tails we have in hand are (20, 30, 40), (30, 40) and (40). Clearly we can re-use (30, 40).
Drawing a diagram might help. We have:
10 ----> 20 ----> 30 -----> 40 -----> Empty
and we want
10 ----> 20 ----> 25 -----> 30 -----> 40 -----> Empty
so let's make
| 10 ----> 20 --------------> 30 -----> 40 -----> Empty
| /
| 10 ----> 20 ----> 25 -/
We can re-use (30, 40) because that part is in common to both lists.
UPDATE:
Would it be possible to provide the code for random insertion and deletion as well?
Here's a recursive solution:
ImmutableList InsertAt(int value, int position)
{
if (position < 0)
throw new Exception();
else if (position == 0)
return this.Add(value);
else
return tail.InsertAt(value, position - 1).Add(head);
}
Do you see why this works?
Now as an exercise, write a recursive DeleteAt.
Now, as an exercise, write a non-recursive InsertAt and DeleteAt. Remember, you have an immutable linked list at your disposal, so you can use one in your iterative solution!
Should we recursively copy all the references to the list when we insert an element?
You should recursively copy the prefix of the list up until the insertion point, yes.
That means that insertion into an immutable linked list is O(n). (As is inserting (not overwriting) an element in array).
For this reason insertion is usually frowned upon (along with appending and concatenation).
The usual operation on immutable linked lists is "cons", i.e. appending an element at the start, which is O(1).
You can see clearly the complexity in e.g. a Haskell implementation. Given a linked list defined as a recursive type:
data List a = Empty | Node a (List a)
we can define "cons" (inserting an element at the front) directly as:
cons a xs = Node a xs
Clearly an O(1) operation. While insertion must be defined recursively -- by finding the insertion point. Breaking the list into a prefix (copied), and sharing that with the new node and a reference to the (immutable) tail.
The important thing to remember about linked lists is :
linear access
For immutable lists this means:
copying the prefix of a list
sharing the tail.
If you are frequently inserting new elements, a log-based structure , such as a tree, is preferred.
There is a way to emulate "mutation" : using immutable maps.
For a linked list of Strings (in Scala style pseudocode):
case class ListItem(s:String, id:UUID, nextID: UUID)
then the ListItems can be stored in a map where the key is UUID:
type MyList = Map[UUID, ListItem]
If I want to insert a new ListItem into val list : MyList :
def insertAfter(l:MyList, e:ListItem)={
val beforeE=l.getElementBefore(e)
val afterE=l.getElementAfter(e)
val eToInsert=e.copy(nextID=afterE.nextID)
val beforeE_new=beforeE.copy(nextID=e.nextID)
val l_tmp=l.update(beforeE.id,beforeE_new)
return l_tmp.add(eToInsert)
}
Where add, update, get takes constant time using Map: http://docs.scala-lang.org/overviews/collections/performance-characteristics
Implementing double linked list goes similarly.

Can we sort an IList partially?

IList<A_Desc,A_premium,B_Desc,B_Premium>
Can I sort two columns A_Desc,A_premium...based on A_Desc ?
And let B_Desc,B_Premium be remain in same order before sorting
First off, a list can only be of one type, and only has one "column" of data, so you actually want two lists and a data type that holds "desc" and "premium". "desc" sounds like a String to me; I don't know what Premium is, but I'll pretend it's a double for lack of better ideas. I don't know what this data is supposed to represent, so to me, it's just some thingie.
public class Thingie{
public String desc;
public double premium;
}
That is, of course, a terrible way to define the class- I should instead have desc and premium be private, and Desc and Premium as public Properties with Get and Set methods. But this is the fastest way for me to get the point across.
It's more canonical to make Thingie implement IComparable, and compare itself to other Thingie objects. But I'm editing an answer I wrote before I knew you needed to write a custom type, and had the freedom to just make it implement IComparable. So here's the IComparer approach, which lets you sort objects that don't sort themselves by telling C# how to sort them.
Implement an IComparer that operates over your custom type.
public class ThingieSorter: IComparer<Thingie>{
public int Compare(Thingie t1, Thingie t2){
int r = t1.desc.CompareTo(t2);
if(r != 0){return r;}
return t1.premium.CompareTo(t2);
}
}
C# doesn't require IList to implement Sort- it might be inefficient if it's a LinkedList. So let's make a new list, based on arrays, which does sort efficiently, and sort it:
public List<Thingie> sortedOf(IList<Thingie> list){
List<Thingie> ret = new List<Thingie>(list);
ret.sort(new ThingieSorter());
return ret;
}
List<Thingie> implements the interface IList<Thingie>, so replacing your original list with this one shouldn't break anything, as long as you have nothing holding onto the original list and magically expecting it to be sorted. If that's happening, refactor your code so it doesn't grab the reference until after your list has been sorted, since it can't be sorted in place.

Resources