Determining whether a singly-linked list has a loop is a common interview question. But why would a linked list exist with a loop? When would a linked list with a loop be useful?
Circularly linked lists can be useful if you want to iterate through the entire list starting from a random iterator or inserting in any position. They simplify the algorithms for those operations as you don't have to account for the beginning or end of the list.
It doesn't matter then, which of the elements you pass to the function as a parameter, for example. When using sentinel pattern, you will iterate over each elements all the same.
Apache uses circularly linked lists to store all its filters.
For some reason they gave it a fancy marketing name ("Bucket Brigades") and call it "a major innovation" - lol.
To use circularly linked lists, you can think about structures that are circular by nature. Such as pool of buffers that are used and released in FIFO order, or a set of processes that should be time-shared in round-robin order, or saving the nodes coordinates of a polygon in their specific order.
Many answers given here consider Completely looped Linked List.
The interview question is a bit different. We've to detect loop (can be partial) somewhere in the link
Eg: (sorry for poor graphics)
1 -> 2 -> 3 -> 4 -> 5
^ v
7 <----- 6
Answer to how it can happen: Link corruption.
a) People trying to crack in through UGC (User Generated Content). Bad data can come up.
b) Design flaws in early stages.
This loop detection is also part of testing we do on LinkedLists to look for suspected bugs.
You've not asked for solutions, but I'll put a few here in brief:
Keep pushing nodes to List (Arraylist in java). The time you find duplicate: Loop detected
Turtle & Hare approach. Turtle jumps one node at a time. Rabbit, two at a time. They meet if it's a loop
store boolean flag *visited* as you traverse. Any encounter of *true* is loop.
Related
Access to nodes in linked lists can get pretty slow if the list gets large. I did think of a way to speed up the access: there is an array (also a LL) with short cuts to every 100th node. This way if I want to get the 205th element, the program will have to go through this "path": short cut to [100] -> short cut to [200] -> [201] -> ... -> [205]. This is much faster that going through the whole LL to the 205th element- 5 "steps", instead of 204. Yes, it gets slower if I want the n-hundred-and-99-th element, but the program will skip a large part of the LL to get there- faster in the long run.
But those short cuts require readjusting after adding and removing more elements. Removing isn't a real problem- remove an element and set certain short cuts to point to the next nodes- those cuts that point at the formal n-hundredth nodes. Adding more data is a problem- when adding a new element, certain nodes must be set to point to the previous nodes. In order to get to these elements, the program must go ALL the way trough the list, starting from the last short cut that still points at an n-hundredth element. Unless the nodes also point to the previous elements, the whole process can get as slow as if I am removing an element from a vector.
Is there a way to speed up the access, keeping the processes for adding and removing elements fairly fast? This is just a question of curiosity, not if it is a good idea to use it in a real program.
You're using the wrong data structure. Linked lists are best used for sequentially-accessed lists, not for randomly-accessed collections. For that, you're better off with a hash table of some sort.
I'll get straight to it. I'm working on an web or phone app that is responsible for scheduling. I want students to input courses they took, and I give them possible combinations of courses they should take that fits their requirements.
However, let's say there's 150 courses that fits their requirements and they're looking for 3 courses. That would be 150C3 combinations, right?.
Would it be feasible to run something like this in browser or a mobile device?
First of all you need a smarter algorithm which can prune the search tree. Also, if you are doing this for the same set of courses over and over again, doing the computation on the server would be better, and perhaps precomputing a feasible data structure can reduce the execution time of the queries. For example, you can create a tree where each sub-tree under a node contains nodes that are 'compatible'.
Sounds to me like you're viewing this completely wrong. At most institutions there are 1) curriculum requirements for graduation, and 2) prerequisites for many requirements and electives. This isn't a pure combinatorial problem, it's a dependency tree. For instance, if Course 201, Course 301, and Course 401 are all required for the student's major, higher numbers have the lower numbered ones as prereqs, and the student is a Junior, you should be strongly recommending that Course 201 be taken ASAP.
Yay, mathematics I think I can handle!
If there are 150 courses, and you have to choose 3, then the amount of possibilities are (150*149*148)/(3*2) (correction per jerry), which is certainly better than 150 factorial which is a whole lot more zeros ;)
Now, you really don't want to build an array that size, and you don't have to! All web languages have the idea of randomly choosing an element in an array, so you get an element in an array and request 3 random unique entries from it.
While the potential course combinations is very large, based on your post I see no reason to even attempt to calculate them. This task of random selection of k items from n-sized list is delightfully trivial even for old, slow devices!
Is there any particular reason you'd need to calculate all the potential course combinations, instead of just grab-bagging one random selection as a suggestion? If not, problem solved!
Option 1 (Time\Space costly): let the user on mobile phone browse the list of (150*149*148) possible choices, page by page, the processing is done at the server-side.
Option 2 (Simple): instead of the (150*149*148)-item decision tree, provide a 150-item bag, if he choose one item from the bag, remove it from the bag.
Option 3 (Complex): expand your decision tree (possible choices) using a dependency tree (parent course requires child courses) and the list of course already taken by the student, and his track\level.
As far as I know, most educational systems use the third option, which requires having a profile for the student.
I have algorithms that works with dynamically growing lists (contiguous memory like a C++ vector, Java ArrayList or C# List). Until recently, these algorithms would insert new values into the middle of the lists. Of course, this was usually a very slow operation. Every time an item was added, all the items after it needed to be shifted to a higher index. Do this a few times for each algorithm and things get really slow.
My realization was that I could add the new items to the end of the list and then rotate them into position later. That's one option!
Another option, when I know how many items I'm adding ahead of time, is to add that many items to the back, shift the existing items and then perform the algorithm in-place in the hole I've made for myself. The negative is that I have to add some default value to the end of the list and then just overwrite them.
I did a quick analysis of these options and concluded that the second option is more efficient. My reasoning was that the rotation with the first option would result in in-place swaps (requiring a temporary). My only concern with the second option is that I am creating a bunch of default values that just get thrown away. Most of the time, these default values will be null or a mem-filled value type.
However, I'd like someone else familiar with algorithms to tell me which approach would be faster. Or, perhaps there's an even more efficient solution I haven't considered.
Arrays aren't efficient for lots of insertions or deletions into anywhere other than the end of the array. Consider whether using a different data structure (such as one suggested in one of the other answers) may be more efficient. Without knowing the problem you're trying to solve, it's near-impossible to suggest a data structure (there's no one solution for all problems). That being said...
The second option is definitely the better option of the two. A somewhat better option (avoiding the default-value issue): simply copy 789 to the end and overwrite the middle 789 with 456. So the only intermediate step would be 0123789789.
Your default-value concern is, however, (generally) not a big issue:
In Java, for one, you cannot (to my knowledge) even assign memory for an array that's not 0- or null-filled. C++ STL containers also enforce this I believe (but not C++ itself).
The size of a pointer compared to any moderate-sized class is minimal (thus assigning it to a default value also takes minimal time) (in Java and C# everything is pointers, in C++ you can use pointers (something like boost::shared_ptr or a pointer-vector is preferred above straight pointers) (N/A to primitives, which are small to start, so generally not really a big issue either).
I'd also suggest forcing a reallocation to a specified size before you start inserting to the end of the array (Java's ArrayList::ensureCapacity or C++'s vector::reserve). In case you didn't know - varying-length-array implementations tend to have an internal array that's bigger than what size() returns or what's accessible (in order to prevent constant reallocation of memory as you insert or delete values).
Also note that there are more efficient methods to copy parts of an array than doing it manually with for loops (e.g. Java's System.arraycopy).
You might want to consider changing your representation of the list from using a dynamic array to using some other structure. Here are two options that allow you to implement these operations efficiently:
An order statistic tree is a modified type of binary tree that supports insertions and selections anywhere in O(log n) time, as well as lookups in O(log n) time. This will increase your memory usage quite a bit because of the overhead for the pointers and extra bookkeeping, but should dramatically speed up insertions. However, it will slow down lookups a bit.
If you always know the insertion point in advance, you could consider switching to a linked list instead of an array, and just keep a pointer to the linked list cell where insertions will occur. However, this slows down random access to O(n), which could possibly be an issue in your setup.
Alternatively, if you always know where insertions will happen, you could consider representing your array as two stacks - one stack holding the contents of the array to the left of the insert point and one holding the (reverse) of the elements to the right of the insertion point. This makes insertions fast, and if you have the right type of stack implementation could keep random access fast.
Hope this helps!
HashMaps and Linked Lists were designed for the problem you are having. Given a indexed data structure with numbered items, the difficulty of inserting items in the middle requires a renumbering of every item in the list.
You need a data structure which is optimized to make inserts a constant O(1) complexity. HashMaps were designed to make insert and delete operations lightning quick regardless of dataset size.
I can't pretend to do the HashMap subject justice by describing it. Here is a good intro: http://en.wikipedia.org/wiki/Hash_table
I heard an interview question:
"Print a singly-linked list backwards,
in constant space and linear time."
My solution was to reverse the linkedlist in place and then print it like that. Is there another solution that is nondestructive?
You've already figured out most of the answer: reverse the linked list in place, and traverse the list back to the beginning to print it. To keep it from being (permanently) destructive, reverse the linked list in place again as you're traversing it back to the beginning and printing it.
Note, however, that this only works if you either only have a single thread of execution, or make the whole traversal a critical section so only one thread does it at a time (i.e., a second thread can never play with the list in the middle of the traversal).
If you reverse it again after printing it will no longer be destructive, since the original order is restored.
You could use a recursive call down the linked list chain with a reference to what you wish to write to. Each node would use the child node's print function while passing the reference before printing itself.
That way each node in the list would pass down, until the last one couldn't and would go straight to the write, then each one back up the chain would write after the last all the way back up to the front.
Edit
This actually doesn't fit the specs because of the linear space on stack. If you had something outside to walk the functions and a method of writing to the front of a string the base logic can still work though.
Okay , this could be an interview question , but it is actually a question behind weis algorithms book. The question clearly states that we cannot use recursion (something the interviewer will hide and reveal later on) as recursion will not use constant space, moslty recursion will become a major point of discusion going forward. Solution is reverse print and reverse back.
Here's an unconventional approach: Change your console to right-to-left reading order and then print the list in normal order. They will appear in backward order. Having to visit the actual data in reverse order doesn't sound like a constraint to the problem.
To experiment, I've (long ago) implemented Conway's Game of Life (and I'm aware of this related question!).
My implementation worked by keeping 2 arrays of booleans, representing the 'last state', and the 'state being updated' (the 2 arrays being swapped at each iteration). While this is reasonably fast, I've often wondered about how to optimize this.
One idea, for example, would be to precompute at iteration N the zones that could be modified at iteration (N+1) (so that if a cell does not belong to such a zone, it won't even be considered for modification at iteration (N+1)). I'm aware that this is very vague, and I never took time to go into the details...
Do you have any ideas (or experience!) of how to go about optimizing (for speed) Game of Life iterations?
I am going to quote my answer from the other question, because the chapters I mention have some very interesting and fine-tuned solutions. Some of the implementation details are in c and/or assembly, yes, but for the most part the algorithms can work in any language:
Chapters 17 and 18 of
Michael Abrash's Graphics
Programmer's Black Book are one of
the most interesting reads I have ever
had. It is a lesson in thinking
outside the box. The whole book is
great really, but the final optimized
solutions to the Game of Life are
incredible bits of programming.
There are some super-fast implementations that (from memory) represent cells of 8 or more adjacent squares as bit patterns and use that as an index into a large array of precalculated values to determine in a single machine instruction if a cell is live or dead.
Check out here:
http://dotat.at/prog/life/life.html
Also XLife:
http://linux.maruhn.com/sec/xlife.html
You should look into Hashlife, the ultimate optimization. It uses the quadtree approach that skinp mentioned.
As mentioned in Arbash's Black Book, one of the most simple and straight forward ways to get a huge speedup is to keep a change list.
Instead of iterating through the entire cell grid each time, keep a copy of all the cells that you change.
This will narrow down the work you have to do on each iteration.
The algorithm itself is inherently parallelizable. Using the same double-buffered method in an unoptimized CUDA kernel, I'm getting around 25ms per generation in a 4096x4096 wrapped world.
what is the most efficient algo mainly depends on the initial state.
if the majority of cells is dead, you could save a lot of CPU time by skipping empty parts and not calculating stuff cell by cell.
im my opinion it can make sense to check for completely dead spaces first, when your initial state is something like "random, but with chance for life lower than 5%."
i would just divide the matrix up into halves and start checking the bigger ones first.
so if you have a field of 10,000 * 10,000, you´d first accumulate the states of the upper left quarter of 5,000 * 5,000.
and if the sum of states is zero in the first quarter, you can ignore this first quarter completely now and check the upper right 5,000 * 5,000 for life next.
if its sum of states is >0, you will now divide up the second quarter into 4 pieces again - and repeat this check for life for each of these subspaces.
you could go down to subframes of 8*8 or 10*10 (not sure what makes the most sense here) now.
whenever you find life, you mark these subspaces as "has life".
only spaces which "have life" need to be divided into smaller subspaces - the empty ones can be skipped.
when you are finished assigning the "has life" attribute to all possible subspaces, you end up with a list of subspaces which you now simply extend by +1 to each direction - with empty cells - and perform the regular (or modified) game of life rules to them.
you might think that dividn up a 10,000*10,000 spae into subspaces of 8*8 is a lot os tasks - but accumulating their states values is in fact much, much less computing work than performing the GoL algo to each cell plus their 8 neighbours plus comparing the number and storing the new state for the net iteration somewhere...
but like i said above, for a random init state with 30% population this wont make much sense, as there will be not many completely dead 8*8 subspaces to find (leave alone dead 256*256 subpaces)
and of course, the way of perfect optimisation will last but not least depend on your language.
-110
Two ideas:
(1) Many configurations are mostly empty space. Keep a linked list (not necessarily in order, that would take more time) of the live cells, and during an update, only update around the live cells (this is similar to your vague suggestion, OysterD :)
(2) Keep an extra array which stores the # of live cells in each row of 3 positions (left-center-right). Now when you compute the new dead/live value of a cell, you need only 4 read operations (top/bottom rows and the center-side positions), and 4 write operations (update the 3 affected row summary values, and the dead/live value of the new cell). This is a slight improvement from 8 reads and 1 write, assuming writes are no slower than reads. I'm guessing you might be able to be more clever with such configurations and arrive at an even better improvement along these lines.
If you don't want anything too complex, then you can use a grid to slice it up, and if that part of the grid is empty, don't try to simulate it (please view Tyler's answer). However, you could do a few optimizations:
Set different grid sizes depending on the amount of live cells, so if there's not a lot of live cells, that likely means they are in a tiny place.
When you randomize it, don't use the grid code until the user changes the data: I've personally tested randomizing it, and even after a long amount of time, it still fills most of the board (unless for a sufficiently small grid, at which point it won't help that much anymore)
If you are showing it to the screen, don't use rectangles for pixel size 1 and 2: instead set the pixels of the output. Any higher pixel size and I find it's okay to use the native rectangle-filling code. Also, preset the background so you don't have to fill the rectangles for the dead cells (not live, because live cells disappear pretty quickly)
Don't exactly know how this can be done, but I remember some of my friends had to represent this game's grid with a Quadtree for a assignment. I'm guess it's real good for optimizing the space of the grid since you basically only represent the occupied cells. I don't know about execution speed though.
It's a two dimensional automaton, so you can probably look up optimization techniques. Your notion seems to be about compressing the number of cells you need to check at each step. Since you only ever need to check cells that are occupied or adjacent to an occupied cell, perhaps you could keep a buffer of all such cells, updating it at each step as you process each cell.
If your field is initially empty, this will be much faster. You probably can find some balance point at which maintaining the buffer is more costly than processing all the cells.
There are table-driven solutions for this that resolve multiple cells in each table lookup. A google query should give you some examples.
I implemented this in C#:
All cells have a location, a neighbor count, a state, and access to the rule.
Put all the live cells in array B in array A.
Have all the cells in array A add 1 to the neighbor count of their
neighbors.
Have all the cells in array A put themselves and their neighbors in array B.
All the cells in Array B Update according to the rule and their state.
All the cells in Array B set their neighbors to 0.
Pros:
Ignores cells that don't need to be updated
Cons:
4 arrays: a 2d array for the grid, an array for the live cells, and an array
for the active cells.
Can't process rule B0.
Processes cells one by one.
Cells aren't just booleans
Possible improvements:
Cells also have an "Updated" value, they are updated only if they haven't
updated in the current tick, removing the need of array B as mentioned above
Instead of array B being the ones with live neighbors, array B could be the
cells without, and those check for rule B0.