vector<int>vec(N,0);
int* ptr=vec;
Can we do this to vector like arrays or we have to have iterators.what I want is to maintain a pointer pointing to elements of array.
I can think of couple of ways to get a pointer to the elements of a vector.
int* p = &(vec[0]);
or
int* p = vec.data();
However, unless you know what you are doing, I don't recommend using pointers to access the elements of a vector. Use iterators to get access to the elements of the vector. There is not much to be gained and much to lose by preferring to use pointers over iterators.
Related
I can think of only a few for example the zero length list or set. The zero length string. How about empty matrices or tensors? How about parallelograms with all zero degree angles? How about a rectangle with two sides zero length? Or a triangle having ones side 180 degrees and the other two are zero? Can we keep going with many sided polygons? Nah that doesn't feel right. But I do believe there are similar degenerate shapes in 3-space.
But those I am not much interested in. I'm looking for some common math functions often used in programming which have well known degenerate cases. I do lots of Mathematica and some JavaScript programming but the actual programming language doesn't really matter as this is more of a computer science task.
There are some interesting examples of degenerate data structures
Degenerate Binary Tree - It's basically a Binary Tree where every parent has only one child. So it degenerates into a linked list.
Hash Table with a constant hash function - Hash Table collisions can be handled in two main ways:
Chaining - Every cell of the array links to a linked list and elements with the same hash value are chained together into this list. So, when the hash function is constant, all elements have the same hash value and they are all connected; here, the hash table degenerates into a linked list.
probing - Here, if an element has the same hash as another one, I simply look for an empty space. Now, when the probing sequence is linear (so if the cell i is occupied I'll look for the cell i+1) and the hash value is always the same, I will generate only collisions, every element is put into the first empty space of the list and It will degenerate into an another linked list.
Classes with no methods - A class without methods, so written like this:
class Fraction {
int numerator;
int denominator;
}
It degenerates into a struct, so like this:
struct Fraction {
int numerator;
int denominator;
}
And so on. Obviously, there are many other examples of degenerate cases for data structures or functions (in graph theory for example).
I hope this can help.
I'm interested in implementing persistent (e.g. purely functional, immutable, etc), growable vectors in F#, so that they might be used in the .NET framework. My current implementation is a variant on the Hash-Mapped Trie, and is done according to Clojure's implementation.
I'm having trouble implementing random-access insertions and deletions (inserting and removing elements at random indices) using this implementation. Is there some algorithm/modification that allows these operations efficiently, or some other implementation I can look at?
Clarification: When I say 'inserts' and 'deletes' I mean, for example, given the list [1; 2; 3; 4] an insert of 500 in position 1 will give me [1:500:2:3:4]. I don't mean a set or associate operation.
Finger trees might be what you are looking for. There is a Clojure implementation available.
Immutable vectors/lists typically provide fast updates by only allowing insertions at one end and then sharing the immutable data at the other end. If you want to do non-head/tail insertions what you're actually wanting to do is mutate the immutable end of your collection. You'll have to split the vector around the item you want to insert and then splice it back together to create a new vector, and the best you're going to be able to do it in is O(n) time.
Immutable sorted trees work a little bit differently, but they won't let you re-number indicies (keys) in less than O(n) time either.
Basically, if someone had discovered an efficient way to support random-access insertions in an immutable vector then it would be supported in one of the mainstream functional languages—but there is no such known data structure or algorithm, so there's no such implementation.
The only thing can do is split and join. This is very ineffective with clojure vectors. That is why Phill Bagwell implmented a persistent vector that can be split and join in log(n).
You might want to look at this video: http://blip.tv/clojure/phill-bagwell-striving-to-make-things-simple-and-fast-5936145
or directly to his paper here: infoscience.epfl.ch/record/169879/files/RMTrees.pdf
Port the Haskell HAMT library? The Insert operation is O(log n)
I know how to implement linked list using array. For example
we define a struct as follow:
struct Node{
int data;
int link;
}
"data" stores the info and "link" stores the index in the array of next node.
Can anybody tell me what is the advantage and disadvantage of implementing a linked list using array compared to "ordinary" linked list? Any suggestion will be appreciated.
If you back a linked list with an array, you'll end up with the disadvantages of both. Consequently, this is probably not a very good way to implement it.
Some immediate disadvantages:
You'll have dead space in the array (entries which aren't currently used for items) taking up memory
You'll have to keep track of the free entries - after a few insertions and deletions, these free entries could be anywhere.
Using an array will impose an upper limit on the size of the linked list.
I suppose some advantages are:
If you're on a 64 bit system, your "pointers" will take up less space (though the extra space required by free entries probably outweighs this advantage)
You could serialise the array to disk and read it back in with an mmap() call easily. Though, you'd be better off using some sort of protocol buffer for portability.
You could make some guarantees about elements in the array being close to each other in memory.
Can anybody tell me what is the advantage and disadvantage of implementation of linked list using array compared to "ordinary" linked list?
linked lists have the following complexity:
cons x xs : O(1)
append n m : O(n)
index i xs : O(n)
if your representation uses a strict, contiguous array, you will have different complexity:
cons will require copying the old array: O(n)
append will require copying both arrays into a new contiguous space: O(n + m)
index can be implemented as array access: O(1)
That is, a linked list API implemented in terms of arrays will behave like an array.
You can mitigate this somewhat by using a linked list or tree of strict arrays, leading to ropes or finger trees or lazy sequences.
stack in implement two way.
first in using array and second is using linked list.
some disadvatages in using array then most of programmer use linked list in stack implement.
first is stack using linked list first not declare stack size and not limited data store in stack. second is linked list in pointer essay to declare and using it.
only one pointer use in linked list. its called top pointer.
stack is lifo method use. but some disadvantages in linked list program implemention.
Most of programmer use stack implemention using liked list.
Using Array implementation, you can have sequential & faster access to nodes of list, on the other hand,
If you implement Linked list using pointers, you can have random access to nodes.
Array implementation is helpful when you are dealing with fixed no. Of elements because resizing an array is expensive as far as performance is concerned because if you are required to insert/delete nodes from middle of the list it you have to shift every node afterwise.
Contrary to this, You should use pointer implemention when you don't know no. of nodes you would want, as such a list can grow/shrink efficiently & you don't need to shift any nodes, it can be done by simply dereferencing & referencing pointers.
When we have a hash table with chaining:
I am just wondering if maintaining the list at each key in order affects the running time for searching, inserting and deleting in the hash table?
In theory: yes, since in the average case you will only have to walk half the chain to find if an item is on the chain or not.
In practice, there is probably not much difference, since the chains are typically very short, and the increased code complexity would also cost some cycles, mainly in the "insert" case.
BTW: in most cases the number of slots is considerably smaller than the "keyspace" of the hash values. If you can afford the space, storing the hash values in the chain nodes will save recomputing the hash value on every hop, and will avoid most of the final compares. This of course is a space<-->time tradeoff. As in:
struct hashnode **this;
for (this=& table[slot] ; *this; this = &(*this)->link) {
if ((*this)->hash != the_hash) continue;
if (compare ((*this)->payload , the_value)) continue;
break;
}
/* at this point "this" points to the pointer that points to the wanted element,
or to the NULL-pointer where it should be inserted.
For the sorted-list example, you should instead break out of the loop
if the compare function returns > 0, and handle that special case here.
*/
Hypothetically, you've chosen your hash algorithm and map size to mitigate the number of collisions you will get in the first place. At that point, you should have a very small list (ideally one or two elements) at any position, so the extra effort of maintaining a sorted structure in the chain is most certainly more than just iterating the small number of items in that bucket.
Yes, of course. The usually-cited O(1) for a hashtable is assuming perfect hashing - where no two items that are not the same resolve to the same hash.
In practice, that won't be the case. You'll always have (for a big enough data set) collisions. And collisions will mean more work at lookup time, regardless of whether you're using chaining or some other collision-resolution technique.
That's why it's very, very important to select a good hash function that is well designed/written, and a good match for the data you'll be using as the key for your hash table. Different types of data will hash better with different hash functions, in practice.
Say I have a binary tree with the following definition for a node.
struct node
{
int key1 ;
int key2 ;
}
The binary search tree is created on the basis of key1. Now is it possible to rearrange the binary search tree on basis of key2 in O(1) space. Although I can do this in variable space using an array of pointers to nodes.
The actual problem where I require this is "counting number of occurrences of unique words in a file and displaying the result in decreasing order of frequency."
Here, a BST node is
{
char *word;
int freq ;
}
The BST is first created on basis of alphabetic order of words and finally I want it on basis of freq.
Am I wrong at choice of data structure i.e a BST?
I think you can create a new tree sorted by freq and push there all elements popping them from an old tree.
That could be O(1) though likely more like O(log N) which isn't big anyway.
Also, I don't know how you call it in C#, but in Python you can use list but sort it by two different keys in-place.
Map, BST are good if you need to have sorted output for your dictionnary.
And it is good if you need to mix up add, remove and lookup operations.
I don't think this is your need here. You load the dictionnary, sort it, then do only look up in it, that's right ?
In this case a sorted array is probably a better container. (See Item 23 from Effective STL from Scott Meyer).
(Update: simply consider that a map could generate more memory cache misses than a sorted array, as an array get its data contiguous in memory, and as each node in a map contain 2 pointers to other nodes in the map. When your objects are simple and take not much space in memory, a sorted vector is probable a better option. I warmly recommand you to read that item from Meyer's book)
About the kind of sort you are talking about, you will need that algorithm from the stl:
stable_sort.
The idea is to sort the dictionnary, then sort with stable_sort() on the frequence key.
It will give something like that (not tested actually, but you got the idea):
struct Node
{
char * word;
int key;
};
bool operator < (const Node& l, const Node& r)
{
return std::string(l.word) < std::string(r.word));
}
bool freq_comp(const Node& l, const Node& r)
{
return l.key < r.key;
}
std::vector<node> my_vector;
... // loading elements
sort(vector.begin(), vector.end());
stable_sort(vector.begin(), vector.end(), freq_comp);
Using a HashTable (Java) or Dictionary (.NET) or equivalent data structure in your language of choice (hash_set or hash_map in STL) will give you O(1) inserts during the counting phase, unlike the binary search tree which would be somewhere from O(log n) to O(n) on insert depending on whether it balances itself. If performance is really that important just make sure you try to initialize your HashTable to a large enough size that it won't need to resize itself dynamically, which can be expensive.
As for listing by frequency, I can't immediately think of a tricky way to do that without involving a sort, which would be O(n log n).
Here is my suggestion for re-balancing the tree based off of the new keys (well, I have 2 suggestions).
The first and more direct one is to somehow adapt Heapsort's "bubble-up" function (to use Sedgewick's name for it). Here is a link to wikipedia, there they call it "sift-up". It is not designed for an entirely-unbalanced tree (which is what you'd need), but I believe it demonstrates the basic flow of an in-place reordering of a tree. It may be a bit hard to follow because the tree is in fact stored in array rather than a tree (though the logic in a sense treats it as a tree) --- perhaps, though, you'll find such an array-based representation is best! Who knows.
The more crazy-out-there suggestion of mine is to use a splay tree. I think they're nifty, and here's the wiki link. Basically, whichever element you access is "bubbled up" to the top, but it maintains the BST invariants. So you maintain the original Key1 for building the initial tree, but hopefully most of the "higher-frequency" values will also be near the top. This may not be enough (as all it will mean is that higher-frequency words will be "near" the top of the tree, not necessarily ordered in any fashion), but if you do happen to have or find or make a tree-balancing algorithm, it may run a lot faster on such a splay tree.
Hope this helps! And thank you for an interesting riddle, this sounds like a good Haskell project to me..... :)
You can easily do this in O(1) space, but not in O(1) time ;-)
Even though re-arranging a whole tree recursively until it is sorted again seems possible, it is probably not very fast - it may be O(n) at best, probably worse in practice. So you might get a better result by adding all nodes to an array once you are done with the tree and just sorting this array using quicksort on frequency (which will be O(log n) on average). At least that's what I would do. Even tough it takes extra space it sounds more promising to me than re-arranging the tree in place.
One approach you could consider is to build two trees. One indexed by word, one indexed by freq.
As long as the tree nodes contain a pointer to the data node, you could access if via the word-based tree to update the info, but later access it by the freq-based tree to output.
Although, if speed is really that important, I'd be looking to get rid of the string as a key. String comparisons are notoriously slow.
If speed is not important, I think your best bet is to gather the data based on word and re-sort based on freq as yves has suggested.