Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
How can I distinguish between two different users, like two different neighbours who lives in a same address and goes to the same office, but they have different patterns of driving and have different office schedules. I wanted to find out the probability of two persons who behaves more or less exactly. Depending on the resolution of the map, I wants to figure them, where they are, how often they are. Can I create a pattern ´for each drivers into some signatures, where their identity can be traced upon.
I assume, by the way that you asked your question, that you haven't had any plausible ideas yet. So I'll make an answer which is purely based on an idea that you might like to try out.
I initially thought of suggesting something along the line of word-similarity metrics, but because order is not necessarily important here, maybe it's worth trying something simpler to start. In fact, if I ever find myself considering something complex when developing a model, I take a step back and try to simplify. It's quicker to code, and you don't get so attached to something that's a dead end.
So, how about histograms? If you divide up time and space into larger blocks, you can increment a value in the relevant location for each time interval. You get a 2D histogram of a person's location. You can use basic anti-aliasing to make the histograms more representative.
From there, it's down to histogram comparison. You could implement something real basic using only 1D strips. You know, like sum the similarity measure for each of the vertical and horizontal strips. Linear histogram comparison is super-easy, and just a few lines of code in a language like C. Good enough for proof of concept. If it feels you're on the right track, then start looking for more tricky ideas...
The next thing I'd do is further stratify my data, using days of the week and statutory holidays... Maybe even stratify further using seasonal variables. I've found it pretty effective for forecasting electricity load, which is as much about social patterns as it is about weather. The trends become much more distinct when you separate an influencing variable.
So, after stratification you get a stack of 2D 'slices', and your signature becomes a kind of 3D volume. I see nothing wrong with representing the entire planet as a grid. Whether your squares represent 100m or 1km. It's easy to store this sparsely and prune out anything that's outside some number of standard deviations. You might choose only the most major events for the day and end up with a handful of locations.
You can then focus on the comparison metric. Maybe some kind of image-based gradient- or cluster-analysis. I'm sure there's loads of really great stuff out there. This is just the kinds of starting-points I make, having done no research.
If you need to add some temporal information to introduce separation between people with very similar lives, you can maybe build some lags into the system... Such as "where they were an hour ago". At that point (or possibly before), you probably want to switch from my over-simplified approach of averaging out a person's daily activities, and instead use something like classification trees. This kind of thing is very easy and rapid to develop with a tool like MATLAB or R.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I am thinking of something like a 3x3 matrix for each of the x,y,z coordinates. But that would be a waste of memory since a lot of block spaces are empty. Another solution would be to have a hashmap ((x,y,z) -> BlockObject), but that doesn't seem too efficient either.
When I say efficient, I do not mean optimal. It simply means that it would be enough to run smoothly on your modern day computer. Keep in mind, that the worlds generated by minecraft are quite huge, efficiency is important regardless. There's is also tons of meta-data that needs to be stored.
As noted in my comment, I have no idea how MineCraft does this, but a common efficient way of representing this sort of data is an Octree; http://en.wikipedia.org/wiki/Octree. The general idea is that it's like a binary tree but in three-space. You recursively divide each block of space in each dimension to get eight smaller blocks, and each block contains the pointers to the smaller blocks and a pointer to its parent block.
This allows you to be efficient about storing large blocks of the same material (e.g., "empty space"), because you can terminate the recursion whenever you get to a block that is made up of all the same thing, even if you haven't recursed down to the level of individual "cube" units.
Also, this means that you can efficiently find all the cubes in a given region by taking your current block and going up the tree just far enough to get to a block that contains all you can see -- and that way, you can very easily ignore all the cubes that are somewhere else.
If you're interested in exploring alternative means to represent Minecraft world (chunk)data, you can also look into the idea of bitstrings. Each 'chunk' is comprised of a volume 16*16*128, whereas 16*16 can adequately be represented by a single byte character and can be consolidated into a binary string.
As this approach is highly specific to a certain goal of trading client-computation vs highly optimized storage and transfer time, it seems imprudent to attempt to explain all the details, but I have created a specification for just this purpose, if you're interested.
Using this method, calculating storage cost is drastically different than the current 1byte-per-block, but instead is 'variable-bit-rate': ((1bit-per-block, rounded up to a multiple of 8) * (number of unique layers a blocktype appears in a chunk) + 2bytes)
This is then summed for the (unique number of blocktypes in that chunk).
Pretty much only in deliberate edgecases can this be more expensive than a normally structured chunk, in excess of 99% of Minecraft chunks are naturally generated and would benefit from this variable-bit-representation by a ratio of 8:1 or more in many of my tests.
Your best bet is to decompile Minecraft and look at the source. Modifying Minecraft: The Source Code is a nice walkthrough on how to do that.
Minecraft is very far from efficent. It just stores "chunks" of data.
Check out the "Map formats" in the Development Resources at Minecraft Wiki. AFAIK, the internal representation is exactly the same.
Why and when should I use stack or queue data structures instead of arrays/lists? Can you please show an example for a state thats it'll be better if you'll use stack or queue?
You've been to a cafeteria, right? and seen a stack of plates? When a clean plate is added to the stack, it is put on top. When a plate is removed, it is removed from the top. So it is called Last-In-First-Out (LIFO). A computer stack is like that, except it holds numbers, and you can make one out of an array or a list, if you like. (If the stack of plates has a spring underneath, they say you "push" one onto the top, and when you remove one you "pop" it off. That's where those words come from.)
In the cafeteria, go in back, where they wash dishes. They have a conveyor-belt where they put plates to be washed in one end, and they come out the other end, in the same order. That's a queue or First-In-First-Out (FIFO). You can also make one of those out of an array or a list if you like.
What are they good for? Well, suppose you have a tree data structure (which is like a real tree made of wood except it is upside down), and you want to write a program to walk completely through it, so as to print out all the leaves.
One way is to do a depth-first walk. You start at the trunk and go to the first branch, and then go to the first branch of that branch, and so on, until you get to a leaf, and print it. But how do you back up to get to the next branch? Well, every time you go down a branch, you "push" some information in your stack, and when you back up you "pop" it back out, and that tells you which branch to take next. That's how you keep track of which branch to do next at each point.
Another way is a breadth-first walk. Starting from the trunk, you number all the branches off the trunk, and put those numbers in the queue. Then you take a number out the other end, go to that branch, and for every branch coming off of it, you again number them (consecutively with the first) and put those in the queue. As you keep doing this you are going to visit first the branches that are 1 branch away from the trunk. Then you are going to visit all the branches that are 2 branches away from the trunk, and so on. Eventually you will get to the leaves and you can print them.
These are two fundamental concepts in programming.
Because they help manage your data in more a particular way than arrays and lists.
Queue is first in, first out (FIFO)
Stack is last in, first out (LIFO)
Arrays and lists are random access. They are very flexible and also easily corruptible. IF you want to manage your data as FIFO or LIFO it's best to use those, already implemented, collections.
Use a queue when you want to get things out in the order that you put them in.
Use a stack when you want to get things out in the reverse order than you put them in.
Use a list when you want to get anything out, regardless of when you put them in (and when you don't want them to automatically be removed).
When you want to enforce a certain usage pattern on your data structure. It means that when you're debugging a problem, you won't have to wonder if someone randomly inserted an element into the middle of your list, messing up some invariants.
Stack
Fundamentally whenever you need to put a reverse gear & get the elements in constant time,use a Stack.
Stack follows LIFO it’s just a way of arranging data.
Appln:
Achieving the undo operation in notepads.
Browser back button uses a Stack.
Queue:
Queues are implemented using a First-In-Fist-Out (FIFO) principle
Appln:
In real life, Call Center phone systems will use Queues, to hold people calling them in an order, until a service representative is free.
CPU scheduling, Disk Scheduling. When multiple processes require CPU at the same time, various CPU scheduling algorithms are used which are implemented using Queue data structure.
In print spooling
Breadth First search in a Graph
Handling of interrupts in real-time systems. The interrupts are handled in the same order as they arrive, First come first served.
Apart from the usage enforcement that others have already mentioned, there is also a performance issue. When you want to remove something from the beginning of an array or a List (ArrayList) it usually takes O(n) time, but for a queue it takes O(1) time. That can make a huge difference if there are a lot of elements.
Arrays/lists and stacks/queues aren't mutually exclusive concepts. In fact, any stack or queue implementations you find are almost certainly using linked lists under the hood.
Array and list structures provide a description of how the data is stored, along with guarantees of the complexity of fundamental operations on the structures.
Stacks and queues give a high level description of how elements are inserted or removed. A queue is First-In-First-Out, while a stack is First-In-Last-Out.
For example, if you are implementing a message queue, you will use a queue. But the queue itself may store each message in a linked list. "Pushing" a message adds it to the front of the linked list; "popping" a message removes it from the end of the linked list.
It's a matter of intent. Stacks and queues are often implemented using arrays and lists, but the addition and deletion of elements is more strictly defined.
A stack or queue is a logical data structure; it would be implemented under the covers with a physical structure (e.g. list, array, tree, etc.)
You are welcome to "roll your own" if you want, or take advantage of an already-implemented abstraction.
The stack and the Queue are more advanced ways to handle a collection that the array itself, which doesn't establish any order in the way the elements behave inside the collection.
The Stack ( LIFO - Last in first out) and a Queue (FIFO - First in First out ) establish and order in which your elements are inserted and removed from a collection.
You can use an Array or a Linked List as the Storage structure to implement the Stack or the Queue pattern. Or even create with those basic structures more complex patterns like Binary Trees or priority queues, which might also bring not only an order in the insertion and removal of elements but also sorting them inside the collection.
There are algorithms that are easier to conceptualize, write and read with stacks rather than arrays. It makes cleaner code with less control logic and iterators since those are presupposed by the data structure itself.
For example, you can avoid a redundant "reverse" call on an array where you've pushed elements that you want to pop in reverse order, if you used a stack.
I think stack and queue both are memory accessing concepts which are used according to application demand. On the other hand, array and lists are two memory accessing techniques and they are used to implement stack(LIFO) and queue(FIFO) concepts.
The question is ambiguous, for you can represent the abstract data type of a stack or queue using an array or linked data structure.
The difference between a linked list implementation of a stack or queue and an array implementation has the same basic tradeoff as any array vs. dynamic data structure.
A linked queue/linked stack has flexible, high speed insertions/deletions with a reasonable implementation, but requires more storage than an array. Insertions/deletions are inexpensive at the ends of an array until you run out of space; an array implementation of a queue or stack will require more work to resize, since you'd need to copy the original into a larger structure (or fail with an overflow error).
Stacks are used in cache based applications, like recently opened/used application will comes up.
Queues are used in deleting/remove the data, like first inserted data needs to be deleted at first.
The use of queue has always been somewhat obscure to me (other than the most obvious one).
Stacks on the other hand are intrinsically linked to nesting which is also an essential part of backtracking.
For example, while checking if in a sentence brackets have been properly closed or not, it is easy to see that
sentence := chars | chars(chars)chars | chars{chars}chars | chars[chars]chars --- suppose cases like (chars) is not valid
chars := char | char char
char := a | b | ... | z | ␢ --- ignored uppercase
So now, when checking a given input is a sentence, if you encounter a (, you must check whether the part from here to ) is a sentence or not. This is nesting. If you ever study about context free languages and the push down automata, you will see in detail how stacks involved in these nested problems.
If you want to see difference between the use of stacks and queues, I recommend that you look up Breadth-First Search and Depth-First Search algorithms and their implementations.
I've been experimenting with different ideas of how to store a 2D game world. I'm interested in hearing techniques of storing large quantities of objects while managing the set that's visible ( lets say 100,000 tiles square ). Obviously the techniques can vary based on how the game renders that space.
Lets assume that we're describing a scrolling 2d game world rather than screen based as you could fairly easily do screen based rendering from such a setup while the converse is a bit more messy.
Looking for language agnostic solutions here so it's more helpful to others.
Edit: I think a good answer here would be a general review of the ideas to consider when thinking about this, as some of the responders have attempted, but also begin to explain how different solutions would apply to those scenarios. It's a somewhat complex question, so I would expect a good answer to reflect that.
Quadtrees are a fairly efficient solution for storing data about a large 2-dimensional world and the objects within it.
You might get some ideas on how to implement this from some spatial data structures like range or kd trees.
However, the answer to this question would vary considerably depending exactly on how your game works.
Are we talking a 2D platformer with 10 enemies onscreen, 20 more offscreen but "active", and an unknown number more "inactive"? If so, you can probably store your whole level as an array of "screens" where you manipulate the ones closest to you.
Or do you mean a true 2D game with lots of up/down movement too? You might have to be a bit more careful here.
The platform is also of some importance. If you're implementing a simple platformer for desktop PCs, you probably wouldn't have to worry about performance as much as you would on an embedded device. This is no excuse to be naive about it, but you might not have to be terribly clever either.
This is a somewhat interesting question I think. Presumably someone smarter than I who has experience with implementing platformers has thought these things out already.
Break the world into smaller areas, and deal with them. Any solution to this problem is going to boil down to this concept (such as quadtrees, mentioned in another answer). The differences will be in how they subdivide the world.
How much data is stored per tile? How fast can players move across the world? What's the behavior of NPCs, etc., that are offscreen? Do they just reset when the player comes back (like old Zelda games)? Do they simply resume where they were? Do they do some kind of catch-up script?
How much different rendering data is going to be needed for different areas?
How much of the world can be seen at one time?
All of these questions are going to immpact your solution, as well as the capabilities of your platform. Coming up with a general answer for these without having a reasonable idea of these parameters is going to be a bit difficult.
Assuming that your game will only update what is visible and some area around what is visible, just break the world in "screens" (a "screen" is a rectangular area on the tilemap that can fill the whole screen). Keep in memory the "screens" around the visible area (and some more if you want to update entities which are close to the character - but there is little reason to update an entity that far away) and have the rest on disk with a cache to avoid loading/unloading of commonly visited areas when you move around. Some setup like:
+---+---+---+---+---+---+---+
|FFF|FFF|FFF|FFF|FFF|FFF|FFF|
+---+---+---+---+---+---+---+
|FFF|NNN|NNN|NNN|NNN|NNN|FFF|
+---+---+---+---+---+---+---+
|FFF|NNN|NNN|NNN|NNN|NNN|FFF|
+---+---+---+---+---+---+---+
|FFF|NNN|NNN|VVV|NNN|NNN|FFF|
+---+---+---+---+---+---+---+
|FFF|NNN|NNN|NNN|NNN|NNN|FFF|
+---+---+---+---+---+---+---+
|FFF|NNN|NNN|NNN|NNN|NNN|FFF|
+---+---+---+---+---+---+---+
|FFF|FFF|FFF|FFF|FFF|FFF|FFF|
+---+---+---+---+---+---+---+
Where "V" part is the "screen" where the center (hero or whatever) is, the "N" parts are those who are nearby and have active (updating) entities, are checked for collisions, etc and "F" parts are far parts which might get updated infrequently and are prone to be "swapped" out (stored to disk). Of course you might want to use more "N" screens than two :-).
Note btw that since 2D games do not usually hold much data instead of saving the far away parts to disk you might want to just keep them in memory compressed.
You probably want to use a single int or byte array that links to block types. If you need more optimization from there, then you'll want to link to more complicated data structures like oct trees from your array. There is a good discussion on a Java game forum here: http://www.javagaming.org/index.php/topic,20505.30.html text
Anything with links becomes very expensive because the pointer takes up something like 8 bytes each, depending upon the language, so depending upon how populated your world is it can get expensive very quickly (8 pointers 8 bytes each is 64 bytes per item, and a byte array is 1 byte per item). So unless 1/64 of your world is empty, a byte array is going to be a much better option. You're also going to need to spend a lot of time iterating down the tree whenever you're doing a lookup for collision or whatever else - a byte array will be an instantaneous lookup.
Hopefully that's detailed enough for you. :-)
What is the correct name for the following data structure? It is:
A queue of fixed size
New elements are added to the start
Whenever the queue gets above a certain size a number of elements are removed from the end
"a fixed sized FIFO queue"
Sometimes buffer, sometimes ring buffer ( as that's how it's normally implemented ). I'm not aware of anything which denotes your strategy for removing items in batches, though it's not uncommon.
I think it may depend on the actual implementation of this. A practical example of what you describe is the Circular Buffer or Ring Buffer where the oldest data is overwritten by new data once the buffer is full. This would be one of traditional ways of implementing such a data structure in something like C.
EDIT: Ok, so the Circular Buffer doesn't quite fit. How about Finite Buffer Queue, or Finite Capacity Queue? But those don't really cover the self-limiting aspect...
Self-limiting Finite Capacity Bratt Queue.
Auto-popping...
My point being that I don't think there's an official name for a data structure with the exact properties that you mention, so you might as well make one up based on the data structure that nearest resembles it, perhaps combined with some of your structure's unique properties. It's probably going to be pretty wordy though...
EDIT: Or perhaps it's a Cyclic Queue. The article describes it as:
this article describes a Queue similar to the System.Collections.Queue, except that it has > a fixed buffer size. This means, of course, that the buffer could not be large enough to > hold all the items added to the Queue, in which case the oldest items are being dropped.
...which sounds a lot like yours. Nice and concise too.
it is a circular buffer
In hardware, a similar structure is called a shift register.
In embedded systems, this is almost universally referred to as circular buffers.