Why do so many things run in 'human observable time'? [closed] - complexity-theory

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I've studied complexity theory and I come from a solid programming background and it's always seemed odd that so many things seem to run in times that are intrinsic to humans. I'm wondering if anyone has any ideas as to why this is?
I'm generally speaking of times in the range of 1 second to 1 hour. If you consider how narrow that span of time is proportional the the billions of operations per second a computer can handle, it seems odd that such a large number of things fall into that category.
A few examples:
Encoding video: 20 minutes
Checking for updates: 5 seconds
Starting a computer: 45 seconds
You get the idea...
Don't you think most things should fall into one of two categories: instantaneous / millions of years?

probably because it signifies the cut-off where people consider further optimizations not being worth the effort.
and clearly, having a computer that takes millions of years to boot wouldn't be very useful (or maybe it would, but you just wouldn't know yet, because it's still booting :P )

Given that computers are tools, and tools are meant to be setup, used, and have their results analyzed by humans (mostly), it makes sense that the majority of operations would be created in a way that didn't take longer than the lifespan of a typical human.
I would argue that most single operations are effectively "instantaneous" (in that they run in less than perceptible time), but are rarely used as a single operation. Humans are capable of creating complexity, and given that many computational operations intrinsically contain a balance between speed and some other factor (quality, memory usage, etc), it actually makes sense that many operations are designed in a way where that balance places them into a "times that are intrinsic to humans". However, I'd personally word that as "a time that is assumed to be acceptable to a human user, given the result generated."

Related

Football (soccer) match processing [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to implement football match processing and so I spent a lot of time to find good algorithms to do it. I have some data as input - players that have some parameters. Some of parameters are static while the match is processing (skills) and some of them are dynamic (physical state, psycological state) that are changing in process. Also I have external parameters that I can change manually. I don't need it to be so close to the real football (excluding result. 20:0 will be awful anyway). And the last main idea is that the same input will not lead to the same output. Some of middle calculations should return random values.
The algorithm should not be very slow because in the near future it will be necessary to process about 1000 matches at the same time step-by-step. Each step will be calculated once per 3 seconds. And also these steps should be logically linked because I will make graphic match process with all ball and players movements.
What algorithms can you recommend for me? I thought about neural network but I'm not sure that this is a good solution.
You will really help me because I spent about of a half of year to find it so thank you very much!
Let's say you have an "action" every 5 minutes of the game, so 90/5 = 18 actions. To make it more realistic you can choose random number like:
numberOfActions = round(10,20);
This number can appear as a lenght of you for().
Than you have interactions between defense and offence parameters of two sets of your players. Lets say each point of offenceA-defenceB creates ten per cent chance to succeed:
if((TeamA.Offence-TeamB.Defence)*10 > round(0,100))
{
TeamA.points++;
}
Of course the goalkeeper can decrease this probability, maybe even significantly.
And so on. Of course you can make it more complicated. Like you can compare your stats only for certain players, depending on who's having the ball. Your both offence and defense parameters can be decreased by time and raised by condition:
TeamA._realOffenceValue =
TeamA.Offence*
(1-i/numberOfActions)*
(TeamA.leftOffencePlayer.Condition);
Remember that in games like football manager or Europa Universalis it's everything about cheating the player. Balancing the game is a job for many hours and nobody will make it for you on the forum :)

What are the standard data structures that can be used to efficiently represent the world of minecraft? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I am thinking of something like a 3x3 matrix for each of the x,y,z coordinates. But that would be a waste of memory since a lot of block spaces are empty. Another solution would be to have a hashmap ((x,y,z) -> BlockObject), but that doesn't seem too efficient either.
When I say efficient, I do not mean optimal. It simply means that it would be enough to run smoothly on your modern day computer. Keep in mind, that the worlds generated by minecraft are quite huge, efficiency is important regardless. There's is also tons of meta-data that needs to be stored.
As noted in my comment, I have no idea how MineCraft does this, but a common efficient way of representing this sort of data is an Octree; http://en.wikipedia.org/wiki/Octree. The general idea is that it's like a binary tree but in three-space. You recursively divide each block of space in each dimension to get eight smaller blocks, and each block contains the pointers to the smaller blocks and a pointer to its parent block.
This allows you to be efficient about storing large blocks of the same material (e.g., "empty space"), because you can terminate the recursion whenever you get to a block that is made up of all the same thing, even if you haven't recursed down to the level of individual "cube" units.
Also, this means that you can efficiently find all the cubes in a given region by taking your current block and going up the tree just far enough to get to a block that contains all you can see -- and that way, you can very easily ignore all the cubes that are somewhere else.
If you're interested in exploring alternative means to represent Minecraft world (chunk)data, you can also look into the idea of bitstrings. Each 'chunk' is comprised of a volume 16*16*128, whereas 16*16 can adequately be represented by a single byte character and can be consolidated into a binary string.
As this approach is highly specific to a certain goal of trading client-computation vs highly optimized storage and transfer time, it seems imprudent to attempt to explain all the details, but I have created a specification for just this purpose, if you're interested.
Using this method, calculating storage cost is drastically different than the current 1byte-per-block, but instead is 'variable-bit-rate': ((1bit-per-block, rounded up to a multiple of 8) * (number of unique layers a blocktype appears in a chunk) + 2bytes)
This is then summed for the (unique number of blocktypes in that chunk).
Pretty much only in deliberate edgecases can this be more expensive than a normally structured chunk, in excess of 99% of Minecraft chunks are naturally generated and would benefit from this variable-bit-representation by a ratio of 8:1 or more in many of my tests.
Your best bet is to decompile Minecraft and look at the source. Modifying Minecraft: The Source Code is a nice walkthrough on how to do that.
Minecraft is very far from efficent. It just stores "chunks" of data.
Check out the "Map formats" in the Development Resources at Minecraft Wiki. AFAIK, the internal representation is exactly the same.

Why are numbers rounded in standard computer algorithms? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I understand that this makes the algorithms faster and use less storage space, and that these would have been critical features for software to run on the hardware of previous decades, but is this still an important feature? If the calculations were done with exact rational arithmetic then there would be no rounding errors at all, which would simplify many algorithms as you would no longer have to worry about catastrophic cancellation or anything like that.
Floating point is much faster than arbitrary-precision and symbolic packages, and 12-16 significant figures is usually plenty for demanding science/engineering applications where non-integral computations are relevant.
The programming language ABC used rational numbers (x / y where x and y were integers) wherever possible.
Sometimes calculations would become very slow because the numerator and denominator had become very big.
So it turns out that it's a bad idea if you don't put some kind of limit on the numerator and denominator.
In the vast majority of computations, the size of numbers required to to compute answers exactly would quickly grow beyond the point where computation would be worth the effort, and in many calculations it would grow beyond the point where exact calculation would even be possible. Consider that even running something like like a simple third-order IIR filter for a dozen iterations would require a fraction with thousands of bits in the denominator; running the algorithm for a few thousand iterations (hardly an unusual operation) could require more bits in the denominator than there exist atoms in the universe.
Many numerical algorithms still require fixed-precision numbers in order to perform well enough. Such calculations can be implemented in hardware because the numbers fit entirely in registers, whereas arbitrary precision calculations must be implemented in software, and there is a massive performance difference between the two. Ask anybody who crunches numbers for a living whether they'd be ok with things running X amount slower, and they probably will say "no that's completely unworkable."
Also, I think you'll find that having arbitrary precision is impractical and even impossible. For example, the number of decimal places can grow fast enough that you'll want to drop some. And then you're back to square one: rounded number problems!
Finally, sometimes the numbers beyond a certain precision do not matter anyway. For example, generally the nnumber of significant digits should reflect the level of experimental uncertainty.
So, which algorithms do you have in mind?
Traditionally integer arithmetic is easier and cheaper to implement in hardware (uses less space on the die so you can fit more units on there). Especially when you go into the DSP segment this can make a lot of difference.

Simple tool for shipping dates estimates without uncertainty [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
What I'm looking for is very simple: I want a tool that computes the calculated, as opposed to estimated based on confidence intervals, shipping date given a list of tasks with total estimates and current progress each without introducing further uncertainty as I want to handle that externally.
I want it to take workdays duration and user input holidays into account, etc.
I know Fogbugz's Evidence Base Scheduling does something very close to that but I would like it without the statistical aspect and associated confidence intervals. I'm aware it's a drastic simplification and that statistical estimation is the essence of EBS but I'm not looking for a subjective discussion here, I just want to be able to access this simple information (the supposedly exact shipping date) at any given time during the project without having to calculate it myself.
So I'm looking for one of three things : 1) a way to customize Fogbugz (6.0) to show me the information I want besides confidence intervals 2) a way to customize Fogbugz to set estimates uncertainty to 0 3) another tool (free) that does what I want exactly.
EDIT: By "supposedly exact" or "calculated", I don't mean with respect to what is actually going to happen, that would indeed be trying to predict the future. I mean with respect to the information that was input, together with its obvious uncertainty. In that case, I guess estimates for individual tasks should be more seen as spending limits or upper bounds. The information I would like to be able to compute is really very simple : if everything goes exactly as specified, where does it take us ? Then, with information about how the estimates were made, such as the ability of each individual developper to make good estimates, I can derive the confidence interval. EBS does this automatically and, undoubtebly, very well which is why I use it. What I would like is to obtain is one more little piece of information, ie the same starting point EBS uses and try to play with my own asumptions as to how the statistical estimation should be made.
FogBugz will show you the sum of estimates at the bottom of the LIST page, labelled "Total estimated time remaining". This is the raw sum of estimates, without any EBS calculations.
You can't predict the future. So any calculated shipping date can only be a guess. That guess depends on the confidence intervals around each individual number that went into it. This is a matter of definition -- even though you may not like it.
You may want to have a "100% confident" date, but such a thing (by definition) cannot exist. You cannot have an uncertainty of zero unless you want a date infinitely far in the future. It's the nature of statistics: the distribution is actually infinite, but data is considerably more likely to cluster around the mean.
The only thing you can do is pick a really big confidence interval (99.7%). You are free to ignore the supporting statistical facts about the confidence interval and pretend it has zero uncertainty. For all practical purposes 0.3% uncertainty is small enough that you're not going to be unhappy with that date.
However, all statistically-based predictions of the future must have uncertainty. It's a law.

What is the most effective way to present and communicate a performance improvement (e.g. percentages, raw data, graphics)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Is it better to describe improvements using percentages or just the differences in the numbers? For example if you improved the performance of a critical ETL SQL Query from 4000 msecs to 312 msecs how would you present it as an 'Accomplishment' on a performance review?
In currency. Money is the most effective medium for communicating value, which is what you're trying to use the performance review to demonstrate.
Person hours saved, (very roughly) estimated value of $NEW_THING_THE_COMPANY_CAN_DO_AS_RESULT, future hardware upgrades averted, etc.
You get the nice bonus that you show that you're sensitive to the company's financial position; a geek who can align himself with what the company is really about.
Take potato
Drench Potato in Lighter Fluid
Light potato on fire
Hand potato to boss
Make boss hold it for 4 seconds.
Ask boss how long those 4 seconds felt
Ask boss how much better half a second would have been
Bask in glory
It is always better to measure relative improvement.
So, if you brought it down to 312ms from 4000ms then it is an improvement of 3688ms, which is 92.2% of the original speed. So, you reduced the runtime by 92.2%. In other words, you brought the runtime down to only 7.8% of what it was originally.
Absolute numbers, on the other hand, usually are not that good since they are not comparable. (If your original runtime was 4,000,000ms then an improvement of 3688ms isn't that great.)
See this link for some nice chart suggestions.
Comparison to Requirements
If I have requirements (response time, throughput), I like to color code the absolute numbers like so:
Green: <= 80% of the requirement (response time); >= 120% of > the requirement (throughput)
No formatting: Meets the requirement.
Red: Does not meet the requirement.
Comparisons are interesting, but only if we have enough to see trends over time; Is our performance steadily improving or degrading? Ultimately, the business only cares if we're meeting the requirement. It's only when we don't that they ask for comparisons to previous releases.
Comparison of Benchmarks
If I'm comparing benchmarks to some baseline, then I like to use percentages, but only if the benchmark is a statistically significant change from the baseline.
Hardware Sizing
If I'm doing hardware sizing or capacity planning, then I like to express the performance as the absolute number plus the cost per transaction. For example:
System A: 1,000 transactions/second, $0.02/transaction
System B: 1,500 transactions/second, $0.04/transaction
Use whichever appears most impressive given the change. According to one method of calculation, that change sped up the query by 1,300%, which looks more impressive than 13x improvement, or
============= <-- old query
= <-- new query
Although the graph isn't a bad method.
If you can calculate the improvement in money, then go for that. One piece of software I wrote many years ago saved a few engineers a little bit of time each day. Figuring out the cost of salary, benefits, overhead and it turned into a savings of more than $12k per year for a small company.
-Adam
Rule of the thumb: Whichever sounds more impressive.
If you went from 10 tasks done in a period to 12, you could say you improved the performance by 20%
Saying you did two tasks more doesnt seem that impressive.
In your case, both numbers sound good, but try different representations and see what you get!
Sometimes graphics help a lot of the improvement is there on a number of factors, but the combined somehow does not look that cool
Example: You have 5 params A, B, C, D, E. You could make a bar chart with those 5 params and "before and after" values side by side for each param. That sure will look impressive.
God im starting to sound like my friend from marketing!
runs away screaming
you can make numbers and graphs say anything you want - the important thing is to make them say something meaningful and relevant to the audience you're presenting them to. if it's end users you can show them differences in the screen refreshes (something they understand), to managers perhaps the reduced number of servers they'll need in order to support the application ($ savings), financial...it's all about the $ how much did it save them. a general rule is the less technical the group the more graphical and dramatic you need to be.

Resources