When a optimization is no longer "Micro-optimization" [closed] - performance

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm a team-leader/feature-architect that emerged from a developer position, so I have coding experience and a lot of the things being evolved now a days were implemented by me in the first place. Now to the point: Reviewing some code for the sake of refactoring (and some nostalgia) I've found a bunch of places that could be optimized, so as an exercise I gave myself 2 days to explore and improved a lot of stuff. After running a benchmark I found out that the general module performance had improved about 5%.
So I aproached some colleagues (and the team I run) and presented my changes. I was surprised by the general impression of "micro-optimization". If you do look at every single optimization, then yes, they are micro, but if you look at the big picture...
So my question here is: When is an optimization no longer considered "micro"?

Whether an optimization is micro or not is usually not important. The important stuff is whether it gives you any bang for the buck.
You wrote you spend two whole working days for a 5% performance increase. Did you spend those days wisely? Was those things you fixed the "most slow" part of your application, or at least those most easy to fix performance issues? Did your changes made you reach your performance target (that you didn't do before)? Does 5% performance matter at all in your case? Usually you want something like 100% or 1000% increase if you figure out that you need to improve your performance.
Could you perform your optimizations without disturbing readability and/or maintainability of the code?
Besides, what other costs did those optimizations render? How much regression test were you required to perform? How many new bugs did you create?
I know, this looks more like questions than an answer, but those are the kind of questions that should rule your decision to make an optimization or not.

Personally, I would differentiate between changes that lead to a reduction in algorithmic time or space complexity (from O(N^2) to O(N), for example) and changes that speed up the code or reduce its memory requirements but keep the overall complexities the same. I'd call the latter micro optimizations.
However, keep in mind that while this is a precise definition it should not be the only criterion for deciding whether a change is worth keeping: Reduced code complexity (as in difficulty to understand) is often more important, especially if speed and memory requirements are not a major cause of concern.
Ultimately, the answer will depend on your project: For software running on embedded devices the rules are different than for stuff running on an Hadoop cluster.

Related

KISS & design patterns [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm presented with a need to rewrite an old legacy desktop application. It is a smallish non-Java desktop program that still supports the daily tasks of several internal user communities.
The language in which the application is both antiquated and no longer supported. I'm a junior developer, and I need to rewrite it. In order to avoid the app rewrite sinkhole, I'm planning on starting out using the existing database & data structures (there are some significant limitations, but as painful as refactoring will be, this approach will get the initial work done more quickly, and avoid a migration, both of which are key to success).
My challenge is that I'm very conflicted about the concept of Keep It Simple. I understand that it is talking about functionality, not design. But as I look to writing this app, it seems like a tremendous amount of time could be spend chasing down design patterns (I'm really struggling with dependency injection in particular) when sticking with good (but non-"Group of Four") design could get the job done dramatically faster and simpler.
This app will grow and live for a long time, but it will never become a 4 million line enterprise behemoth, and its objects aren't going to be used by another app (yes, I know, but what if....YAGNI!).
The question
Does KISS ever apply to architecture & design? Can the "refactor it later" rule be extended so far as to say, for example, "we'll come back around to dependency injection later" or is the project dooming itself if it doesn't bake in all the so-called critical framework support right away?
I want to do it "right"....but it also has to get done. If I make it too complex to finish, it'll be a failure regardless of design.
I'd say KISS certainly applies to architecture and design.
Over-architecture is a common problem in my experience, and there's a code smell that relates:
Contrived complexity
forced usage of overly complicated design patterns where simpler
design would suffice.
If the use of a more advanced design pattern, paradigm, or architecture isn't appropriate for the scale of your project, don't force it.
You have to weigh the potential costs for the architecture against the potential savings... but also consider what the additional time savings will be for implementing it sooner rather than later.
yes, KISS, but see http://www.amazon.com/Refactoring-Patterns-Joshua-Kerievsky/dp/0321213351 and consider refactoring towards a design pattern in small steps. the code should sort of tell you what to do.

How do I choose a CPU and a GPU for a fair comparison? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I need to make a persuasive argument that a good GPU would be valuable to someone who needs to do certain calculations and may be willing to write his/her own code to do those calculations. I have written CUDA code to do the calculations quickly with a GPU, and I want to compare its computation time to that of a version adapted to use only a CPU. The difficult part is to argue that I am making a reasonably fair comparison even though I am not comparing apples to apples.
If I do not have a way to claim that the CPU and GPU that I have chosen are of comparable quality in some meaningful sense, then it could be argued that I might as well have deliberately chosen a good GPU and a bad CPU. How can I choose the CPU and GPU so that the comparison seems reasonable? My best idea is to choose a CPU and GPU that cost about the same amount of money; is there a better way?
There isn't a bullet proof comparison method. You could compare units with similar price and/or units that were released around the same time. The latter would be to show that the state of the art technology at a given time in both products sets the GPU ahead.
I would consider three main costs:
Cost of initial purchase
Development cost for the application
Ongoing cost in power consumption
You can sum these up to get a total cost for using each solution for some period of time, for instance one year.
Also, take into account that it is only worth paying for increased performance if the increase was actually needed. For instance, you may need to run some calculation every day. If you have a full day to run the calculations, it may not matter if they run in 5 minutes or an hour.

Is this defensive programming? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've always thought defensive programming was evil (and I still do), because typically defensive programming in my experience has always involved some sort of unreasonable sacrifices based on unpredictable outcomes. For example, I've seen a lot of people try to code defensively against their own co-workers. They'll do things "just in case" the code changes in some way later on. They end up sacrificing performance in some way, or they'll resort to some silver bullet for all circumstances.
This specific coding practice, does it count as defensive programming? If not, what would this practice be called?
Wikipedia defines defensive programming as a guard for unpredictable usage of the software, but does not indicate defensive programming strategies for code integrity against other programmers, so I'm not sure if it applies, nor what this is called.
Basically I want to be able to argue with the people that do this and tell them what they are doing is wrong, in a professional way. I want to be able to objectively argue against this because it does more harm than good.
"Overengineering" is wrong.
"Defensive Programming" is Good.
It takes wisdom, experience ... and maybe a standing policy of frequent code reviews ... to tell the difference.
It all depends on the specifics. If you're developing software for other programmers to reuse, it makes sense to do at least a little defensive programming. For instance, you can document requirements about input all you want, but sometimes you need to test that the input actually conforms to the requirements to avoid disastrous behavior (e.g., destroying a data base). This usually involves a (trivial) performance hit.
On the other hand, defensiveness can be way overdone. Perhaps that is what's informing your view. A specific example or two would help distinguish what's going on.

What % of programming time do you spend debugging? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 13 years ago.
What % of programming time do you spend debugging? What do you think are acceptable percentages for certain programming mediums?
About 90% of my time is spent debugging or refactoring/rewriting code of my coworkers that never worked but still was commited to GIT as "working".
Might be explained by the bad morale in this (quite big) company as a result of poor management.
Managements opinion about my suggestions:
Unit Tests: forbidden, take too much time.
Development Environment: No spare server and working on live data is no problem, you just have to be careful.
QA/Testing: Developers can test on their own, no need for a seperate tester.
Object Oriented Programming: Too complex, new programmers won't be able to understand the code fast enough.
Written Specs: Take too much time, it's easier to just tell the programmers to create what we need directly.
Developer Training: Too expensive and programmers won't be able to work while in the training.
Not a lot now that I have lots of unit tests. Unless you count time spent writing tests and fixing failing tests to be debugging time, which I don't really. It's relatively rare now to have to step through code in order to see why a test is failing.
How much time you have to spend debugging depends on the codebase. If it is too high, that is likely a symptom of other problems, e.g. lack of adequate exception handling, logging, testing, repeatability etc. What counts as "Too high" is subjective.
If you do have to debug an error, think about making a failing test before you fix it, so that the error does not recur.
The worst that I have had to work on was a large and complex simulation written entirely without tests. Sometimes it failed in the middle of a run, and to reproduce a crash involved setting a breakpoint, starting the run and waiting half an hour or more. Then make a change and repeat. Don't ever get yourself into that morale-sapping and productivity-destroying situation.
There is so much variety when it comes to writing software that it's impossible to give you a solid answer. Complexity of the software can increase debugging time, for example, if the codebase is very large and the code itself is poorly written, then that could increase the amount of time spent debugging.
One way to reduce the debugging time is to write unit tests. I've been doing this for a while and found it helps reduce the number of bugs which are released to the customer.

Current "hot" topics in parallel programming? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I hope this is the right place to ask[1], but I've read a lot of good comments on other topics here so I'll just ask. At the moment I'm searching for a topic for my dissertation (Ph.d in non-german countries I think), which must have to do something with parallelism or concurrency, etc. but otherwise I'm quite free to chose what I'm interested in. Also Everything with GPU's is not reasonable, because a colleague of me does already research on this topic and we'd like to have something else for me :)
So, the magic questions is: What would you say are interesting topics in this area? Personally I'm interested in parallel functional programming languages and virtual machines in general but I'd say that a lot of work has been already done there or is actively researched (e.g. in the Haskell community).
I'd greatly appreciate any help in pointing me to other interesting topics.
Best regards,
Michael
PS: I've already looked at https://stackoverflow.com/questions/212253/what-are-the-developments-going-on-in-all-languages-in-parallel-programming-area but there weren't a lot of answers.
[1] I've already asked at http://lambda-the-ultimate.org but the response was unfortunately not as much as expected.
Software transactional memory?
Another reseach area is automated parallelization. That is to say, given a sequence of instructions S0..Sn, come up with multiple sequences that perform the same work in less steps.
Erlang programming!
From the top of my head:
Load balancing & how to achieve the best level of parallelization. I think it can be very good starting point for PhD because here you can propose new methodology and compare it with real values in hand (number of steps - that was already mentioned, CPU usage, memory usage etc) in general or for particular algorithm or set of tasks (like image processing for example).
Parallel Garbage Collection. There are lot of algorithms for collection, there are a lot algorithms to present objects in a memory. For example, there is a recent work from Haskell comunity about Parallel GC: http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel-gc/index.htm
Then again, there is a good way to present your resutls and compare it with others and it gives you the flexibility in the end - you can focus on concurrent data structures later, or synchronizaiton primitives or algorithms etc.
Probably you already have your PhD by now ;) . Anyways: Fault tolerance on massive parallel systems comes into my mind.
Parallel processing and rules engines are both high-visibility topics in the commercial/industrial computing world. So, how about looking at parallel implementations of the Rete algorithm (introductory descriptions here and here), the foundation under many commercial busines rules engines? Are there techniques for building Rete networks
that are better suited for parallelization? Could a "vanilla" Rete network be refactored into multiple networks that could be executed more effectively in parallel? Etc.
Parallelism friendly common-application features. Right now, parallelism has been heavily focused on scientific computing and programming languages, but not so much on consumer applications or consumer-application friendly features/data structures/design patterns, and these will be very important in a multi-core world.
You mentioned Haskell and you surely has stumbled upon Data Parallel Haskell. Since Big Data Analysis is a big word lately and given that the Map/Reduce niche is overcrowded, I think that DPH is a good area of research.

Resources