Recommendations for Fast Multipole Method implementation? - algorithm

I'm interested in implementing the Fast Multipole Method to efficiently simulate a system of repulsive particles.
I've found a large collection of references discussing FMM, but none seem very approachable for non-mathematicians who want to fully understand the algorithm.
Can you recommend a ground-up reference that clearly explains the mathematics behind the process, and includes pseudocode exemplifying a proper implementation?

I am by no means an expert in FMM, but this java implementation and introduction is the best source I've found so far for explaining it carefully and slowly. The paper is good at defining terms before using them, and the code at least is useful as a reference point. The math still gets hairy very quickly, but it is what it is :)
A pedestrian introduction to fast multipole methods is a close second. It doesn't explain the actual details of a working FMM implementation, but it's a good introduction to the basic ideas.

I like the short course on FMM. In begins with FMM in 1D, than it uses theory of complex variable to do FMM in 2D. And than there is the crazy 3D version which uses theory of spherical harmonics functions, which I guess can be very difficult for non-mathematician. But If you need FMM only in 2D you should be fine.
Unfortunately no pseudo codes are given there.
But do you really need the accuracy of FMM?. You might be fine with Barnes-Hut's algorithm

After running into a similar issue to you, I ended up writing a fully-documented Python fast multipole method implementation, pybbfmm. I've also written a short, mathematics-free tutorial on how the method works. Together, I think they're substantially more accessible than any of the other presentations I could find.
(meta: Although this is effectively a linkpost, the OP is explicitly asking for a link. I've added what I think was missing from the last one - the name fo the library - but I'm not sure how else to offer this answer except as a name and a link. Certainly it doesn't feel any more linkpost-y than the accepted answer. If this one gets deleted as well, I'll give up)

Related

What algorithm is more efficient in computing eigenvectors?

I need to find all eigenpairs of a symmetric real matrix. Quick search shows that there are 5 main algorithms doing that:
Jacobi rotations. Every researcher says it's ineffective.
QR. My textbook says it's the best way.
Divide-and-conquer. Wikipedia says it's better than QR.
MRRR. Never ever heard of it. Any paper seems incomprehensible to me. However, it's newer than DnC, so it must be faster.
Homotopy method. Don't know anything on that.
2, 3 and 4 require the said matrix to be first brought to Hessenberg form. My question is, which one of those is actually the best? I failed to find a paper with a desired comparsion. Moreover, some papers are behind a paywall. Behind that, I tried to read LAPACK code, but this seems impossible as it is optimised as hell. I need your help!

What's the most effective workflow between people who develop algorithms and developers?

We are developing software with pattern recognition in video. We have 7 mathematicians who are creating algorithms. Plus we have 2 developers that maintain / develop the application with these algorithms. The problem is that mathematicians are using different development tools to create algorithm like Matlab, C, C++. Also because they are not developers the don't give much concerns for memory management or multi-threading. This one of the reason why the app. has a lot of bugs.
If in your company you have similar situation, how do you deal with it? What's the best tools you can recommend to create algorithms? What communication supposed to be between mathematicians and developers? What's in your opinion the most effective to work with high-level tools?
I am not sure whether you devs are rewriting the mathematician's stuff or if you just have to interface to it so I am not sure if my answer is of any use.
However: I work together with a bunch of phd candidates and postdocs on a machine learning library and am a student myself. In that process, I came to translate a lot of algorithms from python/numpy to C++/blas.
This process can be quite tedious - especially with numerical and stochastic algorithms, it is hard to find bugs.
So here is what I did: Get some sample inputs and calculate the results with the python code. Generate unit tests out of these for C++ and then start coding them in C++.
Checking concrete sample inputs with the outputs is essential in this setting.
I agree with Makach.
Let the guys who are creating the algorithms use the tools that they are most familiar with. Because there are two separate (and equally important) tasks to work on in this project. First, there is the creating of an efficient, elegant and appropriate mathematically sound algorithm, then there is the twistedly difficult task of translating it into CPU-speak. The mathimaticians should focus on their first task, and to make it easier for them, allow them to use the toosl they are comfortable with. In terms of man hours, it is a much more efficient use of their time to write MATLAB code, than it would be to have them learn a new programming language.
Your task is to unearth the (presumably) brilliant mathematics that are buried within the gibberish code.
That part is just a perspective on the problem at hand. Here's the actual answer.
Communication, mutual respect, and teaching/learning.
Communication & Mutual Respect
You must communicate with them often. Work closely with them and ask them questions whenever you come across something you're not sure of. This is much easier when there is mutual respect, which means that if you spend all your time criticizing their coding abilities, then they will be forced to spend all their time criticizing your math abilities. Instead, try quick learning-sessions. ("Lunch & Learn" is a fairly common tactic)
Teaching/Learning
The first and most important piece of wisdom to impart to them is commenting. Have them comment the crap out of their code. Tell them that the comments are much more important than the code quality, and that as long as their comments are right, they can leave the rest up to you guys. Because they can. They don't need to have their code look beautiful, for be the fastest, they just need it to make sense to you guys.
To continue this mutual learning scenario, if you notice some very simple very common mistakes they are making, (nothing NEARLY as complicated as multithreading) just give them a quick heads up. "That way works (or doesn't) but here's a way to do it that is a little different but it will make your lives much much easier." Encourage them to reciprocate by trying to notice which nuances or parts of their algorithms which you and your team are having difficulty with and teach a little tutorial about it.
Once you guys get the communication flowing, you'll find it easier and easier to shape their coding style to what is best for your team, and they will also find it easier to understand why you don't see it the same way they do.
Also, as mentioned by Kekoav, make sure they provide a few fully loaded test cases.
That means for something like
A -> B -> C -> D -> Solution
They would provide you all the values for A, then what it looks like at B, then what it looks like at C and so on. So that you can be certain that not only is it correct at the end, but it's also correct at every step of the way. Try to have them provide examples that are regular, and also a few of them that are unusual, so that you can be certain your code covers edge cases.
I'd recommend the devs spend a few hours getting used to Matlab, especially the Matlab debugger. If their background is CS then they'll already be familiar with vectors and matrices theoretically if not practically. Other than the matrix being the default data structure, Matlab is C-like and easy enough to interpret for translation into another language.
I have been working with a physics professor lately, and have a little experience with this(although admittedly I'm no expert).
I have had to translate a lot of Matlab code into another language. It has been difficult because a lot of(most) of the operations are absent, including when it comes to precision, and working with matrices and vectors. A good math library needs to be found, or created to fit your needs.
The best way that I have found is to do the following:
Get the algorithm to work correctly in the new language.
Create some tests to verify that the algorithm is producing desired output. Have your mathematicians verify that your converted solution in fact works, and that you have covered all bases with your tests.
Then after it is working, and you can trust your tests, optimize the algorithm to be good coding style, have good design and performance characteristics. Use your regression tests to make sure you aren't breaking anything.
I normally start with a verbatim copy of their algorithms into the other language, and then work from there, regardless of if I do a lot of tests.
It is important to get a working copy first, in case the performance is really not an issue and you need to move on to other things and can come back later to make it faster.
This is your job. How you deal with this is what identifies you as a system developer.
Communicate with your colleagues. Draw and explain, have meetings, agree upon and set standards requirements, follow your plans and talk to your project manager. Make sure that your relevant colleagues are joining up on meetings. Have 1-1 talks etc etc
You cannot blame it on the mathematicians for developers creating bugs. It's their job to worry about implementation, not the mathematicians.

How to cultivate algorithm intuition?

When faced with a problem in software I usually see a solution right away. Of course, what I see is usually somewhat off, and I always need to sit down and design (admittedly, I usually don't design enough), but I get a certain intuition right away.
My problem is I don't get that same intuition when it comes to advanced algorithms. I feel much more up to the task of building another Facebook then building another Google search, or a Music Genom project. It's probably because I've been building software for quite some time, but I have little experience with composing algorithms.
I would like the community's advice on what to read and what projects to undertake to be better at composing algorithms.
(This question has nothing to do with Algorithmic composition. Well, almost nothing)
+1 To whoever said experience is the best teacher.
There are several online portals which have a lot of programming problems, that you can submit your own solutions to, and get an automated pass/fail indication.
http://www.spoj.pl/
http://uva.onlinejudge.org/
http://www.topcoder.com/tc
http://code.google.com/codejam/contests.html
http://projecteuler.net/
https://codeforces.com
https://leetcode.com
The USACO training site is the training program that all USA computing olympiad participants go through. It goes step by step, introducing more and more complex algorithms as you go.
You might find it helpful to perform algorithms physically. For example, when you're studying sorting algorithms, practice doing each one with a deck of cards. That will activate different parts of your brain than reading or programming alone will.
Steve Yegge referred to "The Algorithm Design Manual" in one of his rants. I haven't seen it myself, but it sounds like it's just the ticket from his description.
My absolute favorite for this kind of interview preparation is Steven Skiena's The Algorithm Design Manual. More than any other book it helped me understand just how astonishingly commonplace (and important) graph problems are – they should be part of every working programmer's toolkit. The book also covers basic data structures and sorting algorithms, which is a nice bonus. But the gold mine is the second half of the book, which is a sort of encyclopedia of 1-pagers on zillions of useful problems and various ways to solve them, without too much detail. Almost every 1-pager has a simple picture, making it easy to remember. This is a great way to learn how to identify hundreds of problem types.
problem domain
First you must understand the problem domain. An elegant solution to the wrong problem is no good, nor is an inefficient solution to the right problem in most cases. Solution quality, in other words, is often relative. A simple scheduling problem that has a deterministic solution that takes ten minutes to run may be fine if schedules are realculated once per week, but if schedules change several times a day then a genetic algorithm solution that converges in a few seconds may be required.
decomposition and mapping
Second, decompose the problem into sub-problems and known/unknown elements that correspond to elements of the solution. Sometimes this is obvious, e.g. to count widgets you need a way of identifying widgets, an incrementable counter, and a way of storing the count. Sometimes it is not so obvious. Sometimes you have to decompose the problem, the domain, and possible solutions at the same time and try several different mappings between them to find one that leads to the correct results [this is the general method].
model
Model the solution, in your head at least, and walk through it to see if it works correctly. Adjust as necessary (See decomposition and mapping, above).
composition/interfaces
Many times you can find elements of the problem and elements of the solution that map to each other and produce partial results that are useful. This composition and interface construction provides the kernal of the solution, and also serves to reduce the scope of the problem remaining. So then you just loop back to the top with a smaller initial problem, and go through it again.
experience
Experience is the best teacher, of course, but reading about different kinds of problems and solutions will also be helpful. Studying some of the well-known algorithms and their applications is likewise very helpful, e.g. Dijkstra, Bresenham, Unification, and of course, graph theory.
I am not sure intuition can be cultivated, but I think I know what you are asking. The more problems you solve, the more information and experience you have at your disposal for future problems. So, I say just practice. Practice programming real world applications and you run into plenty of problems. Sometimes, solving puzzles can be very educational as well.
I try to find physical analogues when I'm looking at a complex problem.

Minimum CompSci Knowledge Needed for Writing Desktop Apps

Having been a hobbyist programmer for 3 years (mainly Python and C) and never having written an application longer than 500 lines of code, I find myself faced with two choices :
(1) Learn the essentials of data structures and algorithm design so I can become a l33t computer scientist.
(2) Learn Qt, which would help me build projects I have been itching to build for a long time.
For learning (1), everyone seems to recommend reading CLRS. Unfortunately, reading CLRS would take me at least an year of study (or more, I'm not Peter Krumins). I also understand that to accomplish any moderately complex task using (2), I will need to understand at least the fundamentals of (1), which brings me to my question : assuming I use C++ as the programming language of choice, which parts of CLRS would give me sufficient knowledge of algorithms and data structures to work on large projects using (2)?
In other words, I need a list of theoretical CompSci topics absolutely essential for everyday application programming tasks. Also, I want to use CLRS as a handy reference, so I don't want to skip any material critical to understanding the later sections of the book.
Don't get me wrong here. Discrete math and the theoretical underpinnings of CompSci have been on my "TODO: URGENT" list for about 6 months now, but I just don't have enough time owing to college work. After a long time, I have 15 days off to do whatever the hell I like, and I want to spend these 15 days building applications I really want to build rather than sitting at my desk, pen and paper in hand, trying to write down the solution to a textbook problem.
(BTW, a less-math-more-code resource on algorithms will be highly appreciated. I'm just out of high school and my math is not at the level it should be.)
Thanks :)
This could be considered heresy, but the vast majority of application code does not require much understanding of algorithms and data structures. Most languages provide libraries which contain collection classes, searching and sorting algorithms, etc. You generally don't need to understand the theory behind how these work, just use them!
However, if you've never written anything longer than 500 lines, then there are a lot of things you DO need to learn, such as how to write your application's code so that it's flexible, maintainable, etc.
For a less-math, more code resource on algorithms than CLRS, check out Algorithms in a Nutshell. If you're going to be writing desktop applications, I don't consider CLRS to be required reading. If you're using C++ I think Sedgewick is a more appropriate choice.
Try some online comp sci courses. Berkeley has some, as does MIT. Software engineering radio is a great podcast also.
See these questions as well:
What are some good computer science resources for a blind programmer?
https://stackoverflow.com/questions/360542/plumber-programmers-vs-computer-scientists#360554
Heed the wisdom of Don and just do it. Can you define the features that you want your application to have? Can you break those features down into smaller tasks? Can you organize the code produced by those tasks into a coherent structure?
Of course you can. Identify any 'risky' areas (areas that you do not understand, e.g. something that requires more math than you know, or special algorithms you would have to research) and either find another solution, prototype a solution, or come back to SO and ask specific questions.
Moving from 500 loc to a real (eve if small) application it's not that easy.
As Don was pointing out, you'll need to learn a lot of things about code (flexibility, reuse, etc), you need to learn some very basic of configuration management as well (visual source safe, svn?)
But the main issue is that you need a way to don't be overwhelmed by your functiononalities/code pair. That it's not easy. What I can suggest you is to put in place something to 'automatically' test your code (even in a very basic way) via some regression tests. Otherwise it's going to be hard.
As you can see I think it's no related at all to data structure, algorithms or whatever.
Good luck and let us know
I must say that sitting down with a dry old textbook and reading it through is not the way to learn how to do anything effectively, even if you are making notes. Doing it is the best way to learn, using the textbooks as a reference. Indeed, using sites like this as a reference.
As for data structures - learn which one is good for whatever situation you envision: Sets (sorted and unsorted), Lists (ArrayList, LinkedList), Maps (HashMap, TreeMap). Complexity of doing basic operations - adding, removing, searching, sorting, etc. That will help you to select an appropriate library data structure to use in your application.
And also make sure you're reasonably warm with MVC - i.e., ensure your model is separate from your view (the QT front-end) as best as possible. Best would be to have the model and algorithms working on their own, and then put the GUI on top. Or a unit test on top. Etc...
Good luck!
It's like saying you want to move to France, so should you learn french from a book, and what are the essential words - or should you just go to France and find out which words you need to know from experience and from copying the locals.
Writing code is part of learning computer science. I was writing code long before I'd even heard of the term, and lots of people were writing code before the term was invented.
Besides, you say you're itching to write certain applications. That can't be taught, so just go ahead and do it. Some things you only learn by doing.
(The theoretical foundations will just give you a deeper understanding of what you wind up doing anyway, which will mainly be copying other people's approaches. The only caveat is that in some cases the theoretical stuff will tell you what's futile to attempt - e.g. if one of your itches is to solve an NP complete problem, you probably won't succeed :-)
I would say the practical aspects of coding are more important. In particular, source control is vital if you don't use that already. I like bzr as an easy to set up and use system, though GUI support isn't as mature as it could be.
I'd then move on to one or both of the classics about the craft of coding, namely
The Pragmatic Programmer
Code Complete 2
You could also check out the list of recommended books on Stack Overflow.

How often do you use pseudocode in the real world?

Back in college, only the use of pseudo code was evangelized more than OOP in my curriculum. Just like commenting (and other preached 'best practices'), I found that in crunch time psuedocode was often neglected. So my question is...who actually uses it a lot of the time? Or do you only use it when an algorithm is really hard to conceptualize entirely in your head? I'm interested in responses from everyone: wet-behind-the-ears junior developers to grizzled vets who were around back in the punch card days.
As for me personally, I mostly only use it for the difficult stuff.
I use it all the time. Any time I have to explain a design decision, I'll use it. Talking to non-technical staff, I'll use it. It has application not only for programming, but for explaining how anything is done.
Working with a team on multiple platforms (Java front-end with a COBOL backend, in this case) it's much easier to explain how a bit of code works using pseudocode than it is to show real code.
During design stage, pseudocode is especially useful because it helps you see the solution and whether or not it's feasible. I've seen some designs that looked very elegant, only to try to implement them and realize I couldn't even generate pseudocode. Turned out, the designer had never tried thinking about a theoretical implementation. Had he tried to write up some pseudocode representing his solution, I never would have had to waste 2 weeks trying to figure out why I couldn't get it to work.
I use pseudocode when away from a computer and only have paper and pen. It doesn't make much sense to worry about syntax for code that won't compile (can't compile paper).
I almost always use it nowadays when creating any non-trivial routines. I create the pseudo code as comments, and continue to expand it until I get to the point that I can just write the equivalent code below it. I have found this significantly speeds up development, reduces the "just write code" syndrome that often requires rewrites for things that weren't originally considered as it forces you to think through the entire process before writing actual code, and serves as good base for code documentation after it is written.
I and the other developers on my team use it all the time. In emails, whiteboard, or just in confersation. Psuedocode is tought to help you think the way you need to, to be able to program. If you really unstand psuedocode you can catch on to almost any programming language because the main difference between them all is syntax.
If I'm working out something complex, I use it a lot, but I use it as comments. For instance, I'll stub out the procedure, and put in each step I think I need to do. As I then write the code, I'll leave the comments: it says what I was trying to do.
procedure GetTextFromValidIndex (input int indexValue, output string textValue)
// initialize
// check to see if indexValue is within the acceptable range
// get min, max from db
// if indexValuenot between min and max
// then return with an error
// find corresponding text in db based on indexValue
// return textValue
return "Not Written";
end procedure;
I've never, not even once, needed to write the pseudocode of a program before writing it.
However, occasionally I've had to write pseudocode after writing code, which usually happens when I'm trying to describe the high-level implementation of a program to get someone up to speed with new code in a short amount of time. And by "high-level implementation", I mean one line of pseudocode describes 50 or so lines of C#, for example:
Core dumps a bunch of XML files to a folder and runs the process.exe
executable with a few commandline parameters.
The process.exe reads each file
Each file is read line by line
Unique words are pulled out of the file stored in a database
File is deleted when its finished processing
That kind of pseudocode is good enough to describe roughly 1000 lines of code, and good enough to accurately inform a newbie what the program is actually doing.
On many occasions when I don't know how to solve a problem, I actually find myself drawing my modules on a whiteboard in very high level terms to get a clear picture of how their interacting, drawing a prototype of a database schema, drawing a datastructure (especially trees, graphs, arrays, etc) to get a good handle on how to traverse and process it, etc.
I use it when explaining concepts. It helps to trim out the unnecessary bits of language so that examples only have the details pertinent to the question being asked.
I use it a fair amount on StackOverflow.
I don't use pseudocode as it is taught in school, and haven't in a very long time.
I do use english descriptions of algorithms when the logic is complex enough to warrant it; they're called "comments". ;-)
when explaining things to others, or working things out on paper, i use diagrams as much as possible - the simpler the better
Steve McConnel's Code Complete, in its chapter 9, "The Pseudocode Programming Process" proposes an interesting approach: when writing a function longer than a few lines, use simple pseudocode (in the form of comments) to outline what the function/procedure needs to do before writing the actual code that does it. The pseudocode comments can then become actual comments in the body of the function.
I tend to use this for any function that does more than what can be quickly understood by looking at a screenful (max) of code. It works specially well if you are already used to separate your function body in code "paragraphs" - units of semantically related code separated by a blank line. Then the "pseudocode comments" work like "headers" to these paragraphs.
PS: Some people may argue that "you shouldn't comment what, but why, and only when it's not trivial to understand for a reader who knows the language in question better then you". I generally agree with this, but I do make an exception for the PPP. The criteria for the presence and form of a comment shouldn't be set in stone, but ultimately governed by wise, well-thought application of common sense anyway. If you find yourself refusing to try out a slight bent to a subjective "rule" just for the sake of it, you might need to step back and realize if you're not facing it critically enough.
Mostly use it for nutting out really complex code, or when explaining code to either other developers or non developers who understand the system.
I also flow diagrams or uml type diagrams when trying to do above also...
I generally use it when developing multiple if else statements that are nested which can be confusing.
This way I don't need to go back and document it since its already been done.
Fairly rarely, although I often document a method before writing the body of it.
However, If I'm helping another developer with how to approach a problem, I'll often write an email with a pseudocode solution.
I don't use pseudocode at all.
I'm more comfortable with the syntax of C style languages than I am with Pseudocode.
What I do do quite frequently for design purposes is essentially a functional decomposition style of coding.
public void doBigJob( params )
{
doTask1( params);
doTask2( params);
doTask3( params);
}
private void doTask1( params)
{
doSubTask1_1(params);
...
}
Which, in an ideal world, would eventually turn into working code as methods become more and more trivial. However, in real life, there is a heck of a lot of refactoring and rethinking of design.
We find this works well enough, as rarely do we come across an algorithm that is both: Incredibly complex and hard to code and not better solved using UML or other modelling technique.
I never use or used it.
I always try to prototype in a real language when I need to do something complex, usually writting unit tests first to figure out what the code needs to do.

Resources