Where can i find sample alogrithms for analyzing historical stock prices? - algorithm

Can anyone direct me in the right direction?
Basically, I'm trying to analyze stock prices and see if I can spot any patterns. I'm using PHP and MySQL to do this. Where can I find sample algorithms like the ones used in MetaStock or thinkorswim? I know they are closed source, but are there any tutorials available for beginners?
Thank you,
P.S. I don't even know what to search for in google :(

A basic, educational algorithm to start with is a dual-crossover moving average. Simply chart fast (say, 5-day) and slow (say, 10-day) moving averages of a stock's closing price, and you have a weak predictor of when to buy long (fast line goes above slow) and sell short (slow line goes above the fast). After getting this working, you could implement exponential smoothing (see previously linked wiki article).
That would be a decent start. Take a look at other technical analysis techniques, but do keep in mind that this is quite a perilous method of trading.
Update: As for actually implementing this? You're a PHP programmer, so here is a charting library for PHP. This is the one I used a few years ago for this very project, and it worked out swimmingly. Maybe someone else can recommend a better one. If you need a free source of data, take a look at Yahoo! Finance's historical data. They dispense CSV files containing daily opening prices, closing prices, trading volume, etc. of virtually every indexed corporation.

Check out algorithms at investopedia and FM Labs has formulas for a lot of technical analysis indicators.

First you will need a solid math background : statistics in general, correlation analysis, linear algebra... If you really want to push it check out dimensional transposition. Then you will need solid basis in Data Mining. Associations can be useful if yo want to link strict numerical data with news headlines and other events.
One thing for sure you will most likely not find pre-digested algorithms out there that will make you rich...
I know someone who is trying just that... He is somewhat successful (meaning is is not loosing money and is making a bit) and making his own algorithms... I should mention he has a doctorate in Actuarial science.
Here are a few more links... hope they help out a bit
http://mathworld.wolfram.com/ActuarialScience.html
http://www.actuary.com/actuarial-science/
http://www.actuary.ca/
Best of luck to you

Save yourself time and use programs like NinjaTrader and Wealth-Lab. Both of them are great technical analysis platforms and accept C# as a programming language for defining your trading rules. Every possible technical indicator you can imagine is already included and if you need something more advanced you can always write your own indicator. You would also need a lot of data in order for your analysis to be statistically significant. For US stocks and ETFs, visit www.Kibot.com. We have good experience using their data.

Here's a pattern for ya
http://ddshankar.files.wordpress.com/2008/02/image001.jpg

I'd start with a good introduction to time series analysis and go from there. If you're interested in finding patterns then the interesting term is "1D-Pattern Matching". But for that you need nice features, so google for "Feature extraction in time series". Remember GiGo. So make sure you have error-free stock price data for a sufficiently long timeperiod before you start.

May I suggest that you do a little reading with respect to the Kalman filter? Wikipedia is a pretty good place to start:
http://en.wikipedia.org/wiki/Kalman_filter/
This should give you a little background on the problem of estimating and predicting the variables of some system (the stock market in this case).
But the stock market is not very well behaved so you may want to familiarize yourself with non linear extensions to the KF. Yes, the wikipedia entry has sections on the extended KF and the unscented KF, but here is an introduction that is just a little more in-depth:
http://cslu.cse.ogi.edu/nsel/ukf/
I suppose if anyone had ever tried this before then it would have been all over the news and very well known. So you may very well be on to something.

Use TradeStation
It is a platform that lets you write software to analyze historical stock data. You can even write programs that would trade the stock, and you can back test your program on historical data or run it real time through out the day.

Related

How can avoid people using my code for evil? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm not sure if this is quite the right place, but it seems like a decent place to ask.
My current job involves manual analysis of large data sets (at several levels, each more refined and done by increasingly experienced analysts). About a year ago, I started developing some utilities to track analyst performance by comparing results at earlier levels to final levels. At first, this worked quite well - we used it in-shop as a simple indicator to help focus training efforts and do a better job overall.
Recently though, the results have been taken out of context and used in a way I never intended. It seems management (one person in particular) has started using the results of these tools to directly affect EPR's (enlisted performance reports - \ it's an air force thing, but I assume something similar exists in other areas) and similar paperwork. The problem isn't who is using these results, but how. I've made it clear to everyone that the results are, quite simply, error-prone.
There are numerous unavoidable obstacles to generating this data, which I have worked to minimize with some nifty heuristics and such. Taken in the proper context, they're a useful tool. Out of context however, as they are now being used, they do more harm than good.
The manager(s) in question are taking the results as literal indicators of whether an analyst is performing well or poorly. The results are being averaged and individual scores are being ranked as above (good) or below (bad) average. This is being done with no regard for inherent margins of error and sample bias, with no regard for any sort of proper interpretation. I know of at least one person whose performance rating was marked down for an 'accuracy percentage' less than one percentage point below average (when the typical margin of error from the calculation method alone is around two to three percent).
I'm in the process of writing a formal report on the errors present in the system ("Beginner's Guide to Meaningful Statistical Analysis" included), but all signs point to this having no effect.
Short of deliberately breaking the tools (a route I'd prefer avoiding but am strongly considering under the circumstances), I'm wondering if anyone here has effectively dealt with similar situations before? Any insight into how to approach this would be greatly appreciated.
Update:
Thanks for the responses - plenty of good ideas all around.
If anyone is curious, I'm moving in the direction of 'refine, educate, and take control of interpretation'. I've started rebuilding my tools to try and negate or track error better and automatically generate any numbers and graphs they could want, with included documentation throughout (while hiding away as obscure references the raw data they currently seem so eager to import to the 'magical' excel sheets).
In particular, I'm hopeful that visual representations of error and properly created ranking systems (taking into account error, standard deviations, etc.) will help the situation.
Either modify the output to include error information (so if the error is +/- 5 %, don't output 22%, output 17% - 27%), or educate those whom this is being used against to the error so that they can defend themselves when it is used against them.
Well, you seem to have run afoul of the Law of Unintended Consequences in the context of human behavior.
Unfortunately, once the cat is out of the bag, it's pretty hard to put back in. You have a few options (which are not mutually exclusive, by the way) to consider, including:
Alter the reports so that their data can no longer be abused in the way you describe.
Work with management to help them understand why their use of your data is improper or misleading.
Work with those whose performance is being measured to pressure management to rethink their policy on the matter.
Work with management/analysts to come up with a viable means to measure performance in a way that is fair to everyone.
Break the report in a manner that makes them unusable for any purposes.
Clearly there is a desire on the part of management to get analytics on performance of analysts. Likely there is a real need for this ... and your reports happened to fill a void in the available information. The best option for everyone would be to find a way to effectively and fairly fill this need. There are many possible ways to achieve this - from dropping dense rankings in favor of performance tiers to using time-over-time variance to refine performance measurements.
Now, it's entirely possible that the existing reports you've provided simply cannot be applied in a fair and accurate manner to address this problem. In which case, you should work with your management team to make sure they understand why this is the case - and either redefine the way performance is measured or take the time to develop an appropriate and fair methodology.
One of the strongest means to convince management that their (ab)use of the data in your report is unwise is to remind them of the concept of perverse incentives. It's entirely possible that over time, analysts will modify their behavior in a way that results in higher rankings in performance reports at the cost of real performance or quality of results that are not otherwise captured or expressed. You seem to have a good understanding of your domain - so I would hope that you could provide specific and dramatic examples of such consequences to help make your case.
All you can do is to try and educate the managers as to why what they're doing is incorrect.
Beyond that, you can't stop idiots from being idiotic, and you'll just go mad trying.
I definitely wouldn't "break" code that people are relying on, even if it's not a specific deliverable. That will only cause them to complain about you, a move which may affect your own EPR :-)
I really think the key here is good communication with your managers.
Besides, I like PatrickV's idea. You could also try some other ways to engineer your tool around the problem so that it'll seem silly/be hard to use it as performance measurement - change the name of the statistics to mean something other than "how good programmer X is", make it hard to get data per-person, show error statistics.
You can also try to display the data in another way (this may actually make your managers think you are trying to help them). Show a graph - a several pixels difference in position may be harder to identify than a numeric results (my guess - your managers are using excel and coloring red everything below average). Draw the error margin so it doesn't make sense to obsess over fractions of percentages.
Give the result as a scale - low and high margin that take into account your error information, it is harder to compare.
Edit: Oh yeah, and read about "social interfaces". You can start with's Spolsky's Not Just Usability and Building Communities with Software.
I would echo #paxdiablo's advice, as a first step:
Work on the report on the inherent errors. In fact, make it the introduction to every copy generated.
When you refer to the measurement errors, indicate they are the lower limit of the errors (unless there actually aren't any).
Try to educate the manager(s) in the error of his/her ways.
If possible, discuss the issue with your manager. And perhaps with the offending managers' management, depending on how familiar you are with them you probably limit it to just "express some concerns" and giving a heads-up.
Consult your HR department, or whomever is in charge of fairness in the performance reviews.
Good luck.
The problem is that the code is not yours, it belongs to your company. They really can do whatever they want with it.
I hate to say this, but if you have an issue with the ethics of your company you will have to leave that company.
One thing you could do is implement the comparison yourself. If he really wants to check if somebody is performing significantly less than the rest, it should be tested formally as well.
Now to choose the right test is a bit tricky without knowing the data and the structure, so I can't really advise you on that one. Just take into account that if you do pairwise comparisons, or compare multiple scores against an average, that you run into the multitesting problem. A classic way of correcting is using Bonferroni. If you implement that one, you can be sure that at a certain point, noone will jump out any more. The Bonferroni correction is very conservative. Another option is using Dunn-Sidak, which is supposed to be less conservative.
The correct implementation would be an ANOVA -if the assumptions are met and the data suitable off course- with a post-hoc comparison like a Tukey Honest Significant Difference test. That way at least the uncertainty on the results is taken into account.
If you don't have a clue on which test to use, describe your data in detail on stats.stackexchange.com and ask for help on which test to use.
Cheers
I just wanted to elaborate on the Perverse Incentives answer of LBushkin. I can easily see your problem extending to where analysts will avoid difficult topics for fear of reducing their score. Or maybe they will provide the same answer as earlier stages to avoid hurting a friends score, even if that is not correct. An interesting question is what happens if the later answer is incorrect - you have no truth, just successive analytic opinions - in this case I assume the first answer is marked as "incorrect", right?
Maybe presenting some of these extensions to the manager will help.

How to assemble a project with software products and your own code

Let's say you have a specific project on hand, it can be divided to parts, and you are not completely sure about all the difficulties that will arise.
Time is of the essence.
How do you decide whether a part should use software product or your own code? (considering, that some tools are awesome, but will require much time to learn)
How do you choose the right software product?
How much time (as a percentage) should this stage of choosing the right product, if any, take, and how much time to evaluate a single product?
Is there a way-back, is it o.k to change your mind, after putting efforts in a product, and finding it not suitable?
I would love to hear any rules of thumb about those.
Changing your decisions is like changing your blueprint for a house while it's already being built.
It will entirely depend on what you have spent in time and money to that point.
Some considerations:
0) Understand the problem in clear and simple terms before beginning. Know what's critical to it's success and then use that list to see if any software, language, or tool will aid it, and at what cost, and if the cost outweighs the benefit.
1) Use a crammer's schedule. Build it in the order of what you would build if you only had 1 day or 1 week and no more to work on it. It's amazing how much doesn't matter anymore when you have to do 50% of the features at 100% of the quality. Focus on value, value, value. Read something like 37 Signal's book Getting Real for more on this.
2) Do not re-invent the wheel. It's always easier it seems to build something from scratch. Unless you are doing a fraction of the implementation and it's truly simpler, meaning you can avoid abstraction until you forget what you were building, consider it. If you can build it faster, better, cheaper and in the same amount of time, do it.
3) Know the features of your tools, and the benefits any tools need to give your solution. You should be familiar with or at least aware of many of the tools out there that you may or may not integrate.
4) Pick a language that is used to solve a lot of problems. Chances are you will find many great libraries and tools to build your software that will save your time. If you need something that delivers, can run, and you can lean on the smarts of others, use something established, or a language that can access .NET or Java easily if need be.
For each part of your software you recognize as a software component/package:
How do you decide whether a part should use software product or your own code?
(considering, that some tools are awesome, but will require much time to learn)
Ask yourself whether the component you are considering is a part of your product's main business core.
If not then it is usually better to use an existing solution and not send too much time on it.
If it is then make sure there is no existing product that is better than what you are planning. - It there is, consider purchasing licenses to it instead of developing your product.
Search online for similar components (commercial, open source and even articles/demo-source-code).
Do any of them implement all of your requirements from the components?
How much do they cost, would it cost you more to develop and maintain a similar component?
What are the license conditions? - Are they OK for your product?
If component includes a user-interface, is it plesent to look at and easy to use?
If you answered yes to all the above then do not develop the component yourself.
If not:
Is the component open source or published in an article / demo-code? - If so, it robust, could you take the code an improve it or use it as an example to help you write code that is more suitable for your requirements? - If so write your own code, use code as part of your own component that is not developed from scratch.
If your answer to the above is no, then you'll have to develop your own (or you're searching in the wrong places).
How do you choose the right software product?
See answers to 1.
How much time (as a percentage) should this stage of choosing the right product, if any, take, and how much time to evaluate a single product?
Clear an entire day, search for existing components, read about them (features, prices, reviews) and download + install up to 5 of them.
Clear another day evaluate 2-3 products, compare demos/examples, look at code, write 2 small examples of using each (same example different product).
If you choose more than 3, clear another day and test the others.
Is there a way-back, is it o.k to change your mind, after putting efforts in a product, and finding it not suitable?
Always design your software so that every component is replaceable.
This guarantees that there is always "a way back".
(Use interfaces & adapter design pattern, divide to many assemblies, connect all components as loosely as possible (using events, binding, as etc.) - loose coupling.
Even if you implement something yourself make sure there is a way back - sometime you may use the wrong technology/design and have to replace a component with a new one you develop/purchase.
Other rules of thumb:
Consider which application-wide technologies to use before considering each component.
Writing in assembly would take the longest, in C less, in C++ even less, in more modern languages such as C#, Java, Delphi even less.
Which has more of the self components that are relevant to you? What does your team have experience in.
If you are using .NET (C#), then WPF could help you lower the coupling between GUI and business logic and make a better looking GUI, however it take time to learn how to use it (a 5 day minimum course is very much recommended).
As in any art the difficulty is composing a good solution based on a very large possible solutions space. There as many ways to go about this as there are developers.
I’d normally spend some time understanding the problem and stating it clearly and succinctly as possible, preferably in a written form. The problem description should be completely abstracted away from any possible solutions. Next I’d normally list available constraints that will need to be applied to the solution (time, budget, legal, political, performance, usability, skill availability within team and so on).
Then the theory goes that you need to look on the market for something that solves the problem and meets the constraints at the same time. In practise, the process is not that straight-forward: you try to identify market categories that are likely to be useful, then research them, see what is available and continuously try to reduce the gap between the constraints and capabilities as much as possible, often by going back and revisiting and re-negotiating the constraints.
A few generic tips:
During the research keep coming back to the original problem.
There is always more than one solution, try to extend breadth (concentrating on very different ways of solving the problem) of the search space before going deeper.
Be clear on a number of options it’s worth researching, and amount of time worth spending on each of them before making a decision whether to investigate further.
It’s seldom worth finding an optimal solution, especially then technological landscape keeps changing very rapidly. Look for a solution that is good enough: “The Paradox of Choice - Why More is Less”.
It’s rarely worth turning to users for help (unless they are software experts) on choosing between several options. If you’ve got a number of options all looking equally attractive that means you need to go back and understand the original problem better, it’s likely you’ve missed a requirement or two.
Some further notes on using third-party components (refers to GUI components, but easy to apply to other software areas as well).
And even more notes on scoping, composing and researching for a project.
How do you decide whether a part should use software product or your own code? (considering, that some tools are awesome, but will require much time to learn)
Ask your self two questions.
1) Is it a mature product. If yes, then
2) How long it would take to create the functionality it provides on your own. If that value times your hourly rate is greater than the cost of the product, then use that product.
How do you choose the right software product?
Consult your network of other developers. Have they used it, did they run into problems. Consult the interweb. Create a prototype using the product. Does it work well? Any major bugs?
How much time (as a percentage) should this stage of choosing the right product, if any, take, and how much time to evaluate a single product?
It depends on the size of the project, and the criticality of the product to the success. Most of the time, you are going to be able to get a high level view of the product in a very short amount of time.
It may be just a few minutes using it before you say, nope - not ready for prime time. If it makes past that, a day or two of experimentation may tell you that it passes muster for your project.
If it's a huge project with many developers, then you probably want to spend more time doing a prototype application with it to be sure it's worth investing all that time in.
Is there a way-back, is it o.k to change your mind, after putting efforts in a product, and finding it not suitable?
If you find it's not working out, there's nothing wrong with going back. In fact you probably have to. Ideally you will find this out early. Not at the 11th hour. Again, this is the purpose of prototyping.
There are already some really good answers here, so I won't repeat it, however there is one point you should definitely consider, and though I would have thought its obvious I havent seen it mentioned here yet:
The personnel you have available to implement the solution, their core competency, and their general level of competence.
Who you have to implement this (assuming it's a team, and not just yourself - but relevant even if its just you, too...) can have a HUGE effect on the outcome. If you don't have experienced programmers to help you develop this, you're better off looking for some OTS product to do the work for you... Or, even if you have programmers who are not likely to succeed, you still might want to find a solution with lower overall project risk.

Is it possible manage developers with high turnover if you can't lower the turnover rate? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I lead a small group of programmers in a university setting, having just moved into this position last year. While the majority of our team are full time employees, we have a couple of people who are traditionally graduate assistants.
The competition for these assistantships is fairly intense, as they get free graduate school tuition on top of their salary while they have the job. We require that they sign up for at least a year, though we consider ourselves lucky if they stay for two. After that, they get their master's degree and move on to bigger and better things.
As you can imagine, hiring and re-training these positions is time- and resource-intensive. To make matters worse, up to now they have typically been the sole developer working on their respective projects, with me acting in an advisory and supervisory role, so wrangling the projects themselves to fight the entropy as we switch from developer to developer is a task unto itself.
I'm tempted to bring up to the administrators the possibility of hiring a full- (and long-haul) developer to replace these two positions, but for a school in a budget crisis, paying for two half-time graduate assistants is far cheaper (in terms of salary and benefits) than paying for one full-time developer. Also, since I'm new to this position, I'd like to avoid seeming as though I'm not able to deal with what I signed up for. For the forseeable future, I don't think the practice of hiring short-term graduate assistants is going to change.
My question: What can I do to create an effective training program considering that the employees may be gone after as little as a year on the job?
How much time should I invest in training them, and how much would simply be a waste of time?
How much time should they take simply getting acclamated to our process and the project?
Are there any specific training practices or techniques that can help with this kind of situation?
Has anyone dealt with a similar situation before?
Do I worry too much, or not enough?
By the way, and for the record, we do the vast majority of our development in Perl. It's hard to find grad students who know Perl, while on the other hand everybody seems to have at least an academic understanding of Java. Hence this question which I asked a while back.
Why don't you ask the students what they find difficult and make cheat sheets, lectures, etc. for the parts of the job that they have trouble with? Maybe you need to create some introductory Perl lectures or purchase some dead trees. How about a Safari subscription at O'Reilly? I'd ask the students how they prefer to learn, though, before embarking on a training project. Everyone has different learning styles.
I'd also spend some time and capital creating a culture of professional software development at work. It'll be tough since academic programmers are often neophytes and used to kludging up solutions (I'm an academic programmer, btw) but the students will thank you in the long run. Maybe you can all go out to lunch once a week to discuss programming and other topics. You might also want to take some time to do code reviews so people can learn from each other.
With high turnover you definitely need to ensure that knowledge transfer occurs. Make sure you are using source code control and that your students understand proper commenting. I'd also make the students create brief documentation for posterity. If they are getting credit, make them turn in a writeup of their progress once a semester. You can put this in a directory in the project's repository for anyone who inherits it. As mentioned in other posts, a group wiki can really help with knowledge transfer. We use Mediawiki in our group and like it a lot.
One last thing I should add is that I find it helps to keep a list of projects for new developers that relatively easy and can be completed in a month or so. They are a great way for new people to get acclimated to your development environment.
This is a relative question, and should be taken on a case-by-case basis. If the new hire already knows Perl, you don't need to go over this piece of training (yes, you could put Perl as a mandatory prerequisite, but that would significantly limit your applicant pool), and their first bit of training should be something like fixing a bug in an existing application or walking them through an application they will maintain. Though, given that the developers are only there for a year makes me think the development styles are going to vary some (if not a lot).
Getting the new person up to speed with your process is very important, as long as your process works. In this high turnover environment, you should put a strong emphasis on documentation in your process. A Wiki is a great thing to have for this documentation, since it's centralized and any of the developers can access it. Having them try to figure out how a project works by themselves (with little to no documentation) is a waste of both their time and your time.
Perhaps I'm reading too much into the question, but if your university teaches java, why are you using Perl? Wouldn't it make more sense to use the tools that your students already know? This alone would cut the learning curve significantly. [once you eliminate the legacy code of course]
other than that, try:
break the projects up into month-sized bites
overlap the internships by at least 2 months, if not 6, so the new guy can work with and be trained by the old guy(s)
document whatever repeatable processes you have (as was suggested by Mark Nold)
if the grad students are cheaper than full-time pros, quit whinin' ;-) If not, go for the pros.
Have you considered making a "three ring binder" like Macdonalds and many other high turn over industries have? Have one folder which you can print out and hand to the new hire which shows the new hire some basics of getting up and running with Perl in your environment. This should be a "hello world", plus some basic regex and array manipulation. Lastly your manual should go on to show examples of the 5 things you find yourself doing all the time.
The example code may be authenticating users against an external security system, walking through recordsets or using ghostscript to create PDFs. Whatever they are, they should cover the basics of what you meet 80% of the time. More importantly the examples should show users how you expect the code to be written for clarity (eg: naming and approach), and give them some insight into servers and software in use and other practicalities which a generic book won't show them.
You won't get the binder right first time, but since you have a high staff turn over, you'll have plenty of time to test and improve it.
On top of this i would pick a single Perl programming book and give the new user their own copy of your three ring binder, plus "Programming Perl" to keep on their first day. At a cost of $50 per hire i'm sure it's a lot cheaper than the alternatives and you'll have them flipping burgers.... i mean cutting code in no time.
My initial couple of thoughts are that you should:
hire for the position, i.e. it's Perl-centric so make that a big part of the pre-requisites. That way you don't need that piece of training as well.
invest time in the on-boarding process, maybe use a wiki so that you can easily update it to help bringing them on-board.
Edit: Some extra points:
maybe have a chat and see if Perl can be introduced into the curriculum? If not, then make it known six months before the ads go up that applicants need to know Perl. This way you'll get people who have Perl experience and who have actively demonstrated their motivation.
can you open up some small projects so that they could be done by potential candidates during this six months?
approach the design of your large-scale projects so that they can be done in a piece meal manner. This is how The ACE Components have been done iirc.
allow a specific period for documentation and review of the work done by the departing grad student.
allow an overlap period of at least a couple of weeks where the new grad student can work with the departing grad student. They can learn the development environment and they can be guinea pigs for any updates to your wiki.
Still more to come...
HTH
cheers
That is a pickle, but it is not as uncommon in the commercial sector as you would think. I heard a statistic once that the average tenure of a programmer industry-wide is about 18-24 months. Normally I would suggest getting more experienced programmers who would require less ramp-up time and only need to be trained on the problem domain/technology updates and not the basics
I think your best bet is just to ask for about 30-50% more grad students than it will take to actually perform the job to account for the learning and ramp-up time and invest in some additional resources for testing as this environment is a recipe for mistakes since everyone is learning on the job. Also, this is probably difficult given the academic schedule, but try to stagger the start dates as much as possible to maximize the overlap between employees. Pair-programming teams of new-hires/old-hires might also help increase consistency and supplement the training without sacrificing too much productivity.
How much time should I invest in training them, and how much would simply be a waste of time?
Answer: It's not the amount of time or the amount of waste, but perhaps the approach. Would it be possible to video train - video yourself training one person and provide it as training for subsequent students/developers. You can add over time, but it does reduce your time needed to go through the same process.
How much time should they take simply getting acclamated to our process and the project?
Answer: This is all dependent on the person...min. of a day max of 2-3 weeks I would guess on avg.
Are there any specific training practices or techniques that can help with this kind of situation?
Answer: Video training (home made), having the current student/developer create/update a wiki of needed info, points of interest, etc.
Has anyone dealt with a similar situation before?
Answer: Use to be a 12-18 month avg. turnover - I would imagine that it's changed now (longer), but every company has turn-over, but perhaps not forced like yours due to the resource being students.
Do I worry too much, or not enough?
Answer: Knowledge lost through transition is a key risk area...
Is the application something you could consider open-sourcing?

Minimum CompSci Knowledge Needed for Writing Desktop Apps

Having been a hobbyist programmer for 3 years (mainly Python and C) and never having written an application longer than 500 lines of code, I find myself faced with two choices :
(1) Learn the essentials of data structures and algorithm design so I can become a l33t computer scientist.
(2) Learn Qt, which would help me build projects I have been itching to build for a long time.
For learning (1), everyone seems to recommend reading CLRS. Unfortunately, reading CLRS would take me at least an year of study (or more, I'm not Peter Krumins). I also understand that to accomplish any moderately complex task using (2), I will need to understand at least the fundamentals of (1), which brings me to my question : assuming I use C++ as the programming language of choice, which parts of CLRS would give me sufficient knowledge of algorithms and data structures to work on large projects using (2)?
In other words, I need a list of theoretical CompSci topics absolutely essential for everyday application programming tasks. Also, I want to use CLRS as a handy reference, so I don't want to skip any material critical to understanding the later sections of the book.
Don't get me wrong here. Discrete math and the theoretical underpinnings of CompSci have been on my "TODO: URGENT" list for about 6 months now, but I just don't have enough time owing to college work. After a long time, I have 15 days off to do whatever the hell I like, and I want to spend these 15 days building applications I really want to build rather than sitting at my desk, pen and paper in hand, trying to write down the solution to a textbook problem.
(BTW, a less-math-more-code resource on algorithms will be highly appreciated. I'm just out of high school and my math is not at the level it should be.)
Thanks :)
This could be considered heresy, but the vast majority of application code does not require much understanding of algorithms and data structures. Most languages provide libraries which contain collection classes, searching and sorting algorithms, etc. You generally don't need to understand the theory behind how these work, just use them!
However, if you've never written anything longer than 500 lines, then there are a lot of things you DO need to learn, such as how to write your application's code so that it's flexible, maintainable, etc.
For a less-math, more code resource on algorithms than CLRS, check out Algorithms in a Nutshell. If you're going to be writing desktop applications, I don't consider CLRS to be required reading. If you're using C++ I think Sedgewick is a more appropriate choice.
Try some online comp sci courses. Berkeley has some, as does MIT. Software engineering radio is a great podcast also.
See these questions as well:
What are some good computer science resources for a blind programmer?
https://stackoverflow.com/questions/360542/plumber-programmers-vs-computer-scientists#360554
Heed the wisdom of Don and just do it. Can you define the features that you want your application to have? Can you break those features down into smaller tasks? Can you organize the code produced by those tasks into a coherent structure?
Of course you can. Identify any 'risky' areas (areas that you do not understand, e.g. something that requires more math than you know, or special algorithms you would have to research) and either find another solution, prototype a solution, or come back to SO and ask specific questions.
Moving from 500 loc to a real (eve if small) application it's not that easy.
As Don was pointing out, you'll need to learn a lot of things about code (flexibility, reuse, etc), you need to learn some very basic of configuration management as well (visual source safe, svn?)
But the main issue is that you need a way to don't be overwhelmed by your functiononalities/code pair. That it's not easy. What I can suggest you is to put in place something to 'automatically' test your code (even in a very basic way) via some regression tests. Otherwise it's going to be hard.
As you can see I think it's no related at all to data structure, algorithms or whatever.
Good luck and let us know
I must say that sitting down with a dry old textbook and reading it through is not the way to learn how to do anything effectively, even if you are making notes. Doing it is the best way to learn, using the textbooks as a reference. Indeed, using sites like this as a reference.
As for data structures - learn which one is good for whatever situation you envision: Sets (sorted and unsorted), Lists (ArrayList, LinkedList), Maps (HashMap, TreeMap). Complexity of doing basic operations - adding, removing, searching, sorting, etc. That will help you to select an appropriate library data structure to use in your application.
And also make sure you're reasonably warm with MVC - i.e., ensure your model is separate from your view (the QT front-end) as best as possible. Best would be to have the model and algorithms working on their own, and then put the GUI on top. Or a unit test on top. Etc...
Good luck!
It's like saying you want to move to France, so should you learn french from a book, and what are the essential words - or should you just go to France and find out which words you need to know from experience and from copying the locals.
Writing code is part of learning computer science. I was writing code long before I'd even heard of the term, and lots of people were writing code before the term was invented.
Besides, you say you're itching to write certain applications. That can't be taught, so just go ahead and do it. Some things you only learn by doing.
(The theoretical foundations will just give you a deeper understanding of what you wind up doing anyway, which will mainly be copying other people's approaches. The only caveat is that in some cases the theoretical stuff will tell you what's futile to attempt - e.g. if one of your itches is to solve an NP complete problem, you probably won't succeed :-)
I would say the practical aspects of coding are more important. In particular, source control is vital if you don't use that already. I like bzr as an easy to set up and use system, though GUI support isn't as mature as it could be.
I'd then move on to one or both of the classics about the craft of coding, namely
The Pragmatic Programmer
Code Complete 2
You could also check out the list of recommended books on Stack Overflow.

Does anyone work with Function Points? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
Some questions about Function Points:
1) Is it a reasonably precise way to do estimates? (I'm not unreasonable here, but just want to know compared to other estimation methods)
2) And is the effort required worth the benefit you get out of it?
3) Which type of Function Points do you use?
4) Do you use any tools for doing this?
Edit: I am interested in hearing from people who use them or have used them. I've read up on estimation practices, including pros/cons of various techniques, but I'm interested in the value in practice.
I was an IFPUG Certified Function Point Specialist from 2002-2005, and I still use them to estimate business applications (web-based and thick-client). My experience is mostly with smaller projects (1000 FP or less).
I settled on Function Points after using Use Case Points and Lines of Code. (I've been actively working with estimation techniques for 10+ years now).
Some questions about Function Points:
1) Is it a reasonably precise way to
do estimates? (I'm not unreasonable
here, but just want to know compared
to other estimation methods)
Hard to answer quickly, as it depends on where you are in the lifecycle (from gleam-in-the-eye to done). You also have to realize that there's more to estimation than precision.
Their greatest strength is that, when coupled with historical data, they hold up well under pressure from decision-makers. By separating the scope of the project from productivity (h/FP), they result in far more constructive conversations. (I first got involved in metrics-based estimation when I, a web programmer, had to convince a personal friend of my company's founder and CEO to go back to his investors and tell them that the date he had been promising was unattainable. We all knew it was, but it was the project history and functional sizing (home-grown use case points at the time) that actually convinced him.
Their advantage is greatest early in the lifecycle, when you have to assess the feasibility of a project before a team has even been assembled.
Contrary to common belief, it doesn't take that long to come up with a useful count, if you know what you're doing. Just off of the basic information types (logical files) inferred in an initial client meeting, and average productivity of our team, I could come up with a rough count (but no rougher than all the other unknowns at that stage) and a useful estimate in an afternoon.
Combine Function Point Analysis with a Facilitated Requirements Workshop and you have a great project set-up approach.
Once things were getting serious and we had nominated a team, we would then use Planning Poker and some other estimation techniques to come up with an independent number, and compare the two.
2) And is the effort required worth
the benefit you get out of it?
Absolutely. I've found preparing a count to be an excellent way to review user-goal-level requirements for consistency and completeness, in addition to all the other benefits. This was even in setting up Agile projects. I often found implied stories the customer had missed.
3) Which type of Function Points do
you use?
IFPUG CPM (Counting Practices Manual) 4.2
4) Do you use any tools for doing
this?
An Excel spreadsheet template I was given by the person who trained me. You put in the file or transaction attributes, and it does all of the table lookups for you.
As a concluding note, NO estimate is as precise (or more precisely, accurate) as the bean-counters would like, for reasons that have been well documented in many other places. So you have to run your projects in ways that can accommodate that (three cheers for Agile).
But estimates are still a vital part of decision support in a business environment, and I would never want to be without my function points. I suspect the people who characterize them as "fantasy" have never seen them properly used (and I have seen them overhyped and misused grotesquely, believe me).
Don't get me wrong, FP have an arbitrary feel to them at times. But, to paraphrase Churchill, Function Points are the worst possible early-lifecycle estimation technique known, except for all the others.
Mike Cohn in his Agile Estimating and Planning consider FPs to be great but difficult to get right. He (obviously) recommends to use story points-based estimation instead. I tend to agree with this as with each new project I see the benefits of Agile approach more and more.
1) Is it a reasonably precise way to do estimates? (I'm not unreasonable here, but just want to know compared to other estimation methods)
As far as estimation precision goes the functional points are very good. In my experience they are great but expensive in terms of effort involved if you want do it properly. Not that many projects could afford an elaboration phase to get the FP-based estimates right.
2) And is the effort required worth the benefit you get out of it?
FPs are great because they are officially recognised by ISO which gives your estimations a great deal of credibility. If you work on a big project for a big client it might be useful to invest in official-looking detailed estimations. But if the level of uncertainty is big to start with (like other vendors integration, legacy system, loose requirements etc.) you will not get anywhere near precision anyway so usually you have to just accept this and re-iterate the estimations later. If it is the case a cheaper way of doing the estimates (user stories and story points) are better.
3) Which type of Function Points do you use?
If I understand this part of your question correctly we used to do estimations based on the Feature Points but gradually moved away from these an almost all projects expect for the ones with heavy emphasis on the internal functionality.
4) Do you use any tools for doing this?
Excel is great with all the formulas you could use. Using Google Spreadsheets instead of Excel helps if you want to do that collaboratively.
There is also a great tool built-in to the Sparx Enterprise Architect which allows you to do the estimates based on the Use Cases which could be used for FP estimations as well.
The great hacknot is offline now, but it is in book form. He has an essay on function points: http://www.scribd.com/doc/459372/hacknot-book-a4, concluding they are a fantasy (which I agree with).
Joel on Software has a reasonable sound alternative called Evidence based scheduling that at least sounds like it might work....
From what I have study about Function Point (one of my teacher was highly involved in the process of the theory of function point) and he wasn't able to answer all our answers.Function point fail in many way because it's not because you have something read or write that you can evaluate correctly. You might have a result of 450 functions points and some of these function point will take 1 hour ans some will take 1 weeks. It's a metric that I will never use again.
No because any particular requirement can have an arbitrary amount of effort based on how precise (or imprecise) the author of the requirement is, and the level of experience of the function point assessor.
No because administration of imprecise derivations of abstract functionality yield no reliable estimate.
None if I can help it.
Tools? For function points? How about Excel? Or Word? Or Notepad? Or Edlin?
To answer your questions:
Yes they are more precise than anything else I have encountered (in 20+ years).
Yes they are well worth the effort. You can estimate size, resources, quality and schedule from just the FP count - extremely useful. It takes an average of 1 minute to count an FP manually and an average of 8 hours to fully code an FP (approximately $800 worth). Consider the carpenter's saying of "measure twice cut once". And now a shameless plug: with https://www.ScopeMaster.com you can measure 1 FP per second, and you don't need to learn how!
I like Cosmic Function Points (because they are versatile) and IFPUG because there is a lot of published data (mostly from Capers Jones).
Having invested considerable time, effort and money in developing a tool that counts FPs automatically from requirements, I shall never have to do it manually again!

Resources