How to localize syntax errors in Mathematica? - wolfram-mathematica

Is there a trick to localize syntax errors? I occasionally run into this situation when editing existing code. Here's a latest example, do you see an efficient way to find the error? (I found the error already, but not efficiently)
http://yaroslavvb.com/upload/save/syntax-error.nb

May be, I am missing something, but in your example it was quite easy: once I put your code into a notebook and tried to run, the usual orange bracket appeared, and when you expand the messages, it states very clearly that the problem was in an empty parentheses in your vertexFormula function. In my experience, most of the time, the orange box gives enough hints.
Another great way which I use on a daily basis it through the code highlighting in Workbench. It immediately highlights syntax errors, plus you have a very powerful Eclipse-based navigation both for a single package and multiple packages. It might seem as you lose some flexibility by going to Workbench from an interactive FrontEnd development, but I found the opposite (or may be this is the revenge of my enterprise Java background): you still can have your notebook(s) in a Workbench project, where you do the initial development, but then they get attached to the project and a number of packages that you already developed and use. The transition from notebook to a package could not be easier, since you can continue use that notebook after you transfer the code into a package, and you don't even have to worry about loading the package, as long as you do it just once inside a project. Overall, I find the Workbench - based development much more fun as soon as your project goes over some critical mass (I'd say, perhaps around 1000 loc, but ymmv). But, complete new and independent chunks of functionality I still prefer to prototype entirely in the FrontEnd.
If you adhere to some specific coding standards in your code, or work with code which does, you might be able to develop some simple partial parser which at least breaks the code into complete chunks (function defenitions, Module-s etc, CompoundExpressions). Then, you could use ToExpression (say, Map it on a list of string code chunks returned by your partial parser), to see which piece of code is problematic (it will return $Failed on it). But if you use Workbench, this is unnecessary altogether.

Related

Does Netlogo have a way for Code Debugging?

Anybody knows guys is there a possibility to trace a code in Netlogo statement by statement, like using F7 or F8 in C/C++,...
Generic longer comment:
/*
So as the comments above say, there is not a debugger but there are work-arounds.
Replayable random-seeds and many print statements help. I define v-print routine to print if a "verbose" switch is TRUE, so I can put in lots of v-print statements, sometimes one per line of code, but shut them all off until needed with one interface switch.
For many Netlogo users, judging by the questions, this is the first programming language they have ever used. So there is some generic advice like "take small steps and do sanity checks often, expecting that something will go wrong."
But maybe the best and hardest advice is to learn enough Git and GitHub to set up a repository and "SAVE COMMENTED VERSIONS OFTEN", because otherwise it is a losing proposition to try to debug tangled code with multiple errors that you have been trying random changes on to see if they help. Some times you need to fall back to the prior last-good version and work forward from there again step by step looking for where it breaks. Working forward is easy. Debugging backwards is hard.
Debugging code with one error is much easier than debugging code with multiple errors.
Decades ago I used to repair analog televisions. I loved rich people because they called as soon as one thing was wrong. Poor people waited until the set was full of errors then called. I've seen similar behavior in programmers continuing to code hoping that prior errors will somehow magically go away. Slow is fast.
GitHub has been a life-saver a few times when I accidentally deleted a section of code in the editor and didn't notice I'd done it. It's easy to have lines of code highlighted and then type a word while not looking at the screen. Then you go crazy with code that worked yesterday and doesn't work today and you "didn't change anything".
Using GitHub to restore a prior version, or simply doing a differences ( "diff" ) between the current version and prior version shows what changed.
*/

the best or speedest way to understand uncommented and complex project [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have complex project without comments. The project is programed in Java but have more than one main class, use several .txt files like a template and use several .bat files. I don't know where to start and how to start discovering the project, because I need to make some changes in that project.
As with others I say this is a slow process.
However having done this in the past many times, this is my methodology:
Identify as many requirements that the code fulfils. This may give you the some reasons as to why certain things are the way they are when you look deeper. A common way of finding these is look for any tests that be available. The automated ones are best, but usually they're as missing as the comments.
Find the entry points to the code. These will give you places where you can poke the code to see how different inputs affect the flow. Common entry points are Code Loading 'Main' type functions, service interfaces, web page post backs etc..
Diagram the code. Look for tools that can build black/white box pictures of the code. For me this invaluable. I have on occasion printed out large listings and attacked then with markers and rulers. You're aim to create your own flow chart (mental or other wise) of the code flow.
Using the above (iteratively) build a set of outputs to the code that you think should occur, and add to these the outputs you may already know about such as logs, data files, database writes etc..
Finally if you have time, create some manual tests though preferably in automated test harnesses to verify the above. This where I start to involve the debugger to see detail in the code.
This methodology usually gives me confidence to make changes.
Note this is iterative process and can be done with portions of the code or overall as you see fit. I usually favour a top down approach to start with and then as I gain some insight I drill down till details become overwhelming and then I repeat. However this is just because my mind works in this way - you may be different. Good luck.
Find the main Main class. The starting point.
Start drawing a picture of the classes and the objects they own and the external entities they reference.
Follow all the branches until you can find a logical ending.
I've used UML reverse engineering tools in the past and while a visual picture is good, stepping through the code has always been the hardest and yet best methodology for me.
And, as you step through each piece you can add in your own comments..
I usually start with doxygen, turning on every extracting option (especially EXTRACT_ALL and EXTRACT_PRIVATE), and enable the SOURCE_BROWSER, HAVE_DOT, CALL_GRAPH and CALLER_GRAPH options (you also need to have dot installed). This gives good view of the software. For every function the calls are displayed and linked in a graph, also the sources are linked from there.
While doxygen is intended for C and C++, it also works with Java sources (set the OPTIMIZE_OUTPUT_JAVA option).
Auch. I'm afraid there is no speedy way to do this. Comment out a line (or two) -> test -> see what breaks. You could also put break statements here and there and run the debugger. That should give you some indication how you got there (ie. what the hierarchy between the classes is).
Hopefully the original developers used some patterns that you can recognize and make notes. Make lots of notes of everything. Start by trying to understand the high level structure and work down from there.
Be prepared to spend endless hours not understanding what the thing is doing.
Speak to the client and try to understand what the project is for, and what are all the things that it does. Someone somewhere had to put in some requirements for the stuff that's in there, if only in an email.
I would try to find the first entry point in the code closest to where you suspect you'll need to start making your changes, set a breakpoint, and start debugging. Check out the contents of local variables and work your way deeper as you get to become familiar with whats going on. Then, when you have a basic understanding of the area of code you're going to be working with, start fiddling with some small changes. Test your understanding of it. Try diagramming what you see happening. If you can do that confidently, you'll be able to decide if you need to go back and continue learning more about the code, or if you know enough to get done what you need to get done.
A start is to use an automated uml modeling tool (if you use eclipse you can use a plug-gin), and start creating UML diagrams of the various classes to see how they are related in a high level and visualize the code. This has helped many times
If there are log files being generated, have a look at it to understand the flow from the starting point (main class). Otherwise, put debug statements to understand the flow.
Ya, that sounds like a pretty bad spot to be in.
I would say that the best way is to just walk through the program line for line. Try to grasp the big picture in the code, and write alot of notes, both on paper and in comments in the code.
I would say, a good approach would be to generate documentation using javadoc or doxygen's class diagram feature, then as you run the code traverse through the class diagrams generated using doxygen and see who calls what. This works wonderfully for me everytime i am working on such a project.
I completely agree to most of the answers posted.
I can add to use a development tool that reverse engineering the code and create a class diagram, to have an overall picture of what is involved.
Then you need patience. But you will be a stronger and smarter developer when you'll get through...
Good luck!
One of the best and first things to do is to try to build and run the code. It might sound a bit simplistic but the problem when you take over undocumented code is that you can't even build it and run it. When have no clue were to start.

Removing tightly coupled code

Forgive me if this is a dupe but I couldn't find anything that hit this exact question.
I'm working with a legacy application that is very tightly coupled. We are pulling out some of the major functionality because we will be getting that functionality from an external service.
What is the best way to begin removing the now unused code? Should I just start at the extreme base, remove, and refactor my way up the stack? During lunch I'm going to go take a look at Working Effectively with Legacy Code.
If you can, and it makes sense in your problem domain, I would try to, during development, try and keep the legacy code functioning in parallel with the new API. And use the results from the legacy API to cross check that the new API is working as expected.
I think the most important thing you can do is to refactor/remove/test in very small chunks. It's tedious and time consuming but it will help limit risks and errors later on.
I would also start with code that is "low risk" to change.
My advice is to use findbugs and PMD/CPD (copy-paste-detector) to remove dead code (code that can not or will not be called) unused variables and duplicated code. Getting rid of this junk will make re-factoring easier.
Then learn the key mappings for the common re-factoring in your IDE. Extract method and introduce variable should be committed to muscle memory after an hour.
Use the primary disadvantage of tightly coupled code to... your advantage!
Step 1: Identify the area which provides the redundant functionality which you want to replace. Break it...do a quick smoke test of some of the critical parts of the application. Get the feel.
Step 2: Depending on what language it is find the relevant static-code analysis tools and get the needed refactoring info.
Step 3: Repeat Step 1 in incremental levels of narrowing down to the exact pattern.
All this of course, in a sandbox environment. This may seem a bit haphazard but if you limit yourself to critical functionality testing ... you may get many leads in the process. You will definitely identify the pattern of the legacy code, if nothing else.
You absolutely cannot do with with a live development version [new features being added]. You must start with a feature freeze.
I tend to look at all of the components of the system in an overview and see the biggest places of reuse. From there I would implement the appropriate design pattern to solve it, and make the new component reusable. Write test cases to ensure the new code works as expected, then refactor your code around the new change. Then repeat [overview, etc] till you are satisfied.
I would suggest this for many reasons:
Everyone working with you on refactoring will learn something
People learn how to avoid design mistakes down the road
Everyone working on it will get a better understanding of the code base

What helps to you improve your ability to find a bug?

I want to know if there are method to quickly find bugs in the program.
It seems that the more you master the architecture of your software, the more quickly
you can locate the bugs.
How the programmers improve their ability to find a bug?
Logging, and unit tests. The more information you have about what happened, the easier it is to reproduce it. The more modular you can make your code, the easier it is to check that it really is misbehaving where you think it is, and then check that your fix solves the problem.
Divide and conquer. Whenever you are debugging, you should be thinking about cutting down the possible locations of the problem. Every time you run the app, you should be trying to eliminate a possible source and zero in on the actual location. This can be done with logging, with a debugger, assertions, etc.
Here's a prophylactic method after you have found a bug: I find it really helpful to take a minute and think about the bug.
What was the bug exactly in essence.
Why did it occur.
Could you have found it earlier, easier.
Anything else you learned from the bug.
I find taking a minute to think about these things will make it far less likely that you will produce the same bug in the future.
I will assume you mean logic bugs. The best way I have found to capture logic bugs is to implement some sort of testing scheme. Check out jUnit as the standard. Pretty much you define a set of accepted outputs of your methods. Every time you compile your system it checks all of your test cases. If you have introduced new logic that breaks your tests, you will know about it instantly and know exactly what you have to fix.
Test driven design is a pretty big movement in programming right now. You will be hard pressed to find a language that doesn't support some kind of testing. Even JavaScript has a multitude of test suites.
Experience makes you a better debugger. Pay close attention to the bugs that you AND others commonly make. Try to figure out if/how these bugs apply to ALL code that affects you, not the single instance of where the bug was seen.
Raymond Chen is famous for his powers of psychic debugging.
Most of what looks like psychic
debugging is really just knowing what
people tend to get wrong.
That means that you don't necessarily have to be intimately familiar with the architecture / system. You just need enough knowledge to understand the types of bugs that apply and are easy to make.
I personally take the approach of thinking about where the bug may be in the code before actually opening up the code and taking a look. When you first start with this approach, it may not actually work very well, especially if you are pretty unfamiliar with the code base. However, over time someone will be able to tell you the behavior they are experiencing and you'll have a good idea where the problem is located or you may even know what to fix in the code to remedy the problem before even looking at the code.
I was on a project for several years that maintained by a vendor. They were not very good debuggers and most of the time it was up to us to point them to an area of the code that had the problem. What made our problem worse was that we didn't have a nice way to view the source code, so a lot of our "debugging" was just feeling.
Error checking and reporting. The #1 newbie coder debugging mistake is to turn off error reporting, avoid checking for whether what's going on makes sense, etc etc. In general, people feel like if they can't see anything going wrong then nothing is going wrong. Which of course could not be further from the case.
Instead, your code should be chock full of error conditions that will make lots of noise, with detailed reporting, someplace you will see it. (This doesn't mean inside a production web page.) Then, instead of having to trace an error all over the place because it got passed through sixteen layers of execution before it finally got someplace that broke, your errors start happening proximately to the actual issue.
It seems that the more you master the
architecture of your software ,the
more quickly you can locate the bugs.
After understanding the architecture, one's ability to find bugs in the application increases with their ability to identify and write extensive tests.
Know your tools.
Make sure that you know how to use conditional breakpoints and watches in your debugger.
Use static analysis tools as well - they can point out the more obvious issues.
Sleep and rest.
Use programming methods that produce fewer bugs in the first place.
If to implement a single stand-alone functional requirement it takes N separate point-edits to source code, the number of bugs put into the code is roughly proportional to N, so find programming methods that minimize N. Ways to do this: DRY (don't repeat yourself), code generation, and DSL (domain-specific-language).
Where bugs are likely, have unit tests.
Obviously.IMHO, the best unit tests are monte-carlo.
Make intermediate results visible.
For example, compilers have intermediate representations, in the form of 4-tuples. If there is a bug, the intermediate code can be examined. That tells if the bug is in the first or second half of the compiler.
P.S. Most programmers are not aware that they have a choice of how much data structure to use. The less data structure you use, the less are the chances for bugs (and performance issues) caused by it.
I find tracepoints to be an invaluable debugging tool. They are a bit like logging, except you create them during a debugging session to solve a particular issue, like breakpoints.
Printing the stacktrace in a tracepoint can be especially useful. For example, you can print the hash code and stacktrace in the constructor of an object, and then later on when the object is used again you can search for its hashcode to see which client code created it. Same for seeing who disposed it or called a certain method etc.
They are also great for debugging issues related to window focus changes etc, where the debugger would interfere if you drop in break mode.
Static code tools like FindBugs
Assertions, assertions, and assertions.
Some areas of our code has 4 or 5 assertions for each line of real code. When we get a bug report the first thing that happens is that the customer data is processed in our debug build 99 times out a hundred an assert will fire near the cause of the bug.
Additionally our debug build perform redundant calculations to ensure that an optimized algorithm is returning the correct result, and also debug functions are used to examine the sanity of data structures.
The hardest thing new developers have to contend with is getting their code to survive the assertions of the code gthey are calling.
Additionally we do not allow any code to be putback to toplevel that causes any integration or unit test to fail.
Stepping through the code, examining flow/state where unexpected behavior is occurring. (Then develop a test for it, of course).
Writing Debug.Write(message) in your code and using DebugView is another option. And then run your application find out what is going on.
"Architecture" in software means something like:
Several components
The components interact across clearly-defined interfaces
Each component has a well-defined responsibility
The responsibility of one component is unlike the responsibilities of other components
So, as you said, the better the architecture the easier it is to find bugs.
First: knowing the bug, you can decide which functionality is broken, and therefore know which component implements that functionality. For example, if the bug is that something isn't being logged properly, therefore this bug should be in one of 3 places:
In the component that's responsible for logging (your logging library)
Or, above that in the application code which is using this library
Or, below that in the system code which this library is using
Second: examine the data transfered across the interfaces between components. To continue the previous example above:
Set a debugger breakpoint on the application code which invokes the logger API, to verify whether the logger API is being used correctly (e.g. whether it's being invoked at all, whether parameters are as-expected, etc.).
Doing this tells you whether the bug is in the component above this interface, or in the component that's below this interface.
Repeat (perhaps using binary search if the call stack is very deep) until you've found which component is at fault.
When you come to the point that you think there must be a bug in the OS, check your assertions -- and put them into the code with "assert" statements.
Conversely, as you are writing the code, think of the range of valid inputs for your algorithms and put in assertions to make sure you have what you think you have. Same goes for output: Check that you produced what you think you produced.
E.g. if you expect a non-empty list:
l = getList(input)
assert l, "List was empty for input: %s" % str(input)
I'm part of the QA team # work, and knowing anything about the product and how it is developed, helps a lot in finding bugs, also when I make new QA tools I pass it to our dev team to test it, finding bugs in your own code is just plain hard!
Some people say programmers are tainted, so we cannot see bugs in their own product; we are not talking about code here, we are beyond that, usability and functionality itself.
Meanwhile unit testing seams to be a nice solution to find bugs in your own code, its totally pointless if you're wrong even before writing the unit test, how are you going to find the bugs then? you don't!, let your co-worker find them, hire a QA guy.
Scientific debugging is what I always used, and it greatly helps.
Basically, if you can replicate a bug, you can track its origin. You should then experiment some tests, observe the results, and infer hypotheses on why the bug happens.
Writing about all your hypotheses, attempts, expected results and observed results can help you track down the bugs, particularly if they're nasty.
There are automated tools that can help you with that process, particularly git-bisect (and similar bisection tools on other revision systems) to quickly find which change introduced the bug, unit testing to reproduce a bug and prevent regressions in your code (can be used in combination with bisect), and delta debugging to find the culprit in your code (similar to git-bisect but whereas git-bisect works on the code history, delta debugging works on the code directly).
But whatever the tools you are using, the most important benefit is in the scientific methodology, as this is the formalization of what most experienced debuggers do.

What can you do to a legacy codebase that will have the greatest impact on improving the quality?

As you work in a legacy codebase what will have the greatest impact over time that will improve the quality of the codebase?
Remove unused code
Remove duplicated code
Add unit tests to improve test coverage where coverage is low
Create consistent formatting across files
Update 3rd party software
Reduce warnings generated by static analysis tools (i.e.Findbugs)
The codebase has been written by many developers with varying levels of expertise over many years, with a lot of areas untested and some untestable without spending a significant time on writing tests.
Read Michael Feather's book "Working effectively with Legacy Code"
This is a GREAT book.
If you don't like that answer, then the best advice I can give would be:
First, stop making new legacy code[1]
[1]: Legacy code = code without unit tests and therefore an unknown
Changing legacy code without an automated test suite in place is dangerous and irresponsible. Without good unit test coverage, you can't possibly know what affect those changes will have. Feathers recommends a "stranglehold" approach where you isolate areas of code you need to change, write some basic tests to verify basic assumptions, make small changes backed by unit tests, and work out from there.
NOTE: I'm not saying you need to stop everything and spend weeks writing tests for everything. Quite the contrary, just test around the areas you need to test and work out from there.
Jimmy Bogard and Ray Houston did an interesting screen cast on a subject very similar to this:
http://www.lostechies.com/blogs/jimmy_bogard/archive/2008/05/06/pablotv-eliminating-static-dependencies-screencast.aspx
I work with a legacy 1M LOC application written and modified by about 50 programmers.
* Remove unused code
Almost useless... just ignore it. You wont get a big Return On Investment (ROI) from that one.
* Remove duplicated code
Actually, when I fix something I always search for duplicate. If I found some I put a generic function or comment all code occurrence for duplication (sometime, the effort for putting a generic function doesn't worth it). The main idea, is that I hate doing the same action more than once. Another reason is because there's always someone (could be me) that forget to check for other occurrence...
* Add unit tests to improve test coverage where coverage is low
Automated unit tests is wonderful... but if you have a big backlog, the task itself is hard to promote unless you have stability issue. Go with the part you are working on and hope that in a few year you have decent coverage.
* Create consistent formatting across files
IMO the difference in formatting is part of the legacy. It give you an hint about who or when the code was written. This can gave you some clue about how to behave in that part of the code. Doing the job of reformatting, isn't fun and it doesn't give any value for your customer.
* Update 3rd party software
Do it only if there's new really nice feature's or the version you have is not supported by the new operating system.
* Reduce warnings generated by static analysis tools
It can worth it. Sometime warning can hide a potential bug.
I'd say 'remove duplicated code' pretty much means you have to pull code out and abstract it so it can be used in multiple places - this, in theory, makes bugs easier to fix because you only have to fix one piece of code, as opposed to many pieces of code, to fix a bug in it.
Add unit tests to improve test coverage. Having good test coverage will allow you to refactor and improve functionality without fear.
There is a good book on this written by the author of CPPUnit, Working Effectively with Legacy Code.
Adding tests to legacy code is certianly more challenging than creating them from scratch. The most useful concept I've taken away from the book is the notion of "seams", which Feathers defines as
"a place where you can alter behavior in your program without editing in that place."
Sometimes its worth refactoring to create seams that will make future testing easier (or possible in the first place.) The google testing blog has several interesting posts on the subject, mostly revolving around the process of Dependency Injection.
I can relate to this question as I currently have in my lap one of 'those' old school codebase. Its not really legacy but its certainly not followed the trend of the years.
I'll tell you the things I would love to fix in it as they bug me every day:
Document the input and output variables
Refactor the variable names so they actually mean something other and some hungarian notation prefix followed by an acronym of three letters with some obscure meaning. CammelCase is the way to go.
I'm scared to death of changing any code as it will affect hundreds of clients that use the software and someone WILL notice even the most obscure side effect. Any repeatable regression tests would be a blessing since there are zero now.
The rest is really peanuts. These are the main problems with a legacy codebase, they really eat up tons of time.
I'd say it largely depends on what you want to do with the legacy code...
If it will indefinitely remain in maintenance mode and it's working fine, doing nothing at all is your best bet. "If it ain't broke, don't fix it."
If it's not working fine, removing the unused code and refactoring the duplicate code will make debugging a lot easier. However, I would only make these changes on the erring code.
If you plan on version 2.0, add unit tests and clean up the code you will bring forward
Good documentation. As someone who has to maintain and extend legacy code, that is the number one problem. It's difficult, if not downright dangerous to change code you don't understand. Even if you're lucky enough to be handed documented code, how sure are you that the documentation is right? That it covers all of the implicit knowledge of the original author? That it speaks to all of the "tricks" and edge cases?
Good documentation is what allows those other than the original author to understand, fix, and extend even bad code. I'll take hacked yet well-documented code that I can understand over perfect yet inscrutable code any day of the week.
The single biggest thing that I've done to the legacy code that I have to work with is to build a real API around it. It's a 1970's style COBOL API that I've built a .NET object model around, so that all the unsafe code is in one place, all of the translation between the API's native data types and .NET data types is in one place, the primary methods return and accept DataSets, and so on.
This was immensely difficult to do right, and there are still some defects in it that I know about. It's not terrifically efficient either, with all the marshalling that goes on. But on the other hand, I can build a DataGridView that round-trips data to a 15-year-old application which persists its data in Btrieve (!) in about half an hour, and it works. When customers come to me with projects, my estimates are in days and weeks rather than months and years.
As a parallel to what Josh Segall said, I would say comment the hell out of it. I've worked on several very large legacy systems that got dumped in my lap, and I found the biggest problem was keeping track of what I already learned about a particular section of code. Once I started placing notes as I go, including "To Do" notes, I stopped re-figuring out what I already figured out. Then I could focus on how those code segments flow and interact.
I would say just leave it alone for the most part. If it's not broken then don't fix it. If it is broken then go ahead and fix and improve the portion of the code that is broken and its immediately surrounding code. You can use the pain of the bug or sorely missing feature to justify the effort and expense of improving that part.
I would not recommend any wholesale kind of rewrite, refactor, reformat, or putting in of unit tests that is not guided by actual business or end-user need.
If you do get the opportunity to fix something, then do it right (the chance of doing it right the first time might have already passed, but since you are touching that part again might as well do it right time around) and this includes all the items you mentioned.
So in summary, there's no single or just a few things that you should do. You should do it all but in small portions and in an opportunistic manner.
Late to the party, but the following may be worth doing where a function/method is used or referenced often:
Local variables often tend to be poorly named in legacy code (often owing to their scope expanding when a method is modified, and not being updated to reflect this). Renaming these in line with their actual purpose can help clarify legacy code.
Even just laying out the method slightly differently can work wonders - for instance, putting all the clauses of an if on one line.
There might be stale/confusing code comments there already. Remove them if they're not needed, or amend them if you absolutely have to. (Of course, I'm not advocating removal of useful comments, just those that are a hindrance.)
These might not have the massive headline impact you're looking for, but they are low risk, particularly if the code can't be unit tested.

Resources