I am an engineering student, and deciding upon my final year project.
One of the many candidates is an online UML tool with code generation facilities. But I did not take compiler designing classes, so I am not much aware of the code generation techniques.
I want to know about the techniques that I should look to study in order to build something like this. If these techniques are as complicated as writing a compiler, then perhaps I will have to abandon this idea.
Compilation is really the opposite of the kind of code generation you are describing, so I don't think you need to know how to write a compiler.
Code generation can be as simple as combining text strings or using templates, or as complex as using Reflection.Emit to create classes at runtime.
I would start with this Wikipedia article.
The creation of an UML tool is a long term project. You need many to acquire different expertises which can not be known by just one member of the team.
Your academic project is too ambitious.
An easy project which has never been done is to generate code from an activity or state diagram. You should not try to recreate the graphical editor because this is very very complex but only to take the xmi export and generate code from it using a xml parser. This would be a good 6 months project for your thesis :-)
Most UML tools generate source code. The generation is normally quite a bit simpler than a compiler as well. For example, a class diagram will have a collection of data structures representing classes and links between those classes (inheritance). To generate output, you walk through the class objects, and for each you "print" out a representation of that object in the syntax of the target language.
I'm not sure exactly what capabilities your code generation will require, but the UML tools that I have used are not very sophisticated in their code generation.
Tools that I have used simply create files and drop your function names into them with arguments derived from the inputs. This would not require any understanding of compilers. Most of the difficulty would be in the user interface and how you store the data to make code generation easy.
You can just find that here:
http://yuml.me and http://askuml.com
Related
So iOS 6 deprecates presentModalViewController:animated: and dismissModalViewControllerAnimated:, and it replaces them with presentViewController:animated:completion: and dismissViewControllerAnimated:completion:, respectively. I suppose I could use find-replace to update my app, although it would be awkward with the present* methods, since the controller to be presented is different every time. I know I could handle that situation with a regex, but I don't feel comfortable enough with regex to try using it with my 1000+-files-big app.
So I'm wondering: Does Xcode have some magic "update deprecated methods" command or something? I mean, I've described my particular situation above, but in general, deprecations come around with every OS release. Is there a better way to update an app than simply to use find-replace?
You might be interested in Program Transformation Systems.
These are tools that can automatically modify source code, using pattern-directed source-to-source transformations ("if you see this source-level pattern, replace it by that source-level pattern") that operate on code structures rather than text. Done properly, these transformations can be reliable and semantically correct, and they're a lot easier to write than low-level procedural code that navigates and smashes nanoscopic actual tree structures.
It is not the case that using such tools is easy; such tools have to know how to parse the language of interest into compiler data structures, (e.g., ObjectiveC), process the patterns, and regenerate compilable source code from the modified structures. Even with the basic transformation engine, somebody needs to carefully define parsers (and unparsers!) for the dialects of the languages of interest. And it takes time to learn how to use such a even if you have such parsers/unparsers. This is worth it if the changes you need to make are "regular" (in the program transformation sense, not the regexp sense) and widespread (as yours seem to be).
Our DMS Software Reengineering toolkit has an ObjectiveC front end, and can carry out such transformations.
no there is no magic like that
I m in process of understanding and building a static code analysis tool for a proprietary language from a big company. Reason for doing this , I have to review a rather large code base , and a static code analysis would help a lot and they do not have one for the language so far.
I would like to know how does one go about building a static code analysis tool , for e.g. Lint or SpLint for C.
Any books, articles , blogs , sites..etc would help.
Thanks.
I know this is an old post, but the answers don't really seem that satisfactory. This article is a pretty good introduction to the technology behind the static analysis tools, and has several links to examples.
A good book is "Secure Programming with Static Analysis" by Brian Chest and Jacob West.
You need good infrastructrure, such as a parser, a tree builder, tree analyzers, symbol table builders, flow analyzers, and then to get on with your specific task you need to code specific checks for the specific problems of interest to you, using all the infrastructure machinery.
Building all that foundation machinery is actually pretty hard, and it doesn't help you do your specific task. People don't write the operating system for every application they code; why should you build all the infrastructure? Like an OS, it is better if you simply acquire good infrastructure.
People will tell you to lex and yacc. That's kind of like suggesting you use the real time keneral part of the OS; useful, but far from all the infrastructure you really need.
Our DMS Software Reengineering Toolkit provides all the necessary infracture. It has been used to define many language front ends as well as
many tools for such languages.
Such infrastructure would allow you to define your specific nonstandard language relatively quickly, and then get on with your task of coding your special checks.
There is a blog by DeepSource that covers everything one needs to know to build an understanding of static code analysis and equip you with the basic theory and the right tools so that you can write analyzers on your own.
Here’s the link: https://deepsource.io/blog/introduction-static-code-analysis/
Obviously you need a parser for the language. A good high level AST is useful.
You need to enumerate a set of "mistakes" in the language. Without knowing more about the language in question, we can't help here. Examples: unallocated pointers in C, etc.
Combine the AST with the mistakes in #2.
I'm running a refactoring code dojo for some coworkers who asked how refactoring and patterns go together, and I need a sample code base. Anyone know of a good starting point that isn't to horrible they can't make heads or tails of the code, but can rewrite their way to something useful?
I would actually suggesting refactoring some of your and your coworkers' code.
There are always places that an existing codebase can be refactored, and the familiarity with the existing code will help make it feel more like a useful thing and less like an exercise. Find something in your company's code to use as an example, if possible.
Here are some codes, both the original and the refactored version, so you can prepare your kata or simply compare the results once the refactoring is performed:
My books have both shorter examples and a longer, actually a book long example. Code is free to download.
VB Code Examples
C# Code Examples
A nice example from Refactoring Workbook
There are a lot of examples on the internet of simple games like Tic-Tac-Toe or Snake that have a lot of smells but are simple enough to start with refactoring.
The first chapter in Martin Fowler "Refactoring" is a good starting point to refactoring. I understood most of the concepts when one of my teachers at school used this example.
What is the general knowledge level of your coworkers?
Something basic as code duplication should be easy to wrap their heads around. Two pieces of (nearly) identical code that can be refactored into a reusable method, class, whatever. Using a (past) example from your own codebase would be good.
I would recommend you to develop a simple example project for a specific requirement.
Then you add one more requirement and make changes to the existing classes . You keep on doing this and show them how you are finding it difficult to make each change when the code is not designed properly. This will make them realize easily because, this is what those ppl will be doing in their day to day work. Make them realize that , if patterns and principles are not followed from beginning, how are they going to end up in mess at the end.
When they realize that,then you start from scratch or refactor the existing messed up code .Now add a requirement and make them realize that it is easy to make a change in the refactored code, so that you need to test only a few classes. One change would not affect others and so on.
You could use the computer ,keyboard and printer class as an example. Add requirements like, you will be wanting the computer to read from mouse , then one more requirement can be like your computer would want to save it in hard disk than printing. Finally your refactored code should be like, your computer class should depend on abstract input device class and output device class. And your keyboard class should inherit from Inputdevice class.
Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin considers refactoring.
I'm loving Refactoring Guru examples.
In there you can find design patterns examples too.
Refactoring is non-functional requirement when code perform correct functionality for which it is designed however difficult to debug, requires more effort to maintain and some performance bottleneck. Refactoring is to change to be easily maintainable, good readability and improve efficiency.
Thus we need to focus on criteria to make code more readable, easy to maintain.
Its obvious that having very large method/function might be difficult to understand.
Class depends on other hundreds of class make thing worst while debugging.
Code should be readable just like reading some workflow.
You can also use tools like sonar which can help you to identify critical criteria such as "Cyclomatic Complexity"
http://www.sonarsource.org/managing-cyclomatic-complexity-to-increase-maintainability/
You ask them to write code them self and check how tool does refactoring.
Apart from that, you can write code in eclipse and there is option available which does refactoring for you...
It's a bit dated (2003), but IBM has several refactoring examples (that work[ed?] in Eclipse) at http://www.ibm.com/developerworks/library/os-ecref/
As a pet project, I was thinking about writing a program to migrate applications written in a language A into a language B.
A and B would be object-oriented languages. I suppose it is a very hard task : mapping language constructs that are alike is doable, but mapping libraries concepts will be a very long task.
I was wondering what tools to use, I know this has to do with compilation, but I'm a bit afraid to use Lex and Yacc and all that stuff.
I was thinking of maybe using the Eclipse Modeling Framework, which would help me write models (of application code) transformations in a readable form.
But first I would have to write parsers for creating the models (and also create the metamodel from the language grammar).
Are there tools that exist that would make my task easier?
You can use special transformation tools/languages for that TXL or Stratego/XT.
Also you can have a look and easily try Java to Python and Java to Tcl migrating projects made by me with TXL.
You are right about mapping library concepts. It is rather hard and long task. There are two ways here:
Fully migrate the class library from language A to B
Migrate classes/functions from language A to the corresponding concepts in language B
The approach you will choose depends on your goals and time/resources available. Also in many cases you wont be doing a general A->B migration which will cover all possible cases, you will need just to convert some project/library/etc. so you will see in your particular cases what is better to do with classes/libraries.
I think this is almost impossibly hard, especially as a personal project. But if you are going to do it, don't make life even more difficult for yourself by trying to come up with a general solution. Choose two specific real-life programming languages ind investigate the possibities of converting between them. I think you will be shocked by the number of problems and issues this will expose.
There are some tools for direct migration for some combinations of A and B.
There are a variety of reverse engineering and code generation tools for different languages and platforms. It's fairly rare to see reverse engineering tools which capture all the semantics of the source language, and the semantics of UML are not well defined ( since it's designed to map to different implementation languages, it itself doesn't define a complete execution model for its behavioural representations ), so you're unlikely to be able to reverse engineer and generate code between tools. You may find one tool that does full reverse engineering and full code generation for your A and B languages, and so may be able to get somewhere.
In general you don't use the same idioms on different platforms, so you're more likely to get something which emulates A code on B rather than something which corresponds to a native B solution.
If you want to use Java as the source language(that language you try to convert) than you might use Checkstyle AST(its used to write Rules). It gives you tree structure with every operation done in the source code. This will be much more easier than writing your own paser or using regex.
You can run com.puppycrawl.tools.checkstyle.gui.Main from checkstyle-4.4.jar to launch Swing GUI that parse Java Source Code.
Based on your comment
I'm not sure yet, but I think the source language/framework would be Java/Swing and the target some RIA language like Flex or a Javascript/Ajax framework. – Alain Michel 3 hours ago
Google Web Toolkit might be worth a look.
See this answer: What kinds of patterns could I enforce on the code to make it easier to translate to another programming language?
So - highly hypothetical question and more like discussion about your coding style and practice you use daily.
I will take as example: CodeGear RAD Studio 2009 (sorry to all D7 fans, but Unicode rules).
I have capability to expand/collapse functions/procedures/records and few other complex data structures, but what if code is lengthy?
What makes the task and its accomplishment efficient - the time required to add comments (its req actually) and expand/collapse necessary area or use OMT offered possibilities?
To give example input from myself - I have small app, about 1,5k lines and I do not use Modeling. Is it smart enough or do I lose a lot of time if I need to find some simple references or (event) calls?
If I understand your question correctly, it is a bout finding your way into code (yours or someone elses').
I use Model Maker Code Explorer for browsing through source code (and for refactoring existing code, and creating new code). At EUR 99, it is dead cheap for what it does.
It usually gives me a perfect overview of what I need, and has a nice 'search' interface as well.
If I need more complex searches, I usually use the GExperts (grep) search function: it is blazingly fast, and with good naming of your identifiers, it is usually a breeze to find stuff.
If I understand your question correctly, you want to know what is more efficient:
Use comments and expandable sections.
Use moddeling techniques.
I think it depends on personal style. Modeling can be great, but has dangers of spending too much time creating nice pictures.
We have a large app 500k+ lines. We do not use collapsable sections because we keep our file size acceptable and we have a good file organisation structure. We sometimes use modeling if complex parts are added (class diagrams and state diagrams). And we use lots of comment to explain difficult parts.
If you have Delphi 2009 you can use also the Delphi Class Explorer (in the View menu) in order to see your classes. It seems a little bit cryptic but only for the first 5 minutes. After this you will get used with it.
Also you can use CnPack a very impressive package in order to help you manage your project. Basically, in the IDE appears a new menu called 'CnPack' which has a bunch of wizards to help you find the way out in the source. Some examples:
Uses Cleaner
Procedure List (it gives you the incremental search capability for your procedures - very neat)
Bookmark Browser
etc.