How can I develop a custom transpiler using NLP? - compilation

There are ~500k code snippets written in proprietary language that I have to port to a new system also using its own proprietary language. I have the following with me
Vocabulary and grammar of source and destination languages
Sample of 1500 converted rules (for training if required) of different complexity
I am not looking for 100% automation but may be a transpiler that may automate part of it. Can it be done using NLP? Have already gone through this, this, Rascal , Haxe and Spoofax. I could not find much documentation on how to create a custom tranpiler.
Any help is appreciated. Thank you!

Related

Find links/relationships between 2 variables/objects in the code

EDIT:
I found that doxygen can generate call graphs for classes, but I could not find any options or examples where the call(er) graph is generated for public/private members of the class(es) such as fields, methods, etc.? See the example that I provided below.
Is it possible to find links/relationships between 2 variables/objects in the code using some IDE tools and code editors, i.e. in Visual Studio, Sublime, etc.
e.g.
a=func(b,c);
w=func(a,c);
Here w and b are indirectly related to each other.
In convoluted code it is very difficult to manually find such relationships.
I understand that reflection and dynamic nature of some languages can limit such analysis.
You need to provide the language you are looking to use. If I take a guess and say C/C++ you can use CCTree and Cscope in general for this functionality. Most open source developers use Cscope extensively for this purpose.
Eclipse CDT also has call graph's. It is a bit of a pain to work outside of VisualStudio for this purpose I know. But cost is part of the reason to use open source instead.
Your best bet to cover all languages for the purpose of browsing is Exuberant Ctags. This works with a fair amount of editors and all the languages you listed. With that large a list of languages and use cases its probably worth your time to learn either vim or emacs and the integrations supported here.
For Python you can also take a look at pyscope with cscope. Another excellent alternative for Python is Rope. Rope supports finding definitions and usages as part of its standard set of tools.
Most developers do not need CCTree as browsing code bases with cscope is relatively straightforward. I have used exuberant ctags + emacs on a huge variety of language for years. It takes a touch of time to learn, but the upsides are
it's free, portable, and powerful. Another alternative to CCTree is codegraph for some of your target languages.
Found a list of tools and comparison:
https://github.com/OpenGrok/OpenGrok/wiki/Comparison-with-Similar-Tools
EDIT
possible in doxygen, but only for classes and their relationships
I found it, this is code map in VS Ultimate:
http://blogs.msdn.com/b/visualstudioalm/archive/2014/11/12/announcing-visual-studio-2015-preview-availability.aspx

Where can I find math intense apps in Ruby

I have found many rails apps mainly on the enterprise, social networking kind of web apps. I see that Ruby is compared with some of the great OOPS languages like Java & C# but I am really finding it hard to get some Math Intense apps. Any knowledgeble input (links to sample programs etc.), where the usage of the language is shown with ease and is like jumpstart or show how the language can be used for variety of math probs, is greatly appreciated.
Unfortunately, Ruby hasn't ventured very far into mathematical and scientific computing. Currently, there is a pre-alpha library called SciRuby that is attempting to bring more math oriented capabilities to Ruby. They are trying to build a NumPy/SciPy equivalent. A few projects that are under SciRuby with example usage are:
NMatrix
Rubyvis
Statsample
Each project has various examples on how to get started/contribute. A good place to start is their docs and their mailing list.
Hope that helps.

Best practices for internationalization using PyQt4

I want to add multiple language support to my application which is written in Python using PyQt4. I was looking for information on how to add multiple languages and would like to see how other people do this.
Here i read:
The PyQt behaviour is unsatisfactory and may be changed in the future.
It is recommended that QCoreApplication.translate() be used in
preference to tr() (and trUtf8()). This is guaranteed to work with
current and future versions of PyQt and makes it much easier to share
message files between Python and C++ code.
In files generated by pyuic4 i see something like:
WPopupCalendar.setWindowTitle(QtGui.QApplication.translate("WPopupCalendar", "Календарь", None, QtGui.QApplication.UnicodeUTF8))
This looks too long for me. I was thinking to make my own tr helper function which somehow would automate the process.
Also i could not find articles describing a workflow and specifics for developing multilingual apps in python with pyqt4.
Would you please advice me with some good and convenient techniques on this?
Just use tr (or trUtf8) everywhere to start with. Only bother with translate when you identify code that is affected by the issue with multiple inheritance (which could easily be never).
I would suggest you have a look at Qt's i18n overview, and the Qt Linguist Manual. They are obviously both oriented towards C++ projects, but it should give you a pretty clear idea of what's required.
For a working example, you could also download the source code of the Eric Python IDE - it's written in PyQt4, and has support for a half dozen or more languages.

why use Google V8

I don't get it. I'm a C/C++ programmer, what's the possible use of V8 for me? There are few examples and tutorials out there, and they all lack substance - I don't want to use another library to just add a couple of numbers or print something in a console window.
My question is: is there a real use for this technology, and if yes, then would be the scenario?
Also, can I do any part of GUI this way?
Help is appreciated.
"V8 is Google's open source JavaScript engine"
So the whole point is ability to write code in JavaScript, and run it quite fast (for an interpreted dynamic language). Google Chrome, which is written in C++, uses it for internal scripting — not only for regular web page scripting, but also for extension code. Let's consider this as a 'real use'.
So, if your app needs scripting, V8 may be good for you (JS is not a perfect language, but stil quite decent). As for GUI, you'll need to bind your GUI components with JS first, there's no built-in UI components (as Tk in TCL).
One real use of v8 is node.js. I hope that is good enough
Google V8 is a JavaScript engine.
I don't really think it is what you are looking for.
V8 is a JavaScript engine. The most common use for it is to allow users of your software to write scripts in simpler language than that your software was written with (C++ in your case).
It´s the same approach of Matlab, AutoCad, Microsoft Office, and etc.
If you write any kind of commercial application, you can expose some APIs and allow other developer to create addons for your applications without require them to know C/C++.
How about this for real use: You can use javascript as a debugging or testing tool - add a javascript console to your app and bind the commands of your GUI application to javascript functions, and you'll be able to test your UI application using javascript scripts. This way you'll reduce the amount of manual testing needed - manual testing would only have to verify that a correct command was excutes as a result of user action.
You can do GUI in javascript the same way that Qt is being used in Python and other scripting languages (see PyQt, and QtRuby, PerlQt, etc.). For how to create bindings for V8 you may want to check out this

Pros and cons of using gettext instead of QObject.tr() for localization of PyQt4 application?

I have couple of application written in PyQt4 where I've used standard Python gettext library for internationalization and localization of GUI. It works good for me. But I've selected gettext just because I've already had knowledge and experience of gettext usage, and zero of experience with Qt4 tr() approach.
Now I'd like to better compare both approaches and understand what I'm missing by using gettext instead of QObject.tr, and does there any serious reason why I should not use gettext for Qt4/PyQt4 applications?
In my understanding advantages of using gettext are:
GNU gettext is mature and it seems to be standard de-facto in GNU/Linux world.
There is enough special editors for PO files to simplify translators work, although textual nature of PO templates makes it not strictly necessary.
There is even web services available which can be used for collaborative translations.
gettext is standard Python library, so I don't need to install anything special to use it in runtime.
It has very good support for singular/plural forms selection via ngettext().
What I see as advantages of QObject.tr():
This is native technology for Qt4/PyQt4 so maybe it will work better/faster (although I have no data to prove).
The messages to translate may have additional context information which will help translators to choose the best variants for homonym words, e.g. the english word "Letter" can be translates as "Character", "Mail" or even kind of "Paper size" depending on the actual context.
What I see as disadvantages of QObject.tr() vs gettext:
I did not found in the Qt documentation how's supported singular/plural selection there.
Qt4 TS translation template is in XML format and therefore more complex to edit without special editor (QT Linguist) and it seems there is no other third-party solutions or web services. So it would require for translators to learn new tool (if they are already familiar with PO tools).
But all the items above are not critical enough to clearly say that any tool is better of other. And I don't want to start flame war about what is better because it's very subjective. I just want to know what I missing as pros and cons of QObject.tr() vs gettext.
One simple reason to use QObject.tr() is:
It saves you the need to install gettext on Windows, making cross-platform work a bit easier.
I try to have as little binary dependencies as possible on Windows.
All have their pros and cons, but to define them more clearly you would have to define first if you're targeting a mobile environment or a desktop environment.
Within our company we use different methods simply because the ideal solution does not exist yet.
For desktop development we're using PO files simply because the buttons are not scaled and therefore text will fit.
For mobile development, the translation of a string depends on the button size which could be different on landscape and portrait devices.
So this complicates it a little because a PO file can just have 1 translation of a certain word.
So we selected XLIFF for this, so we could assign unique ID's to a string.
This is not an easy task as well, because there are no good solutions to convert .RC files to XLIFF files.
(Because current tools convert ALL strings between "" which is of course unwanted behavior).
So I wrote a converter for this task.
However, when thinking of localization, then plural forms are very important so not having this is not a good localization solution.
Therefore, I would say to go for PO gettext.
Greetings,
Floris.
At the current time, Qt does not handle plural forms when you're making use of QT_TRANSLATE_NOOP
You could add that args are managed differently...
With Gettext, we can do
_("Hello %(name)s from %(city)s") % {person.__dict__}
whereas in PyQt, we do
self.tr("Hello %1 from %2").arg(person.name).arg(person.city)

Resources