Cuda Source to Source translation using Rose compiler

Cuda Source to Source translation using Rose compiler - compilation

I would like to know about the extent of support for cuda in rose compiler. I am trying to build a source to source translator for cuda. Is it possible using Rose compiler? Which distribution of Rose compiler should I use?
I know this has been discussed earlier (support for cuda in rose compiler), but I cannot understand whether cuda support is there or not. Rose user manual does not have much information either.

Rose has a C++ front end and a Fortran front end that seem reasonably well integrated. The Rose system design IMHO is not amenable to easy integration of other front end parsers (such as you would need presumably to parse Cuda), although with enough effort you could do it. (Rose originally only had C++, and Fortran was grafted on).
If you don't see explicit mention of Cuda in the Rose manuals, its pretty like because it simply isn't there.
If you want to process Cuda using source to source transformations, you'll need both a Cuda parser and an appropriate set of transformation machinery something like what Rose has.
I cannot offer you a Cuda parser, but my company does provide industrial strength source-to-source program transformation systems in the form the DMS Software Reengineering Toolkit.
DMS has been used to carry out massive transformations on large C++ systems, so I think it quite reasonable to say it is at least as competent as Rose for that purpose. DMS has also been used to process extremely large C and Fortran systems, and other codes in Java, C#, ECMAScript, PHP, and many other languages, so I think it safe to say it is considerably easier to integrate a different front end into DMS.
Cuda, as I understand it, is a C99 derivative. DMS has a C front end, with explicit support for building various C dialects. Most of C99 is already built using the dialect mechanism. That might be a pretty good starting point.
You can try other tools such as ANTLR as an alternative, but I think it will soon become obvious that ANTLR, and Rose and DMS are in very different leagues in terms of their ability to parse, analyze and transform complex systems of real code.

Related

Are custom-made programming language- compilers, based on existing languages?

I'm trying to start, figuring out, how creating a simple programming language work. Both with the syntax and the compiler itself. I've done some research on this topic, but I really don't get what my true question is all about.
I would think, that existing programming languages- compilers, is built on already existing programming languages, and therefore it would only make sense, to also base my compiler, on one of these languages.
Altho, since this in theory, the very first language with a compiler, didn't have another language to be based on, this can't be a true fact, and really must be based on something else, like the core Computer System language.
Which way is the best way to go, aswell as how, to get to my goal, which is creating a simple (With room to expanding) programming language?
Any answer is appreciated!

The very first compilers were based on assembler coding. Where did the assemblers come from?
The very first assemblers were based on painfully entered raw binary machine code instructions.
Hardly anybody enters binary; at very least, some kind of debugger program is used to do this. Hardly anybody codes compilers using assemblers anymore either; in many cases, a first compiler for a language is coded in C.
If you want to build a programming language, your first step is to get a compiler book (google "compiler book") and read it from cover to cover. If you try to avoid this step, you'll spend a huge amount of energy to try and invent what you need, and you'll likely fail.
Key tools for building compilers are parser generators, and program transformation systems. The former is the classic answer. The latter is a high-tech answer, and isn't very common, but can produce language processing tools much more quickly than classic answers. You need the compiler book background to understand these tools.

Which way is the best way to creating a simple programming language?
Unlike a majority of people I don't believe that creating a language is about using a compiler or interpreter. While you will most likely need a compiler or interpreter to implement your new language, they are tools just as is a pencil and paper. Don't start by using a tool and think you have accomplished something. It would be like using a wrench to make an engine that doesn't work, but you claim you made an engine because use used a wrench.
To create a good programming language you have to have goal for your language.
Since you mention programming language as opposed to some other type of language such as SQL, or a markup language such as HTML, I will take it that you want a Turing complete language.
Since most Turing complete languages support arithmetic I would start with a simple arithmetic expression language and build on that. There are a huge amount of examples of these on the Internet, but be fore warned that many have problems.
Next learn how to build Abstract Syntax Trees (AST) for arithmetic expressions. i.e.
3 + 2 * 6
+
/ \
3 *
/ \
2 6
Do not use a compiler to build the AST, but build them by hand in the language you are using to write your programming language. i.e. If you are using Java to create a C++ compiler, then create the AST using Java.
Then write an evaluator for the AST that will walk the tree.
Once you are able to correctly build an AST and evaluate then add the lexer/parser which translates human readable source code into an AST. This is were you will need to get a good compiler design book.
Now you can compile the AST into assembly or byte code or just continue using an evaluator.
From this point on you just add features to your language, again starting by with the AST and then modifying the parser and code generator if you implemented one.
How to create a simple (with room to expanding) programming language?
As I noted: start with an arithmetic evaluator and add language concepts one at a time. Since you are new at this, you may find that a concept you add is actually better as a composition of simpler concepts and that you should add one of the simpler concepts first before adding the other concept finally reaching the higher concept.
Because your question is so general I can't give more specific answers. I see that you already have a few close votes noting such.

If you want to build an unlimited extensibility into your language, consider implementing a simple metaprogramming system in it.
This way you can start with some very simple and small language, and then build an arbitrary complex language or a set of different languages by extending it with its own macros. Such language can be trivially turned into any other language.
Take a look at Forth and Lisp - both can be built upon some extremely trivial core which is then extended to a fully capable language. You don't even need any other high level language to implement such a chain: a simple Forth can be bootstrapped in about a couple of hundred lines of x86 assembly.
If you're determined enough, you can even skip assembler and write in machine code straight away, for something of this scale it's quite manageable in a reasonable time and might give you some indispensable experience.

well inventing a language is inventing a language...how you implement it you usually use an existing language and then at some point assuming your new language is such that it can be used as a compiler, then you write a compiler in your new language and you use the binary from the current language to compile the same language compiler, then you do it once more with the binary from the same language compiler if that all works you are self-hosting. a compiler that can compile its own language compiler.
If you have never made a language or compiler then you are a long long way from that, you might try one of the many examples on line of a simple C like compiler that can only do some simple things (and can never self-compile), get your feet wet with something like that.
At the end of the day the programming language to be useful has to compile down to something, ideally machine code be it a real machine or virtual like python or java or old pascal. But sometimes one language compiles down to another known language, C++ for example, and then you use existing tools for that language to take down to something can execute.
It has been asked and answered a number of times now. If you go far enough back or want to get as pure as you can you start with machine code and a way to enter it (see the many computers for this, dec pdp series, altair, etc, the entry method being address, data and clock manual switches). The "compiler" or in the case of assembly/machine code the "assembler" is the human with paper and pencil or pen if you are that good. You manually write out your assembly language, you then manually convert that to machine code, then you manually flip switches to enter the program into ram then you manually push the run button.
The first assemblers and then later compilers were written this way, you make an assembler using machine code using a human assembler, then self host that. Then you either use the human assembler or software assembler to the write your first compiler for your first ever non-assembly language, then you re-write the compiler in the new language, then you self-host that. Repeat until it is present day and there are more compilers and languages that you could ever master and a myriad of choices of editors and languages to build a compiler for a new language upon.

Unified assembly language

I wonder if there exists some kind of universal and easy-to-code opcode (or assembly) language which provides basic set of instructions available in most of today's CPUs (not some fancy CISC, register-only computer, just common one). With possibility to "compile", micro-optimize and "interpret" on any mentioned CPUs?
I'm thinking about something like MARS MIPS simulator (rather simple and easy to read code), with possibility to make real programs. No libraries necessary (but nice thing if that possible), just to make things (libraries or UNIX-like tools) faster in uniform way.
Sorry if that's silly question, I'm new to assembler. I just don't find NASM or UNIX assembly language neither extremely cross-platform nor easy to read and code.

The JVM bytecode is sort of like assembly language, but without pointer arithmetic. However, it's quite object-oriented. On the positive side, it's totally cross-platform.
You might want to look at LLVM bytecode - but bear in mind this warning: http://llvm.org/docs/FAQ.html#can-i-compile-c-or-c-code-to-platform-independent-llvm-bitcode

First thing: writing in Assembly does not guarantee a speed increase. Using the correct algorithm for the job at hand has the greatest impact on speed. By the time you need to go down to Assembly to squeeze the last few drops out you can only really do that by adapting the algorithm to the specific architecture of the hardware in question. A generic HLA (High Level Assembler) pretty much defeats the purpose of writing your code in Assembly. Note that I am not knocking Randall Hyde’s HLA, which is a great product, I’m just saying that you don’t gain anything from writing Assembly the way a compiler generates machine code. Most C and C++ compilers have very good optimizers, and can produce machine code superior to almost any naïve implementation in ASM.
See if you can find these books (2nd hand, they are out of print) by Michael Abrash: "Zen of Assembly Language", and "Zen of Code Optimization". Or look if you can find his articles on DDJ. They will give you an insight into optimization second to none,

Related stuff, so I hope might be useful :
There is
flat assembler
with an approach of a kind of portable assembler.
Interesting project of operating system with graphical user interface written in assembler, and great assembly API :
Menuet OS

LLVM IR provides quite portable assembly, backed with powerful compiler, backing many projects including Clang

OCaml and Scheme for game development

This is a question more targeted towards language features and not coding.
Could you tell me which would be a better language (OCaml or Scheme??) to use for basic game development?
My knowledge with both scheme and OCaml is pretty basic and I find both equally challenging to work with and was unable to determine which would be a better one with respect to scalability and ease of use.
If any of you guys have extensive development experience with either of the 2 languages please give me your inputs.
Any inputs appreciated.
Thank you.

Both OCaml and Racket (PLT Scheme) have OpenGL bindings. It looks like Racket doesn't have SDL bindings however, which may or may not be important to you.
Racket uses a JIT compiler, OCaml can be compiled to native code or byte code (and there are a couple of JIT compilers for OCaml).
OCaml is faster than Racket for most of the benchmarks on Languages Benchmark Game.*
Personally I would choose OCaml. It can be compiled to native code, executes faster and has bindings to SDL (which provides input, sound and buffered 2D graphics, among other things).
Another option to consider is F# which is another ML dialect. F# can take advantage of the XNA framework. XNA will limit you to Windows however (from what I understand F# can only be used in dlls on the XBox; there are Mono implementations of XNA but I'm not sure how complete they are).
The benchmark game can only give you a rough idea of the relative efficiency of a language's implementation. A game is much more complex than the tests used by the benchmark game.

Java or C for image processing

I am looking in to learning a programming language (take a course) for use in image analysis and processing. Possibly Bioinformatics too. Which language should I go for? C or Java? Other languages are not an option for me. Also please explain why either of the languages is a better option for my application.

You have to balance raw processing power and developer time. Java is getting pretty fast too and if you are finished a couple of days early, you have more time to process the data.
It all depends on volume.
More importantly, I suggest you look for the libraries and frameworks which already exist, see which fits closest to what needs to be done, and choose whatever language the library was written be it C, Java or Fortran.
For Java I found BioJava.org as a starting point.

Java isn't TOOO bad for image processing. If you manage your source objects appropriately, you ll have a chance at getting reasonable performance out of it. Some of the things I like with Java that relates to imaging:
Java Advanced Imaging
2D Graphics utilities (take a look at BufferedImages)
ImageJ, etc
Get it to work with JAMA

Ask someone in the field you're working in (ie, bioinformatics)
For solar images, the majority of the work is done in IDL, Fortran, Matlab, Python, C or Perl (PDL). (Roughly in that order ... IDL is definitely first, as the majority of the instrument calibration software is written in IDL)
Because of this, there's a lot of toolkits already written in those languages for our field. Frequently, with large reference data sets, the PI releases some software package as an example of how to interpret / interact with the data format. I can only assume that Bioinformatics would be similar.
If you end up going a different route than the rest of the field, you're going to have a much harder time working with other scientists as you can't share code as easily.
Note -- There are a number of the visualization tools that have been released in our field that were written in Java, but they assume that the images have already been prepped by some other process.

The most popular computer vision (image processing, image analysis) library is OpenCV which is written in C++, but can also be used with Python, and Java (official OpenCV4Android and non-official JavaCV).
There are Bioinformatic applications that are basically image processing, so OpenCV will take care of that. But there are also some which are not, they are, for example, based on Machine Learning, so if you need something other than image/video related you will need another Bioinformatic oriented library. Opencv also has a machine learning module but it is more focused for computer vision.
About the languages C vs Java, most has been said in the other answers. I should add that these libraries are now C++ based and not plain C. If your applications have real-time processing needs, C++ will probably be better for that, if not, Java will be more than enough as it is more friendly.

Ideally, you would use something like Java or (even better) Python for "high-level" stuff, and compile in C the routines that require a lot of processing power (for instance using Cython, etc).
Some scientific libraries exist for Python (SciPy and NumPy), and they are a good start, although it isn't yet straightforward to combine Python and C (you need to tweak things a bit).

just my two pence worth: java doesn't allow the use of pointers as opposed to C/C++ or C#. So if you are going to manipulate pixels directly, i.e. write your own image processing functions then they will be much slower than the equivalent in C++. On the otherhand C++ is a total nightmare of a language compared to java. it will take you at least twice as long to write the equivalent bit of code in c++. so with all the productivity gain you can probably afford to buy a computer that makes up for the difference in runtime ;-)
i know other languages aren't an option for you, but personally i can highly recommend c# for image processing or computer vision: it allows pointers and hence IP functions in c# are only half as slow as in C++ (an acceptable trade-off i think) and it has excellent integration with native C++ and a good wrapper library for opencv.

Disclaimer: I work for TunaCode.
If you have to make a choice between different languages to get started on Image Processing, I would recommend to start with C++. You can raw pointer access which is a must if you want to operate on individual pixels.
Next, what kind of Imaging are you interested in? Just for fun image filters or some heavy stuff like motion estimation, tracking and detection etc? For that I would recommend you take a look at CUVILib since sooner than later, you will need performance on Imaging functionality and that's what CUVI provides. You can use it as standalone if it serves your purposes or you can plug it with other libraries like Intel IPP, ITK, OpenCV etc.

Fortran's performance

Fortran's performances on Computer Language Benchmark Game are surprisingly bad. Today's result puts Fortran 14th and 11th on the two quad-core tests, 7th and 10th on the single cores.
Now, I know benchmarks are never perfect, but still, Fortran was (is?) often considered THE language for high performance computing and it seems like the type of problems used in this benchmark should be to Fortran's advantage. In an recent article on computational physics, Landau (2008) wrote:
However, [Java] is not as efficient or
as well supported for HPC and parallel
processing as are FORTRAN and C, the
latter two having highly developed
compilers and many more scientific
subroutine libraries available.
FORTRAN, in turn, is still the
dominant language for HPC, with
FORTRAN 90/95 being a surprisingly
nice, modern, and effective language;
but alas, it is hardly taught by any
CS departments, and compilers can be
expensive.
Is it only because of the compiler used by the language shootout (Intel's free compiler for Linux) ?

No, this isn't just because of the compiler.
What benchmarks like this -- where the program differs from benchmark to benchmark -- is largely the amount of effort (and quality of effort) that the programmer put into writing any given program. I suspect that Fortran is at a significant disadvantage in that particular metric -- unlike C and C++, the pool of programmers who'd want to try their hand at making the benchmark program better is pretty small, and unlike most anything else, they likely don't feel like they have something to prove either. So, there's no motivation for someone to spend a few days poring over generated assembly code and profiling the program to make it go faster.
This is fairly clear from the results that were obtained. In general, with sufficient programming effort and a decent compiler, neither C, C++, nor Fortran will be significantly slower than assembly code -- certainly not more than 5-10%, at worst, except for pathological cases. The fact that the actual results obtained here are more variant than that indicates to me that "sufficient programming effort" has not been expended.
There are exceptions when you allow the assembly to use vector instructions, but don't allow the C/C++/Fortran to use corresponding compiler intrinsics -- automatic vectorization is not even a close approximation of perfect and probably never will be. I don't know how much those are likely to apply here.
Similarly, an exception is in things like string handling, where you depend heavily on the runtime library (which may be of varying quality; Fortran is rarely a case where a fast string library will make money for the compiler vendor!), and on the basic definition of a "string" and how that's represented in memory.

Some random thoughts:
Fortran used to do very well because it was easier to identify loop invariants which made some optimizations easier for the compiler. Since then
Compilers have gotten much more sophisticated. Enormous effort has been put into c and c++ compilers in particular. Have the fortran compilers kept up? I suppose the gfortran uses the same back end of gcc and g++, but what of the intel compiler? It used to be good, but is it still?
Some languages have gotten a lot specialized keywords and syntax to help the compiler (restricted and const int const *p in c, and inline in c++). Not knowing fortran 90 or 95 I can't say if these have kept pace.

I've looked at these tests. It's not like the compiler is wrong or something. In most tests Fortran is comparable to C++ except some where it gets beaten by a factor of 10. These tests just reflect what one should know from the beggining - that Fortran is simply NOT an all-around interoperable programming language - it is suited for efficient computation, has good list operations & stuff but for example IO sucks unless you are doing it with specific Fortran-like methods - like e.g. 'unformatted' IO.
Let me give you an example - the 'reverse-complement' program that is supposed to read a large (of order of 10^8 B) file from stdin line-by-line, does something with it & prints the resulting large file to stdout. The pretty straighforward Fortran program is about 10 times slower on a single core (~10s) than a HEAVILY optimized C++ (~1s). When you try to play with the program, you'll see that only simple formatted read & write take more than 8 seconds. In a Fortran way, if you care for efficiency, you'd just write an unformatted structure to a file & read it back in no time (which is totally non-portable & stuff but who cares anyway - an efficient code is supposed to be fast & optimized for a specific machine, not able to run everywhere).
So the short answer is - don't worry, just do your job - and if you want to write a super-efficient operating system, than sorry - Fortran is just not the way for that kind of performance.

This benchmark is stupid at all.
For example, they measure CPU-time for the whole program to run. As mcmint stated (and it might be actually true) Fortran I/O sucks*. But who cares? In real-world tasks one read input for some seconds than do calculations for hours/days/months and finally write output for the seconds. Thats why in most benchmarks I/O operations are excluded from time measurements (if you of course do not benchmark I/O by itself).
Norber Wiener in his book God & Golem, Inc. wrote
Render unto man the things which are man’s and unto the computer the things which are the computer’s.
In my opinion the usage of this principle while implementing algorithm in any programming language means:
Write as readable and simple code as you can and let compiler do the optimizations.
Especially it is important in real-world (huge) applications. Dirty tricks (so heavily used in many benchmarks) even if they might improve the efficiency to some extent (5%, maybe 10%) are not for the real-world projects.
/* C/C++ uses stream I/O, but Fortran traditionally uses record-based I/O. Further reading. Anyway I/O in that benchmarks are so surprising. The usage of stdin/stdout redirection might also be the source of problem. Why not simply use the ability of reading/writing files provided by the language or standard library? Once again this woud be more real-world situation.

I would like to say that even if the benchmark do not bring up the best results for FORTRAN, this language will still be used and for a long time. Reasons of use are not just performance but also some kind of thing called easyness of programmability. Lots of people that learnt to use it in the 60's and 70's are now too old for getting into new stuff and they know how to use FORTRAN pretty well. I mean, there are a lot of human factors for a language to be used. The programmer also matters.

Considering they did not publish the exact compiler options they used for the Intel Fortran Compiler, I have little faith in their benchmark.
I would also remark that both Intel's math library, MKL, and AMD's math library, ACML, use the Intel Fortran Compiler.
Edit:
I did find the compilation options when you click on the benchmark's name. The result is surprising since the optimization level seems reasonable. It may come down to the efficiency of the algorithm.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio