Related
One of the promises of side-effect free, referentially transparent functional programming is that such code can be extensively optimized.
To quote Wikipedia:
Immutability of data can, in many cases, lead to execution efficiency, by allowing the compiler to make assumptions that are unsafe in an imperative language, thus increasing opportunities for inline expansion.
I'd like to see examples where a functional language compiler outperforms an imperative one by producing a better optimized code.
Edit: I tried to give a specific scenario, but apparently it wasn't a good idea. So I'll try to explain it in a different way.
Programmers translate ideas (algorithms) into languages that machines can understand. At the same time, one of the most important aspects of the translation is that also humans can understand the resulting code. Unfortunately, in many cases there is a trade-off: A concise, readable code suffers from slow performance and needs to be manually optimized. This is error-prone, time consuming, and it makes the code less readable (up to totally unreadable).
The foundations of functional languages, such as immutability and referential transparency, allow compilers to perform extensive optimizations, which could replace manual optimization of code and free programmers from this trade-off. I'm looking for examples of ideas (algorithms) and their implementations, such that:
the (functional) implementation is close to the original idea and is easy to understand,
it is extensively optimized by the compiler of the language, and
it is hard (or impossible) to write similarly efficient code in an imperative language without manual optimizations that reduce its conciseness and readability.
I apologize if it is a bit vague, but I hope the idea is clear. I don't want to give unnecessary restrictions on the answers. I'm open to suggestions if someone knows how to express it better.
My interest isn't just theoretical. I'd like to use such examples (among other things) to motivate students to get interested in functional programming.
At first, I wasn't satisfied by a few examples suggested in the comments. On second thoughts I take my objections back, those are good examples. Please feel free to expand them to full answers so that people can comment and vote for them.
(One class of such examples will be most likely parallelized code, which can take advantage of multiple CPU cores. Often in functional languages this can be done easily without sacrificing code simplicity (like in Haskell by adding par or pseq in appropriate places). I' be interested in such examples too, but also in other, non-parallel ones.)
There are cases where the same algorithm will optimize better in a pure context. Specifically, stream fusion allows an algorithm that consists of a sequence of loops that may be of widely varying form: maps, filters, folds, unfolds, to be composed into a single loop.
The equivalent optimization in a conventional imperative setting, with mutable data in loops, would have to achieve a full effect analysis, which no one does.
So at least for the class of algorithms that are implemented as pipelines of ana- and catamorphisms on sequences, you can guarantee optimization results that are not possible in an imperative setting.
A very recent paper Haskell beats C using generalised stream fusion by Geoff Mainland, Simon Peyton Jones, Simon Marlow, Roman Leshchinskiy (submitted to ICFP 2013) describes such an example. Abstract (with the interesting part in bold):
Stream fusion [6] is a powerful technique for automatically transforming
high-level sequence-processing functions into efficient implementations.
It has been used to great effect in Haskell libraries
for manipulating byte arrays, Unicode text, and unboxed vectors.
However, some operations, like vector append, still do not perform
well within the standard stream fusion framework. Others,
like SIMD computation using the SSE and AVX instructions available
on modern x86 chips, do not seem to fit in the framework at
all.
In this paper we introduce generalized stream fusion, which
solves these issues. The key insight is to bundle together multiple
stream representations, each tuned for a particular class of stream
consumer. We also describe a stream representation suited for efficient
computation with SSE instructions. Our ideas are implemented
in modified versions of the GHC compiler and vector library.
Benchmarks show that high-level Haskell code written using
our compiler and libraries can produce code that is faster than both
compiler- and hand-vectorized C.
This is just a note, not an answer: the gcc has a pure attribute suggesting it can take account of purity; the obvious reasons are remarked on in the manual here.
I would think that 'static single assignment' imposes a form of purity -- see the links at http://lambda-the-ultimate.org/node/2860 or the wikipedia article.
make and various build systems perform better for large projects by assuming that various build steps are referentially transparent; as such, they only need to rerun steps that have had their inputs change.
For small to medium sized changes, this can be a lot faster than building from scratch.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
At university I programmed a FPGA in a C-like language. However, I also know that one usually programs FPGAs in Verilog or VHDL. Is this a designer choice? If so, what are the performance drawbacks?
I would ideally like to program the FPGA in a C-like language, rather than VHDL.
I was thinking of getting an Xilinx Virtex-5 if it makes any difference?
FPGA's are not processors. C is a language designed for processors.
Yes, there are C to FPGA compilers.
Are they a good idea? I'd say No. The design you're going to end up with is (from what I've seen) normally a state machine that has one state per line of code in the C. The state machine then moves through the states performing the algorithm. Either that or some other kind of Turing machine is put in place to execute the code.
This is not how somebody skilled in FPGA design would generally approach a problem. It's a slow, and potentially gate hungry way doing things.
In the same way that English is a better language to write a novel than Fortran, VHDL and Verilog are better languages to describe logic circuits than C.
If you're serious about using FPGAs, use a language that is designed to describe logic circuits. It might be a steep learning curve, but the results will be much better IMHO.
The short answer is "yes, certainly".
Here's an excellent survey of C compilers for FPGAs and FPGA-based systems.
C-to-hardware compiler (HLL synthesis)
Performance drawbacks and considerations are found in the system architecture and communication bandwidths rather than in using C vs. a hardware design language (HDL). The considerations in using C vs. an HDL lies in programming time and software maintenance issues, not so much in performance.
You can install a soft processor core inside the FPGA logic, and run your C code inside the virtual processor. Xilinx has Microblaze (licensed) and Picoblaze (free) cores. There are other soft cores you can implement as well (MIPS, x86, 8051, etc).
However, this is largely considered a "hack", as the cores are very slow compared to real cores. And I think that any C-to-FPGA conversion is ultimately going to start smelling like running a soft core, and not give you the efficiency you deserve for running on a FPGA. FPGAs are not Turing machines, they are a sack of logic gates. You can build a Turing machine out of the gates, but that is not why you bought the sack of gates.
Its sort of like buying a bag of Legos, and building a hammer and set of nails out of the bricks. It might work, but you are better off buying a hammer to pound nails, and better off building Castles, Space Ships and Fire Stations with the Legos.
I'd like to add something that I believe is the closest answer to the OP's question. If you're looking for a C-like language (which is not the same as C), you should definitely check out Synflow. The idea is to have a modern language that allows you to design faster without the learning curve of VHDL/Verilog and with no overhead. Also it's free and open source!
Disclosure: I'm co-founder of Synflow :-)
You should have a look at SystemC. The advantages of using a C based language is plentiful. Especially, on a system design perspective you can utilize that your other software (firmware and other low level stuff) is written in C. Hence, your software team can on a really early stage test against the rtl code.
In 2011, Xilinx bought the company AutoESL that had developed high level synthesis with SystemC. Xilinx has reused the name when the released its product "AutoESL". Especially with their new circuit Zynq, there a dual core ARM Cortex A9 embedded together with the FPGA logic, this will probably become a powerful tool for system development.
There are indeed some compilers that allow you to infer (solve using an incomplete description) hardware circuits using a high-level language like C. "C-to-gates" is in fact a popular buzzword. The image companies advertise is that programmers are able to write hardware if the language they use is one they have used to describe software. This is incredibly wrong for a number of reasons, chief among them being the fundamental differences between the execution model assumed by languages like C and hardware description languages.
An illustrative example: C assumes at its heart a large randomly accessible linear-addressed memory - an assumption that rarely holds for hardware. A C-to-gates compiler faces a challenging task of interperting the behavior of the program described, and designing a hardware circuit with the same behavior.
While C-like languages are a great productivity tool in limited use cases, these compilers certainly don't allow you to suddenly know how to design hardware if you are familiar with C.
Hope this helps,
I guess you used Handel C. Its a subset of C. From what I know the result is not very optimized. Verilog and VHDL allows for more optimization. I am saying this based on the my experience with Handel C a few years back
You might want to take a look at C-to-hardware technology, where you can write C code and it will get compiled/translated to VHDL or Verilog. This post lists a few compilers. Haven't used it myself so I don't have any experience with it. Hope this helps!
Many designers write VHDL/Verilog instead of a high-level language, for the same reasons that many programmers used to (and still do in some cases) write assembly instead of Java: you can tune resource usage and performance at a low level. Both VHDL and Verilog are languages designed for designing hardware. C is not. Given enough time, you could always write a program in VHDL/Verilog that will outperform a high-level language program. What an HLL gives you is 1) faster development, 2) ease of maintenance, and 3) possibly greater portability.
There have been many efforts to compile existing high-level programming languages (C is one) to FPGA targets. Most of them do, in fact, generate optimized code. Impulse C, for example, is a subset of C with some add-on libraries that support process-level parallelism, plus a compiler that optimizes the C input for instruction-level parallelism, too. It pipelines loops, maps certain operations to high-performance hardware primitives it knows the underlying FPGA family provides, etc. (Full disclosure: I helped build the Impulse C toolchain.)
The C-to-hardware environments list Carlito and David Pointer link to is pretty exhaustive. Xilinx Virtex-5 is supported by many of them, and if you're using any recent FPGA family from a major vendor, choice of hardware shouldn't be a problem. Some of the HLL environments support built-in (or softcore) embedded CPUs better than others.
Let's say you have to implement a tool to efficiently solve an NP-hard problem, with unavoidable possible explosion of memory usage (the output size in some cases exponential to the input size) and you are particularly concerned about the performances of this tool at running time. The source code has also to be readable and understandable once the underlying theory is known, and this requirement is as important as the efficiency of the tool itself.
I personally think that 3 languages could be suitable for these three requirements: c++, scala, java.
They all provide the right abstraction on data types that makes it possible to compare different structures or apply the same algorithms (which is also important) to different data types.
C++ has the advantage of being statically compiled and optimized, and with function inlining (if the data structures and algorithms are designed carefully) and other optimisation techniques it's possible to achieve a performance close to that of pure C while maintaining a fairly good readability.
If you also put a lot of care in data representation you can optimise the cache performance, which can gain orders of magnitude in speed when the cache miss rate is low.
Java is instead JIT compiled, which allows to apply optimisations during runtime, and in this category of algorithms that could have different behaviours between different runs, that may be a plus. I fear instead that such an approach could suffer from garbage collector, however in the case of this algorithm it's common to continuously allocate memory and java heap performance is notoriously better than C/C++ and if you implement your own memory manager inside the language you could even achieve good efficiency.
This approach instead is not able to inline method invocation (which induces a huge performance penalty) and doesn't give you control over the cache performance. Among the pros there's a better and cleaner syntax than C++.
My concerns about scala are more or less the same as Java, plus the fact that I can't control how the language is optimised unless I have a deep knowledge on the compiler and the standard library. But well: I get a very clean syntax :)
What's your take on the subject? Have you had to deal with this already? Would you implement an algorithm with such properties and requirements in any of these languages or would you suggest something else? How would you compare them?
Usually I’d say “C++” in a heartbeat. The secret being that C++ simply produces less (memory) garbage that needs managing.
On the other hand, your observation that
however in the case of this algorithm it's common to continuously allocate memory
is a hint that Java / Scala may actually be more suited. But then you could use a small object heap in C++ as well. Boost has one that uses the standard allocator interface, if memory serves.
Another advantage of C++ is obviously the use of abstraction without penalty through templates – i.e. that you can easily create generic algorithmic components that can interact without incurring a runtime overhead due to abstraction. In fact, you noted that
it's possible to achieve a performance close to that of pure C while maintaining a fairly good readability
– this is looking at things the wrong way: Templates allow C++ to achieve performance superior to that of C while still maintaining high abstraction.
D might be worth a look, seeing as how it tries to be a better C++.
From a superficial glance, it has better source code readability than C++ does, so that's one of your points covered.
It also has memory management, which makes playing with algorithms a bit easier.
And templates
Here is a stackoverflow discussion comparing the performance of C++ and D
The languages you noticed were my first guesses as well.
Each language has a different take on how to handle specific issues like compilation, memory management and source code, but in theory, any of them should be fitting to your problem.
It is impossible to tell which is best, and there is likely no major difference if you are familiar enough with all of them to work around their respective quirks.
And obviously, if you actually find the need to optimize (I'm not sure if that's a given), that's possible in each language. Lower level languages obviously offer more options, but are also (far) more complex to actually improve.
A single note about C++ vs Java: This is really a holy war, and if you've followed the recent development you'll probably have your own opinion. I, for one, think Java offers enough good aspects to make up for its flaws, usually.
And a final note on C++ vs C: According to my knowledge, the difference usually amounts to a sufficiently low percentage to ignore this. It it doesn't make a difference for the source code, it's fine to go with C, if C++ could make for easier-to-read source code, go with C++. In any case, the choice is kind of negligible.
In the end, remember that money spent on a few hours of programming/optimizing this could as well go into slightly superior hardware to make up for missed tiny details.
It all boils down to: Any of your options is fine as long as you do it right (domain knowledge).
I would use a language which makes it very easy to work on the algorithm. Get the algorithm right and it could very easily outweigh any advantage from fine-tuning the wrong algorithm. Don't be scared to play around in a language normally thought of as slow in execution speed if that language makes it easier to express algorithmic ideas. It is usually much easier to transcribe the right algorithm into another language than it is to eek-out the last dregs of speed from the wrong algorithm in the fastest executing language.
So do it in a language you are comfortable with and which is expressive. You might surprise yourself and find that what is produced is fast enough!
I've read in Wikipedia that neural-network functions defined on a field of arbitrary real/rational numbers (along with algorithmic schemas, and the speculative `transrecursive' models) have more computational power than the computers we use today. Of course it was a page of russian wikipedia (ru.wikipedia.org) and that may be not properly proven, but that's not the only source of such.. rumors
Now, the thing that I really do not understand is: How can a string-rewriting machine (NNs are exactly string-rewriting machines just as Turing machines are; only programming language is different) be more powerful than a universally capable U-machine?
Yes, the descriptive instrument is really different, but the fact is that any function of such class can be (easily or not) turned to be a legal Turing-machine. Am I wrong? Do I miss something important?
What is the cause of people saying that? I do know that the fenomenum of undecidability is widely accepted today (though not consistently proven according to what I've read), but I do not really see a smallest chance of NNs being able to solve that particular problem.
Add-in: Not consistently proven according to what I've read - I meant that you might want to take a look at A. Zenkin's (russian mathematician) papers after mid-90-s where he persuasively postulates the wrongness of G. Cantor's concepts, including transfinite sets, uncountable sets, diagonalization method (method used in the proof of undecidability by Turing) and maybe others. Even Goedel's incompletness theorems were proven in right way in only 21-st century.. That's all just to plug Zenkin's work to the post cause I don't know how widespread that knowledge is in CS community so forgive me if that did look stupid.
Thank you!
From what little research I've done, most of these claims of trans-Turing systems, or of the incorrectness of Cantor's diagonalization proof, etc. are, shall we say, "controversial" in legitimate mathematical circles. Words like "crank" get thrown around frequently.
Obviously, the strong Church-Turing thesis remains unproven, but as you pointed out there's really no good reason to believe that artificial neural networks constitute computational capabilities beyond general recursion/UTMs/lambda calculus/etc.
From a theoretical viewpoint, I think you're absolutely correct -- neural networks provide very little that's new or different.
From a practical viewpoint, neural networks are simply a way of casting solutions into a form where parallel execution is natural and easy, whereas Turing machines are sequential in nature, and executing their sequences in parallel is relatively difficult. In fact, most of what's been done in CPU development over the last few decades has basically been figuring out ways to execute code in parallel while maintaining the illusion that it's executing in sequence. A lot of the hardware in a modern CPU is devoted to maintaining that illusion, and the degree to which parallel execution has become explicit is mostly an admission that maintaining the illusion has become prohibitively expensive.
Anyone who "proves" that Cantor's diagonal method doesn't work proves only their own incompetence. Cf. Wilfred Hodges' An editor recalls some hopeless papers for a surprisingly sympathetic explanation of what kind of thing is going wrong with these attempts.
You can provide speculative descriptions of hyper-Turing neural nets, just as you can provide speculative descriptions of other kinds of hyper-Turing computers: there's nothing incoherent in the idea that hypercomputation is possible, and speculative descriptions of mechanical hypercomputers have been made where the hypercomputer is stipulated to have infinitely fine engravings that encode an oracle for the Halting machine: the existence of such a machine is consistent with Newtonian mechanics, though not quantum mechanics. Rather, the Church-Turing thesis says that they can't be constructed, and there are two reasons to believe the Church-Turing thesis is correct:
No such machines have ever been constructed; and
There's work been done connecting models of physics to models of computation, going back to work in the early 1970s by Robin Gandy, with recent work by people such as David Deutsch (e.g., Machines, Logic and Quantum Physics and John Tucker (e.g., Computations via experiments with kinematic systems) which argues that physics doesn't support hypercomputation.
The main point is that the truth of the Church-Turing thesis is an empirical fact, and not a mathematical fact. It's one that we can have confidence is true, but not certainty.
From a layman's perspective, I see that
NNs can be more effective at solving some types problems than a turing machine, but they are not compuationally more powerful.
Even if NNs were provably more powerful than TMs, execution on current hardware would render them less powerful, since current hardware is only an apporximation of a TM and can only execute problems computable by a bounded TM.
You may be interested in S. Franklin and M. Garzon, Neural computability. There is a preview on Google. It discusses the computational power of neural nets and also states that it is rumored that neural nets are strictly more powerful than Turing machines.
I am brainstorming an idea of developing a high level software to manipulate matrix algebra equations, tensor manipulations to be exact, to produce optimized C++ code using several criteria such as sizes of dimensions, available memory on the system, etc.
Something which is similar in spirit to tensor contraction engine, TCE, but specifically oriented towards producing optimized rather than general code.
The end result desired is software which is expert in producing parallel program in my domain.
Does this sort of development fall on the category of expert systems?
What other projects out there work in the same area of producing code given the constraints?
What you are describing is more like a Domain-Specific Language.
http://en.wikipedia.org/wiki/Domain-specific_language
It wouldn't be called an expert system, at least not in the traditional sense of this concept.
Expert systems are rule-based inference engines, whereby the expertise in question is clearly encapsulated in the rules. The system you suggest, while possibly encapsulating insight about the nature of the problem domain inside a linear algebra model of sorts, would act more as a black box than an expert system. One of the characteristics of expert systems is that they can produce an "explanation" of their reasoning, and such a feature is possible in part because the knowledge representation, while formalized, remains close to simple statements in a natural language; matrices and operations on them, while possibly being derived upon similar observation of reality, are a lot less transparent...
It is unclear from the description in the question if the system you propose would optimize existing code (possibly in a limited domain), or if it would produced optimized code, in that case driven bay some external goal/function...
Well production systems (rule systems) are one of four general approaches to computation (Turing machines, Church recursive functions, Post production systems and Markov algorithms [and several more have been added to that list]) which more or less have these respective realizations: imperative programming, functional programming, rule based programming - as far as I know Markov algorithms don't have an independent implementation. These are all Turing equivalent.
So rule based programming can be used to write anything at all. Also early mathematical/symbolic manipulation programs did generally use rule based programming until the problem was sufficiently well understood (whereupon the approach was changed to imperative or constraint programming - see MACSYMA - hmmm MACSYMA was written in Lisp so perhaps I have a different program in mind or perhaps they originally implemented a rule system in Lisp for this).
You could easily write a rule system to perform the matrix manipulations. You could keep a trace depending on logical support to record the actual rules fired that contributed to a solution (some rules that fire might not contribute directly to a solution afterall). Then for every rule you have a mapping to a set of C++ instructions (these don't have to be "complete" - they sort of act more like a semi-executable requirement) which are output as an intermediate language. Then that is read by a parser to link it to the required input data and any kind of fix up needed. You might find it easier to generate functional code - for one thing after the fix up you could more easily optimize the output code in functional source.
Having said that, other contributors have outlined a domain specific language approach and that is what the TED people did too (my suggestion is that too just using rules).