Writing a Ruby extension in Go (golang) - ruby

Are there some tutorials or practical lessons on how to write an extension for Ruby in Go?

Go 1.5 added support for building shared libraries that are callable from C (and thus from Ruby via FFI). This makes the process easier than in pre-1.5 releases (when it was necessary to write the C glue layer), and the Go runtime is now usable, making this actually useful in real life (goroutines and memory allocations were not possible before, as they require the Go runtime, which was not useable if Go was not the main entry point).
goFuncs.go:
package main
import "C"
//export GoAdd
func GoAdd(a, b C.int) C.int {
return a + b
}
func main() {} // Required but ignored
Note that the //export GoAdd comment is required for each exported function; the symbol after export is how the function will be exported.
goFromRuby.rb:
require 'ffi'
module GoFuncs
extend FFI::Library
ffi_lib './goFuncs.so'
attach_function :GoAdd, [:int, :int], :int
end
puts GoFuncs.GoAdd(41, 1)
The library is built with:
go build -buildmode=c-shared -o goFuncs.so goFuncs.go
Running the Ruby script produces:
42

Normally I'd try to give you a straight answer but the comments so far show there might not be one. So, hopefully this answer with a generic solution and some other possibilities will be acceptable.
One generic solution: compile high level language program into library callable from C. Wrap that for Ruby. One has to be extremely careful about integration at this point. This trick was a nice kludge to integrate many languages in the past, usually for legacy reasons. Thing is, I'm not a Go developer and I don't know that you can compile Go into something callable from C. Moving on.
Create two standalone programs: Ruby and Go program. In the programs, use a very efficient way of passing data back and forth. The extension will simply establish a connection to the Go program, send the data, wait for the result, and pass the result back into Ruby. The communication channel might be OS IPC, sockets, etc. Whatever each supports. The data format can be extremely simple if there's no security issues and you're using predefined message formats. That further boosts speed. Some of my older programs used XDR for binary format. These days, people seem to use things like JSON, Protocol Buffers and ZeroMQ style wire protocols.
Variation of second suggestion: use ZeroMQ! Or something similar. ZeroMQ is fast, robust and has bindings for both languages. It manages the whole above paragraph for you. Drawbacks are that it's less flexible wrt performance tuning and has extra stuff you don't need.
The tricky part of using two processes and passing data between them is a speed penalty. The overhead might not justify leaving Ruby. However, Go has great native performance and concurrency features that might justify coding part of an application in it versus a scripting language like Ruby. (Probably one of your justifications for your question.) So, try each of these strategies. If you get a working program that's also faster, use it. Otherwise, stick with Ruby.
Maybe less appealing option: use something other than Go that has similar advantages, allows call from C, and can be integrated. Althought it's not very popular, Ada is a possibility. It's long been strong in native code, (restricted) concurrency, reliability, low-level support, cross-language development and IDE (GNAT). Also, Julia is a new language for high performance technical and parallel programming that can be compiled into a library callable from C. It has a JIT too. Maybe changing problem statement from Ruby+Go to Ruby+(more suitable language) will solve the problem?

As of Go 1.5, there's a new build mode that tells the Go compiler to output a shared library and a C header file:
-buildmode c-shared
(This is explained in more detail in this helpful tutorial: http://blog.ralch.com/tutorial/golang-sharing-libraries/)
With the new build mode, you no longer have to write a C glue layer yourself (as previously suggested in earlier responses). Once you have the shared-library and the header file, you can proceed to use FFI to call the Go-created shared library (example here: https://www.amberbit.com/blog/2014/6/12/calling-c-cpp-from-ruby/)

Related

Is there parallelism in Elm?

It's possible to write parallel code in Elm? Elm is pure functional, so no locking is needed. Of course, I can use Javascript FFI, spawn workers here and do it on my own. But, I want more user friendly "way" of doing this.
Short answer
No, not currently. But the next release (0.15) release will have new ways to handle effects inside Elm so you will need to use ports + JavaScript code less. So there may well be a way to spawn workers inside Elm in the next version.
More background
If you're feeling adventurous, try reading the published paper on Elm (or the longer original thesis), which shows that the original flavour of FRP that Elm uses is well suited for fine-grained concurrency. There is also an async construct which can potentially make part of the program run separately in a more coarse-grained manner. That might be support with OS-level threads (like JS Webworkers) and parallelism.
There have been earlier experiments with Webworkers. There is certainly an interest in concurrency within the community, but JavaScript doesn't offer (m)any great options for concurrency.
For reading tips on the paper, here's post of mine from the elm-discuss mailing list:
If you want to know more about signals and opt-in async, I suggest you try Evan's PLDI paper on Elm. Read from the introduction (1) up to building GUIs (4). You can skip the type system (3.2) and functional evaluation (3.3.1), that may save you some time. Most in and after building GUIs (4) is probably stuff you know already. Figure 8 is probably the best overview of what the async keyword does (note that the async keyword is not implemented in the current Elm compiler).

How to properly use Golang packages in the standard library or third-party with Goroutines?

Hi Golang programmers,
First of all I apologize if my question is not very clear initially but I'm trying to understand the proper usage pattern when writing Golang code that uses Goroutines when using the standard lib or other libraries.
Let me elaborate: Suppose I import some package that I didn't have a hand in writing that I want to utilize. Let's say this package does a simple http get request somehow to a website such as Flickr for example. If I want a concurrent request, I can just prefix the function call with the go keyword. But how do I know, that this package when doing the request doesn't already do some internal go calls itself therefore making my go calls redundant?
Do Golang packages typically say in the documentation that their method is "greened"? Or perhaps they provide two versions of a method, one that is green and one that is straight synchronous?
In my quest to understand Go idioms and usage patterns I feel like when using even packages in the standard lib that I can't be sure if my go commands are necessary. I suppose I can profile the calls, or write test code but that feels odd to have to figure out if a func is already "green".
I suppose another possibility is that it's up to me to study the source code of whatever I'm using and understand how it should be used and if the go keyword is necessary.
If anybody can shed some light on this or point me to the right documentation or even a Golang screen-cast I'd much appreciate it. I think Rob Pike briefly mentions in one talk that a good client api written go is just written in a typical synchronous manner and it's up to the caller of that api to have the choice of making it green or not.
Thanks for your time,
-Ralph
If a function / method returns some value(s), or have a side effect like that (io.Reader.Read) - then it's necessarily a synchronous thing. Unless documented otherwise, no safety for concurrent use by multiple goroutines should be assumed.
If it accepts a closure (callback) or a channel or if it returns a channel - then it is often an asynchronous thing. If that's the case, it's normally either obvious or explicitly documented. Asynchronous stuff like this is usually safe for concurrent use by multiple goroutines.

How do different apps written in different languages interact?

Off late I'd been hearing that applications written in different languages can call each other's functions/subroutines. Now, till recently I felt that was very natural - since all, yes all - that's what I thought then, silly me! - languages are compiled into machine code and that should be same for all the languages. Only some time back did I realise that even languages compiled in 'higher machine code' - IL, byte code etc. can interact with each other, the applications actually. I tried to find the answer a lot of times, but failed - no answer satisfied me - either they assumed I knew a lot about compilers, or something that I totally didn't agree with, and other stuff...Please explain in an easy to understand way how this works out. Especially how languages compiled into 'pure' machine code have different something called 'calling conventions' is what is making me clutch my hair.
This is actually a very broad topic. Languages compiled to machine code can often call each others' routines, though usually not without effort; e.g., C++ code can call C routines when properly declared:
// declare the C function foo so it can be called by C++ code
extern "C" {
void foo(int, char *);
}
This is about as simple as it gets, because C++ was explicitly designed for compatibility with C (it includes support for calling C++ routines from C as well).
Calling conventions indeed complicate the picture in that C routines compiled by one compiler might not be callable from C compiled by another compiler, unless they share a common calling convention. For example, one compiler might compile
foo(i, j);
to (pseudo-assembly)
PUSH the value of i on the stack
PUSH the value of j on the stack
JUMP into foo
while another might push the values of i and j in reverse order, or place them in registers. If foo was compiled by a compiler following another convention, it might try to fetch its arguments off the stack in the wrong order, leading to unpredictable behavior (consider yourself lucky if it crashes immediately).
Some compilers support various calling conventions for this purpose. The Wikipedia article introduces calling conventions; for more details, consult your compiler's documentation.
Finally, mixing bytecode-compiled or interpreted languages and lower-level ones in the same address space is still more complicated. High-level language implementations commonly come with their own set of conventions to extend them with lower-level (C or C++) code. E.g., Java has JNI and JNA.

A scripting engine for Ruby?

I am creating a Ruby On Rails website, and for one part it needs to be dynamic so that (sorta) trusted users can make parts of the website work differently. For this, I need a scripting language. In a sort of similar project in ASP.Net, I wrote my own scripting language/DSL. I can not use that source code(written at work) though, and I don't want to make another scripting language if I don't have to.
So, what choices do I have? The scripting must be locked down and not be able to crash my server or anything. I'd really like if I could use Ruby as the scripting language, but it's not strictly necessary. Also, this scripting part will be called on almost every request for the website, sometimes more than once. So, speed is a factor.
I looked at the RubyLuaBridge but it is Alpha status and seems dead.
What choices for a scripting language do I have in a Ruby project?
Also, I will have full control over where this project is deployed(root access), so there are no real limits..
There's also Rufus-lua though it's at version 0.1.0...
What about JRuby? You can use java implementation of many scripting language, such as javascript, scheme etc
Well, since it hasn't been suggested yet, there's Locking Ruby In The Safe as described by the Pickaxe book. This allows you to use Ruby as the language without significant slowdown AFAIK.
This technique is intended to allow safe sandboxing of untrusted Ruby code and bug fixes and discussions are directed toward keeping it that way, but infinite loops and some other things still allow malicious users to peg the CPU. (e.g. this discussion maybe.)
What I don't know is how you return data that is inherently safe to use from outside the safe thread. A singleton object (for instance) can mimic whatever class and then do something dangerous when any method is called in the returning thread. I'm still googling around about it. (The Ruby Programming Language says that level 4 "Prevents metaprogramming methods" which would allow you to safely verify the class of a returned object, which I suppose would make results safe to use.)
Barring that, it might not be hard (*snrk*) to implement a Lisp-1 with dynamic scope since you already have a garbage collector.

How do multiple languages interact in one project?

I heard some people program in multiple languages in one project. I can't imagine how the languages interact with each other.
I mean there is no Java method like
myProgram.callCfunction(parameters);
never happens or am I wrong?
Having multiple languages in one project is actually quite common, however the principles behind are not always simple.
In the simple case, different languages are compiled to the same code. For example, C and C++ code typically is compiled into machine assembler or C# and VB.Net is compiled into IL (the language understood by the .NET runtime).
It gets more difficult if the languages/compilers use a differnt type system. There can be many different ways, basic data types such as integer, float and doubles are represented internally, and there is even more ways to represent strings. When passing types around between the different languages it must be sure that both sides interpret the type the same or - if not - the types are correctly mapped. This sort of type mapping is also known as marshalling.
Classic examples of interoperability between different program languages are (mostly from the Windows world):
The various languages available for the .NET platfrom. This includes C#, VB.Net, J#, IronRuby, F#, XSLT and many other less popular languages.
Native COM components written in C++ or VB can be used with a huge variety of languages: VBScript, VB, all .NET languages, Java
Win32 api functions can be called from .NET or VB
IPC (inter process communication)
Corba, probably the most comprehensive (and most complex) approach
Web services and other service-oriented architectures, probably the most modern approach
Generally, any decently sized web project will use about five languages: HTML, CSS, Javascript, some kind of server-side “getting things done” language (ASP, JSP, CGI scripts with Perl, PHP, etc.), and some variant of SQL for database connectivity.
(This is, of course, hand-waving away the argument about whether or not HTML and CSS count as programming languages – I’m the “they are, but just not Turing-complete languages” camp, but that’s a whole other thread.)
Some examples of how all those work together:
If you’re going the best-practices route, the structure of a web page is in HTML, and the instructions for how to display it are in CSS – which could be in the same file, but don’t have to be. The CSS contains a bunch of classes, which the HTML refers to, and it’s up to the browser to figure out how to click them together.
Taking all that a step further, any javascript scripts on that page can alter any of the HTML/CSS that is present (change contents of HTML entities, swap out one CSS class for another, change the behavior of the CSS, and so on.) It does this via something called the Document Object Model, which is essentially a language and platform-independent API to manipulate HTML pages in an object-like manner (at which point I’ll back away slowly and just provide a link to the relevant wiki article.)
But then, where does all the HTML / CSS / Javascript come from? That’s what the server-side language does. In the simplest form, the serer-side language is a program that returns a giant string holding an HTML page as its output. This, obviously, can get much more complex: HTML forms and query string parameters can be used as input for our server side program, and then you have the whole AJAX thing where the javascript gets to send data directly to the server language as well. You can also get fancy where the server language can customize the HTML, CSS, and Javascript that gets spit out – essentially, you have a program in one language writing a program in another language.
The Server-side language to SQL connection works much the same. There are a lot of ways to make it both more complex and safer, but the simplest way is for your server language to dynamically build a string with a SQL command in it, hand that to the database via some kind of connector, and get back a result set. (This is a case where you really do have a function that boils down to someValue = database.executeThisSQLCommand( SQLString ). )
So to wrap this up, different languages in this case either communicate by actually writing programs in each other, or by handing data around in very simple easy to parse formats that everybody can understand. (Strings, mainly.)
Multiple languages in use is called "interoperability" or "interop" for short.
Your example is wrong. Java can call C functions.
The language provides a mechanism for interoperability.
In the case of .NET, languages are compiled into IL as part of the CLI. Thus any .NET language can interop (call methods defined by) modules defined in any other .NET language.
As an example:
I can define a method in C#
static void Hello(){ Console.WriteLine("Hello World");}
And I can call it from Python (IronPython)
Hello()
And get the expected output.
Generally speaking, some languages interop better than others, especially if the language authors specifically made interop a feature of the language.
Multiple languages can interact with:
Piped input/output (ANY language can do this
because input and output must by necessity be implemented in every non-toy
language)
Having code in one language compile to a native library
while the other supports calling native code.
Communicating over a loopback network connection. You can
run into difficulties with firewall interference this way.
Databases. These can be thought of as a "universal" data
storage format, and hence can be accessed by most languages
with database extensions. This generally
requires one program to finish operation before the next program
can access the database. In addition, all 'communications' are
generally written to disk.
If the languages involved run on the same runtime (i.e. .NET,
JVM), then you generally can pass object data from one language
directly to the other with little impedence.
In almost every case, you have to convert any communication to a common
format before it can be exchanged (the exception is languages on the
same runtime). This is why multiple languages are rarely used in one
project.
I work on a large enterprise project which is comprised of (at the last count) about 8 languages. The majority of the communication is via an enterprise level message bus which contains bindings for multiple languages to tap into and pass data back and forth. It's called tibco.
There are many different ways you can use different languages in one project
There are two main categories that come to mind:
Using Different languages together to build one application. For example using Java to build the GUI and using JNI to access C API (so answering your question you can call C functions from Java ;))
Using different languages together in the one project if they are not part of the same application. For example. I am currently working on an iPhone app that has uses a large amount of text. I am currently using three languages: Python (to work with the original sources of the text), SQL (to put the results of the python app in a format easily accessible from the iPhone sqlite3 API) and Objective C to build the actual app. Even though the final product will only be Objective C, I've used two other languages to get to the final product.
You could have an app where the bulk of the work is done in Java, but there might be some portion of it, like maybe a data parser or something is written in Python or what have you. Almost two separate apps really, perhaps the parser is just doing some work on files and then your main app in Java is using them for something. If someone were to ask me what I used in this project I'd say "Java and Python."
There are various ways that multiple languages can be used in one project. Some examples:
You can write a DLL in, say, C, and then use that library from, say, a VB program.
You could write a server program in, say C++, and have lots of different language implementations of the client.
A web project often uses lots of languages; for example a server program, written in, say, Java (a programming language), which fetches data from a database using SQL (a query language), sends the result to the browser in HTML (a markup language), which the user can interact with using Javascript (a scripting language)...
Badly. If there is no urgent need, stick to a single language. You're increasing dependencies and complexity. But when you have existing code providing interesting functionality, it can be easier to glue it together than to recreate it.
It depends on the type of project. If you want to experiment, you can set up a web project, in .NET, and change the language on a page by page basis. It does not work as you show in your pseudocode, but it is multiple languages. Of course, the actual code directory must be a single language.
There are a couple of ways in which code in languages can interact directly. As long as the data being passed between the code is in the right format, at the bits and bytes level, then there is no reason why different languages can't interop. This approach is used in traditional windows DLL development. Even on different platforms, if you can get the format correct (look at big/little endian if interested) it will work as long as your linker (not compiler) knows how to join code together.
Beyond that there are numerous other ways in which languages can talk to each other. In the .Net world code is compiled down to IL code, which is the same for every language, in this way C#, VB.Net are all the same under the hood and can call/work with each other seamlessly.
Just to add to the list of examples, it's fairly common to optimize Python code in C or C++ or write a C library to bind another library to Python.

Resources