I am trying to port a native extension of ruby to FFI. The exposed ruby interface is
auto_link(text, mode=:all, link_attr=nil, skip_tags=nil, flags=0) { |link_text| ... }
and the block is optional.
There are two functions in the original c implementation:
int rinku_autolink(
struct buf *ob,
const uint8_t *text,
size_t size,
autolink_mode mode,
unsigned int flags,
const char *link_attr,
const char **skip_tags,
void (*link_text_cb)(struct buf *ob, const struct buf *link, void *payload),
void *payload)
which does the actual work and
static VALUE rb_rinku_autolink(int argc, VALUE *argv, VALUE self)
which deals with the default arguments and block callback stuff.
My question is if I want to expose the same ruby interface in FFI, which one of the above functions should be registered with attach_function, or should I define another c function for FFI? And whichever function to choose how to define the default argument values in attach_function?
The answer is neither. I think you are misunderstanding the point of FFI, or I am misunderstanding your post. If you are porting your native extension to FFI, that should mean that you are getting rid of all the C code in your code base and converting it to Ruby. You should convert the features of the old C methods rinku_autolink() and rb_rinku_autolink() into Ruby, probably a single Ruby method. Then if that Ruby method needs to call some C functions you would use FFI's attach_function method to get access to those.
If that's not what you are doing, could you please explain what your actual goal is and why?
Related
I am reading android kernel code and I'm facing this kind of data structures ,
static const struct file_operations tracing_fops = {
.open = tracing_open,
.read = seq_read,
.write = tracing_write_stub,
.llseek = tracing_seek,
.release = tracing_release,
};
can someone explain this syntax generally ? right side of equations are functions names and &tracing_fops later is passed as an argument to another function that inits debugfs file system.
The assignment is an example of using Compund Literals. According to C99 Section #6.5.2.5:
A postfix expression that consists of a parenthesized type name
followed by a brace- enclosed list of initializers is a compound
literal. It provides an unnamed object whose value is given by the
initializer list.
In simpler version, according to GCC docs: Compound literals:
A compound literal looks like a cast of a brace-enclosed aggregate
initializer list. Its value is an object of the type specified in the
cast, containing the elements specified in the initializer. Unlike the
result of a cast, a compound literal is an lvalue. ISO C99 and later
support compound literals. As an extension, GCC supports compound
literals also in C90 mode and in C++, although as explained below, the
C++ semantics are somewhat different.
An simple example:
struct foo { int x; int y; };
func() {
struct foo var = { .x = 2, .y = 3 };
...
}
In the question's example, the struct file_operations is defined in include/linux/fs.h and tracing_fops is in kernel/trace/trace.c file in Linux source tree.
struct file_operations {
struct module *owner;
loff_t (*llseek) (struct file *, loff_t, int);
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
...
};
The open, read, write are Function Pointers which are pointers that points to a function. After dereferencing the function pointer, it can be used as normal function call. The tracing_fops structure is file_operations type. The values of function pointer members are assigned to the functions in the same trace.c file using compound literals.
With compound literals, we don't have to explicitly specify/assign all members in the structure type because other members are set to zero or null. Structure objects created using compound literals can be passed to functions without depending on member order. The function parameters should be same for both side. For example, the parameters of
int (*open) (struct inode *, struct file *);
is same as
int tracing_open(struct inode *inode, struct file *file);
In object oriented programming, this idea is somewhat similar as Virtual Function Table.
This is simply a struct initialization, using field names to assign values to specific fields only. You can take a look at struct initialization at cppreference which demonstrates these use cases (and even more advanced situations, such as omitting specific field names, etc.)
The Linux kernel sources often make use of structs consisting of sets of function pointers for related operations. These are used to provide distinct implementations of the same interface, akin to what would be accomplished using class inheritance in object-oriented languages. For instance, in C++ this same idea would be implemented using virtual methods and the function pointers would be stored in the class vtable (which means this would be implicit rather than explicit in C++.)
Using this struct in C is similar to how you'd use an object of a class using virtual methods in C++, since you can simply call one of the "methods" using:
int r = fops->open(inode, filp);
The actual code typically tests whether the struct member is set, since the struct initialization will keep the pointers that are not explicitly mentioned set to NULL, making it possible to use this kind of struct to implement optional operations as well.
The main difference being that in C++ you'd have an implicit reference to the object itself (this), while in C you have to pass that as an additional argument in cases where it's needed.
Ok, muddling though Stack on the particulars about void*, books like The C Programming Language (K&R) and The C++ Programming Language (Stroustrup). What have I learned? That void* is a generic pointer with no type inferred. It requires a cast to any defined type and printing void* just yields the address.
What else do I know? void* can't be dereferenced and thus far remains the one item in C/C++ from which I have discovered much written about but little understanding imparted.
I understand that it must be cast such as *(char*)void* but what makes no sense to me for a generic pointer is that I must somehow already know what type I need in order to grab a value. I'm a Java programmer; I understand generic types but this is something I struggle with.
So I wrote some code
typedef struct node
{
void* data;
node* link;
}Node;
typedef struct list
{
Node* head;
}List;
Node* add_new(void* data, Node* link);
void show(Node* head);
Node* add_new(void* data, Node* link)
{
Node* newNode = new Node();
newNode->data = data;
newNode->link = link;
return newNode;
}
void show(Node* head)
{
while (head != nullptr)
{
std::cout << head->data;
head = head->link;
}
}
int main()
{
List list;
list.head = nullptr;
list.head = add_new("My Name", list.head);
list.head = add_new("Your Name", list.head);
list.head = add_new("Our Name", list.head);
show(list.head);
fgetc(stdin);
return 0;
}
I'll handle the memory deallocation later. Assuming I have no understanding of the type stored in void*, how do I get the value out? This implies I already need to know the type, and this reveals nothing about the generic nature of void* while I follow what is here although still no understanding.
Why am I expecting void* to cooperate and the compiler to automatically cast out the type that is hidden internally in some register on the heap or stack?
I'll handle the memory deallocation later. Assuming I have no understanding of the type stored in void*, how do I get the value out?
You can't. You must know the valid types that the pointer can be cast to before you can dereference it.
Here are couple of options for using a generic type:
If you are able to use a C++17 compiler, you may use std::any.
If you are able to use the boost libraries, you may use boost::any.
Unlike Java, you are working with memory pointers in C/C++. There is no encapsulation whatsoever. The void * type means the variable is an address in memory. Anything can be stored there. With a type like int * you tell the compiler what you are referring to. Besides the compiler knows the size of the type (say 4 bytes for int) and the address will be a multiple of 4 in that case (granularity/memory alignment). On top, if you give the compiler the type it will perform consistency checks at compilation time. Not after. This is not happening with void *.
In a nutshell, you are working bare metal. The types are compiler directives and do not hold runtime information. Nor does it track the objects you are dynamically creating. It is merely a segment in memory that is allocated where you can eventually store anything.
The main reason to use void* is that different things may be pointed at. Thus, I may pass in an int* or Node* or anything else. But unless you know either the type or the length, you can't do anything with it.
But if you know the length, you can handle the memory pointed at without knowing the type. Casting it as a char* is used because it is a single byte, so if I have a void* and a number of bytes, I can copy the memory somewhere else, or zero it out.
Additionally, if it is a pointer to a class, but you don't know if it is a parent or inherited class, you may be able to assume one and find out a flag inside the data which tells you which one. But no matter what, when you want to do much beyond passing it to another function, you need to cast it as something. char* is just the easiest single byte value to use.
Your confusion derived from habit to deal with Java programs. Java code is set of instruction for a virtual machine, where function of RAM is given to a sort of database, which stores name, type, size and data of each object. Programming language you're learning now is meant to be compiled into instruction for CPU, with same organization of memory as underlying OS have. Existing model used by C and C++ languages is some abstract built on top of most of popular OSes in way that code would work effectively after being compiled for that platform and OS. Naturally that organization doesn't involve string data about type, except for famous RTTI in C++.
For your case RTTI cannot be used directly, unless you would create a wrapper around your naked pointer, which would store the data.
In fact C++ library contains a vast collection of container class templates that are useable and portable, if they are defined by ISO standard. 3/4 of standard is just description of library often referred as STL. Use of them is preferable over working with naked pointers, unless you mean to create own container for some reason. For particular task only C++17 standard offered std::any class, previously present in boost library. Naturally, it is possible to reimplement it, or, in some cases, to replace by std::variant.
Assuming I have no understanding of the type stored in void*, how do I get the value out
You don't.
What you can do is record the type stored in the void*.
In c, void* is used to pass around a binary chunk of data that points at something through one layer of abstraction, and recieve it at the other end, casting it back to the type that the code knows it will be passed.
void do_callback( void(*pfun)(void*), void* pdata ) {
pfun(pdata);
}
void print_int( void* pint ) {
printf( "%d", *(int*)pint );
}
int main() {
int x = 7;
do_callback( print_int, &x );
}
here, we forget thet ype of &x, pass it through do_callback.
It is later passed to code inside do_callback or elsewhere that knows that the void* is actually an int*. So it casts it back and uses it as an int.
The void* and the consumer void(*)(void*) are coupled. The above code is "provably correct", but the proof does not lie in the type system; instead, it depends on the fact we only use that void* in a context that knows it is an int*.
In C++ you can use void* similarly. But you can also get fancy.
Suppose you want a pointer to anything printable. Something is printable if it can be << to a std::ostream.
struct printable {
void const* ptr = 0;
void(*print_f)(std::ostream&, void const*) = 0;
printable() {}
printable(printable&&)=default;
printable(printable const&)=default;
printable& operator=(printable&&)=default;
printable& operator=(printable const&)=default;
template<class T,std::size_t N>
printable( T(&t)[N] ):
ptr( t ),
print_f( []( std::ostream& os, void const* pt) {
T* ptr = (T*)pt;
for (std::size_t i = 0; i < N; ++i)
os << ptr[i];
})
{}
template<std::size_t N>
printable( char(&t)[N] ):
ptr( t ),
print_f( []( std::ostream& os, void const* pt) {
os << (char const*)pt;
})
{}
template<class T,
std::enable_if_t<!std::is_same<std::decay_t<T>, printable>{}, int> =0
>
printable( T&& t ):
ptr( std::addressof(t) ),
print_f( []( std::ostream& os, void const* pt) {
os << *(std::remove_reference_t<T>*)pt;
})
{}
friend
std::ostream& operator<<( std::ostream& os, printable self ) {
self.print_f( os, self.ptr );
return os;
}
explicit operator bool()const{ return print_f; }
};
what I just did is a technique called "type erasure" in C++ (vaguely similar to Java type erasure).
void send_to_log( printable p ) {
std::cerr << p;
}
Live example.
Here we created an ad-hoc "virtual" interface to the concept of printing on a type.
The type need not support any actual interface (no binary layout requirements), it just has to support a certain syntax.
We create our own virtual dispatch table system for an arbitrary type.
This is used in the C++ standard library. In c++11 there is std::function<Signature>, and in c++17 there is std::any.
std::any is void* that knows how to destroy and copy its contents, and if you know the type you can cast it back to the original type. You can also query it and ask it if it a specific type.
Mixing std::any with the above type-erasure techinque lets you create regular types (that behave like values, not references) with arbitrary duck-typed interfaces.
When receiving data on wire and sending it to upper applications, normally, in C style, we have a struct for example with a void*:
struct SData{
//... len, size, version, msg type, ...
void* payload;
}
Later in the code, after error checking and mallocating, ..., we can do something as:
if(msgType == type1){
struct SType1* ptr = (struct SType1*) SData->payload;
}
In C++, an attempt to use unique_ptr fails in the following snippet:
struct SData{
// .. len, size, version, msg type, ...
std::unique_ptr<void> payload;
}
But as you know, this will cause:
error: static assertion failed: can't delete pointer to incomplete type
Is there a way to use smart pointers to handle this?
One solution I found is here:
Should std::unique_ptr<void> be permitted
Which requires creating a custom deleter:
void HandleDeleter(HANDLE h)
{
if (h) CloseHandle(h);
}
using
UniHandle = unique_ptr<void, function<void(HANDLE)>>;
This will require significantly more additional code (compared to the simple unsafe C Style), since for each type of payload there has to be some logic added.
This will require significantly more additional code (compared to the simple unsafe C Style), since for each type of payload there has to be some logic added.
The additional complexity is only calling the added destructors. You could use a function pointer instead of std::function since no closure state should ever be used.
If you don't want destructors, but only to add RAII to the C idiom, then use a custom deleter which simply does operator delete or std::free.
I would like to use Fiddle to access a native library compiled from Rust code. The C representation of the struct is very simple, it is just a pointer and a length:
typedef struct {
char *data;
size_t len;
} my_thing_t;
// Example function that somehow accepts a struct
void accepts_a_struct(my_thing_t thing);
// Example function that somehow returns a struct
my_thing_t returns_a_struct(void);
However, all examples I can find accept or return pointers to structs, and not the structs themselves. I'd like to avoid the double indirection if at all possible.
I've borrowed an example from the Fiddle::Importer documentation. However, I do not see how to properly call the extern method with a structure instead of a pointer to a structure:
require 'fiddle'
require 'fiddle/import'
module LibSum
extend Fiddle::Importer
dlload './libsum.so'
extern 'double sum(double*, int)'
extern 'double split(double)'
end
Note
Fiddle is not the same as the FFI gem. Fiddle is a component of the Ruby standard library, and is not provided as a separate gem. These related questions refer to the FFI gem, and not to Fiddle:
How to wrap function in Ruby FFI method that takes struct as argument?
How do I specify a struct as the return value of a function in RubyFFI?
I've gone through Fiddle documentation and as I can see it is not possible since even in core function definition Fiddle::Function.new it requires args that Fiddle::CParser can handle. I've done various test and to make it work I had to transform your code into something like this:
test2.c
#include <stdio.h>
#include <stdlib.h>
typedef struct {
char *data;
char *more_data;
size_t len;
} my_thing_t;
my_thing_t *returns_a_struct(void){
my_thing_t *structure = malloc(sizeof(my_thing_t));
structure->data = "test2";
structure->more_data = "I am more data";
structure->len = 5;
return structure;
};
irb
require 'fiddle'
require 'fiddle/import'
module Testmd
extend Fiddle::Importer
dlload './test2.dll'
RetStruct = struct ['char *data','char *more_data','size_t len']
extern 'RetStruct* returns_a_struct(void)'
end
include Testmd
2.2.1 :013 > res = Testmd::returns_a_struct(nil)
=> #<Fiddle::Pointer:0x00000000b12a10 ptr=0x00000000e066b0 size=0 free=0x00000000000000>
2.2.1 :014 > s = RetStruct.new(res)
=> #<Testmd::RetStruct:0x00000000c3e9e8 #entity=#<Fiddle::CStructEntity:0x000000007f0ad0 ptr=0x00000000e066b0 size=24 free=0x00000000000000>>
2.2.1 :015 > s.data.to_s
=> "test2"
2.2.1 :016 > s.more_data.to_s
=> "I am more data"
2.2.1 :017 > s.len
=> 5
What I came to is that Fiddle can operate with simple types but needs struct and union types to be passed using references. Still it has wrappers for this classes. Also these wrappers are inherited from Fiddle::Pointer what kinda leads us to conclusion they want us to use pointers for these data types.
If you want more details regarding this or you want to add this functionality you can go through their git repo.
When saving the address of a function with a variadic template, the g++ compiler (Version 4.8.2) outputs this error:
address of overloaded function with no contextual type information
The code in question:
template<typename... Args>
void redirect_function(const char *format, Args... args)
{
pLog->Write(format, args...); // or: printf(format, args...);
}
void *fnPtr = (void *)&redirect_function; // The error occurs here.
Here is what I do with this somewhere else:
typedef void (*log_bridge)(const char*, ...);
log_bridge LogWrite;
LogWrite = (log_bridge)fnPtr;
I have no other possibility to this so please don't suggest completely different ways of solving this.
Well. It is simple why it's not possible. You have a clear ambiguousity. redirect_function is not a function; as all template functions it's more like a set of overloads generated from the template for different types of arguments.
The function needs to get instantiated first to be able to get its address, and you provide no necessary information to do this.
In other words the problem is that you cannot possibly know which concrete overload of redirect_function you should use on the problematic line.
The only thing you could do is to provide template arguments explicitly.