Using nogil with cpdef class method in cython - parallel-processing

I want to design a cdef class whose methods can be run in parallel and therefore I need to set them as nogil. I see I can do this for cdef methods, but for some reason I cannot understand I am not allowed to do the same with cpdef methods. This in particular fails
cdef class Test:
cdef int i
def __init__(self):
self.i = 0
cpdef int incr(self) nogil:
self.i += 1;
return self.i
while the same with cdef int incr would work. This is somewhat surprising, because in normal cpdef functions the nogil attribute is allowed:
cpdef int testfunc(int x) nogil:
return x + 1
Am I missing something or doing something wrong here?

If you look at the C code generated (omitting nogil) you'll see that the first thing the method does is check to see whether the it has been overridden by a Python subclass. This requires the GIL.
(Note that this can't happen for a cdef function since it is never known about by Python, so there is no issue there.)
Fortunately, it is easy to make it so that your Cython class cannot be subclassed and the problem goes away (it compiles fine with nogil):
cimport cython
#cython.final
cdef class Test:
# the rest of your code is exactly the same so isn't reproduced...

Related

Why should we initialize data members on declaration (not necessarily on constructor)?

Does anyone could explain me the reason of this coding recommendation ?
Since C++11, please initialize data members on declaration (not
necessary on constructor) :
class Limit
{
public:
Limit() = default;
private:
int32_t quantity = 0;
double price = 0.0;
};
Someone thinks (correctly) that this way the variable is always initialised. Which is a good thing if it is initialised with a meaningful value and bad if the value is not meaningful. For example a person’s year of birth is a number from say 1890 to 2021. Initialising it to 0 isn’t useful and can only prevent the compiler from warning you.
So do this if you have a value that is always a useful initialisation value. I wouldn’t do it for anything that is likely to be overwritten in a constructor or shortly after.
I found this answer from CppCoreGuidelines C-48 :
C.48: Prefer in-class initializers to member initializers in constructors for constant initializers
Reason
Makes it explicit that the same value is expected to be used in all constructors. Avoids repetition. Avoids maintenance problems. It leads to the shortest and most efficient code.
Example, bad
class X { // BAD
int i;
string s;
int j;
public:
X() :i{666}, s{"qqq"} { } // j is uninitialized
X(int ii) :i{ii} {} // s is "" and j is uninitialized
// ...
};
How would a maintainer know whether j was deliberately uninitialized (probably a bad idea anyway) and whether it was intentional to give s the default value "" in one case and qqq in another (almost certainly a bug)? The problem with j (forgetting to initialize a member) often happens when a new member is added to an existing class.
Example
class X2 {
int i {666};
string s {"qqq"};
int j {0};
public:
X2() = default; // all members are initialized to their defaults
X2(int ii) :i{ii} {} // s and j initialized to their defaults
// ...
};
Alternative: We can get part of the benefits from default arguments to constructors, and that is not uncommon in older code. However, that is less explicit, causes more arguments to be passed, and is repetitive when there is more than one constructor:
class X3 { // BAD: inexplicit, argument passing overhead
int i;
string s;
int j;
public:
X3(int ii = 666, const string& ss = "qqq", int jj = 0)
:i{ii}, s{ss}, j{jj} { } // all members are initialized to their defaults
// ...
};
Enforcement
(Simple) Every constructor should initialize every member variable (either explicitly, via a delegating ctor call or via default construction).
(Simple) Default arguments to constructors suggest an in-class initializer might be more appropriate.
There is also the guideline C-45 that explains it.

Casting Parent Struct to Child Struct [duplicate]

In c++ what is object slicing and when does it occur?
"Slicing" is where you assign an object of a derived class to an instance of a base class, thereby losing part of the information - some of it is "sliced" away.
For example,
class A {
int foo;
};
class B : public A {
int bar;
};
So an object of type B has two data members, foo and bar.
Then if you were to write this:
B b;
A a = b;
Then the information in b about member bar is lost in a.
Most answers here fail to explain what the actual problem with slicing is. They only explain the benign cases of slicing, not the treacherous ones. Assume, like the other answers, that you're dealing with two classes A and B, where B derives (publicly) from A.
In this situation, C++ lets you pass an instance of B to A's assignment operator (and also to the copy constructor). This works because an instance of B can be converted to a const A&, which is what assignment operators and copy-constructors expect their arguments to be.
The benign case
B b;
A a = b;
Nothing bad happens there - you asked for an instance of A which is a copy of B, and that's exactly what you get. Sure, a won't contain some of b's members, but how should it? It's an A, after all, not a B, so it hasn't even heard about these members, let alone would be able to store them.
The treacherous case
B b1;
B b2;
A& a_ref = b2;
a_ref = b1;
//b2 now contains a mixture of b1 and b2!
You might think that b2 will be a copy of b1 afterward. But, alas, it's not! If you inspect it, you'll discover that b2 is a Frankensteinian creature, made from some chunks of b1 (the chunks that B inherits from A), and some chunks of b2 (the chunks that only B contains). Ouch!
What happened? Well, C++ by default doesn't treat assignment operators as virtual. Thus, the line a_ref = b1 will call the assignment operator of A, not that of B. This is because, for non-virtual functions, the declared (formally: static) type (which is A&) determines which function is called, as opposed to the actual (formally: dynamic) type (which would be B, since a_ref references an instance of B). Now, A's assignment operator obviously knows only about the members declared in A, so it will copy only those, leaving the members added in B unchanged.
A solution
Assigning only to parts of an object usually makes little sense, yet C++, unfortunately, provides no built-in way to forbid this. You can, however, roll your own. The first step is making the assignment operator virtual. This will guarantee that it's always the actual type's assignment operator which is called, not the declared type's. The second step is to use dynamic_cast to verify that the assigned object has a compatible type. The third step is to do the actual assignment in a (protected!) member assign(), since B's assign() will probably want to use A's assign() to copy A's, members.
class A {
public:
virtual A& operator= (const A& a) {
assign(a);
return *this;
}
protected:
void assign(const A& a) {
// copy members of A from a to this
}
};
class B : public A {
public:
virtual B& operator= (const A& a) {
if (const B* b = dynamic_cast<const B*>(&a))
assign(*b);
else
throw bad_assignment();
return *this;
}
protected:
void assign(const B& b) {
A::assign(b); // Let A's assign() copy members of A from b to this
// copy members of B from b to this
}
};
Note that, for pure convenience, B's operator= covariantly overrides the return type, since it knows that it's returning an instance of B.
If You have a base class A and a derived class B, then You can do the following.
void wantAnA(A myA)
{
// work with myA
}
B derived;
// work with the object "derived"
wantAnA(derived);
Now the method wantAnA needs a copy of derived. However, the object derived cannot be copied completely, as the class B could invent additional member variables which are not in its base class A.
Therefore, to call wantAnA, the compiler will "slice off" all additional members of the derived class. The result might be an object you did not want to create, because
it may be incomplete,
it behaves like an A-object (all special behaviour of the class B is lost).
These are all good answers. I would just like to add an execution example when passing objects by value vs by reference:
#include <iostream>
using namespace std;
// Base class
class A {
public:
A() {}
A(const A& a) {
cout << "'A' copy constructor" << endl;
}
virtual void run() const { cout << "I am an 'A'" << endl; }
};
// Derived class
class B: public A {
public:
B():A() {}
B(const B& a):A(a) {
cout << "'B' copy constructor" << endl;
}
virtual void run() const { cout << "I am a 'B'" << endl; }
};
void g(const A & a) {
a.run();
}
void h(const A a) {
a.run();
}
int main() {
cout << "Call by reference" << endl;
g(B());
cout << endl << "Call by copy" << endl;
h(B());
}
The output is:
Call by reference
I am a 'B'
Call by copy
'A' copy constructor
I am an 'A'
Third match in google for "C++ slicing" gives me this Wikipedia article http://en.wikipedia.org/wiki/Object_slicing and this (heated, but the first few posts define the problem) : http://bytes.com/forum/thread163565.html
So it's when you assign an object of a subclass to the super class. The superclass knows nothing of the additional information in the subclass, and hasn't got room to store it, so the additional information gets "sliced off".
If those links don't give enough info for a "good answer" please edit your question to let us know what more you're looking for.
The slicing problem is serious because it can result in memory corruption, and it is very difficult to guarantee a program does not suffer from it. To design it out of the language, classes that support inheritance should be accessible by reference only (not by value). The D programming language has this property.
Consider class A, and class B derived from A. Memory corruption can happen if the A part has a pointer p, and a B instance that points p to B's additional data. Then, when the additional data gets sliced off, p is pointing to garbage.
In C++, a derived class object can be assigned to a base class object, but the other way is not possible.
class Base { int x, y; };
class Derived : public Base { int z, w; };
int main()
{
Derived d;
Base b = d; // Object Slicing, z and w of d are sliced off
}
Object slicing happens when a derived class object is assigned to a base class object, additional attributes of a derived class object are sliced off to form the base class object.
I see all the answers mention when object slicing happens when data members are sliced. Here I give an example that the methods are not overridden:
class A{
public:
virtual void Say(){
std::cout<<"I am A"<<std::endl;
}
};
class B: public A{
public:
void Say() override{
std::cout<<"I am B"<<std::endl;
}
};
int main(){
B b;
A a1;
A a2=b;
b.Say(); // I am B
a1.Say(); // I am A
a2.Say(); // I am A why???
}
B (object b) is derived from A (object a1 and a2). b and a1, as we expect, call their member function. But from polymorphism viewpoint we don’t expect a2, which is assigned by b, to not be overridden. Basically, a2 only saves A-class part of b and that is object slicing in C++.
To solve this problem, a reference or pointer should be used
A& a2=b;
a2.Say(); // I am B
or
A* a2 = &b;
a2->Say(); // I am B
So ... Why is losing the derived information bad? ... because the author of the derived class may have changed the representation such that slicing off the extra information changes the value being represented by the object. This can happen if the derived class if used to cache a representation that is more efficient for certain operations, but expensive to transform back to the base representation.
Also thought someone should also mention what you should do to avoid slicing...
Get a copy of C++ Coding Standards, 101 rules guidlines, and best practices. Dealing with slicing is #54.
It suggests a somewhat sophisticated pattern to fully deal with the issue: have a protected copy constructor, a protected pure virtual DoClone, and a public Clone with an assert which will tell you if a (further) derived class failed to implement DoClone correctly. (The Clone method makes a proper deep copy of the polymorphic object.)
You can also mark the copy constructor on the base explicit which allows for explicit slicing if it is desired.
The slicing problem in C++ arises from the value semantics of its objects, which remained mostly due to compatibility with C structs. You need to use explicit reference or pointer syntax to achieve "normal" object behavior found in most other languages that do objects, i.e., objects are always passed around by reference.
The short answers is that you slice the object by assigning a derived object to a base object by value, i.e. the remaining object is only a part of the derived object. In order to preserve value semantics, slicing is a reasonable behavior and has its relatively rare uses, which doesn't exist in most other languages. Some people consider it a feature of C++, while many considered it one of the quirks/misfeatures of C++.
1. THE DEFINITION OF SLICING PROBLEM
If D is a derived class of the base class B, then you can assign an object of type Derived to a variable (or parameter) of type Base.
EXAMPLE
class Pet
{
public:
string name;
};
class Dog : public Pet
{
public:
string breed;
};
int main()
{
Dog dog;
Pet pet;
dog.name = "Tommy";
dog.breed = "Kangal Dog";
pet = dog;
cout << pet.breed; //ERROR
Although the above assignment is allowed, the value that is assigned to the variable pet loses its breed field. This is called the slicing problem.
2. HOW TO FIX THE SLICING PROBLEM
To defeat the problem, we use pointers to dynamic variables.
EXAMPLE
Pet *ptrP;
Dog *ptrD;
ptrD = new Dog;
ptrD->name = "Tommy";
ptrD->breed = "Kangal Dog";
ptrP = ptrD;
cout << ((Dog *)ptrP)->breed;
In this case, none of the data members or member functions of the dynamic variable
being pointed to by ptrD (descendant class object) will be lost. In addition, if you need to use functions, the function must be a virtual function.
It seems to me, that slicing isn't so much a problem other than when your own classes and program are poorly architected/designed.
If I pass a subclass object in as a parameter to a method, which takes a parameter of type superclass, I should certainly be aware of that and know the internally, the called method will be working with the superclass (aka baseclass) object only.
It seems to me only the unreasonable expectation that providing a subclass where a baseclass is requested, would somehow result in subclass specific results, would cause slicing to be a problem. Its either poor design in the use of the method or a poor subclass implementation. I'm guessing its usually the result of sacrificing good OOP design in favor of expediency or performance gains.
OK, I'll give it a try after reading many posts explaining object slicing but not how it becomes problematic.
The vicious scenario that can result in memory corruption is the following:
Class provides (accidentally, possibly compiler-generated) assignment on a polymorphic base class.
Client copies and slices an instance of a derived class.
Client calls a virtual member function that accesses the sliced-off state.
Slicing means that the data added by a subclass are discarded when an object of the subclass is passed or returned by value or from a function expecting a base class object.
Explanation:
Consider the following class declaration:
class baseclass
{
...
baseclass & operator =(const baseclass&);
baseclass(const baseclass&);
}
void function( )
{
baseclass obj1=m;
obj1=m;
}
As baseclass copy functions don't know anything about the derived only the base part of the derived is copied. This is commonly referred to as slicing.
class A
{
int x;
};
class B
{
B( ) : x(1), c('a') { }
int x;
char c;
};
int main( )
{
A a;
B b;
a = b; // b.c == 'a' is "sliced" off
return 0;
}
when a derived class object is assigned to a base class object, additional attributes of a derived class object are sliced off (discard) form the base class object.
class Base {
int x;
};
class Derived : public Base {
int z;
};
int main()
{
Derived d;
Base b = d; // Object Slicing, z of d is sliced off
}
When a Derived class Object is assigned to Base class Object, all the members of derived class object is copied to base class object except the members which are not present in the base class. These members are Sliced away by the compiler.
This is called Object Slicing.
Here is an Example:
#include<bits/stdc++.h>
using namespace std;
class Base
{
public:
int a;
int b;
int c;
Base()
{
a=10;
b=20;
c=30;
}
};
class Derived : public Base
{
public:
int d;
int e;
Derived()
{
d=40;
e=50;
}
};
int main()
{
Derived d;
cout<<d.a<<"\n";
cout<<d.b<<"\n";
cout<<d.c<<"\n";
cout<<d.d<<"\n";
cout<<d.e<<"\n";
Base b = d;
cout<<b.a<<"\n";
cout<<b.b<<"\n";
cout<<b.c<<"\n";
cout<<b.d<<"\n";
cout<<b.e<<"\n";
return 0;
}
It will generate:
[Error] 'class Base' has no member named 'd'
[Error] 'class Base' has no member named 'e'
I just ran across the slicing problem and promptly landed here. So let me add my two cents to this.
Let's have an example from "production code" (or something that comes kind of close):
Let's say we have something that dispatches actions. A control center UI for example.
This UI needs to get a list of things that are currently able to be dispatched. So we define a class that contains the dispatch-information. Let's call it Action. So an Action has some member variables. For simplicity we just have 2, being a std::string name and a std::function<void()> f. Then it has an void activate() which just executes the f member.
So the UI gets a std::vector<Action> supplied. Imagine some functions like:
void push_back(Action toAdd);
Now we have established how it looks from the UI's perspective. No problem so far. But some other guy who works on this project suddenly decides that there are specialized actions that need more information in the Action object. For what reason ever. That could also be solved with lambda captures. This example is not taken 1-1 from the code.
So the guy derives from Action to add his own flavour.
He passes an instance of his home-brewed class to the push_back but then the program goes haywire.
So what happened?
As you might have guessed: the object has been sliced.
The extra information from the instance has been lost, and f is now prone to undefined behaviour.
I hope this example brings light about for those people who can't really imagine things when talking about As and Bs being derived in some manner.

link functions with mismatching signature

I'm playing around with gcc and g++ compiler and trying to compile some C code within those, my purpose is to see how the compiler / linker enforces that when linking a model with some function declaration to a model with that implementation of that function, the correct function are linked ( in terms of parameters passed and values returned )
for example let's take a look at this code
#include <stdio.h>
extern int foo(int b, int c);
int main()
{
int f = foo(5, 8);
printf("%d",f);
}
after compilation within my symbol table I'd have a symbol for foo, but within the elf file format there is not place that describes the arguments taken and the function signature, ( int(int,int) ), so basically if I write some other code such as this:
char foo(int a, int b, int c)
{
return (char) ( a + b + c );
}
compile that model it'll also have some symbol called foo, what if I link these models together, what's gonna happen? I have never thought of this, and how would a compiler overcome this weakness... I know that within g++ the compiler generates some prefix for every symbol regarding to it's namespace, but does it also take in mind the signature? If anyone has ever encountered this it would be great if he could shed some light upon this problem
The problem is solved with name mangling.
In compiler construction, name mangling (also called name decoration)
is a technique used to solve various problems caused by the need to
resolve unique names for programming entities in many modern
programming languages.
It provides a way of encoding additional information in the name of a
function, structure, class or another datatype in order to pass more
semantic information from the compilers to linkers.
The need arises where the language allows different entities to be
named with the same identifier as long as they occupy a different
namespace (where a namespace is typically defined by a module, class,
or explicit namespace directive) or have different signatures (such as
function overloading).
Note the simple example:
Consider the following two definitions of f() in a C++ program:
int f (void) { return 1; }
int f (int) { return 0; }
void g (void) { int i = f(), j = f(0); }
These are distinct functions, with no relation to each other apart
from the name. If they were natively translated into C with no
changes, the result would be an error — C does not permit two
functions with the same name. The C++ compiler therefore will encode
the type information in the symbol name, the result being something
resembling:
int __f_v (void) { return 1; }
int __f_i (int) { return 0; }
void __g_v (void) { int i = __f_v(), j = __f_i(0); }
Notice that g() is mangled even though there is no conflict; name
mangling applies to all symbols.
Wow, I've kept exploring and testing it on my own and I came up with a solution which quietly amazed my mind,
so I wrote the following code and compiled it on a gcc compiler
main.c
#include <stdio.h>
extern int foo(int a, char b);
int main()
{
int g = foo(5, 6);
printf("%d", g);
return 0;
}
foo.c
typedef struct{
int a;
int b;
char c;
char d;
} mystruct;
mystruct foo(int a, int b)
{
mystruct myl;
my.a = a;
my.b = a + 1;
my.c = (char) b;
my.d = (char b + 1;
return my1;
}
now I compiled foo.c to foo.o with gcc firstly and checked the symbol table using
readelf and I had some entry called foo
also after that I compiled main.c to main.o checked the symbol table and it also had some entry called foo, I linked those two together and surprisingly it worked, I ran main.o and obviously encountered some segmentation fault, which makes sense as the actual implementation of foo as implemented in foo.o probably expects three parameters (first one should be struct adders), a parameter which isn't passed in main.o under it's definition to foo then the actual implementation accesses some memory that doesn't belong to it from the stack frame of main, then tries accessing addresses that it thought it got, and ends up with segmentation fault, that's fine,
now I compiled both models again with g++ and not gcc and what came up was amazing.. I found out that the symbol entry under foo.o was _Z3fooii and under main.o it was _Z3fooic, now my guess is that the ii suffix means int int and ic suffix means int char which probably refers to the parameters that should be passed to function hence allowing the compiler to know some function deceleration gets the actual implementation. so I changed my foo declaration in main.c to
extern int foo(int a, int b);
re-compiled and this time got the symbol _Z3fooii, I linked both models again and amazingly this time it worked, I tried running it and again encountered segmentation fault, which again also makes sense as the compiler wont always even authorize correct return values.. anyways what was my original thought - that g++ includes function signature within symbol name and thus enforces the linker to give function implementation get correct parameters to correct function declaration

Is it possible to call Rust's struct fields directly from Ruby code without implementing extern "C" getter to the corresponding fields

I am thinking about writing a Ruby gem with Rust. Lets assume I want to create some structs in Rust which are returned to the Ruby code similar to the example here. While getting the Point struct to my Ruby code, I would like to call its attributes directly. Currently I would have to do something like that:
point.rb:
require "fiddle"
require "fiddle/import"
module RustPoint
extend Fiddle::Importer
dlload "./libmain.dylib"
extern "Point* make_point(int, int)"
extern "double get_distance(Point*, Point*)"
extern "int y(Point*)"
extern "int x(Point*)"
end
main.rs:
use std::num::pow;
pub struct Point { x: int, y: int }
#[no_mangle]
pub extern "C" fn make_point(x: int, y: int) -> Box<Point> {
box Point { x: x, y: y }
}
#[no_mangle]
pub extern "C" fn x(p: &Point) -> int {
p.x
}
#[no_mangle]
pub extern "C" fn y(p: &Point) -> int {
p.y
}
and use this in Ruby:
point = RustPoint::make_point(0, 42)
# To get x:
x = RustPoint::x(point)
to get an x value. I would prefer something like:
point = RustPoint::make_point(0, 42)
# To get x:
x = point.x
Does anyone know a library or a way to get this implemented easier. I think it would be much nicer if i wouldn't see a different regarding the point object from ruby side. i should not make a difference weather this is a C extension, a Ruby object or written in Rust.
Edit: I want the Rust code to behave like a native extension. So the returned struct should be callable from Ruby side similar to a C struct using ruby objects as values. Of course a library would be nessessary to handle the ruby objects in rust code.
You could wrap the whole thing in a custom delegator:
class RustDelegator
attr_accessor :__delegate_class__, :__delegate__
def method_missing(method_name, *arguments, &block)
__delegate_class__.public_send(method_name, *__rust_arguments__(arguments), &block)
end
def respond_to_missing(name, include_private = false)
__delegate_class__.respond_to?(name, include_private)
end
private
def __rust_arguments__(arguments)
arguments.unshift(__delegate__)
end
end
class Point < RustDelegator
def initialize(x, y)
self.__delegate_class__ = RustPoint
self.__delegate__ = RustPoint::make_point(0, 42)
end
end
p = Point.new(0, 42)
#=> #<Point:0x007fb4a4b5b9d0 #__delegate__=[0, 42], #__delegate_class__=RustPoint>
p.x
#=> 0
p.y
#=> 42
Rust provides a native C interface for struct as well. If you define your struct like this:
#[repr(C)]
pub struct Point {
pub x: i32,
pub y: i32
}
It will behave like the C struct
struct Point
{
int32_t x;
int32_t y;
}
You may then use it in Ruby like any other C struct.
I recommend using the fixed size int types rather than plain int, because you have no real guaranty Rust's int are the same size as C's int. If you really need to use it, you should probably use libc::c_int.

Ruby FFI how to define default arguments

I am trying to port a native extension of ruby to FFI. The exposed ruby interface is
auto_link(text, mode=:all, link_attr=nil, skip_tags=nil, flags=0) { |link_text| ... }
and the block is optional.
There are two functions in the original c implementation:
int rinku_autolink(
struct buf *ob,
const uint8_t *text,
size_t size,
autolink_mode mode,
unsigned int flags,
const char *link_attr,
const char **skip_tags,
void (*link_text_cb)(struct buf *ob, const struct buf *link, void *payload),
void *payload)
which does the actual work and
static VALUE rb_rinku_autolink(int argc, VALUE *argv, VALUE self)
which deals with the default arguments and block callback stuff.
My question is if I want to expose the same ruby interface in FFI, which one of the above functions should be registered with attach_function, or should I define another c function for FFI? And whichever function to choose how to define the default argument values in attach_function?
The answer is neither. I think you are misunderstanding the point of FFI, or I am misunderstanding your post. If you are porting your native extension to FFI, that should mean that you are getting rid of all the C code in your code base and converting it to Ruby. You should convert the features of the old C methods rinku_autolink() and rb_rinku_autolink() into Ruby, probably a single Ruby method. Then if that Ruby method needs to call some C functions you would use FFI's attach_function method to get access to those.
If that's not what you are doing, could you please explain what your actual goal is and why?

Resources