I have found some code in PyCXX that may be buggy.
Is it indeed a bug, and if so, what is the right way to fix it?
Here is the problem:
struct PythonClassInstance
{
PyObject_HEAD
ExtObjBase* m_pycxx_object;
}
:
{
:
table->tp_new = extension_object_new; // PyTypeObject
:
}
:
static PyObject* extension_object_new(
PyTypeObject* subtype, PyObject* args, PyObject* kwds )
{
PythonClassInstance* o = reinterpret_cast<PythonClassInstance *>
( subtype->tp_alloc(subtype,0) );
if( ! o )
return nullptr;
o->m_pycxx_object = nullptr;
PyObject* self = reinterpret_cast<PyObject* >( o );
return self;
}
Now PyObject_HEAD expands to "PyObject ob_base;", so clearly PythonClassInstance trivially extends PyObject to contain an extra pointer (which will point to PyCXX's representation for this PyObject)
tp_alloc allocates memory for storing a PyObject
The code then typecasts this pointer to a PythonClassInstance, laying claim to an extra 4(or 8?) bytes that it does not own!
And then it sets this extra memory to 0.
This looks very dangerous, and I'm surprised the bug has gone unnoticed. The risk is that some future object will get placed in this location (that is meant to be storing the ExtObjBase*).
How to fix it?
PythonClassInstance foo{};
PyObject* tmp = subtype->tp_alloc(subtype,0);
// !!! memcpy sizeof(PyObject) bytes starting from location tmp into location (void*)foo
But I think now maybe I need to release tmp, and I don't think I should be playing with memory directly like this. I feel like it could be jeopardising Python's memory management/garbage collection inbuilt machinery.
The other option is maybe I can persuade tp_alloc to allocate 4 extra bytes (or is it 8 now; enough for a pointer) bypassing in 1 instead of 0.
Documentation says this second parameter is "Py_ssize_t nitems" and:
If the type’s tp_itemsize is non-zero, the object’s ob_size field
should be initialized to nitems and the length of the allocated memory
block should be tp_basicsize + nitemstp_itemsize, rounded up to a
multiple of sizeof(void); otherwise, nitems is not used and the
length of the block should be tp_basicsize.
So it looks like I should be setting:
table->tp_itemsize = sizeof(void*);
:
PyObject* tmp = subtype->tp_alloc(subtype,1);
EDIT: just tried this and it causes a crash
But then the documentation goes on to say:
Do not use this function to do any other instance initialization, not
even to allocate additional memory; that should be done by tp_new.
Now I'm not sure whether this code belongs in tp_new or tp_init.
Related:
Passing arguments to tp_new and tp_init from subtypes in Python C API
Python C-API Object Allocation
The code is correct.
As long as the PyTypeObject for the extension object is properly initialized it should work.
The base class tp_alloc receives subtype so it should know how much memory to allocate by checking the tp_basicsize member.
This is a common Python C/API pattern as demonstrated int the tutorial.
Actually this is a (minor/harmless) bug in PyCXX
SO would like to convert this answer to a comment, which makes no sense I can't awarded the green tick of completion so I comment. So I have to ramble in order to qualify it. blerh.
Related
I have a boost::multiprecision::cpp_int in big endian and have to change it to little endian. How can I do that? I tried with boost::endian::conversion but that did not work.
boost::multiprecision::cpp_int bigEndianInt("0xe35fa931a0000*);
boost::multiprecision::cpp_int littleEndianInt;
littleEndianIn = boost::endian::endian_reverse(m_cppInt);
The memory layout of boost multi-precision types is implementation detail. So you cannot assume much about it anyways (they're not supposed to be bitwise serializable).
Just read a random section of the docs:
MinBits
Determines the number of Bits to store directly within the object before resorting to dynamic memory allocation. When zero, this field is determined automatically based on how many bits can be stored in union with the dynamic storage header: setting a larger value may improve performance as larger integer values will be stored internally before memory allocation is required.
It's not immediately clear that you have any chance at some level of "normal int behaviour" in memory layout. The only exception would be when MinBits==MaxBits.
Indeed, we can static_assert that the size of cpp_int with such backend configs match the corresponding byte-sizes.
It turns out that there's even a promising tag in the backend base-class to indicate "triviality" (this is truly promising): trivial_tag, so let's use it:
Live On Coliru
#include <boost/multiprecision/cpp_int.hpp>
namespace mp = boost::multiprecision;
template <int bits> using simple_be =
mp::cpp_int_backend<bits, bits, mp::unsigned_magnitude>;
template <int bits> using my_int =
mp::number<simple_be<bits>, mp::et_off>;
using my_int8_t = my_int<8>;
using my_int16_t = my_int<16>;
using my_int32_t = my_int<32>;
using my_int64_t = my_int<64>;
using my_int128_t = my_int<128>;
using my_int192_t = my_int<192>;
using my_int256_t = my_int<256>;
template <typename Num>
constexpr bool is_trivial_v = Num::backend_type::trivial_tag::value;
int main() {
static_assert(sizeof(my_int8_t) == 1);
static_assert(sizeof(my_int16_t) == 2);
static_assert(sizeof(my_int32_t) == 4);
static_assert(sizeof(my_int64_t) == 8);
static_assert(sizeof(my_int128_t) == 16);
static_assert(is_trivial_v<my_int8_t>);
static_assert(is_trivial_v<my_int16_t>);
static_assert(is_trivial_v<my_int32_t>);
static_assert(is_trivial_v<my_int64_t>);
static_assert(is_trivial_v<my_int128_t>);
// however it doesn't scale
static_assert(sizeof(my_int192_t) != 24);
static_assert(sizeof(my_int256_t) != 32);
static_assert(not is_trivial_v<my_int192_t>);
static_assert(not is_trivial_v<my_int256_t>);
}
Conluding: you can have trivial int representation up to a certain point, after which you get the allocator-based dynamic-limb implementation no matter what.
Note that using unsigned_packed instead of unsigned_magnitude representation never leads to a trivial backend implementation.
Note that triviality might depend on compiler/platform choices (it's likely that cpp_128_t uses some builtin compiler/standard library support on GCC, e.g.)
Given this, you MIGHT be able to pull of what you wanted to do with hacks IF your backend configuration support triviality. Sadly I think it requires you to manually overload endian_reverse for 128 bits case, because the GCC builtins do not have __builtin_bswap128, nor does Boost Endian define things.
I'd suggest working off the information here How to make GCC generate bswap instruction for big endian store without builtins?
Final Demo (not complete)
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/endian/buffers.hpp>
namespace mp = boost::multiprecision;
namespace be = boost::endian;
template <int bits> void check() {
using T = mp::number<mp::cpp_int_backend<bits, bits, mp::unsigned_magnitude>, mp::et_off>;
static_assert(sizeof(T) == bits/8);
static_assert(T::backend_type::trivial_tag::value);
be::endian_buffer<be::order::big, T, bits, be::align::no> buf;
buf = T("0x0102030405060708090a0b0c0d0e0f00");
std::cout << std::hex << buf.value() << "\n";
}
int main() {
check<128>();
}
(Changing be::order::big to be::order::native obviously makes it compile. The other way to complete it would be to have an ADL accessible overload for endian_reverse for your int type.)
This is both trivial and in the general case unanswerable, let me explain:
For a general N-bit integer, where N is a large number, there is unlikely to be any well defined byte order, indeed even for 64 and 128 bit integers there are more than 2 possible orders in use: https://en.wikipedia.org/wiki/Endianness#Middle-endian.
On any platform, with any native endianness you can always extract the bytes of a cpp_int, the first example here: https://www.boost.org/doc/libs/1_73_0/libs/multiprecision/doc/html/boost_multiprecision/tut/import_export.html#boost_multiprecision.tut.import_export.examples shows you how. When exporting bytes like this, they are always most significant byte first, so you can subsequently rearrange them how you wish. You should not however, rearrange them and load them back into a cpp_int as the class won't know what to do with the result!
If you know that the value is small enough to fit into a native integer type, then you can simply cast to the native integer and use a system API on the result. As in endian_reverse(static_cast<int64_t>(my_cpp_int)). Again, don't assign the result back into a cpp_int as it requires native byte order.
If you wish to check whether a value is small enough to fit in an N-bit integer for the approach above, you can use the msb function, which returns the index of the most significant bit in the cpp_int, add one to that to obtain the number of bits used, and filter out the zero case and the code looks like:
unsigned bits_used = my_cpp_int.is_zero() ? 0 : msb(my_cpp_int) + 1;
Note that all of the above use completely portable code - no hacking of the underlying implementation is required.
This is a problem of codechef that I recently came across. The answer seems to be right for every test case where the value of input string is less than 128 bytes as it is passing a couple of test cases. For every value greater than 128 bytes it is printing out a large value which seems to be a garbage value.
std::string str;
std::cin>>str;
vector<pair<char,int>> v;
v.push_back(make_pair('C',0));
v.push_back(make_pair('H',0));
v.push_back(make_pair('E',0));
v.push_back(make_pair('F',0));
int i=0;
while(1)
{
if(str[i]=='C')
v['C'].second++;
else if (str[i]=='H')
{
v['H'].second++;
v['C'].second--;
}
else if (str[i]=='E')
{
v['E'].second++;
v['C'].second--;
}
else if (str[i]=='F')
v['F'].second++;
else
break;
i++;
Even enclosing the same code within
/*reading the string values from a file and not console*/
std::string input;
std::ifstream infile("input.txt");
while(getline(infile,input))
{
istringstream in(input);
string str;
in>>str;
/* above code goes here */
}
generates the same result. I am not looking for any solution(s) or hint(s) to get to the right answer as I want to test the correctness of my algorithm. But I want to know why this happens as I am new to vector containers`.
-Regards.
if(str[i]=='C')
v['C'].second++;
You're modifying v[67]
... which is not contained in your vector, and thus either invalid memory or uninitialized
You seem to be trying to use a vector as an associative array. There is already such a structure in C++: a std::map. Use that instead.
With using this v['C'] you actually access the 67th (if 'A' is 65 from ASCII) element of a container having only 4 items. Depending on compiler and mode (debug vs release) you get undefined behavior for the code.
What you probably wanted to use was map i.e. map<char,int> v; instead of vector<pair<char,int>> v; and simple v['C']++; instead of v['C'].second++;
I'm looking to integrate a scripting engine in my C/C++ program. Currently, I am looking at Google V8.
How do I efficiently handle 64 bit values in V8? My C/C++ program uses 64 bit values extensivly for keeping handlers/pointers. I don't want them separatelly allocated on the heap. There appears to be a V8::External value type. Can I assign it to a Javascript variable and use it as a value type?
function foo() {
var a = MyNativeFunctionReturningAnUnsigned64BitValue();
var b = a; // Hopefully, b is a stack allocated value capable of
// keeping a 64 bit pointer or some other uint64 structure.
MyNativeFunctionThatAcceptsAnUnsigned64BitValue(b);
}
If it is not possible in V8, how about SpiderMonkey? I know that Duktape (Javascript engine) has a non Ecmascript standard 64 bit value type (stack allocated) to host pointers, but I would assume that other engines also wants to keep track of external pointers from within their objects.
No it's not possible and I'm afraid duktape could be violating the spec unless it took some great pains to ensure it's not observable.
You can store pointers in objects so to store 64-bit ints directly on an object you need pointers to have the same size:
Local<FunctionTemplate> function_template = FunctionTemplate::New(isolate);
// Instances of this function have room for 1 internal field
function_template->InstanceTemplate()->SetInternalFieldCount(1);
Local<Object> object = function_template->GetFunction()->NewInstance();
static_assert(sizeof(void*) == sizeof(uint64_t));
uint64_t integer = 1;
object->SetAlignedPointerInInternalField(0, reinterpret_cast<void*>(integer));
uint64_t result = reinterpret_cast<uint64_t>(object->GetAlignedPointerInInternalField(0));
This is of course far from being efficient.
I've written a simple Bag class. A Bag is filled with a fixed ratio of Temperature enums. It allows you to grab one at random and automatically refills itself when empty. It looks like this:
class Bag {
var items = Temperature[]()
init () {
refill()
}
func grab()-> Temperature {
if items.isEmpty {
refill()
}
var i = Int(arc4random()) % items.count
return items.removeAtIndex(i)
}
func refill() {
items.append(.Normal)
items.append(.Hot)
items.append(.Hot)
items.append(.Cold)
items.append(.Cold)
}
}
The Temperature enum looks like this:
enum Temperature: Int {
case Normal, Hot, Cold
}
My GameScene:SKScene has a constant instance property bag:Bag. (I've tried with a variable as well.) When I need a new temperature I call bag.grab(), once in didMoveToView and when appropriate in touchesEnded.
Randomly this call crashes on the if items.isEmpty line in Bag.grab(). The error is EXC_BAD_INSTRUCTION. Checking the debugger shows items is size=1 and [0] = (AppName.Temperature) <invalid> (0x10).
Edit Looks like I don't understand the debugger info. Even valid arrays show size=1 and unrelated values for [0] =. So no help there.
I can't get it to crash isolated in a Playground. It's probably something obvious but I'm stumped.
Function arc4random returns an UInt32. If you get a value higher than Int.max, the Int(...) cast will crash.
Using
Int(arc4random_uniform(UInt32(items.count)))
should be a better solution.
(Blame the strange crash messages in the Alpha version...)
I found that the best way to solve this is by using rand() instead of arc4random()
the code, in your case, could be:
var i = Int(rand()) % items.count
This method will generate a random Int value between the given minimum and maximum
func randomInt(min: Int, max:Int) -> Int {
return min + Int(arc4random_uniform(UInt32(max - min + 1)))
}
The crash that you were experiencing is due to the fact that Swift detected a type inconsistency at runtime.
Since Int != UInt32 you will have to first type cast the input argument of arc4random_uniform before you can compute the random number.
Swift doesn't allow to cast from one integer type to another if the result of the cast doesn't fit. E.g. the following code will work okay:
let x = 32
let y = UInt8(x)
Why? Because 32 is a possible value for an int of type UInt8. But the following code will fail:
let x = 332
let y = UInt8(x)
That's because you cannot assign 332 to an unsigned 8 bit int type, it can only take values 0 to 255 and nothing else.
When you do casts in C, the int is simply truncated, which may be unexpected or undesired, as the programmer may not be aware that truncation may take place. So Swift handles things a bit different here. It will allow such kind of casts as long as no truncation takes place but if there is truncation, you get a runtime exception. If you think truncation is okay, then you must do the truncation yourself to let Swift know that this is intended behavior, otherwise Swift must assume that is accidental behavior.
This is even documented (documentation of UnsignedInteger):
Convert from Swift's widest unsigned integer type,
trapping on overflow.
And what you see is the "overflow trapping", which is poorly done as, of course, one could have made that trap actually explain what's going on.
Assuming that items never has more than 2^32 elements (a bit more than 4 billion), the following code is safe:
var i = Int(arc4random() % UInt32(items.count))
If it can have more than 2^32 elements, you get another problem anyway as then you need a different random number function that produces random numbers beyond 2^32.
This crash is only possible on 32-bit systems. Int changes between 32-bits (Int32) and 64-bits (Int64) depending on the device architecture (see the docs).
UInt32's max is 2^32 − 1. Int64's max is 2^63 − 1, so Int64 can easily handle UInt32.max. However, Int32's max is 2^31 − 1, which means UInt32 can handle numbers greater than Int32 can, and trying to create an Int32 from a number greater than 2^31-1 will create an overflow.
I confirmed this by trying to compile the line Int(UInt32.max). On the simulators and newer devices, this compiles just fine. But I connected my old iPod Touch (32-bit device) and got this compiler error:
Integer overflows when converted from UInt32 to Int
Xcode won't even compile this line for 32-bit devices, which is likely the crash that is happening at runtime. Many of the other answers in this post are good solutions, so I won't add or copy those. I just felt that this question was missing a detailed explanation of what was going on.
This will automatically create a random Int for you:
var i = random() % items.count
i is of Int type, so no conversion necessary!
You can use
Int(rand())
To prevent same random numbers when the app starts, you can call srand()
srand(UInt32(NSDate().timeIntervalSinceReferenceDate))
let randomNumber: Int = Int(rand()) % items.count
I'm new to the world of parallel programming and openmp, so this may be a futile question, but I can't really come up with good answer to what I'm experiencing, so I hope someone will be able to shed some light on the matter.
What I am trying to achieve is to have a private copy of a dinamically allocated matrix (of integers) for every thread that will handle the following parallel section, but as soon as the flow of execution enters said region the reference to the supposedly private matrix holds a null value.
Is there any limitation of this directive I'm not aware of? Everything seems to work just fine with monodimensional dynamic arrays.
A snippet of the code is the following one...
#define n 10000
int **matrix;
#pragma omp threadprivate(matrix)
int main()
{
matrix = (int**) calloc(n, sizeof(int*));
for(i=0;i<n;i++) matrix[i] = (int*) calloc(n, sizeof(int));
AdjacencyMatrix(n, matrix);
...
/* Explicitly turn off dynamic threads */
omp_set_dynamic(0);
#pragma omp parallel
{
// From now on, matrix is NULL...
executor_p(matrix, n);
}
....
Look at the OpenMP documentation regarding what happens with the threadprivate clause:
On first entry to a parallel region, data in THREADPRIVATE variables and common blocks should be assumed undefined, unless a COPYIN clause is specified in the PARALLEL directive
There's no guarantee of what value is going to be stored in the matrix variable in the parallel region.
OpenMP can privatise only variables with known storage size. That is you can have a private copy of an array if it was defined like double matrix[N][M]. In your case is not only the storage size unknown (a pointer doesn't store the number of elements that it is pointing to) but also your matrix is not a contiguous area in memory and rather a pointer to a list of dynamically allocated rows.
What you would end up with is having a private copy of the top-level pointer, not a private copy of the matrix data itself.