I have a metal 2 shader on macOS 10.12 that I am trying to pass an array of int into, but XCode is giving me a compile-time error Unknown type name 'array'. Here is the code I am using:
kernel void computeMandelbrot(texture2d<float, access::write> output [[texture(0)]], constant int &maxIterations [[buffer(1)]], const array<int, 10> &hist [[buffer(2)]], uint2 gid [[thread_position_in_grid]]) {
// Compute Mandelbrot
}
I have also tried using the keyword constant instead of const but then I also get an error of Parameter may not be qualified with an address space. I had read that arrays of textures were not supported in metal on macOS, but I was not sure if this appled to arrays of other types. Any help would be greatly appriciated, thank you!
A few things:
Metal 2 is not available in macOS 10.12. It is new with 10.13.
array is only usable with textures and samplers.
array is not available on macOS <=10.12. It's available on macOS 10.13+ with Metal 2.
You can declare your parameter as constant int *hist [[buffer(2)]]. It won't have an explicit length, but just limit what elements you reference.
Edit: I was wrong about array only being usable with textures and samplers. The spec suggests that to be the case by introducing it in a section titled "Arrays of Textures and Samplers" and only illustrating its use that way, but the template class itself seems generically useful.
However, until Metal 2, it's not available on macOS. In other words, it's only available on macOS 10.13+.
Related
I am currently trying to make some TensorFlow Inference (C backend) using Boost::GIL (challenging). I need a few thinks, I have been able to load my png image (rgb8_image_t)
and did a conversion to rgb32_f_image_t.
I still need 3 thinks, the raw pointer of the data, memory allocated, and dimensions.
for the memory allocated unfortunately the function total_allocated_size_in_bytes() is private, so I did this:
boost::gil::view(dest).size() * boost::gil::view(dest).num_channels() * sizeof(value_type);
Which is valid, if I do not have any extra padding for alignment story. But does it exist any nice alternative?
For the dimension, I should match with numpy (from PILLOW), I hope both libraries are using the same memory layout pattern. From my understanding, by default, datas are interleaved and contiguous so, it should be good.
Last the raw pointer _memory, it is a private data member of the Image class with no dedicated function. boost::gil::view(dest).row_begin(0) returns a iterator on the first pixel but I not sure how I could get the pointer of the data _memory. Any suggestions ?
Thank you very much,
++t
ps: TensorFlow proposes a C++ backend, however, it is not installed from any package managers, and manipulate Bazel is beyond my strength.
GIL documentation pretty accurately documents the various memory layouts.
The point of the library, though, is to abstract away the memory layouts. If you require some representation (planar/interleaved, packed or unpacked) you are doing things "the hard way" for the library interface.
So, I think you can read and convert in one go, e.g. for a jpeg:
gil::rgb32f_image_t img;
gil::image_read_settings<gil::jpeg_tag> settings;
read_and_convert_image("input.jpg", img, settings);
Now getting the raw data is possible:
auto* raw_data = gil::interleaved_view_get_raw_data(view(img));
It happens to be the case that the preferred implementation storage is interleaved, which is likely what you're expecting. If your particular image storage is planar, the call will not compile (and you'd probably want planar_view_get_raw_data(vw, plane_index) instead).
Note that you'll have to reinterpret_cast to float [const]* if you need that, because there is not public interface to get a reference to the scoped_channel_value<>::value_, but the BaseChannelValue type is indeed float and you can assert that the wrapper doesn't add additional weight:
static_assert(sizeof(float) == sizeof(raw_data[0]));
Alternative Approach:
Conversely, you can setup your own raw pixel buffer, mount a mutable view into it and use that to read/convert your initial load into:
// get dimension
gil::image_read_settings<gil::jpeg_tag> settings;
auto info = gil::read_image_info("input.jpg", settings).get_info();
// setup raw pixel buffer & view
using pixel = gil::rgb32f_pixel_t;
auto data = std::make_unique<pixel[]>(info._width * info._height);
auto vw = gil::interleaved_view(info._width, info._height, data.get(),
info._width * sizeof(pixel));
// load into buffer
read_and_convert_view("input.jpg", vw, settings);
I've actually checked that it works correctly by writing out the resulting view:
//// just for test - doesn't work for 32f, so choose another pixel format
//gil::write_view("output.png", vw, gil::png_tag());
while trying to create an object oriented OpenACC implementation I stumbled upon this question.
From there I took the code provided by #mat-colgrove at the GTC15 (code available at http://www.pgroup.com/lit/samples/gtc15_S5233.tar).
Since I am interested how to use objects to manage data on with OpenACC I posted another question.
I was quite impressed by the ease of the OpenACCArray::swap function, so I created a small example to test it (see gist).
First I tried to just swap and hope that it is sufficient to swap the pointers on the host, but this ends in a fatal memory error. (presumably because the size and capacity members are not updated on the device)
A safer approach, that I assumed to work is to update the host, swap arrays and update device. This runs but creates wrong results.
I am compiling for nvidia accelerators.
Looks like this is my fault since I didn't test the swap routine.
The problem here is while the code is swapping the data on the host, the device copy of the objects still point to the old array. The fix is to re-attach (i.e. set the object's device pointers to the correct arrays) the lists.
void swap(OpenACCArray<type>& x)
{
type* tmp_list = list;
int tmp_size = _size;
int tmp_capacity = _capacity;
list = x.list;
_size = x._size;
_capacity = x._capacity;
x.list = tmp_list;
x._size = tmp_size;
x._capacity = tmp_capacity;
#ifdef _OPENACC
#pragma acc update device(_size,_capacity,x._size,x._capacity)
acc_attach((void**)&list);
acc_attach((void**)&x.list);
#endif
}
"acc_attach" is a PGI extension that hopefully will be adopted in the OpenACC 3.0 standard.
Thanks for trying things out and let me know if you encounter other issues.
- Mat
Now that we've reached Swift 2.0, I've decided to convert my, as yet unfinished, OS X app to Swift. Making progress but I've run into some issues with using termios and could use some clarification and advice.
The termios struct is treated as a struct in Swift, no surprise there, but what is surprising is that the array of control characters in the struct is now a tuple. I was expecting it to just be an array. As you might imagine it took me a while to figure this out. Working in a Playground if I do:
var settings:termios = termios()
print(settings)
then I get the correct details printed for the struct.
In Obj-C to set the control characters you would use, say,
cfmakeraw(&settings);
settings.c_cc[VMIN] = 1;
where VMIN is a #define equal to 16 in termios.h. In Swift I have to do
cfmakeraw(&settings)
settings.c_cc.16 = 1
which works, but is a bit more opaque. I would prefer to use something along the lines of
settings.c_cc.vim = 1
instead, but can't seem to find any documentation describing the Swift "version" of termios. Does anyone know if the tuple has pre-assigned names for it's elements, or if not, is there a way to assign names after the fact? Should I just create my own tuple with named elements and then assign it to settings.c_cc?
Interestingly, despite the fact that pre-processor directives are not supposed to work in Swift, if I do
print(VMIN)
print(VTIME)
then the correct values are printed and no compiler errors are produced. I'd be interested in any clarification or comments on that. Is it a bug?
The remaining issues have to do with further configuration of the termios.
The definition of cfsetspeed is given as
func cfsetspeed(_: UnsafeMutablePointer<termios>, _: speed_t) -> Int32
and speed_t is typedef'ed as an unsigned long. In Obj-C we'd do
cfsetspeed(&settings, B38400);
but since B38400 is a #define in termios.h we can no longer do that. Has Apple set up replacement global constants for things like this in Swift, and if so, can anyone tell me where they are documented. The alternative seems to be to just plug in the raw values and lose readability, or to create my own versions of the constants previously defined in termios.h. I'm happy to go that route if there isn't a better choice.
Let's start with your second problem, which is easier to solve.
B38400 is available in Swift, it just has the wrong type.
So you have to convert it explicitly:
var settings = termios()
cfsetspeed(&settings, speed_t(B38400))
Your first problem has no "nice" solution that I know of.
Fixed sized arrays are imported to Swift as tuples, and – as far as I know – you cannot address a tuple element with a variable.
However,Swift preserves the memory layout of structures imported from C, as
confirmed by Apple engineer Joe Groff:. Therefore you can take the address of the tuple and “rebind” it to a pointer to the element type:
var settings = termios()
withUnsafeMutablePointer(to: &settings.c_cc) { (tuplePtr) -> Void in
tuplePtr.withMemoryRebound(to: cc_t.self, capacity: MemoryLayout.size(ofValue: settings.c_cc)) {
$0[Int(VMIN)] = 1
}
}
(Code updated for Swift 4+.)
The following simple program produces an EXEC_BAD_ACCESS (segmentation fault) when run, and I don't understand why:
#include <OpenGL/gl.h>
int main(void) {
const GLubyte * strVersion;
// The next line gives an 'EXEC_BAD_ACCESS'
strVersion = glGetString (GL_VERSION);
}
I'm running in Xcode in OS X 10.6.5, and I'm linking against the OpenGL framework. Any ideas would be appreciated.
You have to create an OpenGL context before you can call gl* functions. There are several ways to do that, for example using GLUT or SDL.
For C Specification to create the variable GLubyte you call it by
const GLubyte* glGetString(GL_VERSION );
then you should be able to get the version. by the following
const char *GLVersionString = glGetString(GL_VERSION);
//Or better yet, use the GL3 way to get the version number
int OpenGLVersion[2];
glGetIntegerv(GL_MAJOR_VERSION, &OpenGLVersion[0])
glGetIntegerv(GL_MINOR_VERSION, &OpenGLVersion[1])
here is more basic information on glGetString:
glGetString returns a pointer to a static string describing some aspect of the current GL connection. name can be one of the following:
GL_VENDOR
Returns the company responsible for this GL implementation.
This name does not change from release to release.
GL_RENDERER
Returns the name of the renderer.
This name is typically specific to a particular configuration of a hardware platform.
It does not change from release to release.
GL_VERSION
Returns a version or release number.
GL_SHADING_LANGUAGE_VERSION
Returns a version or release number for the shading language.
GL_EXTENSIONS
Returns a space-separated list of supported extensions to GL.
I'm keen on using boost's object_pool class for memory re-use for a set of video frames.
boost::object_pool< VideoFrame > FramePool;
Now, the VideoFrame class has two constructors. The first version of the constructor takes 4 arguments while the second version takes 6 arguments/parameters.
For every "new" video frame that is allocated, I would like to call the constructor on the object either using the 4 or 6 parameter version. For example:
//VideoFrame *F = new VideoFrame(IdeckLinkOutput, &R, PixelFormat, FrameFlags);
VideoFrame *F = FramePool.construct(IdeckLinkOutput, &R, PixelFormat, FrameFlags);
Building this on MSVS 2005, I receive the error:
error C2660: 'boost::object_pool<T>::construct' : function does not take 4 arguments
According to the documentation on the 'construct' method of object_pool, "ElementType must have a constructor matching ???; the number of parameters given must not exceed what is supported through pool_construct"
I've seen boost's page for the pool_construct, but I'm not too sure the direction I need to take. The build of boost that I have on my machine has both a pool_construct.m4, pool_construct.sh, pool_construct.bat, pool_construct.inc. It's a question of what do I do with these example files within my own project? Would I create my own variation of pool_construct.inc and include that in my own project? How would I add the file?
Any tips/recommendations would be much appreciated. Please note that I have installed gnu's m4.
zerodefect.
If I look at /usr/include/boost/pool/detail/pool_construct.inc on my Debian machine (sorry don't have access to MSVC currently), I see it only supports up to 3 constructor arguments.
Messing with m4 as per the documentation to support more than the supported 3 sounds like a pain compared with simply creating a new constructor which bundles enough of the arguments in a single struct or boost::tuple to bring the total passed down to the number supported.