Is that possible to make custom renderscript intrinsics? - intrinsics

Renderscript intrinsics is very fast and useful. However, there are situations where we might want to build our own intrinsics, e.g. current convolution doesn't support the "valid" mode as in matlab. It would be very nice to have it. So, I'm wondering if it's possible to do so and connect nicely with the java layer (just like the existing intrinsics)? If it's possible, would you sketch how? Thank you.

no, there's no way to add custom intrinsics right now. in the next release we're planning to add support for clipped intrinsics in the same way that clipped kernels operate since 4.3, which would allow you to add a valid mode equivalent to Matlab.

Related

Is there a way to determine GPU warp/wavefront/SIMD width on Android?

My question is similar to the question "OpenCL - How to I query for a device's SIMD width?", but I'm wondering whether there's any way to do this outside of OpenCL, CUDA, or anything else that's not really available on Android, which I'm targeting. I am writing an OpenGL ES 3.1 application which makes use of compute shaders, and for certain GPGPU algorithms, such as efficient parallel reduction as described by Nvidia (in the Reduction #5 section), there are optimizations you can make if you are aware of the "warp" (a.k.a. wavefront, a.k.a. SIMD width) size of the GPU the code will be running on. I'm also not sure if it's consistent enough on Android GPUs in order to just make a hard-coded assumption without querying anything, or if there's some table of GPU info I can reference, etc.
I tried Googling if there is any way to do this in OpenGL or even in general on Android, but I could not find anything. Is this possible? If not, is there a "recommended" workaround, like just assuming some minimum possible warp size in cases where that still may produce a small speed-up?
For OpenGL ES if the implementation supports the OpenGL ES KHR_shader_subgroup extension you can use glGetIntegerv(SUBGROUP_SIZE_KHR) to get the subgroup size.
https://www.khronos.org/registry/OpenGL/extensions/KHR/KHR_shader_subgroup.txt
For sake of completeness, for Vulkan 1.1 you can query the subgroup size in the device properties VkPhysicalDeviceSubgroupProperties.subgroupSize.
https://www.khronos.org/blog/vulkan-subgroup-tutorial

Porting DirectX to OpenGL ES (iPhone)

I have been asked to investigate porting 10 year old Direct X (v7-9) games to OpenGL ES, initially for the iPhone
I have never undertaken a game port like this before (and will be hiring someone to do it) but I'd like to understand the process.
Are there any resources/books/blogs that will help me in understanding the process?
Are there any projects like Mono that can accomplish this?
TBH A porting job like this is involved but fairly easy.
First you start by replacing all the DirectX calls with "stubs" (ie empty functions). You do this until you can get the software to compile. Once it has compiled then you start implementing all the stub functions. There will be a number of gotchas along the way but its worth doing.
If you need to port to and support phones before iPhone 3GS you have a more complex task as the hardware only supports GLES 1 which is fixed-function only. You will have to "emulate" these shaders somehow. On mobile platforms I have written, in the past, assembler code that performs "vertex shading" directly on the vertex data. Pixel shading is often more complicated but you can usually provide enough information through the "vertex shading" to get this going. Some graphical features you may just have to drop.
Later versions of the iPhone use GLES 2 so you have access to GLSL ... ATI have written, and Aras P of Unity3D fame has extended, software that will port HLSL code to GLSL.
Once you have done all this you get on to the optimisation stage. You will probably find that your first pass isn't very efficient. This is perfectly normal. At this point you can look at the code from a higher level and see how you can move code around and do things differently to get best performance.
In summary: Your first step will be to get the code to compile without DirectX. Your next step will be the actual porting of DirectX calls to OpenGL ES calls. Finally you will want to refactor the remaining code for best performance.
(P.S: I'd be happy to do the porting work for you. Contact me through my linkedin page in my profile ;)).
Not a complete answer, but in the hope of helping a little...
I'm not aware of anything targeting OpenGL ES specifically, but Cadega, Cider and VirtualBox — amongst others — provide translation of DirectX calls to OpenGL calls, and OpenGL ES is, broadly speaking, OpenGL with a lot of very rarely used bits and some slower and redundant parts removed. So it would probably be worth at least investigating those products; at least VirtualBox is open source.
The SGX part in the iPhone 3GS onwards has a fully programmable pipeline, making it equivalent to a DirectX 10 part, so the hardware is there. The older MBX is fixed pipeline with the dot3 extension but no cube maps and only two texture units. It also has the matrix palette extension, so you can do good animation and pretty good lighting if multiple passes is acceptable.

v8 is too slow for my purpose

I'm working on a music visualization plugin for libvisual. It's an AVS clone -- AVS being from Winamp. Right now I have a superscope plugin. This element has 4 scripts, and "point" is run at every pixel. You can imagine that it has to be rather fast. The original libvisual avs clone had a JIT compiler that was really fast, but it had some bugs and wasn't fully implemented, so I decided to try v8. Well, v8 is too slow running the compiled script at every pixel. Is there any other script engine that would be pretty fast for this purpose?
If you are running your updates on a per-pixel level, I would suggest having an off-screen in-memory representation of the screen, and update the screen as a whole, not each individual pixel. I know that this is a common issue for bitmap updates in general, not V8 per-se. I don't know enough about the specific environment you are working in to be much help, only that as I said, it's a common performance issue to try to update individual pixels against a UI canvas one at a time. If you can do an offline/offscreen representation of your canvas/ui surface then update it all at once, your performance will be much better.
Also, there will be some dependencies on how your event model is worked out. If this doesn't work well, you may need to bring this logic into a compiled COM object or something, but on a per-pixel update scheme, you will have similar issues when trying to do per-pixel updates. Not saying you are, just noting again this is the most common issue with this type of problem.
sounds like you need to use native code, or maybe a Java Applet (Not that I recommend a Java Applet, use it only if you are in full control over the client environment).

How a marker-based augmented reality algorithm (like ARToolkit's one) works?

For my job i've been using a Java version of ARToolkit (NyARTookit). So far it proven good enough for our needs, but my boss is starting to want the framework ported in other platforms such as web (Flash, etc) and mobiles. While i suppose i could use other ports, i'm increasingly annoyed by not knowing how the kit works and beyond that, from some limitations. Later i'll also need to extend the kit's abilities to add stuff like interaction (virtual buttons on cards, etc), which as far as i've seen in NyARToolkit aren't supported.
So basically, i need to replace ARToolkit with a custom mark detector (and in case of NyARToolkit, try to get rid of JMF and use a better solution via JNI). However i don't know how these detectors work. I know about 3D graphics and i've built a nice framework around it, but i need to know how to build the underlying tech :-).
Does anyone know any sources about how to implement a marker-based augmented reality application from scratch? When searching in google i only find "applications" of AR, not the underlying algorithms :-/.
'From scratch' is a relative term. Truly doing it from scratch, without using any pre-existing vision code, would be very painful and you wouldn't do a better job of it than the entire computer vision community.
However, if you want to do AR with existing vision code, this is more reasonable. The essential sub-tasks are:
Find the markers in your image or video.
Make sure they are the ones you want.
Figure out how they are oriented relative to the camera.
The first task is keypoint localization. Techniques for this include SIFT keypoint detection, the Harris corner detector, and others. Some of these have open source implementations - i think OpenCV has the Harris corner detector in the function GoodFeaturesToTrack.
The second task is making region descriptors. Techniques for this include SIFT descriptors, HOG descriptors, and many many others. There should be an open-source implementation of one of these somewhere.
The third task is also done by keypoint localizers. Ideally you want an affine transformation, since this will tell you how the marker is sitting in 3-space. The Harris affine detector should work for this. For more details go here: http://en.wikipedia.org/wiki/Harris_affine_region_detector

Image Recognition

I'd like to do some work with the nitty-gritties of computer imaging. I'm looking for a way to read single pixels of data, analyze them programatically, and change them. What is the best language to use for this (Python, c++, Java...)? What is the best fileformat?
I don't want any super fancy software/APIs... I'm looking for the bare basics.
If you need speed (you'll probably always want speed with image processing) you definitely have to work with raw pixel data.
Java has some real disadvantages as you cannot access memory directly which makes pixel access quite slow compared to accessing the memory directly.
C++ is definitely the language of choice for production use image processing. But you can, for example, also use C# as it allows for unsafe code in specific areas. (Take a look at the scan0 pointer property of the bitmapdata class.)
I've used C# successfully for image processing applications and they are definitely much faster than their java counterparts.
I would not use any scripting language or java for such a purpose.
It's very east to manipulate the large multi-dimensional or complex arrays of pixel information that are pictures using high-level languages such as Python. There's a library called PIL (the Python Imaging Library) that is quite useful and will let you do general filters and transformations (change the brightness, soften, desaturate, crop, etc) as well as manipulate the raw pixel data.
It is the easiest and simplest image library I've used to date and can be extended to do whatever it is you're interested in (edge detection in very little code, for example).
I studied Artificial Intelligence and Computer Vision, thus I know pretty well the kind of tools that are used in this field.
Basically: you can use whatever you want as long as you know how it works behind the scene.
Now depending on what you want to achieve, you can either use:
C language, but you will lose a lot of time in bugs checking and memory management when implementing your algorithms. So theoretically, this is the fastest language to do that kind of job, but if your algorithms are not computationnally efficient (in terms of complexity) or if you lose too much time in bugs checking, this is clearly not worth it. So I would advise to first implement your application in another language, and then later you can always optimize small parts of your code with C bindings.
Octave/MatLab: very efficient language, almost as much as C, and you can make very elegant and succinct algorithms. If you are into vectorization, matrix and linear operations, you should go with that. However, you won't be able to develop a whole application with this language, it's more focused on algorithms, but then you can always develop an interface using another language later.
Python: all-in-one elegant and accessible language, used in gigantically large scale applications such as Google and Facebook. You can do pretty much everything you want with Python, any kind of application. It will be perfectly adapted if you want to make a full application (with client interaction and all, not only algorithms), or if you want to quickly draft a prototype using existent libraries since Python has a very large set of high quality libraries, like OpenCV. However if you only want to make algorithms, you should better use Octave/MatLab.
The answer that was selected as a solution is very biaised, and you should be careful about this kind of archaic comment.
Nowadays, hardware is cheaper than wetware (humans), and thus, you should use languages where you will be able to produce results faster, even if it's at the cost of a few CPU cycles or memory space.
Also, a lot of people tends to think that as long as you implement your software in C/C++, you are making the Saint Graal of speedness: this is just not true. First, because algorithms complexity matters a lot more than the language you are using (a bad algorithm will never beat a better algorithm, even if implemented in the slowest language in the universe), and secondly because high-level languages are nowadays doing a lot of caching and speed optimization for you, and this can make your program run even faster than in C/C++.
Of course, you can always do everything of the above in C/C++, but how much of your time are you willing to waste to reinvent the wheel?
Not only will C/C++ be faster, but most of the image processing sample code you find out there will be in C as well, so it will be easier to incorporate things you find.
if you are looking to numerical work on your images (think matrix) and you into Python check out http://www.scipy.org/PyLab - this is basically the ability to do matlab in python, buddy of mine swears by it.
(This might not apply for the OP who only wanted the bare basics -- but now that the speed issue was brought up, I do need to write this, just for the record.)
If you really need speed, it's better to forget about working on the pixel-by-pixel level, and rather see whether the operations that you need to perform could be vectorized. For example, for your C/C++ code you could use the excellent Intel IPP library (no, I don't work for Intel).
It depends a little on what you're trying to do.
If runtime speed is your issue then c++ is the best way to go.
If speed of development is an issue, though, I would suggest looking at java. You said that you wanted low level manipulation of pixels, which java will do for you. But the other thing that might be an issue is the handling of the various file formats. Java does have some very nice APIs to deal with the reading and writing of various image formats to file (in particular the java2d library. You choose to ignore the higher levels of the API)
If you do go for the c++ option (or python come to think of it) I would again suggest the use of a library to get you over the startup issues of reading and writing files. I've previously had success with libgd
What language do you know the best? To me, this is the real question.
If you're going to spend months and months learning one particular language, then there's no real advantage in using Python or Java just for their (to be proven) development speed.
I'm particularly proficient in C++ and I think that for this particular task I can be as speedy as a Java programmer, for example. With the aid of some good library (OpenCV comes to mind) you can create anything you need in a matter of a couple of lines of C++ code, really.
Short answer: C++ and OpenCV
Short answer? I'd say C++, you have far more flexibility in manipulating raw chunks of memory than Python or Java.

Resources