CUDA Build Error with CUDA 5.5.targets [duplicate] - visual-studio-2010

The CUDA FAQ says:
CUDA defines vector types such as float4, but doesn't include any
operators on them by default. However, you can define your own
operators using standard C++. The CUDA SDK includes a header
"cutil_math.h" that defines some common operations on the vector
types.
However I can not find this using CUDA SDK 5.0. Has it been removed/renamed?
I've found a version of the header here. How is it related to the one that's supposed to come with SDK?

The cutil functionality was deleted from the CUDA 5.0 Samples (i.e. the "SDK"). You can still download a previous SDK and compile it under CUDA 5, you should then have everything that came with previous SDK's.
The official notice was given by nvidia in the CUDA 5.0 release notes (CUDA_Samples_Release_Notes.pdf, installed with the samples). As to why, I imagine that the nvidia sentiment regarding cutil probably was something like what is expressed here "not suitable for use in a real application. It is completely unsupported" but people were using it in real applications. So one way to try put a stop to that is to delete it, I suppose. That's just speculation.
Note some additional useful info provided in the release notes:
CUTIL has been removed with the CUDA Samples in CUDA 5.0, and replaced
with helper functions found in NVIDIA_CUDA-5.0/common/inc:
helper_cuda.h, helper_cuda_gl.h, helper_cuda_drvapi.h,
helper_functions.h, helper_image.h, helper_math.h, helper_string.h,
helper_timer.h
These helper functions handle CUDA device
initialization, CUDA error checking, string parsing, image file
loading and saving, and timing functions. The CUDA Samples projects no
longer have references and dependencies to CUTIL, and now use these
helper functions going forward.
So you may find useful functions in some of those header files.

in latest SDK helper_math.h implement most of required operator, however its still missing logical operators like OR or AND

Related

Are there benefits in producing a .NET 6.0 version of a .NET Standard 2.0 library?

If I have a .NET Standard 2.0 library project that is being consumed by a .NET 6.0 console project, are there any performance benefits if I also instruct the compiler to produce a .NET 6.0 version of the library?
I don't plan to use any functionality available on .NET 6.0, I just want to know if the .NET 6.0 version receives extra-love from the compiler.
Asked the same thing on Twitter, and was fortunate enough to receive feedback from reputed experts Bartosz Adamczewski, Immo Landwerth, Jared Parsons and Lucas Trzesniewski
Here is the question link.
Here are the most relevant bits of info you can extract from the original Twitter thread:
What you might gain is better IL, so things like strings and certain
other things are handled better by the front-end compiler C# and
better IL is generated, this, in turn, could provide better codegen in
JIT - Bartosz Adamczewski
#jaredpar can correct me but for the most part the code gen isnโ€™t
depending on the target framework, except for cases where the code gen
users specific APIs, which is rare. - Immo Landwerth
That is correct. At the compiler level there is no concept of target
frameworks, there are just references. Hence all decisions about code
gen are based on the API in the references. - Jared Parsons
In the case of .NET 6, you'll get access to some new APIs such as
DefaultInterpolatedStringHandler, so for instance all of your string
interpolation expressions ($"...") will get a perf boost just by
targeting net6.0. Also, there are new method overloads such as
StringBuilder.Append that take string interpolation handlers as
parameters. Your string interpolation expressions will target these
instead when targeting net6.0, and your code will allocate less than
on other targets. So yes, in some cases, your code will get more
love by the compiler if you add a net6.0 target ๐Ÿ™‚ - Lucas
Trzesniewski

Do I really need an OpenCL SDK?

I just tried to make myself familiar with OpenCL but totally got confused when everyone in the Internet was talking about downloading a Vendor specific OpenCL SDK. Why would I need that?
My understanding of using OpenCL was the following:
Download the OpenCL header files that are maintained by Kronos in the Khronos OpenCL Registry and make it available to your compiler.
Compile your code and link against the OpenCL.dll.
The reason why it confuses me is because I thought that OpenCL should abstract away vendor specific implementations. If I now download a Vendor specific SDK isn't then this advantage destroyed?
Can someone please make this clear?
Your understanding is absolutely correct - you do not need any vendor SDKs in order to develop or run OpenCL programs. All you need are the headers and a library to link against. The vendor SDKs provide sample code that may be useful to look at while you are learning how to use OpenCL, and they may also provide tools that can aid development.
On Windows, you will need an OpenCL.lib library to link against, which the SDKs do provide. You can download the sources for this library and build it yourself if you wish.
There is no harm from using a specific vendor's SDK however. The headers and library that they provide in the SDK should just be the stock Khronos versions that you can download yourself. This means that an OpenCL application built using one vendor's SDK will still run just fine against other vendors' devices.

Expanding media capabilities of Win Embedded CE 6.0

I have an embedded device with WinCE 6.0 as OS. The manufacturer provides an IDE for 3rd party development to it. The IDE pretty much allows nothing else than
.NET 3.5 Compact Framework scripting that's invoked from various events from the main application
Adding files to the device.
The included mediaplayer seems to be using DirectShow and the OS has media codec only for mpeg-1 encoded video playback. My goal is to to be able to play media encoded with some other codecs as well inside that main application.
I've already managed to use DirectShowNETCF (DirectShow wrapper for .NET Compact Framework) and successfully playback mpeg-1 encoded video.
I'm totally new with this stuff and I have tons of (stupid) questions. I'll try to narrow them down:
The OS is based on WinCE, but as far as I've understood, it's actually always some customized version of it (via Platform Builder). Only "correct way" of developing anything for it afterwards is to use the SDK the manufacturer usually provides. Right? In my case, the SDK is extremely limited and tightly integrated into IDE as noted above. However, .NET CF 3.5 is capable for interop so its possible to call native libraries -as long as they are compiled for correct platform.
Compiled code is pretty much just instructions for the processor (assembler code) and the compiler chooses the correct instructions based on the target processor setting. Also there's the PE-header that defines under which platform the program is meant to be run. If I target my "helloworld.exe" (does nothing but returns specific exit code) to x86 and compile it with VC, should it work?
If the PE-header is in fact the problem, is it possible to setup for WINCE without the SDK? Do I REALLY need the whole SDK for creating a simple executable that uses only base types? I'm using VS2010, which doesn't even support smart device dev anymore and I'd hate to downgrade just for testing purposes.
Above questions are prequel to my actual idea: Porting ffmpeg/ffdshow for WinCE. This actually already exists, but not targeted nor built for Intel Atom. Comments?
If the native implementation is not possible and I would end up implementing some specific codec with C#...well that would probably be quite a massive task. But having to choose C# over native, could I run into problems with codec performance? I mean.. is C# THAT much slower?
Thank you.
I've not seen an OEM that shipped their own IDE, but it's certainly possible. That shouldn't change how apps can created, however. It's possible that they've done a lot of work to make sure only things from their IDE work, but that would be a serious amount of work for not that much benefit, so I'd think it's unlikely.
As for your specific questions:
The OS is Windows CE, not "based on" it. The OS is, however, componentized, so not all pieces are going to be available. An SDK generally provides a mechanism to filter out what isn't available. You can actually use any SDK that targets the right processor architecture, but if your app calls into a library for something that isn't in the OS, then you'll get at the very least an error. For managed code this is all not relevant because the CF isn't componentized. If it's there, and CF app can run (and if it's not, you can often install it after the fact). This means that if the platform supports the CF, then you can write a CF app and run it. That app can then call native stuff via P/Invoke (unless, of course, the OS creator decided to add security to prevent that. This is possible in the OS, though I've never seen it implemented).
Yes, compiled code is just "instructions". For native, yes, they are processor instructions. For managed, they are MSIL instructions that the managed runtime in turn converts to processor instructions at JIT time. If your target is an ARM platform, you cannot use an x86 compiler. Broadly speaking, you need to use the correct Microsoft compiler that support Windows CE, and call that compiler with the proper switches to tell it not only the processor architecture, but also the target OS because the linking that needs to be done will be different for OS-level APIs and even the C runtimes. The short of this is that for your platform, you need to use Visual Studio 2008 Pro.
For native apps, you need some SDK that targets the same OS version (CE 6.0) and processor architecture (e.g. ARMv4I). Having it match the OS feature set is also useful but not a requirement. For managed code, you can just use the SDKs that ship with Studio because managed code is not processor-dependent. Still, you have to go back to Studio 2008 because 2010 doesn't have any WinCE compilers.
If you've found an existing library, then you can try to use it. Things that might impede your progress are A) it's unlikely to use an SDK you have so you probably have to create new project files (painful, but workable) and B) if it uses features not available in your OS, then you'd have to work around those. If you're missing OS features, you're probably out of luck but if it already has a media player and codec, I suspect you'll be ok.
Don't implement this in managed code. Seriously, just don't do it. Could you? Yes. Performance could probably be made to be nearly the same except to avoid GC stuttering you're going to have to basically create your own memory manager. The amount of work involved in this path is very, very large.

glReadBuffer vs glReadBufferNV in OpenGL ES 2.0

I'm trying to build OpenSceneGraph 3.2 for the Ubuntu armhf architecture, but I'm getting a compile error about a symbol not found. The symbol in question is glReadBuffer. I looked at GLES2/gl2.h header, and indeed, that symbol is not there. However, the symbol is present in GLES3/gl3.h, and documentation online suggests that the function was added in OpenGL ES 3.0. However, I did find a function named glReadBufferNV in GLES2/gl2ext.h (which is not #include'd in the source files.
I'm wondering if glReadBufferNV can be used instead of glReadBuffer, and what might be the possible side effects. I'm suspecting that the NV stands for Nvidia, and that it is a Nvidia-only implementation. Is this correct? If so, is there any way to get glReadBuffer in OpenGL ES 2.0 (I am under the impression that OpenSceneGraph can be built under OpenGL ES 2.0)?
Edit: As it turned out, the code that builds this portion of OpenSceneGraph was excluded when building with OpenGL ES or OpenGL 3.0. However, I'm still interested in what's special about glReadBufferNV.
As your research suggests, glReadBuffer was added to ES for 3.0; it is not defined in 2.0. Prior to that, as per the header file you found, an extension defined glReadBufferNV โ€” specifically the NV_read_buffer extension.
So what's happened is that something wasn't in the spec, at least Nvidia thought it would be useful, so they've implemented an OpenGL extension, which has subsequently been discussed at Khronos, had all the edge cases or ambiguities dealt with and has eventually made its way into the core spec.
That's generally how GL development proceeds: extensions come along to provide functionality that's not yet in the main library, they're discussed and refined and adopted into the core spec if appropriate.
Comparing the official specification for glReadBuffer with the extension documentation, the extension has a few ties into other extensions that you wouldn't expect to make it into the core spec (e.g. COLOR_ATTACHMENTi_NV is supported as a source) but see resolved issue 7:
Version 6 of this specification isn't compatible with OpenGL ES 3.0.
For contexts without a back buffer, this extension makes FRONT the
default read buffer. ES 3.0 instead calls it BACK.
How can this be harmonized?
RESOLVED: Update the specification to match ES 3.0 behavior. This
introduces a backwards incompatibility, but few applications are
expected to be affected. In the EGL ecosystem where ES 2.0 is
prevalent, only pixmaps have no backbuffer and their usage remains
limited.
So the extension has retroactively been modified to bring it into line with what was agreed for the core spec.

Loading OpenCL kernels from bitcode in the correct architecture

I'm using Xcode (version 5.3) to compile OpenCL kernels to bitcode, as explained in WWDC 2013 session 508.
Xcode generates 4 different files, each with a different extension according to the architecture for which it has been targeted for.
The extensions are: cl.gpu_32.bc , cl.gpu_64.bc , cl.x84_64.bc, cl.i386.bc
In session 508, they only load a single file (The one with the cl.gpu_32.bc extension and use it).
Is it possible to generate a single cl_program that support all devices associated with the context?
How do I know which architecture to use for each of the available devices?
A sample code that reads all files and generate a single cl_program would be very helpful.
Apple provides sample code that covers loading platform-specific bitcode:
https://developer.apple.com/library/mac/samplecode/OpenCLOfflineCompilation/Introduction/Intro.html#//apple_ref/doc/uid/DTS40011196
From the description:
This sample demonstrates how developers can utilize the OpenCL offline
compiler to transform their human-readable OpenCL source files into
shippable bitcode. It includes an example Makefile that demonstrates
how to invoke the compiler, and a self-contained OpenCL program that
shows how to build a program from the generated bitcode. The sample
covers the case of using bitcode on 64 and 32 bit CPU devices, as well
as 32 bit GPU devices.
The readme covers the CLI arguments and the single-file C program contains lots of explanations.
Seems from Apple Sample Code (referenced by weichsel), that all is needed is to get CL_DEVICE_ADDRESS_BITS and CL_DEVICE_TYPE_GPU using clGetDeviceInfo to distinguish between all possible different architectures.

Resources