I'm currently learning OpenCL and got some problems with my program ( running fine on GPU but not on CPU). So I wanted to try gDebugger. I'd like to know if it's possible to see the input data passed to the kernel in gDebugger and if so how can I see. I searched but couldn't get information. I hope you'll be able to help.
Take care.
Best regards,
Jacq
You can see the input and output buffers in the Textures and Buffers viewer.
P.S: gDEBugger was replaced by CodeXL (http://developer.amd.com/tools-and-sdks/heterogeneous-computing/codexl/). You will enjoy this one...
Related
I may just be using this program wrong, but I've been having a lot of difficulty using the C++ profiler on MacOS. There are two major issues:
I'd preferably like to see line-by-line annotations of where time is spent not just per-function. I haven't figured out how to do this.
When I look at the per-function annotations, the function annotation is so large it pushes the function name off the screen, and Instruments doesn't seem to want to let me scroll to the right. See photo.
I'm sure I'm using this wrong. Would anyone be able to help me out?
Thanks so much!
I'd preferably like to see line-by-line annotations of where time is
spent not just per-function. I haven't figured out how to do this.
Double-clicking a function in the call tree view opens the source view, which shows line-by-line statistics.
The following articles should help you interpret the data Instruments generates:
Measuring Your App's Memory Usage with Instruments
Finding the Slow Spots in Your Code with the Time Profiler Instrument
I would like to know if there is a proper method to track memory accesses
across multiple resources at once. For example I set up a simple dual core CPU
by advancing the simple.py from learning gem5 (I just added another
TimingSimpleCPU and made the port connections).
I took a look at the different debug options and found for example the
MemoryAccess flag (and others), but this seemed to only show the accesses at
the DRAM or one other resource component.
Nevertheless I imagine a way to track events across CPU, bus and finally memory.
Does this feature already exist?
What can I try next? Is it and idea to add my own --debug-flag or can I work
with the TraceCPU for my specified use?
I haven't worked much with gem5 yet so I'm not sure how to achieve this. Since until now I only ran in SE mode is the FS mode a solution?
Finally I also found the TraceCPUData flag in the --debug-flags, but running
this with my config script created no output (like many other flags btw. ...).
It seems that this is a --debug-flag for the TraceCPU, what kind of output does this flag create and can it help me?
I ask this question because I'd like to know this from my kernel mode Windows driver.
I have some library code porting from user mode that has some accompanying stress test to run; that stress test code need to know when CPU is idle.
Simple googling shows no result, at least from first several pages.
you need use ZwQuerySystemInformation with SystemProcessorPerformanceInformation infoclass ( you got a array of SYSTEM_PROCESSOR_PERFORMANCE_INFORMATION structures on output)
There should be a standard, board and architecture independent way to do this just like there is with initfamfs, no?
I'm using powerpc and linux-3.10, if it matters. If there are better facilities later, I'd be interested to hear about them.
And if anyone knows of a board where this is currently working that I could use as a reference, that would also be helpful.
I've been searching and searching and I find a lot of information about why dts/dtb exists, a fair amount about the ongoing discussion of whether they are useful, and some about how to write dts or use existing dts, but nothing about how to embed them.
Quick descriptions or pointers to relevant doc would be very much appreciated.
What you need is Flattened Image Tree format (FIT). FIT uses DTS syntax/format to describe images embedded into one master image. For example you can package zImage and one or more DTB files and initramfs image and what so ever. Take a look at these slides for details.
If the bootloader supports device tree, the DTB can be loaded like any other (u/m)Image, but should have it pass to the kernel. If not, we have to use kernel CONFIG_(ARM_)APPENDED_DTB to load newer kernels. Not an option for PowerPC?
cat x.dtb >> zImage
To load an initramfs in that case, use CONFIG_INITRAMFS_SOURCE to include it in the kernel build.
In my OpenCL kernel I would like to both read and write to an image2d_t object. According to OpenCL standard I can only specify either __read_only or __write_only.
However, I figured if I send the same cl_mem as two separate kernel arguments (one with __read_only and one with __write_only) I can do both.
Probably when I do a write followed by a read, I might get the old value(?) but in my case I would like the old value first, update it and write it back to the image. A simple example would be "increment each pixel by 1". It looks like it works in 99.9% but gives me artifacts sometimes.
Does anybody know if this is possible at all or if I have to expect undefined behaviour?
According to OpenCL standard, one Image can be used either for reading, or for writing within one kernel. So, if you need to read-write into same memory object, you have to use 2 Images, or switch to regular Buffer. No guarantee can be made that your kernel will work fine.