Error while trying to debug CUDA code using TOTALVIEW - debugging

I am trying to fix some error related to a SEGMENTATION FAULT. So when I try to fix the error using by step by step debugging of the code, I got couple of errors:
ERROR: cuda_trace_obj::initialize_cuda_library: Cuda initialize() returned CUDBG_ERROR_INITIALIZATION_FAILURE(20)!
ERROR: cuda_system_status_t::initialize: Error CUDBG_ERROR_UNINITIALIZED(5) getting device count
Any help or pointers regarding the above mentioned errors is appreciated.

This error often occurs when you debug a CUDA application on a computer with a single GPU and an X11 server running.
In a single GPU system, CUDA applications can be used debugged only if no X11 server (on Linux) or no Aqua desktop manager (on Mac OS X) is running on that system.
As far as I know, only the command line debugger CUDA-GDB is able to override this restriction setting software preemption as described in the cuda-gdb documentation, but works only for devices with SM3.5 compute capability and higher.

Related

Error running 'make DETECT_DEVICES' on Intel FPGA Monitor Program

I'm currently trying to run ARM assembly on my DE series board. However when I try to open my project I get the following error on the Intel FPGA Monitor Program:
Error running 'make DETECT_DEVICES'. (java.io.IOException: The pipe is
being closed)
How can I solve that?
Depends on the OS you are running. If you are running on Windows 11, it's not going to work because there is no USB Blaster II driver support for it unfortunately.
(see: https://community.intel.com/t5/Programmable-Devices/USB-Blaster-for-Windows-11/m-p/1422212#M87272)
NazrulNaim_Intel Employee
10-16-2022 11:57 PM
Hi Fari,
Regarding the issue with the USB blaster, as mention by ak6dn there will be issues regarding installing the USB blaster in Windows 11 because It is not officially supported yet by Intel. We cannot sure that it will 100% works in windows 11. As for work around to troubleshoot the issue, you can follow the instruction from the link that I have attached below.
https://www.terasic.com.tw/wiki/Altera_USB_Blaster_Driver_Installation_Instructions
Regards,
Nazrul Naim
I suggest you use a VM with Windows 10 if that's the case.
The FPGA monitor program requires WSL1 with a Linux distro installed on your PC. Make sure WSL1 is set to default, WSL2 is not supported and will result in crashes while trying to compile your code.
To install WSL1 and set it to default, follow this link:
https://learn.microsoft.com/en-us/windows/wsl/install
After installation, launch the installed distro and follow this link step by step:
https://www.intel.com/content/www/us/en/docs/programmable/683525/21-3/installing-windows-subsystem-for-linux.html
Although the document refers to the NIOS II EDS it is also applicable for the FPGA monitor.
Also make sure that the version of Quartus corresponds to the version of the FPGA monitor and keep the Linux distro running in the background while compiling.

How to run Gem5 system emulation with Golang program

I am trying to run Gem5 system emulation with a binary I compiled from a Golang program. I am using X86 O3CPU and classic memory. However, I have to launch the same process on 3 cpus to have the system emulation set up. Otherwise I will receive error: fail to create new OS thread. I try to set the GOMAXPROCS to 1 or MAXTHREADS to 1. Neither of them solve this problem.
But even if i run the binary with 3 cpu. I still might receive error:
wirep: p->m=824633877504(2) p->status=1
fatal error: wirep: invalid p state
after many hours of emulation. Does anyone have experience with running Gem5 with Golang?

LLDB crashes on raspberry pi

I have a c++ program running "fine" but when I try to debug it with LLDB, LLDB just quits on me.
Process ... launching
Segmentation fault
Even if I set a breakpoint on the very first line in main I just get these two lines. Googling yields the typical memory leak errors in user code. I don't think that's the case here since my program runs outside of LLDB.
I am not experiencing any issues under ubuntu. Could it be related to ARM (raspberry pi)?
You might have more luck sending a query about the state of the lldb port to raspberry pi on the lldb-dev mailing list. Details here:
http://lists.llvm.org/mailman/listinfo/lldb-dev

UHD error with REDHAWK

I made a node which contains a USRP_UHD and a GPP (and make sure the ip_address is correct for USRP_UHD). I launched the domain based on this node. However, I got the following error:
UHD Error:
Device discovery error: AssertionError: libusb_init(&_context) == 0
in libusb_session_impl::libusb_session_impl()
at /builddir/build/BUILD/uhd-release_003_005_003/host/lib/transport/libusb1_base.cpp:37
UHD Error:
Device discovery error: AssertionError: libusb_init(&_context) == 0
in libusb_session_impl::libusb_session_impl()
at /builddir/build/BUILD/uhd-release_003_005_003/host/lib/transport/libusb1_base.cpp:37
...
-- Opening a USRP2/N-Series device...
-- Current recv frame size: 1472 bytes
-- Current send frame size: 1472 bytes
UHD Warning:
Unable to set the thread priority. Performance may be negatively affected.
Please see the general application notes in the manual for instructions.
EnvironmentError: OSError: error in pthread_setschedparam
I did get two unallocated (TX/RX for each) tuners, but it is not easy to allocate these two tuners for use for any parameters.
Besides, if I just launch the domain and launch the single device USRP_UHD, or simply run the discover USRP_UHD command via the command line window, I got the same error:
UHD Error:
Device discovery error: AssertionError: libusb_init(&_context) == 0
in libusb_session_impl::libusb_session_impl()
at /builddir/build/BUILD/uhd-release_003_005_003/host/lib/transport/libusb1_base.cpp:37
2016-02-01 16:59:20 WARN USRP_UHD_i:943 - WARNING: NO UHD (USRP) DEVICES FOUND!
Could anybody figure out where this problem is? Thanks in advance!
So, first of all, the good news is that this is happening during autodetection of USB devices, so your N2xx is not inherently affected, but:
UHD 3.5.3 is not only old, it's ancient. You should really uninstall it (If you've got Debian or a derived one [Ubuntu], it'd be sudo apt-get remove uhd-host libuhd003 libuhd-dev), install a new version directly from Ettus (can help you with that, if necessary) and rebuild Redhawk against that version.
Really, really do that. There's been so much improvement in behaviour like failure handling that fixing this without updating isn't really worth it.
Now, if you explicitly specify a device address that allows you to cancel USB-based USRP detection completely, you should be fine. As device address, use type=usrp2 for USRP2, N200 and N210.
I ran into this issue trying to install UHD v3.9.3 in a CentOS 7 Docker container - the error message points to a usb issue, not related to Redhawk. The Redhawk Device, USRP_UHD, is just an abstraction layer on top of the Ettus UHD drivers, so the easiest way to tell whether the problem is Redhawk or something else is to try one of the UHD commands directly from a terminal to generate the same error, like uhd_usrp_probe.
To check if the problem is directly related to the usb drivers try the command lsusb. This should list all usb devices connected to your OS. These are good debug tips to isolate where the problem is.
If you happen to be doing this using Linux containers or Docker you have to give the proper privileges, see docker-any-way-to-give-access-to-host-usb-or-serial-device. Otherwise, assuming you built UHD from source, check the output of make test step - if all the tests passed there shouldn't be anything wrong with the UHD library.
Edit: Also if you're running this inside a VM you have to make sure your host has given network/USB/etc privileges to the hypervisor (ex. VirtualBox) during installation, or that you've attached the correct virtual hardware in the VM configuration.

CUDA Nvidia NSight Debugging: "CUDA grid launch failed"

When I try to debug an arbitrary CUDA application, e.g. the matrix multiplication or convolutionSeparable sample from the Nvidia GPU Computing SDK 4.0, I always get an output similar to:
Parallel Nsight Debug
CUDA grid launch failed: CUcontext: 2059192 CUmodule: 348912936 Function: _Z9matrixMulILi32EEvPfS0_S0_ii
……
……
And a file with the following content is showing up:
Parallel Nsight CUDA Debugger
The application being debugged with the Nexus CUDA debugger, was unable to
find any associated source. This could be for a number of reasons:
1) CUDA has not been initialized.
Make sure cuInit has been called, and it returned a successful result.
2) No CUDA contexts have been created.
Once a context is created, memory can be examined in the context. Each context
shows up as a single "Thread" in the Visual Studio Threads view. (Debug | Windows | Threads)
3) There are no active CUDA grids in any context.
A grid must be launched in order to hit breakpoints.
4) You have selected the "Default Context" in the Visual Studio Threads view.
This context is a placeholder shown when there are no available actual CUDA
contexts. It does not show real data.
5) No CUDA modules have been loaded.
You can see which modules are loaded in each CUDA context by showing the
Visual Studio Modules view. (Debug | Windows | Modules)
6) Symbolics were not found for the loaded .cubin.
The module needs to be built with debug information. Please specify the
-G0 switch when building.
7) A grid launch failed while running a kernel.
Each breakpoint within the corresponding “.cu” file is completely ignored during the run. When I just run the application, without Nsight Debugging, the program executes without any problems.
What can I do to tackle this problem?
My Setup:
1xIntel GPU and 1x NV 570GTX, I want to use the local debugging option
Win 7. Pro 64Bit
Dev Env.: VS2008 or VS2010
CUDA 4.0 & Parallel Nsight 2.0
NV Driver Vers.: 285.38
WPF is disabled
TDR is disabled
Windows runs in Basic mode (no aero)
Project Propertys: Cuda Runtime API -> GPU-> Generate GPU Debug Information -> Yes (-G0)
Firstly, you need to ensure that your display is driven by the Intel integrated graphics and not the NVIDIA GPU. This is because when you hit a breakpoint in CUDA code you are stalling the entire GPU, so if the same GPU was used for display then your system would lock up naturally.
Note that the hardware requirements for Parallel Nsight indicate you need two supported GPUs whereas you only have one, but if I understand correctly it's possible to use a non-Intel GPU for display (I haven't tried).
Assuming the above is working you should start by trying out the samples included with Parallel Nsight. You can find them in the Parallel Nsight menu group in the start menu.
CUDA Grid Launch has a wide variety of causes. This one is probably accessing an array beyond its allocated size. what in the x86 world is called a segmentation fault. i debug these by selectively commenting out parts of the kernel you are testing until the error goes away. (what we used to call wolf fence debugging). Another cause of grid launch failure is if the kernel is taking too long (1 or 2 seconds) to execute.
the reason the debugger isnt helping is that the debugger ONLY stops 1 thread in 1 block! your access error is coming before then. also you cant use the printf to find the bug as the output does not get returned in the event of a grid launch failure.
To add potential solution on top of the answers given already, one way to avoid the error is to run the NSight monitor with administrator right.
The answer for this is definitely using the correct driver for the installation of Parallel NSight. For the latest version (2.1 RC2, currently), this is driver version 285.86. For the current stable version 2.0, this is driver version 270.81, as another poster mentioned.

Resources