I just managed to install successfully the SyCL ComputeCpp + OpenCL (from CUDA) and running cmake to generate the samples VS2019 sln, successfully.
I've tried to run the matrix_multiply example ONLY, for now.
It ran successfully using the Intel FPGA emulator as a default device.
Changing the devices to the Device CPU worked well as well.
Choosing the host device, took ages without exiting.
When I tried to change the device to the nVidia, the GeForce GTX 1650 Ti.
I got this expection error from there ComputeCpp:RT0100, etc etc.
Googling a bit, I found I'd probably have to output the PTX instead of the SPIR.
So I regenerated the sln using -DCOMPUTECPP_BITCODE=ptx64
After doing that, the kernel ran successfully on the nVidia GPU.
My first question is: is that needed since nVidia does NOT support spir yet at the time of this writing, but only PTX?
However this broke the other devices, which are now reporting:
[ComputeCpp:RT0107] Failed to create program from binary
This happens now for all devices: Intel GPU, Device CPU, Device FPGA (While were formerly working)
Inspecting the .sycl I found now SYCL_matrix_multiply_cpp_bin_nvptx64[].
My question is: how to support nVidia with ptx and "normal" devices with spir altogether in the same exe? I did a menĂ¹ from which the user can choose to play with, but now it's working only for nVidia.
What am I doing wrong, please?
I would expect to be able to run the same .sycl code for all the devices despite it contains ptx or spir. How to manage for that?
EDIT: I just tried to retarget the bitcode to spirv64, since the computecpp_info told me all my devices are supposed to support it.
However, now no device is anymore working with that setting :-(
I have written a simple kernel module, which is generating segmentation fault whenever I try to open, read or write into the module. The open, read and write functions does nothing but prints a message. I did a search but could not find the solution. Please help. Below is the dmesg output when trying to write.
Platform : Raspberry Pi 3b+ , Raspbian 4.19.58-v7+
There is no issue with the code since i have executed it on my Ubuntu PC and it works perfectly fine.
screenshot of dmesg
Finally the sparkfun board edge boards arrived today ;-)
Following this well written guide : https://codelabs.developers.google.com/codelabs/sparkfun-tensorflow/#3 i am stuck with the following NoResponseError when trying to flash the code on the Ambiq, with the uart_wired_update.pyscript, that comes with tensorflow examples
opprud$ python3 tensorflow/lite/experimental/micro/tools/make/downloads/AmbiqSuite-Rel2.0.0/tools/apollo3_scripts/uart_wired_update.py -b 115200 /dev/cu.usbserial-1430 -r 1 -f main_nonsecure_wire.bin -i 6
MOJ/Connecting with Corvette over serial port /dev/cu.usbserial-1430...
Sending Hello.
No response for command 0x00000000
Traceback (most recent call last):
File "tensorflow/lite/experimental/micro/tools/make/downloads/AmbiqSuite-Rel2.0.0/tools/apollo3_scripts/uart_wired_update.py", line 336, in <module>
main()
File "tensorflow/lite/experimental/micro/tools/make/downloads/AmbiqSuite-Rel2.0.0/tools/apollo3_scripts/uart_wired_update.py", line 38, in main
connect_device(ser)
File "tensorflow/lite/experimental/micro/tools/make/downloads/AmbiqSuite-Rel2.0.0/tools/apollo3_scripts/uart_wired_update.py", line 58, in connect_device
response = send_command(hello, 88, ser)
File "tensorflow/lite/experimental/micro/tools/make/downloads/AmbiqSuite-Rel2.0.0/tools/apollo3_scripts/uart_wired_update.py", line 235, in send_command
raise NoResponseError
__main__.NoResponseError
My setup:
Macbook pro, tried both old 15" & new 13"
Sparkfun serial basic breakout, USBC version (default jumped to 3v3)
FTDI 3v3 serial cable
I have tried
two different edge boards, with the correct Key14 & reset combo + misc variants and timing
legacy USB on old Macbook
new Macbook w USB C
FTDI 3v3 serial cable as alternative to sparkfun serial board
Running an alternative uart_boot_host.py script in tensorflow/lite/experimental/micro/tools/make/downloads/AmbiqSuite-Rel2.0.0/tools/bootloader_scripts/uart_boot_host.py, also no response
I can measure, with a scope, the handshake bytes '0x14', '0x55', '0x9d', '0xe9' '0x0', '0x0', '0x8', '0x0' being transmitted initially at 115200 on the TXO pin on the programming header - but the ambiq is not replying anything.
btw. The onboard demo is running, blue led flashing, an some "yes's" are being recognized.
Any inputs welcome.
Does anyone know the protocol for the corvette bootloader ?
Are there any CPU revision changes from the first batch of boards, or possibly any lock bits programmed accidentally from sparkfun ?
rgds from an eager TF lite user ;-)
I tried measuring the actual baudrate with a scope on rx/tx pins, and saw that the bit timing using default OSX serial driver is rather imprecise, app 10% off, causing faulty readings, and ultimately missing bytes, when the baudrate are high.
After updating to the ch340 serial driver, timing improved, and the bit timings were correct.
At 921600bps, a single byte 8N1 is supposed to be10.9uS
Driver install
https://github.com/adrianmihalko/ch340g-ch34g-ch34x-mac-os-x-driver
This is what worked for me: (source: github.com/sparkfun/SparkFun_Edge_BSP/issues/3, the SparkFunEdge tutorial and my teammates!). I am running this on a Linux machine (x86_64; Run $ uname -a) and my SparkfunEdge DEVICENAME=/dev/ttyUSB0
The tutorial does warn you about this problem at Step 4:
Note: Some users have reported issues with their operating system's
default drivers for programmer, so we recommend installing the
driver before you continue.
Click on the driver link and follow the instructions under "Other Linux distributions" as follows:
Install the correct version of the ch34 library.
$ git clone https://github.com/juliagoda/CH341SER.git
$ cd CH341SER/
$ make
$ sudo insmod ch34x.ko
$ sudo rmmod ch341
To verify that the correct driver is being used, run:
$ dmesg
..
[889247.585301] usb 1-7: ch341-uart converter now attached to ttyUSB0
[955698.718839] usbcore: registered new interface driver ch34x
[955698.718848] usbserial: USB Serial support registered for ch34x
[955759.196437] usbserial: USB Serial deregistering driver ch341-uart
[955759.196576] ch341-uart ttyUSB0: ch341-uart converter now disconnected from ttyUSB0
[955759.196601] usbcore: deregistering interface driver ch341
[955759.196643] ch341 1-7:1.0: device disconnected
Now unplug the USB-C from the SparkfunEdge Board, and plug it back again
$ dmesg
....
[955876.176950] ch34x 1-7:1.0: ch34x converter detected
[955876.177320] usb 1-7: ch34x converter now attached to ttyUSB0
glad to hear that you're so excited about the board. I have a hunch that this will be an easy fix.
The Edge boards handed out at the conference have a bootloader set for 115200 baud, however the Edge boards that have come out in the second batch are upgraded to flash at 921600 baud, greatly reducing flashing time. Try changing the baud rate in your serial upload script.
You can also set up the Ambiq Software Development Kit to write your own applications for the Apollo3 microcontroller. Check out the tutorial here: Using the Edge Board with Ambiq SDK
Since I can't comment on your post (not enough reputation.... thanks SE) I'll be responding here.
If the baud rate accuracy is a problem I'm slightly unsure that that would be caused by the OS, but rather I'd think it is a problem with the USB-serial converter chip. I've been using the CH340G whereas on the USB-C version there is the CH340C IC. The difference between the two is that the "C" version includes an internal oscillator to provide the frequency reference. It is possible that that one is less accurate...? I'll try it out over here (but on windows) and let you know.
If this is a persisting problem would you mind making a post on the SparkFun forums? That way our tech support can get linked in (they are the people who could get you replacement hardware in case it is defective, also). Here's a forum for the Edge: SparkFun Edge Forums
If the problem is coming from the OS then the only fix that we can do with the Edge is to reduce the bootloader speed. We're working on a short tutorial about how to do that, but it would require having a programmer/debugger for Cortex-M processors. The Ambiq Apollo3 Evaluation Board has a built-in SEGGER J-Link which is what we used to program the boards.
On MacOS Mojave, Installing/Reinstalling CH340 works for me:
https://learn.sparkfun.com/tutorials/how-to-install-ch340-drivers#mac-osx
Before doing install, check that you actually see the sparkfun edge device with:
ls -l /dev/cu*
If driver is correctly installed, you should detect:
/dev/cu.wchusbserial1420
I got similar issue whereas I only had /dev/cu.usbserial-1420 and thought that was expected device to access, whereas it is /dev/cu.wchusbserial1420 which was only detected after installed ch340 driver.
Then flashing device works successfully for me.
I have multiple questions regarding debugging a Raspberry pi 3 from a linux x64 host using gdb-multiarch, as well as writing bare metal programs in general. We are currently facing a problem where our code appears to not be loaded into memory. When we begin debugging in GDB we start at address 0. 3 instructions down we jump into 0x10000. If I modify my linker script to put the Raspberry pi into either address I get the same result, we jump into 0x10000 and our code isn't loaded there. Instead we get this
We noticed also that GDB is using 32 bit register names here when we're supposed to be debugging 64 bit code.
Again a recap of what we're using:
QEMU with versatile-pb machine.
An aarch64 GCC cross-compiler.
GDB-multiarch.
We've tried on two different hosts: Ubuntu 16.04 x64 Host running in virtualbox. Mint x64 running natively.
We also tried the arm-none-eabi toolchain but were running into problems not having our code compiled as 64 bit.
Help is much appreciated! Thanks!
You don't give your command line, but "versatile-pb" is a 32-bit only board type, so trying to run 64-bit code on it is going to misbehave in confusing ways. You need to tell QEMU to emulate a 64-bit capable board that matches what your bare-metal code is expecting to run on.
In QEMU 2.12 there will be a "raspi3" QEMU board which may be helpful for you; you'd need to try building the latest 2.12 release candidate tarball at the moment if you wanted to experiment with that (2.12 release isn't due for another couple of weeks). Otherwise you could use the "virt" board if you made sure your bare metal code was built to be able to run on that board.
I have a c++ program running "fine" but when I try to debug it with LLDB, LLDB just quits on me.
Process ... launching
Segmentation fault
Even if I set a breakpoint on the very first line in main I just get these two lines. Googling yields the typical memory leak errors in user code. I don't think that's the case here since my program runs outside of LLDB.
I am not experiencing any issues under ubuntu. Could it be related to ARM (raspberry pi)?
You might have more luck sending a query about the state of the lldb port to raspberry pi on the lldb-dev mailing list. Details here:
http://lists.llvm.org/mailman/listinfo/lldb-dev