Loading variable addresses into registers PowerPC inline Assembly - gcc

I am trying to put together and example of coding inline assembly code in a 'C' program. I have not had any success. I am using GCC 4.9.0. As near as I can tell my syntax is correct. However the build blows up with the following errors:
/tmp/ccqC2wtq.s:48: Error: syntax error; found `(', expected `,'
Makefile:51: recipe for target 'all' failed
/tmp/ccqC2wtq.s:48: Error: junk at end of line: `(31)'
/tmp/ccqC2wtq.s:49: Error: syntax error; found `(', expected `,'
/tmp/ccqC2wtq.s:49: Error: junk at end of line: `(9)'
These are related to the input/output/clobbers lines in the code. Anyone have an idea where I went wrong?
asm volatile("li 7, %0\n" // load destination address
"li 8, %1\n" // load source address
"li 9, 100\n" // load count
// save source address for return
"mr 3,7\n"
// prepare for loop
"subi 8,8,1\n"
"subi 9,9,1\n"
// perform copy
"1:\n"
"lbzu 10,2(8)\n"
"stbu 10,2(7)\n"
"subi 9,9,1\n" // Decrement the count
"bne 1b\n" // If zero, we've finished
"blr\n"
: // no outputs
: "m" (array), "m" (stringArray)
: "r7", "r8"
);

It's not clear what you are trying to do with the initial instructions:
li 7, %0
li 8, %1
Do you want to load the address of the variables into those registers? In general, this is not possible because the address is not representable in an immediate operand. The easiest way out is to avoid using r7 and r8, and instead use %0 and %1 directly as the register names. It seems that you want to use these registers as base addresses, so you should use the b constraint:
: "b" (array), "b" (stringArray)
Then GCC will take care of the details of materializing the address in a suitable fashion.
You cannot return using blr from inline assembler because it's not possible to tear down the stack frame GCC created. You also need to double-check the clobbers and make sure that you list all the things you clobber (including condition codes, memory, and overwritten input operands).

I was able to get this working by declaring a pair of pointers and initializing them with the addresses of the arrays. I didn't realize that the addresses wouldn't be available directly. I've used inline assembly very sparsely in the past, usually just
to raise or lower interrupt masks, not to do anything that references variables. I just
had some folks who wanted an example. And the "blr" was leftover when I copied a
snipet of a pure assembly routine to use as a starting point. Thanks for the responses.
The final code piece looks like this:
int main()
{
char * stringArrayPtr;
unsigned char * myArrayPtr;
unsigned char myArray[100];
stringArrayPtr = (char *)&stringArray;
myArrayPtr = myArray;
asm volatile(
"lwz 7,%[myArrayPtr]\n" // load destination address
"lwz 8, %[stringArrayPtr]\n" // load source address
"li 9, 100\n" // load count
"mr 3,7\n" // save source address for return
"subi 8,8,1\n" // prepare for loop
"subi 9,9,1\n"
"1:\n" // perform copy
"lbzu 10,1(8)\n"
"stbu 10,1(7)\n"
"subic. 9,9,1\n" // Decrement the count
"bne 1b\n" // If zero, we've finished
// The outputs / inputs / clobbered list
// No output variables are specified here
// See GCC documentation on extended assembly for details.
:
: [stringArrayPtr] "m" (stringArrayPtr), [myArrayPtr]"m"(myArrayPtr)
: "7","8","9","10"
);
}

Related

Rust debugging doesn't stop at the breakpoints when debugging stm32f407 via openocd and gdb

I have a problem debugging an stm32f407vet6 board and rust code.
The point of the problem is that GDB ignores breakpoints.
After setting breakpoints and executing the "continue" command in gdb, the program continues to ignore all breakpoints.
The only way to stop the program running is to cause an interrupt using the "ctrl + c" command.
After this command, the board stops its execution on the line currently being executed.
I have tried to set breakpoints on all lines where I can set them, but all the attempts are unsuccessful.
$ openocd
Open On-Chip Debugger 0.10.0 (2020-07-01) [https://github.com/sysprogs/openocd]
Licensed under GNU GPL v2
libusb1 09e75e98b4d9ea7909e8837b7a3f00dda4589dc3
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "hla_swd". To override use 'transport select <transport>'.
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : clock speed 2000 kHz
Error: libusb_open() failed with LIBUSB_ERROR_NOT_SUPPORTED
Info : STLINK V2J35S7 (API v2) VID:PID 0483:3748
Info : Target voltage: 6.436364
Info : stm32f4x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : starting gdb server for stm32f4x.cpu on 3333
Info : Listening on port 3333 for gdb connections
$ arm-none-eabi-gdb -q target\thumbv7em-none-eabihf\debug\test_blink
Reading symbols from target\thumbv7em-none-eabihf\debug\test_blink...
(gdb) target remote :3333
Remote debugging using :3333
0x00004070 in core::ptr::read_volatile (src=0xe000e010) at C:\Users\User\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib/rustlib/src/rust\src/libcore/ptr/mod.rs:1005
1005 pub unsafe fn read_volatile<T>(src: *const T) -> T {
(gdb) load
Loading section .vector_table, size 0x1a8 lma 0x0
Loading section .text, size 0x47bc lma 0x1a8
Loading section .rodata, size 0xbf0 lma 0x4970
Start address 0x47a2, load size 21844
Transfer rate: 100 KB/sec, 5461 bytes/write.
(gdb) b main
Breakpoint 1 at 0x1f2: file src\main.rs, line 15.
(gdb) continue
Continuing.
Program received signal SIGINT, Interrupt.
0x00001530 in cortex_m::peripheral::syst::<impl cortex_m::peripheral::SYST>::has_wrapped (self=0x1000fc6c)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\cortex-m-0.6.3\src\peripheral/syst.rs:135
135 pub fn has_wrapped(&mut self) -> bool {
(gdb) bt
#0 0x00001530 in cortex_m::peripheral::syst::<impl cortex_m::peripheral::SYST>::has_wrapped (self=0x1000fc6c)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\cortex-m-0.6.3\src\peripheral/syst.rs:135
#1 0x00003450 in <stm32f4xx_hal::delay::Delay as embedded_hal::blocking::delay::DelayUs<u32>>::delay_us (self=0x1000fc6c, us=500000)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\stm32f4xx-hal-0.8.3\src/delay.rs:69
#2 0x0000339e in <stm32f4xx_hal::delay::Delay as embedded_hal::blocking::delay::DelayMs<u32>>::delay_ms (self=0x1000fc6c, ms=500)
at C:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\stm32f4xx-hal-0.8.3\src/delay.rs:32
#3 0x00000318 in test_blink::__cortex_m_rt_main () at src\main.rs:40
#4 0x000001f6 in main () at src\main.rs:15
memory.x file:
MEMORY
{
/* NOTE 1 K = 1 KiBi = 1024 bytes */
/* TODO Adjust these memory regions to match your device memory layout */
/* These values correspond to the LM3S6965, one of the few devices QEMU can emulate */
CCMRAM : ORIGIN = 0x10000000, LENGTH = 64K
RAM : ORIGIN = 0x20000000, LENGTH = 128K
FLASH : ORIGIN = 0x00000000, LENGTH = 512K
}
/* This is where the call stack will be allocated. */
/* The stack is of the full descending type. */
/* You may want to use this variable to locate the call stack and static
variables in different memory regions. Below is shown the default value */
_stack_start = ORIGIN(CCMRAM) + LENGTH(CCMRAM);
/* You can use this symbol to customize the location of the .text section */
/* If omitted the .text section will be placed right after the .vector_table
section */
/* This is required only on microcontrollers that store some configuration right
after the vector table */
/* _stext = ORIGIN(FLASH) + 0x400; */
/* Example of putting non-initialized variables into custom RAM locations. */
/* This assumes you have defined a region RAM2 above, and in the Rust
sources added the attribute `#[link_section = ".ram2bss"]` to the data
you want to place there. */
/* Note that the section will not be zero-initialized by the runtime! */
/* SECTIONS {
.ram2bss (NOLOAD) : ALIGN(4) {
*(.ram2bss);
. = ALIGN(4);
} > RAM2
} INSERT AFTER .bss;
*/
openocd.cfg file:
# Sample OpenOCD configuration for the STM32F3DISCOVERY development board
# Depending on the hardware revision you got you'll have to pick ONE of these
# interfaces. At any time only one interface should be commented out.
# Revision C (newer revision)
source [find interface/stlink.cfg]
# Revision A and B (older revisions)
# source [find interface/stlink-v2.cfg]
source [find target/stm32f4x.cfg]
# use hardware reset, connect under reset
# reset_config none separate
main.rs file:
#![no_main]
#![no_std]
#![allow(unsafe_code)]
// Halt on panic
#[allow(unused_extern_crates)] // NOTE(allow) bug rust-lang/rust#53964
extern crate panic_halt; // panic handler
use cortex_m;
use cortex_m_rt::entry;
use stm32f4xx_hal as hal;
use crate::hal::{prelude::*, stm32};
#[entry]
fn main() -> ! {
if let (Some(dp), Some(cp)) = (
stm32::Peripherals::take(),
cortex_m::peripheral::Peripherals::take(),
) {
let rcc = dp.RCC.constrain();
let clocks = rcc
.cfgr
.sysclk(168.mhz())
.freeze();
let mut delay = hal::delay::Delay::new(cp.SYST, clocks);
let gpioa = dp.GPIOA.split();
let mut l1 = gpioa.pa6.into_push_pull_output();
let mut l2 = gpioa.pa7.into_push_pull_output();
loop {
l1.set_low().unwrap();
l2.set_high().unwrap();
delay.delay_ms(500u32);
l1.set_high().unwrap();
l2.set_low().unwrap();
delay.delay_ms(500u32);
}
}
loop {}
}
Cargo.toml file:
[package]
name = "test_blink"
version = "0.1.0"
authors = ["Alex"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
embedded-hal = "0.2"
nb = "0.1.2"
cortex-m = "0.6"
cortex-m-rt = "0.6"
# Panic behaviour, see https://crates.io/keywords/panic-impl for alternatives
panic-halt = "0.2"
cortex-m-log="0.6.2"
[dependencies.stm32f4xx-hal]
version = "0.8.3"
features = ["rt", "stm32f407"]
I am new to rust embedded and maybe I have done something wrong, but I have already tried all the options I can find on the Internet.
At first I thought it was a problem with the cortex-debug plugin for vscode and even created the issue, but the guys couldn't help me because the problem is obviously not on their side.
Debugging "C" code in cubeIDE works, so I dare to assume that the problem is somewhere in rust--gdb--openocd. Perhaps I am missing something, but unfortunately I cannot find it myself yet.
I would appreciate any resources or ideas to solve this problem.
I'm hoping you checked out this resources:
Discovery - debug
From your screen-grab of arm-none-eabi-gdb it does indeed look it it did not hit the break point.
you should have seen this message afterwards:
Note: automatically using hardware breakpoints for read-only addresses.
Breakpoint 1, main () at ...
Did you compile your source with symbols, and unoptimised?
Your config all looks right to me otherwise.

How can I put some memory section in particular memory

I am trying to understand the GCC link script and create a small demo to practice , however I got the "syntax error" from ld. I appreciate any comments or suggestions. Thank you so much!
hello.c
__attribute__((section(".testsection"))) volatile int testVariable;
hello.ld
MEMORY {
TEST_SECTION: ORIGIN = 0x43840000 , LENGTH = 0x50
}
SECTIONS{
.testsection: > TEST_SECTION /* Syntax error here*/
}
compile command
gcc -T hello.ld -o hello hello.o
c:/gnu/bin/../lib/gcc/arm-none-eabi/9.2.1/../../../../arm-none-eabi/bin/ld.exe:../hello.ld:20: syntax error
collect2.exe: error: ld returned 1 exit status
Original post
MEMORY {
TEST_SECTION: ORIGIN = 0x43840000 LENGTH = 0x50
}
SECTIONS{
.testsection: > TEST_SECTION
}
Solution
MEMORY {
TEST_SECTION: ORIGIN = 0x43840000, LENGTH = 0x50
}
SECTIONS{
.testsection: { *(.testsection*) } > TEST_SECTION
}
I think it was missing two things.
First the comma between the origin value and LENGTH.
Second a section command was missing. I don't think the GNU linker script syntax allows for that, you probably need to put something in there anyway to have this be useful. You could try
.testsection: { } > TEST_SECTION
but my guess without trying it is that you wouldn't end up with anything in that memory.
Edit
After doing an experiment:
so.s
.text
add r0,r0,r0
add r1,r2,r3
add r1,r1,r1
add r2,r2,r2
so.ld
MEMORY
{
bob : ORIGIN = 0x30000000, LENGTH = 0x1000
ted : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.hello : { *(.text*) } > bob
.world : { } > ted
.text : { } > ted
}
Disassembly of section .hello:
30000000 <.hello>:
30000000: e0800000 add r0, r0, r0
30000004: e0821003 add r1, r2, r3
30000008: e0811001 add r1, r1, r1
3000000c: e0822002 add r2, r2, r2
As I suspected, first I already new the names in MEMORY are whatever you want, as you saw as well, RAM, ROM, etc are not special names just can't use reserved names if any. I suspected that SECTIONS worked the same way the thing on the left is a new name you are creating for the output. This built and linked just fine (not a real program obviously just enough to get the tools to do a demonstration).
The linker did not complain, but at the same time since there was nothing actually placed in ted then nothing was placed in that memory space in the output binary. .hello and .text don't have to match, one is an output name one is an input name, very often for simple stuff folks make them match, why create a new name. Only did it here for demonstration purposes to show that by using the same name on the left does not automatically include the input section to the output.
I highly recommend doing experiments like this and using tools like readelf and objdump and others to examine what is going on (as well as reading the documentation for the scripting language).

Meaning of yywrap() in flex

What does this instructions mean in flex (lex) :
#define yywrap() 1
and this [ \t]+$
i find it in the code below:
(%%
[ \t]+ putchar('_');
[ \t]+%
%%
input "hello world"
output "hello_world"
)
According to The Lex & Yacc Page :
When the scanner receives an end-of-file indication from YY_INPUT, it then checks the yywrap() function. If yywrap() returns false (zero), then it is assumed that the function has gone ahead and set up yyin to point to another input file, and scanning continues. If it returns true (non-zero), then the scanner terminates, returning 0 to its caller. Note that in either case, the start condition remains unchanged; it does not revert to INITIAL.
The #define is used to simplify building the program (so that no -ll linkage option is needed).
Further reading:
What are lex and yacc?
Routines to reprocess input
6. How do Lex and YACC work internally (Lex and YACC primer/HOWTO)

FFTW R2C two-dimensional size parameters

I cannot get what size parameters are for fftwf_plan_dft_r2c_2d
Input: a N rows by M cols matrix
Output: a N rows by floor(M/2) + 1 cols matrix ?
Are parameters input or output size?
Tried to give input size. This is what GDB sais
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1676.0x768]
0x637eed72 in n1fv_8 () from C:\devfiles\bin\libfftw3f-3.dll
(gdb) backtrace
#0 0x637eed72 in n1fv_8 () from C:\devfiles\bin\libfftw3f-3.dll
#1 0x7c91a000 in ntdll!RtlpUnWaitCriticalSection ()
from C:\WINDOWS\system32\ntdll.dll
No other thread uses fftw at the time.
The context:
Herbs::MatrixStorage<float> spectrum_in(frame_a.nRowsGet(),frame_a.nColsGet());
Herbs::MatrixStorage<std::complex<float>>
spectrum_out(frame_a.nRowsGet(),frame_a.nColsGet()/2+1);
Check for heap corruption issued by possible alllocation mistakes in MatrixStorage class . The rows in the matrix is one large block. rowGet returns the pointer to the given row.
memset(spectrum_in.rowGet(0),0
,sizeof(float)*spectrum_in.nRowsGet()*spectrum_in.nColsGet());
memset(spectrum_out.rowGet(0),0
,sizeof(float)*spectrum_out.nRowsGet()*spectrum_out.nColsGet());
heapdump(); //Heap seams to be fine after these
Causes sigsevg
FFT::PlanFloat_2dR2C plan(spectrum_in,spectrum_out);
The plan constructor does the following
plan=FFT::PlanFloat_2dR2C::PlanFloat_2dR2C(Herbs::MatrixStorage<InputType>& buffer_in
,Herbs::MatrixStorage<OutputType>& buffer_out)
{
plan=fftwf_plan_dft_r2c_2d
(
buffer_in.nRowsGet()
,buffer_in.nColsGet()
,buffer_in.rowGet(0)
,(fftwf_complex*)buffer_out.rowGet(0)
,FFTW_MEASURE
);
}
EDIT:
I used a precompiled DLL instead. GCC may produce bad code on 32-bit Windows (From release notes):
Removed an archaic stack-alignment hack that was failing with gcc-4.7/i386. Added stack-alignment hack necessary for gcc on Windows/i386. We will regret this in ten years (see previous change).
EDIT 2:
The dll became bad due to wrong options given to the configure script. The documentation is now updated.
You are correct. For an NxM float or double input you should allocate Nx(M/2+1) fftwf_complex or fftw_complex output.
The parameters are the input size. For example,
#include "fftw3.h"
#define Width 1024
#define Height 768
int main(){
float input[Width*Height];
fftwf_complex *fft = new fftwf_complex[((Width/2)+1)*Height];
fftwf_plan fplan = fftwf_plan_dft_r2c_2d(Height, Width,
(float*)input, fft, FFTW_ESTIMATE);
fftwf_execute(fplan);
fftwf_destroy_plan(fplan);
}
See Chapter 4.3 Basic Interface of the FFTW User Manual

CLR IL-significance of square bracket on .locals init

I am trying to generate a dynamic assembly using Reflection & Emit in .NET. I am getting an error, "Common Language Runtime detected an invalid program." I created another program which has the functionality I want using hard-coded types. The functionality I am trying to write will ultimately use dynamic types, but I can use ILDasm to see the IL I need to generate. I am comparing the IL I am generating with the IL which the compiler generates. In the .locals init declaration of one method I see there is an extra item in the compiler-generated code,
compiler-generated:
.locals init ([0] class [System.Core]System.Linq.Expressions.ParameterExpression CS$0$0000,
[1] class [System.Core]System.Linq.Expressions.ParameterExpression[] CS$0$0001)
mine:
.locals init (class [System.Core]System.Linq.Expressions.ParameterExpression V_0,
class [System.Core]System.Linq.Expressions.ParameterExpression[] V_1)
I don't understand the significance of the "[0]" and "[1]" in the compiler-generated code. Can anyone tell me what it means?
As a more general question, I can follow most ILDasm output without too much trouble. But every so often I run across a problematic expression. For instance, in this line from ILDasm
callvirt instance class [EntityFramework]System.Data.Entity.ModelConfiguration.EntityTypeConfiguration`1<!!0> [EntityFramework]System.Data.Entity.DbModelBuilder::Entity<class DynamicEdmxTrial.HardFooAsset>()
the "!!0" probably refers to the generic type of the Entity<>, but I don't know for sure, and I wonder if there is a key to ILDasm output that would explain its more obscure output to me.
The specification is freely available here. It takes a little getting used to, but most details are easily found once you figure out the structure.
!! is listed in II.7.1 Types:
Type ::= | Description | Clause
‘!’ Int32 | Generic parameter in a type definition, | §II.9.1
| accessed by index from 0 |
| ‘!!’ Int32 | Generic parameter in a method | §II.9.2
| definition, accessed by index from 0 |
...
In other words, inside a method that C# would call f<T, U>(), !!0 would be T, and !!1 would be U.
However, the [0] is a good question. The spec does not seem to address it. The .locals directive is described in II.15.4.1.3 The .locals directive, which lists the syntax as
MethodBodyItem ::= ...
| .locals [ init ] ‘(’ LocalsSignature ‘)’
LocalsSignature ::= Local [ ‘,’ Local ]*
Local ::= Type [ Id ]
There is nothing that seems to allow [0] there unless it is part of a Type, and Type does not allow anything starting with [ either. My guess is that this is an undocumented peculiarity specific to Microsoft's implementation, intended to help the human reader see that location 0 is local variable CS$0$0000, for when the generated instructions access local variables by index.
Experimenting with ILAsm shows that this is exactly what it means. Taking a simple C# program:
static class Program {
static void Main() {
int i = 0, j = 1;
}
}
and compiling and then disassembling it (csc test.cs && ildasm /text test.exe >test.il) shows:
....
.locals init (int32 V_0,
int32 V_1)
IL_0000: nop
IL_0001: ldc.i4.0
IL_0002: stloc.0
IL_0003: ldc.i4.1
IL_0004: stloc.1
IL_0005: ret
....
Modifying the .locals to
.locals init ([0] int32 V_0, [0] int32 V_1)
gives a useful warning message:
test.il(41) : warning : Local var slot 0 is in use
And indeed, declaring variables of different types, then reordering them using [2], [1], [0], assembling and immediately disassembling the result, shows that the variables got reordered.

Resources