HLS: How to separate AXI4 signals - fpga

I am trying to write a module that uses the AXI4 streaming protocol to communicate with the previous and next modules. The modules use the following communication signals:
TDATA, which is 16 bits,
TKEEP, which is 2 bits,
TUSER, which is 1 bit,
TVALID, which is 1 bit,
TREADY, which is 1 bit and goes towards the previous module, and
TLAST, which is 1 bit.
These all need to be separate signals. I tried to implement it using the following code:
#include "core.h"
void core_module(hls::stream<ap_axis_str> &input_stream, hls::stream<ap_axis_str> &output_stream){
#pragma HLS INTERFACE axis port=input_stream
#pragma HLS INTERFACE axis port=output_stream
#pragma HLS INTERFACE s_axilite port=return bundle=CTRL
ap_axis_str strm_val_in;
ap_axis_str strm_val_out;
for (int i = 0; i<NDATA; i++){
strm_val_in = input_stream.read();
strm_val_out.data = strm_val_in.data * 2;
strm_val_out.keep = 3;
strm_val_out.valid = 1;
strm_val_in.ready = 1;
strm_val_out.user = ((i%2)==0);
strm_val_out.last = (i == NDATA-1) ? 1:0;
output_stream.write(strm_val_out);
}
}
where the header file is
#ifndef core_h
#define core_h
#include <ap_int.h>
#include <ap_axi_sdata.h>
#include <hls_stream.h>
typedef ap_uint<16> word;
#define NDATA 10
struct ap_axis_str {
word data;
ap_uint<2> keep;
bool user;
bool last;
bool ready;
bool valid;
};
void core_module(hls::stream<ap_axis_str> &input_stream, hls::stream<ap_axis_str> &output_stream);
#endif
The problem is that this doesn't separate the signals. When I synthesise it and run it in the co-simulation (giving it values from 0 to 9), even if the result is what I expect it to be, the waveform produced looks like this:
We can see that TREADY, TVALID, and TDATA are there, but not the other 3. Furthermore, looking at the contents of TDATA (which for some reason are 64 bits) we notice that they contain all the signals. They are the following:
0001000001030000,
0001000000030002,
0001000001030004,
0001000000030006,
...
000100000003000c, (they are in base 16)
0001000001030010,
0001000100030012.
From which we can see that the 3 in position 12 is probably what was intended to be TKEEP, the 1 in position 8 which only appears in the last case is probably what was intended to be TUSER, the last 4 digits are what was supposed to be TDATA, etc. Additionally, the program drops TREADY when it isn't ready to receive data, which is what is intended of TREADY, but I didn't program it to work this way, which means that it's automatically generated and probably has nothing to do with the TREADY I told it to have.
So my question is: How do I make a module that gives out the correct 6 separate signals for the version of the AXI4 protocol that we are using?

Well, according to the Xilinx Documentation,
If you specify an hls::stream object with a data type other than ap_axis or ap_axiu, the tool will infer an AXI4-Stream interface without the TLAST signal, or any of the side-channel signals. This implementation of the AXI4-Stream interface consumes fewer device resources, but offers no visibility into when the stream is ending.
Now I had already imported the needed module with#include <ap_axi_sdata.h>, all I needed to do was actually use it by removing
struct ap_axis_str {
word data;
ap_uint<2> keep;
bool user;
bool last;
bool ready;
bool valid;
};
and replacing it with
typedef ap_axiu<16, 1, 0, 0> ap_axis_str;
Additionally, I needed to remove my manual attempt to control TREADY and TVALID, as those are done automatically.

Related

Use field of union in intialiser of another variable

I am building records for unit testing a software module. Record data is serialised before sending it to the UUT.
The records contain bitfields, so I would like to build serialised records using these same bitfields at compile-time (to prevent having to account for little- and big-endian issues and where the bits in bitfield go) and use a union to access the (serialised) data. I have to calculate a checksum over the record, so I need the bitfields as bytes to do so.
My attempt so far is:
/* defines for 64 bit valid record */
#define REC3_ID EEID_ARRAY_FIRST
#define REC3_SIZE 1
#define REC3_INDEX 248
#define REC3_SI0 MAKE_SIZE_INDEX0(REC3_SIZE,REC3_INDEX)
#define REC3_SI1 MAKE_SIZE_INDEX1(REC3_SIZE,REC3_INDEX)
#define REC3_VALUE0 0xf2
#define REC3_VALUE1 0x4f
#define REC3_VALUE2 0xb8
#define REC3_VALUE3 0xa0
#define REC3_DATA \
MAKE_CHKSUM7(REC3_ID,REC3_SI0,REC3_SI1,REC3_VALUE0,REC3_VALUE1,REC3_VALUE2,REC3_VALUE3),\
REC3_ID,REC3_SI0,REC3_SI1,REC3_VALUE0,REC3_VALUE1,REC3_VALUE2,REC3_VALUE3
#define CHKSUM_SEED (0x2a)
#define MAKE_CHKSUM7(v0,v1,v2,v3,v4,v5,v6) (0x100-(((v0)+(v1)+(v2)+(v3)+(v4)+(v5)+(v6)+CHKSUM_SEED)%0x100))
typedef union
{
uint8_t si[2];
struct
{
uint16_t s: 6;
uint16_t i: 10;
} b;
} si_t;
MAKE_SIZE_INDEX0(size,index) ((si_t){.b.s=size,.b.i=index}).si[0]
MAKE_SIZE_INDEX1(size,index) ((si_t){.b.s=size,.b.i=index}).si[1]
static uint8_t rec3[] = {REC3_DATA};
The problem lies with the macros MAKE_SIZE_INDEX0 and MAKE_SIZE_INDEX1. I can't get them to compile (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.11))
The problem can be simplified to:
uint8_t rec[] = {0x12, (si_t){.b.s=4,.b.i=8}.si[0], (si_t){.b.s=4,.b.i=8}.si[1], 0x34};
But that results in error:
error: initializer element is not constant
I know I can create my records at run-time, but I wondered whether it is possible to let the preprocessor handle it.
My alternative is something like:
#if defined (TGT_ARCHITECTURE_x86_64)
#define MAKE_SIZE_INDEX0(size,index) (((size)&0x3f)+(((index)<<6)&0xc0))
#define MAKE_SIZE_INDEX1(size,index) ((index>>2)&0xff)
#endif
But this depends on whether the target is little-endian or big-endian and how it stores bitfields.
static uint8_t rec3[] = { (si_t){.b.s=4,.b.i=8}.si[0] };
Variables with static storage duration must be initialized only using a static initializer - it has to be a constant expression. There is a list of what is allowed in a constant expression - using array subscript operator on a array embedded in a compound literal is not allowed in a constant expression. It's basically the same as you can't do static int a[] = {1, 2}; static int b = a[1];
On a side note, the standard says that implementations are allowed to accept custom forms of constant expression. So the code may happen to work with a different compiler and even a different gcc version (as with newest gcc versions you may initialize variable with static storage duration with const qualived variable, which is an extension).
The compiler errors with "initializer element is not constant", as the element used to initialize variable with static storage duration is not a constant expression.
Using bit-fields to extract a bit-mask of a variable is compiler dependent, compiler options dependent (gcc storage layout) and shouldn't be used in portable code. Compiler is free to reorder the bitfields in your struct and is free to add padding between bit fields members. As advertised on stackoverflow many, many times, use bitmasks - they work every time.

Trap memory accesses inside a standard executable built with MinGW

So my problem sounds like this.
I have some platform dependent code (embedded system) which writes to some MMIO locations that are hardcoded at specific addresses.
I compile this code with some management code inside a standard executable (mainly for testing) but also for simulation (because it takes longer to find basic bugs inside the actual HW platform).
To alleviate the hardcoded pointers, i just redefine them to some variables inside the memory pool. And this works really well.
The problem is that there is specific hardware behavior on some of the MMIO locations (w1c for example) which makes "correct" testing hard to impossible.
These are the solutions i thought of:
1 - Somehow redefine the accesses to those registers and try to insert some immediate function to simulate the dynamic behavior. This is not really usable since there are various ways to write to the MMIO locations (pointers and stuff).
2 - Somehow leave the addresses hardcoded and trap the illegal access through a seg fault, find the location that triggered, extract exactly where the access was made, handle and return. I am not really sure how this would work (and even if it's possible).
3 - Use some sort of emulation. This will surely work, but it will void the whole purpose of running fast and native on a standard computer.
4 - Virtualization ?? Probably will take a lot of time to implement. Not really sure if the gain is justifiable.
Does anyone have any idea if this can be accomplished without going too deep? Maybe is there a way to manipulate the compiler in some way to define a memory area for which every access will generate a callback. Not really an expert in x86/gcc stuff.
Edit: It seems that it's not really possible to do this in a platform independent way, and since it will be only windows, i will use the available API (which seems to work as expected). Found this Q here:
Is set single step trap available on win 7?
I will put the whole "simulated" register file inside a number of pages, guard them, and trigger a callback from which i will extract all the necessary info, do my stuff then continue execution.
Thanks all for responding.
I think #2 is the best approach. I routinely use approach #4, but I use it to test code that is running in the kernel, so I need a layer below the kernel to trap and emulate the accesses. Since you have already put your code into a user-mode application, #2 should be simpler.
The answers to this question may provide help in implementing #2. How to write a signal handler to catch SIGSEGV?
What you really want to do, though, is to emulate the memory access and then have the segv handler return to the instruction after the access. This sample code works on Linux. I'm not sure if the behavior it is taking advantage of is undefined, though.
#include <stdint.h>
#include <stdio.h>
#include <signal.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static void segv_handler(int, siginfo_t *, void *);
int main()
{
struct sigaction action = { 0, };
action.sa_sigaction = segv_handler;
action.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &action, NULL);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static void segv_handler(int, siginfo_t *info, void *ucontext_arg)
{
ucontext_t *ucontext = static_cast<ucontext_t *>(ucontext_arg);
ucontext->uc_mcontext.gregs[REG_RAX] = 1234;
ucontext->uc_mcontext.gregs[REG_RIP] += 2;
}
The code to read the register is written in assembly to ensure that both the destination register and the length of the instruction are known.
This is how the Windows version of prl's answer could look like:
#include <stdint.h>
#include <stdio.h>
#include <windows.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *);
int main()
{
SetUnhandledExceptionFilter(segv_handler);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *ep)
{
// only handle read access violation of REG_ADDR
if (ep->ExceptionRecord->ExceptionCode != EXCEPTION_ACCESS_VIOLATION ||
ep->ExceptionRecord->ExceptionInformation[0] != 0 ||
ep->ExceptionRecord->ExceptionInformation[1] != (ULONG_PTR)REG_ADDR)
return EXCEPTION_CONTINUE_SEARCH;
ep->ContextRecord->Rax = 1234;
ep->ContextRecord->Rip += 2;
return EXCEPTION_CONTINUE_EXECUTION;
}
So, the solution (code snippet) is as follows:
First of all, i have a variable:
__attribute__ ((aligned (4096))) int g_test;
Second, inside my main function, i do the following:
AddVectoredExceptionHandler(1, VectoredHandler);
DWORD old;
VirtualProtect(&g_test, 4096, PAGE_READWRITE | PAGE_GUARD, &old);
The handler looks like this:
LONG WINAPI VectoredHandler(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
static DWORD last_addr;
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION) {
last_addr = ExceptionInfo->ExceptionRecord->ExceptionInformation[1];
ExceptionInfo->ContextRecord->EFlags |= 0x100; /* Single step to trigger the next one */
return EXCEPTION_CONTINUE_EXECUTION;
}
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP) {
DWORD old;
VirtualProtect((PVOID)(last_addr & ~PAGE_MASK), 4096, PAGE_READWRITE | PAGE_GUARD, &old);
return EXCEPTION_CONTINUE_EXECUTION;
}
return EXCEPTION_CONTINUE_SEARCH;
}
This is only a basic skeleton for the functionality. Basically I guard the page on which the variable resides, i have some linked lists in which i hold pointers to the function and values for the address in question. I check that the fault generating address is inside my list then i trigger the callback.
On first guard hit, the page protection will be disabled by the system, but i can call my PRE_WRITE callback where i can save the variable state. Because a single step is issued through the EFlags, it will be followed immediately by a single step exception (which means that the variable was written), and i can trigger a WRITE callback. All the data required for the operation is contained inside the ExceptionInformation array.
When someone tries to write to that variable:
*(int *)&g_test = 1;
A PRE_WRITE followed by a WRITE will be triggered,
When i do:
int x = *(int *)&g_test;
A READ will be issued.
In this way i can manipulate the data flow in a way that does not require modifications of the original source code.
Note: This is intended to be used as part of a test framework and any penalty hit is deemed acceptable.
For example, W1C (Write 1 to clear) operation can be accomplished:
void MYREG_hook(reg_cbk_t type)
{
/** We need to save the pre-write state
* This is safe since we are assured to be called with
* both PRE_WRITE and WRITE in the correct order
*/
static int pre;
switch (type) {
case REG_READ: /* Called pre-read */
break;
case REG_PRE_WRITE: /* Called pre-write */
pre = g_test;
break;
case REG_WRITE: /* Called after write */
g_test = pre & ~g_test; /* W1C */
break;
default:
break;
}
}
This was possible also with seg-faults on illegal addresses, but i had to issue one for each R/W, and keep track of a "virtual register file" so a bigger penalty hit. In this way i can only guard specific areas of memory or none, depending on the registered monitors.

Where can I find documentation on SPI2STATbits?

I've recently come across spi2statbits in the following function:
int WriteSPI2( int data)
{
int f;
SPI2BUF = data; // write to buffer for TX
while( !SPI2STATbits.SPIRBF)
{f=1;}// wait transfer completion
return SPI2BUF; // read the received value
} // WriteSPI2
I'm using the above in conjunction with a PIC24FJ128GA010 project.
I've been searching around to find more about SPI2STATbits but haven't found actual documentation. I assume this is a fairly basic requirement.
Can someone direct me to the correct documentation?
See page 130 of the datasheet , look for SPIxSTAT.
For the particular device, do a search for SPI2STATbits in all included header files.
It's in p24FJ128GA010.h as
#define SPI2STAT SPI2STAT extern volatile uint16_t SPI2STAT __attribute__((__sfr__)); __extension__ typedef struct tagSPI2STATBITS {...

Control relay from PIC18 microchip

I have a PIC18F24K20 microchip, and wants to control a relay. It works fine from my RasPI over GPIO - but i cant get it working trough my microchip.
My test program is this:
#include <xc.h>
#define R1 LATBbits.LATB0
#define R1_TRIS TRISBbits.RB0
#define R2 LATBbits.LATB1
#define R2_TRIS TRISBbits.RB1
void main(void) {
R1_TRIS = 0;
R2_TRIS = 0;
R1 = 1;
R2 = 0;
return;
}
What is im doing wrong?
replace the return;
with:
while(1)
{
ClrWdt();
}
according datasheet,RB0 and RB1 have several modules connected to these pins,so you should verify they are turned off:
Analog,
ECCP,
Comparator.
BTW why using two pins in order to control one relay?
3.you may need add driver in order to operat the relay.
according datasheet, add following initialization code:
CCP1CON=0;
CCP2CON=0;
ADCON0=0;
CM1CON0=0;
CM2CON0=0;
also PBADEN bit at configuration bit should be zero.
The main function should never return in the embedded PIC processors. In some implementations, it would cause a software reset which would cause your pins to go back to high impedance mode. Try adding while (1); at the end of your main.
Check if the used pins have other functions. The typical gotcha is that the pins double as analog pins and are enable by default.
Disable them by looking up which AN pin they correspond to in the datasheet and disable them with code like
ANSEL.ANS0 = 0;
ANSEL.ANS1 = 0;
If you enable watchdog functionality you also might want to add a
ClrWdt();
to the main WHILE loop (which was a good suggestion from Mathieu)

Setting register values in PIC16F876 using Hi Tech PICC

I am using MPLABx and the HI Tech PICC compiler. My target chip is a PIC16F876. By looking at the pic16f876.h include file, it appears that it should be possible to set the system registers of the chip by referring to them by name.
For example, within the CCP1CON register, bits 0 to 3 set how the CCP and PWM modules work. By looking at the pic16f876.h file, it looks like it should be possible to refer to these 4 bits alone, without change the value of the rest of the CCP1CON register.
However, I have tried to refer to these 4 bits in a variety of ways with no success.
I have tried;
CCP1CON.CCP1M=0xC0; this results in "error: struct/union required
CCP1CON:CCP1M=0xC0; this results in "error: undefined identifier "CCP1M"
but both have failed. I have read through the Hi Tech PICC compiler manual, but cannot see how to do this.
From the pic16f876.h file, it looks to me as though I should be able to refer to these subsets within the system registers by name, as they are defined in the .h file.
Does anyone know how to accomplish this?
Excerpt from pic16f876.h
// Register: CCP1CON
volatile unsigned char CCP1CON # 0x017;
// bit and bitfield definitions
volatile bit CCP1Y # ((unsigned)&CCP1CON*8)+4;
volatile bit CCP1X # ((unsigned)&CCP1CON*8)+5;
volatile bit CCP1M0 # ((unsigned)&CCP1CON*8)+0;
volatile bit CCP1M1 # ((unsigned)&CCP1CON*8)+1;
volatile bit CCP1M2 # ((unsigned)&CCP1CON*8)+2;
volatile bit CCP1M3 # ((unsigned)&CCP1CON*8)+3;
#ifndef _LIB_BUILD
volatile union {
struct {
unsigned CCP1M : 4;
unsigned CCP1Y : 1;
unsigned CCP1X : 1;
};
struct {
unsigned CCP1M0 : 1;
unsigned CCP1M1 : 1;
unsigned CCP1M2 : 1;
unsigned CCP1M3 : 1;
};
} CCP1CONbits # 0x017;
#endif
You need to access the bitfield members through an instance of a struct. In this case, that is CCP1CONbits. Because it is a bitfield, you only need to have the number of significant bits as defined in the bitfield, not the full eight bits in your code.
So:
CCP1CONbits.CCP1M = 0x0c;
Should be the equivalent of what you are trying to do. If you want to set all eight bits at once you can use CCP1CON = 0xc0. That would set the CCP1M bits to 0x0c and all the other bits to zero.
The header you gave also has individual bit symbols, so you could do this too:
CCP1M0 = 1;
CCP1M1 = 1;
CCP1M2 = 0;
CCP1M3 = 0;
Although the bitfield approach is cleaner.

Resources