Kissfftr different results x86 - Atheros AR9331 - gcc

This is my first question on stackoverflow and my englsich is unfortunately poor. But I want to try it.
A customized routine of twotonetest of kissfft brings on two different systems very different results.
The under ubuntu translated with gcc on x86 program brings the correct values. That with the openWRT SDK translated for the Arduino YUN (Atheros AR9331) program displays incorrect values​​. It seems as if since the definition of FIXED_POINT is ignored.
Defined is:
#define FIXED_POINT 32
the function:
double GetFreqBuf( tBuf * io_pBuf, int nfft)
{
kiss_fftr_cfg cfg = NULL;
kiss_fft_cpx *kout = NULL;
kiss_fft_scalar *tbuf = NULL;
uint32_t ptr;
int i;
double sigpow=0;
double noisepow=0;
long maxrange = SHRT_MAX;
cfg = kiss_fftr_alloc(nfft , 0, NULL, NULL);
tbuf = KISS_FFT_MALLOC(nfft * sizeof(kiss_fft_scalar));
kout = KISS_FFT_MALLOC(nfft * sizeof(kiss_fft_cpx));
/* generate the array from samples*/
for (i = 0; i < nfft; i++) {
//nur einen Kanal, eine Krücke, würde nun auch mit 2 kanälen gehen, aber so ist schneller
if (io_pBuf->IndexNextValue >= (i*2))
ptr = io_pBuf->IndexNextValue - (i*2);
else
ptr = io_pBuf->bufSize - ((i*2) - io_pBuf->IndexNextValue);
tbuf[i] = io_pBuf->aData[ptr] ;
}
kiss_fftr(cfg, tbuf, kout);
for (i=0;i < (nfft/2+1);++i) {
double tmpr = (double)kout[i].r / (double)maxrange;
double tmpi = (double)kout[i].i / (double)maxrange;
double mag2 = tmpr*tmpr + tmpi*tmpi;
if (i!=0 && i!= nfft/2)
mag2 *= 2; /* all bins except DC and Nyquist have symmetric counterparts implied*/
/* if there is power between the frq's, it is signal, otherwise noise*/
if ( i > nfft/96 && i < nfft/32 )
noisepow += mag2;
else
sigpow += mag2;
}
kiss_fft_cleanup();
//printf("TEST %d Werte, noisepow: %f sigpow: %f noise # %fdB\n",nfft,noisepow,sigpow,10*log10(noisepow/sigpow +1e-30) );
free(cfg);
free(tbuf);
free(kout);
return 10*log10(noisepow/sigpow +1e-30);
}
As input samples of 16-bit sound from the same file be used. Results differ for example from-3dB to-15dB. AWhere could you start troubleshooting?

Possibility #1 (most likely)
You are compiling kissfft.c or kiss_fftr.c differently than the calling code. This happens to a lot of people.
An easy way to force the same FIXED_POINT is to edit the kiss_fft.h directly. Another option: verify with some printf debugging. i.e. place the following in various places:
printf( __FILE__ " sees sizeof(kiss_fft_scalar)=%d\n" , sizeof(kiss_fft_scalar) )
Possibility #2
Perhaps the FIXED_POINT=16 code works but the FIXED_POINT=32 code does not because something is being handled incorrectly either inside kissfft or on the platform. The 32 bit fixed code relies on int64_t being implemented correctly.
Is that Atheros a 16 bit processor? I know kissfft has been used successfully on 16 bit platforms, but I'm not sure if FIXED_POINT=32 real FFTs on a 16 bit fixed point has been used.
viel Glück,
Mark

Related

How do I enable GDB/GEF to allow me to see how stack changes as I insert discrete input?

I am trying to identify the offset in which a buffer overflow occurs via pwntools and gdb. Here is the C code (x64):
int input[8];
int count, num;
count = 0;
while(1)
{
printf("Enter:\n");
scanf("%d", &num);
if (num == -1){
break;
} else {
input[count++] = num;
}
}
Understanding that the size of the integer is 4 bytes, I am attempting to feed the program a string of integers via pwntools (code below):
from pwn import *
context.log_level = "debug"
io = gdb.debug('_file_')
for i in range(0,10,1):
io.clean()
io.sendline("{:d}".format(i))
io.interactive()
However, I am having trouble finding the offset and trying to debug the program via gdb. I would like to be able to see changes to the stack as each integer is input (via ni or si). Is there a better way to identify where the program crashes?
I am using the for loop as a proxy for pattern create (with the hope to see which integer causes the crash).
Any insights would greatly be appreciated!

Getting TSC rate from x86 kernel

I have an embedded Linux system running on an Atom, which is a new enough CPU to have an invariant TSC (time stamp counter), whose frequency the kernel measures on startup. I use the TSC in my own code to keep time (avoiding kernel calls), and my startup code measures the TSC rate, but I'd rather just use the kernel's measurement. Is there any way to retrieve this from the kernel? It's not in /proc/cpuinfo anywhere.
BPFtrace
As root, you can retrieve the kernel's TSC rate with bpftrace:
# bpftrace -e 'BEGIN { printf("%u\n", *kaddr("tsc_khz")); exit(); }' | tail -n
(tested it on CentOS 7 and Fedora 29)
That is the value that is defined, exported and maintained/calibrated in arch/x86/kernel/tsc.c.
GDB
Alternatively, also as root, you can also read it from /proc/kcore, e.g.:
# gdb /dev/null /proc/kcore -ex 'x/uw 0x'$(grep '\<tsc_khz\>' /proc/kallsyms \
| cut -d' ' -f1) -batch 2>/dev/null | tail -n 1 | cut -f2
(tested it on CentOS 7 and Fedora 29)
SystemTap
If the system doesn't have bpftrace nor gdb available but SystemTap you can get it like this (as root):
# cat tsc_khz.stp
#!/usr/bin/stap -g
function get_tsc_khz() %{ /* pure */
THIS->__retvalue = tsc_khz;
%}
probe oneshot {
printf("%u\n", get_tsc_khz());
}
# ./tsc_khz.stp
Of course, you can also write a small kernel module that provides access to tsc_khz via the /sys pseudo file system. Even better, somebody already did that and a tsc_freq_khz module is available on GitHub. With that the following should work:
# modprobe tsc_freq_khz
$ cat /sys/devices/system/cpu/cpu0/tsc_freq_khz
(tested on Fedora 29, reading the sysfs file doesn't require root)
Kernel Messages
In case nothing of the above is an option you can parse the TSC rate from the kernel logs. But this gets ugly fast because you see different kinds of messages on different hardware and kernels, e.g. on a Fedora 29 i7 system:
$ journalctl --boot | grep 'kernel: tsc:' -i | cut -d' ' -f5-
kernel: tsc: Detected 2800.000 MHz processor
kernel: tsc: Detected 2808.000 MHz TSC
But on a Fedora 29 Intel Atom just:
kernel: tsc: Detected 2200.000 MHz processor
While on a CentOS 7 i5 system:
kernel: tsc: Fast TSC calibration using PIT
kernel: tsc: Detected 1895.542 MHz processor
kernel: tsc: Refined TSC clocksource calibration: 1895.614 MHz
Perf Values
The Linux Kernel doesn't provide an API to read the TSC rate, yet. But it does provide one for getting the mult and shift values that can be used to convert TSC counts to nanoseconds. Those values are derived from tsc_khz - also in arch/x86/kernel/tsc.c - where tsc_khz is initialized and calibrated. And they are shared with userspace.
Example program that uses the perf API and accesses the shared page:
#include <asm/unistd.h>
#include <inttypes.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
int cpu, int group_fd, unsigned long flags)
{
return syscall(__NR_perf_event_open, hw_event, pid, cpu, group_fd, flags);
}
The actual code:
int main(int argc, char **argv)
{
struct perf_event_attr pe = {
.type = PERF_TYPE_HARDWARE,
.size = sizeof(struct perf_event_attr),
.config = PERF_COUNT_HW_INSTRUCTIONS,
.disabled = 1,
.exclude_kernel = 1,
.exclude_hv = 1
};
int fd = perf_event_open(&pe, 0, -1, -1, 0);
if (fd == -1) {
perror("perf_event_open failed");
return 1;
}
void *addr = mmap(NULL, 4*1024, PROT_READ, MAP_SHARED, fd, 0);
if (!addr) {
perror("mmap failed");
return 1;
}
struct perf_event_mmap_page *pc = addr;
if (pc->cap_user_time != 1) {
fprintf(stderr, "Perf system doesn't support user time\n");
return 1;
}
printf("%16s %5s\n", "mult", "shift");
printf("%16" PRIu32 " %5" PRIu16 "\n", pc->time_mult, pc->time_shift);
close(fd);
}
Tested in on Fedora 29 and it works also for non-root users.
Those values can be used to convert a TSC count to nanoseconds with a function like this one:
static uint64_t mul_u64_u32_shr(uint64_t cyc, uint32_t mult, uint32_t shift)
{
__uint128_t x = cyc;
x *= mult;
x >>= shift;
return x;
}
CPUID/MSR
Another way to obtain the TSC rate is to follow DPDK's lead.
DPDK on x86_64 basically uses the following strategy:
Read the 'Time Stamp Counter and Nominal Core Crystal Clock Information Leaf' via cpuid intrinsics (doesn't require special privileges), if available
Read it from the MSR (requires the rawio capability and read permissions on /dev/cpu/*/msr), if possible
Calibrate it in userspace by other means, otherwise
FWIW, a quick test shows that the cpuid leaf doesn't seem to be that widely available, e.g. an i7 Skylake and a goldmont atom don't have it. Otherwise, as can be seen from the DPDK code, using the MSR requires a bunch of intricate case distinctions.
However, in case the program already uses DPDK, getting the TSC rate, getting TSC values or converting TSC values is just a matter of using the right DPDK API.
I had a brief look and there doesn't seem to be a built-in way to directly get this information from the kernel.
However, the symbol tsc_khz (which I'm guessing is what you want) is exported by the kernel. You could write a small kernel module that exposes a sysfs interface and use that to read out the value of tsc_khz from userspace.
If writing a kernel module is not an option, it may be possible to use some Dark Magic™ to read out the value directly from the kernel memory space. Parse the kernel binary or System.map file to find the location of the tsc_khz symbol and read it from /dev/{k}mem. This is, of course, only possible provided that the kernel is configured with the appropriate options.
Lastly, from reading the kernel source comments, it looks like there's a possibility that the TSC may be unstable on some platforms. I don't know much about the inner workings of the x86 arch but this may be something you want to take into consideration.
The TSC rate is directly related to "cpu MHz" in /proc/cpuinfo. Actually, the better number to use is "bogomips". The reason is that while the freq for TSC is the max CPU freq, the current "cpu Mhz" can vary at time of your invocation.
The bogomips value is computed at boot. You'll need to adjust this value by number of cores and processor count (i.e. the number of hyperthreads) That gives you [fractional] MHz. That is what I use to do what you want to do.
To get the processor count, look for the last "processor: " line. The processor count is <value> + 1. Call it "cpu_count".
To get number of cores, any "cpu cores: " works. number of cores is <value>. Call it "core_count".
So, the formula is:
smt_count = cpu_count;
if (core_count)
smt_count /= core_count;
cpu_freq_in_khz = (bogomips * scale_factor) / smt_count;
That is extracted from my actual code, which is below.
Here's the actual code I use. You won't be able to use it directly because it relies on boilerplate I have, but it should give you some ideas, particularly with how to compute
// syslgx/tvtsc -- system time routines (RDTSC)
#include <tgb.h>
#include <zprt.h>
tgb_t systvinit_tgb[] = {
{ .tgb_val = 1, .tgb_tag = "cpu_mhz" },
{ .tgb_val = 2, .tgb_tag = "bogomips" },
{ .tgb_val = 3, .tgb_tag = "processor" },
{ .tgb_val = 4, .tgb_tag = "cpu_cores" },
{ .tgb_val = 5, .tgb_tag = "clflush_size" },
{ .tgb_val = 6, .tgb_tag = "cache_alignment" },
TGBEOT
};
// _systvinit -- get CPU speed
static void
_systvinit(void)
{
const char *file;
const char *dlm;
XFIL *xfsrc;
int matchflg;
char *cp;
char *cur;
char *rhs;
char lhs[1000];
tgb_pc tgb;
syskhz_t khzcpu;
syskhz_t khzbogo;
syskhz_t khzcur;
sysmpi_p mpi;
file = "/proc/cpuinfo";
xfsrc = fopen(file,"r");
if (xfsrc == NULL)
sysfault("systvinit: unable to open '%s' -- %s\n",file,xstrerror());
dlm = " \t";
khzcpu = 0;
khzbogo = 0;
mpi = &SYS->sys_cpucnt;
SYSZAPME(mpi);
// (1) look for "cpu MHz : 3192.515" (preferred)
// (2) look for "bogomips : 3192.51" (alternate)
// FIXME/CAE -- on machines with speed-step, bogomips may be preferred (or
// disable it)
while (1) {
cp = fgets(lhs,sizeof(lhs),xfsrc);
if (cp == NULL)
break;
// strip newline
cp = strchr(lhs,'\n');
if (cp != NULL)
*cp = 0;
// look for symbol value divider
cp = strchr(lhs,':');
if (cp == NULL)
continue;
// split symbol and value
*cp = 0;
rhs = cp + 1;
// strip trailing whitespace from symbol
for (cp -= 1; cp >= lhs; --cp) {
if (! XCTWHITE(*cp))
break;
*cp = 0;
}
// convert "foo bar" into "foo_bar"
for (cp = lhs; *cp != 0; ++cp) {
if (XCTWHITE(*cp))
*cp = '_';
}
// match on interesting data
matchflg = 0;
for (tgb = systvinit_tgb; TGBMORE(tgb); ++tgb) {
if (strcasecmp(lhs,tgb->tgb_tag) == 0) {
matchflg = tgb->tgb_val;
break;
}
}
if (! matchflg)
continue;
// look for the value
cp = strtok_r(rhs,dlm,&cur);
if (cp == NULL)
continue;
zprt(ZPXHOWSETUP,"_systvinit: GRAB/%d lhs='%s' cp='%s'\n",
matchflg,lhs,cp);
// process the value
// NOTE: because of Intel's speed step, take the highest cpu speed
switch (matchflg) {
case 1: // genuine CPU speed
khzcur = _systvinitkhz(cp);
if (khzcur > khzcpu)
khzcpu = khzcur;
break;
case 2: // the consolation prize
khzcur = _systvinitkhz(cp);
// we've seen some "wild" values
if (khzcur > 10000000)
break;
if (khzcur > khzbogo)
khzbogo = khzcur;
break;
case 3: // remember # of cpu's so we can adjust bogomips
mpi->mpi_cpucnt = atoi(cp);
mpi->mpi_cpucnt += 1;
break;
case 4: // remember # of cpu cores so we can adjust bogomips
mpi->mpi_corecnt = atoi(cp);
break;
case 5: // cache flush size
mpi->mpi_cshflush = atoi(cp);
break;
case 6: // cache alignment
mpi->mpi_cshalign = atoi(cp);
break;
}
}
fclose(xfsrc);
// we want to know the number of hyperthreads
mpi->mpi_smtcnt = mpi->mpi_cpucnt;
if (mpi->mpi_corecnt)
mpi->mpi_smtcnt /= mpi->mpi_corecnt;
zprt(ZPXHOWSETUP,"_systvinit: FINAL khzcpu=%d khzbogo=%d mpi_cpucnt=%d mpi_corecnt=%d mpi_smtcnt=%d mpi_cshalign=%d mpi_cshflush=%d\n",
khzcpu,khzbogo,mpi->mpi_cpucnt,mpi->mpi_corecnt,mpi->mpi_smtcnt,
mpi->mpi_cshalign,mpi->mpi_cshflush);
if ((mpi->mpi_cshalign == 0) || (mpi->mpi_cshflush == 0))
sysfault("_systvinit: cache parameter fault\n");
do {
// use the best reference
// FIXME/CAE -- with speed step, bogomips is better
#if 0
if (khzcpu != 0)
break;
#endif
khzcpu = khzbogo;
if (mpi->mpi_smtcnt)
khzcpu /= mpi->mpi_smtcnt;
if (khzcpu != 0)
break;
sysfault("_systvinit: unable to obtain cpu speed\n");
} while (0);
systvkhz(khzcpu);
zprt(ZPXHOWSETUP,"_systvinit: EXIT\n");
}
// _systvinitkhz -- decode value
// RETURNS: CPU freq in khz
static syskhz_t
_systvinitkhz(char *str)
{
char *src;
char *dst;
int rhscnt;
char bf[100];
syskhz_t khz;
zprt(ZPXHOWSETUP,"_systvinitkhz: ENTER str='%s'\n",str);
dst = bf;
src = str;
// get lhs of lhs.rhs
for (; *src != 0; ++src, ++dst) {
if (*src == '.')
break;
*dst = *src;
}
// skip over the dot
++src;
// get rhs of lhs.rhs and determine how many rhs digits we have
rhscnt = 0;
for (; *src != 0; ++src, ++dst, ++rhscnt)
*dst = *src;
*dst = 0;
khz = atol(bf);
zprt(ZPXHOWSETUP,"_systvinitkhz: PRESCALE bf='%s' khz=%d rhscnt=%d\n",
bf,khz,rhscnt);
// scale down (e.g. we got xxxx.yyyy)
for (; rhscnt > 3; --rhscnt)
khz /= 10;
// scale up (e.g. we got xxxx.yy--bogomips does this)
for (; rhscnt < 3; ++rhscnt)
khz *= 10;
zprt(ZPXHOWSETUP,"_systvinitkhz: EXIT khz=%d\n",khz);
return khz;
}
UPDATE:
Sigh. Yes.
I was using "cpu MHz" from /proc/cpuinfo prior to the introduction of processors with "speed step" technology, so I switched to "bogomips" and the algorithm was derived empirically based on that. When I derived it, I only had access to hyperthreaded machines. However, I've found an old one that is not and the SMT thing isn't valid.
However, it appears that bogomips is always 2x the [maximum] CPU speed. See http://www.clifton.nl/bogo-faq.html That hasn't always been my experience on all kernel versions over the years [IIRC, I started with 0.99.x], but it's probably a reliable assumption these days.
With "constant TSC" [which all newer processors have], denoted by constant_tsc in the flags: field in /proc/cpuinfo, the TSC rate is the maximum CPU frequency.
Originally, the only way to get the frequency information was from /proc/cpuinfo. Now, however, in more modern kernels, there is another way that may be easier and more definitive [I had code coverage for this in other software of mine, but had forgotten about it]:
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
The contents of this file is the maximum CPU frequency in kHz. There are analogous files for the other CPU cores. The files should be identical for most sane motherboards (e.g. ones that are composed of the same model chip and don't try to mix [say] i7s and atoms). Otherwise, you'd have to keep track of the info on a per-core basis and that would get messy fast.
The given directory also has other interesting files. For example, if your processor has "speed step" [and some of the other files can tell you that], you can force maximum performance by writing performance to the scaling_governor file. This will disable use of speed step.
If the processor did not have constant_tsc, you'd have to disable speed step [and run the cores at maximum rate] to get accurate measurements

Cannot get OpenAL to play sound

I've searched the net, I've searched here. I've found code that I could compile and it works fine, but for some reason my code won't produce any sound. I'm porting an old game to the PC (Windows,) and I'm trying to make it as authentic as possible, so I'm wanting to use generated wave forms. I've pretty much copied and pasted the working code (only adding in multiple voices,) and it still won't work (even thought the exact same code for a single voice works fine.) I know I'm missing something obvious, but I just cannot figure out what. Any help would be appreciated thank you.
First some notes... I was looking for something that would allow me to use the original methodology. The original system used paired bytes for music (sound effects - only 2 - were handled in code.) A time byte that counted down every time the routine was called, and a note byte that was played until time reached zero. this was done by patching into the interrupt vector, windows doesn't allow that, so I set up a timer that routing that accomplished the same thing. The timer kicks in, updates the display, and then runs the music sequence. I set this up with a defined time so that I only have one place to adjust the timing at (to get it as close as possible to the original sequence. The music is a generated wave form (and I've double checked the math, and even examined the generated data in debug mode,) and it looks good. The sequence looks good, but doesn't actually produce sound. I tried SDL2 first, and it's method of only playing 1 sound doesn't work for me, also, unless I make the sample duration extremely short (and the sound produced this way is awful,) I can't match the timing (it plays the entire sample through it's own interrupt without letting me make adjustments.) Also, blending the 3 voices together (when they all run with different timings,) is a mess. Most of the other engines I examined work in much the same way, they want to use their own callback interrupt and won't allow me to tweak it appropriately. This is why I started working with OpenAL. It allows multiple voices (sources,) and allows me to set the timings myself. On advice from several forums, I set it up so that the sample lengths are all multiples of full cycles.
Anyway, here's the code.
int main(int argc, char* argv[])
{
FreeConsole(); //Get rid of the DOS console, don't need it
if (InitLog() < 0) return -1; //Start logging
UINT_PTR tim = NULL;
SDL_Event event;
InitVideo(false); //Set to window for now, will put options in later
curmusic = 5;
InitAudio();
SetTimer(NULL,tim,_FREQ_,TimerProc);
SDL_PollEvent(&event);
while (event.type != SDL_KEYDOWN) SDL_PollEvent(&event);
SDL_Quit();
return 0;
}
void CALLBACK TimerProc(HWND hWind, UINT Msg, UINT_PTR idEvent, DWORD dwTime)
{
RenderOutput();
PlayMusic();
//UpdateTimer();
//RotateGate();
return;
}
void InitAudio(void)
{
ALCdevice *dev;
ALCcontext *cxt;
Log("Initializing OpenAL Audio\r\n");
dev = alcOpenDevice(NULL);
if (!dev) {
Log("Failed to open an audio device\r\n");
exit(-1);
}
cxt = alcCreateContext(dev, NULL);
alcMakeContextCurrent(cxt);
if(!cxt) {
Log("Failed to create audio context\r\n");
exit(-1);
}
alGenBuffers(4,Buffer);
if (alGetError() != AL_NO_ERROR) {
Log("Error during buffer creation\r\n");
exit(-1);
}
alGenSources(4, Source);
if (alGetError() != AL_NO_ERROR) {
Log("Error during source creation\r\n");
exit(-1);
}
return;
}
void PlayMusic()
{
static int oldsong, ofset, mtime[4];
double freq;
ALuint srate = 44100;
ALuint voice, i, note, len, hold;
short buf[4][_BUFFSIZE_];
bool test[4] = {false, false, false, false};
if (curmusic != oldsong) {
oldsong = (int)curmusic;
if (curmusic > 0)
ofset = moffset[(curmusic - 1)];
for (voice = 1; voice < 4; voice++)
alSourceStop(Source[voice]);
mtime[voice] = 0;
return;
}
if (curmusic == 0) return;
//Only 3 voices for music, but have
for (voice = 0; voice < 3; voice ++) { // 4 set asside for eventual sound effects
if (mtime[voice] == 0) { //is note finished
alSourceStop(Source[voice]); //It is, so stop the channel (source)
mtime[voice] = music[ofset++]; //Get the next duration
if (mtime[voice] == 0) {oldsong = 0; return;} //zero marks end, so restart
note = music[ofset++]; //Get the next note
if (note > 127) { //Old HW data was designed for could only
if (note == 255) note = 127; //use values 128 - 255 (255 = 127)
freq = (15980 / (voice + (int)(voice / 3))) / (256 - note); //freq of note
len = (ALuint)(srate / freq); //A single cycle of that freq.
hold = len;
while (len < (srate / (1000 / _FREQ_))) len += hold; //Multiply till 1 interrup cycle
while (len > _BUFFSIZE_) len -= hold; //Don't overload buffer
if (len == 0) len = _BUFFSIZE_; //Just to be safe
for (i = 0; i < len; i++) //calculate sine wave and put in buffer
buf[voice][i] = (short)((32760 * sin((2 * M_PI * i * freq) / srate)));
alBufferData(Buffer[voice], AL_FORMAT_MONO16, buf[voice], len, srate);
alSourcei(openAL.Source[i], AL_LOOPING, AL_TRUE);
alSourcei(Source[i], AL_BUFFER, Buffer[i]);
alSourcePlay(Source[voice]);
}
} else --mtime[voice];
}
}
Well, it turns out there were 3 problems with my code. First, you have to link the built wave buffer to the AL generated buffer "before" you link the buffer to the source:
alBufferData(buffer,AL_FORMAT_MONO16,&wave_sample,sample_lenght * sizeof(short),frequency);
alSourcei(source,AL_BUFFER,buffer);
Also in the above example, I multiplied the sample_length by how many bytes are in each sample (in this case "sizeof(short)".
The final problem was that you need to un-link a buffer from the source before you change the buffer data
alSourcei(source,AL_BUFFER,NULL);
The music would play, but not correctly until I added that line to the note change code.

vsnprintf on an ATMega2560

I am using a toolkit to do some Elliptical Curve Cryptography on an ATMega2560. When trying to use the print functions in the toolkit I am getting an empty string. I know the print functions work because the x86 version prints the variables without a problem. I am not experienced with ATMega and would love any help on this matter. The print code is included below.
Code to print a big number (it itself calls a util_print)
void bn_print(bn_t a) {
int i;
if (a->sign == BN_NEG) {
util_print("-");
}
if (a->used == 0) {
util_print("0\n");
} else {
#if WORD == 64
util_print("%lX", (unsigned long int)a->dp[a->used - 1]);
for (i = a->used - 2; i >= 0; i--) {
util_print("%.*lX", (int)(2 * (BN_DIGIT / 8)),
(unsigned long int)a->dp[i]);
}
#else
util_print("%llX", (unsigned long long int)a->dp[a->used - 1]);
for (i = a->used - 2; i >= 0; i--) {
util_print("%.*llX", (int)(2 * (BN_DIGIT / 8)),
(unsigned long long int)a->dp[i]);
}
#endif
util_print("\n");
}
}
The code to actually print a big number variable:
static char buffer[64 + 1];
void util_printf(char *format, ...) {
#ifndef QUIET
#if ARCH == AVR
char *pointer = &buffer[1];
va_list list;
va_start(list, format);
vsnprintf(pointer, 128, format, list);
buffer[0] = (unsigned char)2;
va_end(list);
#elif ARCH == MSP
va_list list;
va_start(list, format);
vprintf(format, list);
va_end(list);
#else
va_list list;
va_start(list, format);
vprintf(format, list);
fflush(stdout);
va_end(list);
#endif
#endif
}
edit: I do have UART initialized and can output printf statments to a console.
I'm one of the authors of the RELIC toolkit. The current util_printf() function is used to print inside the Avrora simulator, for debugging purposes. I'm glad that you could adapt the code to your purposes. As a side note, the buffer size problem was already fixed in more recent releases of the toolkit.
Let me know you have further problems with the library. You can either contact me personally or write directly to the discussion group.
Thank you!
vsnprintf store it's output on the given buffer (which in this case is the address point by pointer variable), in order for it to show on the console (through UART) you must send your buffer using printf (try to add printf("%s", pointer) after vsnprintf), if you're using avr-libc don't forget to initialized std stream first before making any call to printf function
oh btw your code is vulnerable to buffer overflow attack, buffer[64 + 1] means your buffer size is only 65 bytes, vsnprintf(pointer, 128, format, list); means that the maximum buffer defined by your application is 128 bytes, try to change it below 65 bytes in order to avoid overflow
Alright so I found a workaround to print the bn numbers to a stdout on an ATMega2560. The toolkit comes with a function that writes a variable to a string (bn_write_str). So I implemented my own print function as such:
void print_bn(bn_t a)
{
char print[BN_SIZE]; // max precision of a bn number
int bi = bn_bits(a); // get the number of bits of the number
bn_write_str(print, bi, a, 16) // 16 indicates the radix (hexadecimal)
printf("%s\n"), print);
}
This function will print a bn number in hexadecimal format.
Hope this helps anyone using the RELIC toolkit with an AVR.
This skips the util_print calls.

Getting the current stack trace on Mac OS X

I'm trying to work out how to store and then print the current stack in my C++ apps on Mac OS X. The main problem seems to be getting dladdr to return the right symbol when given an address inside the main executable. I suspect that the issue is actually a compile option, but I'm not sure.
I have tried the backtrace code from Darwin/Leopard but it calls dladdr and has the same issue as my own code calling dladdr.
Original post:
Currently I'm capturing the stack with this code:
int BackTrace(Addr *buffer, int max_frames)
{
void **frame = (void **)__builtin_frame_address(0);
void **bp = ( void **)(*frame);
void *ip = frame[1];
int i;
for ( i = 0; bp && ip && i < max_frames; i++ )
{
*(buffer++) = ip;
ip = bp[1];
bp = (void**)(bp[0]);
}
return i;
}
Which seems to work ok. Then to print the stack I'm looking at using dladdr like this:
Dl_info dli;
if (dladdr(Ip, &dli))
{
ptrdiff_t offset;
int c = 0;
if (dli.dli_fname && dli.dli_fbase)
{
offset = (ptrdiff_t)Ip - (ptrdiff_t)dli.dli_fbase;
c = snprintf(buf, buflen, "%s+0x%x", dli.dli_fname, offset );
}
if (dli.dli_sname && dli.dli_saddr)
{
offset = (ptrdiff_t)Ip - (ptrdiff_t)dli.dli_saddr;
c += snprintf(buf+c, buflen-c, "(%s+0x%x)", dli.dli_sname, offset );
}
if (c > 0)
snprintf(buf+c, buflen-c, " [%p]", Ip);
Which almost works, some example output:
/Users/matthew/Library/Frameworks/Lgi.framework/Versions/A/Lgi+0x2473d(LgiStackTrace+0x5d) [0x102c73d]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x2a006(tart+0x28e72) [0x2b006]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x2f438(tart+0x2e2a4) [0x30438]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x35e9c(tart+0x34d08) [0x36e9c]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x1296(tart+0x102) [0x2296]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x11bd(tart+0x29) [0x21bd]
It's getting the method name right for the shared object but not for the main app. Those just map to "tart" (or "start" minus the first character).
Ideally I'd like line numbers as well as the method name at that point. But I'll settle for the correct function/method name for starters. Maybe shoot for line numbers after that, on Linux I hear you have to write your own parser for a private ELF block that has it's own instruction set. Sounds scary.
Anyway, can anyone sort this code out so it gets the method names right?
What releases of OS X are you targetting. If you are running on Mac OS X 10.5 and higher you can just use the backtrace() and backtrace_symbols() libraray calls. They are defined in execinfo.h, and there is a manpage with some sample code.
Edit:
You mentioned in the comments that you need to run on Tiger. You can probably just include the implementation from Libc in your app. The source is available from Apple's opensource site. Here is a link to the relevent file.

Resources