Add breakpoints and install handlers - macos

My high-level goal is something like this:
void print_backtrace() {
void *callstack[128];
int framesC = backtrace(callstack, sizeof(callstack));
printf("backtrace() returned %d addresses\n", framesC);
char** strs = backtrace_symbols(callstack, framesC);
for(int i = 0; i < framesC; ++i) {
if(strs[i])
printf("%s\n", strs[i]);
else
break;
}
free(strs);
}
install_breakpoint_handler("__NSAutoreleaseNoPool", print_backtrace);
So, each time the __NSAutoreleaseNoPool function breakpoint is catched, print_backtrace should be called. (All within the same binary. I'm not trying to catch the breakpoint of separate processes.)
I guess I can somehow do this via ptrace. Is there some easy-to-use and lightweight library?
Currently I'm searching for a MacOSX solution, but cross-platform would be nice of course.

I just found one lib (I even had used it a few years ago...): mach_override
I also found this debuglib but didn't tried.
See here for a demonstration for __NSAutoreleaseNoPool: It automatically executes print_backtrace.

Related

Has there been any change between kernel 5.15 and 5.4.0 concerning ioctl valid commands?

We have some custom driver working on 5.4.0. It's pretty old and the original developers are no longer supporting it, so we have to maintain it in our systems.
When upgrading to Ubuntu 22 (Kernel 5.15), the driver suddenly stopped working, and sending ioctl with the command SIOCDEVPRIVATE (which used to work in kernel 5.4.0, and in fact is used to get some necessary device information)now gives "ioctl: Operation not supported" error with no extra information anywhere on the logs.
So... has something changed between those two kernels? We did have to adapt some of the structures used to register the driver, but I can't see anything concerning registering valid operations there. Do I have to register valid operations somewhere now?
Alternatively, does somebody know what part of the kernel code is checking for the operation to be supported? I've been trying to find it from ioctl.c, but I can't seem to find where that particular error comes from.
The driver code that supposedly takes care of this (doesn't even reach first line on 5.15):
static int u50_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) {
struct u50_priv *priv = netdev_priv(dev);
if (cmd == SIOCDEVPRIVATE) {
memcpy(&ifr->ifr_data, priv->tty->name, strlen(priv->tty->name));
}
return 0;
}
And the attempt to access it that does no longer work:
struct ifreq ifr = {0};
struct ifaddrs *ifaddr, *ifa;
getifaddrs(&ifaddr);
for (ifa = ifaddr; ifa != NULL; ifa = ifa->ifa_next) {
memcpy(ifr.ifr_name, ifa->ifa_name, IFNAMSIZ);
if (ioctl(lonsd, SIOCDEVPRIVATE, &ifr) < 0) {
perror("ioctl");
syslog(LOG_ERR, "Ioctl:%d: %s\n", __LINE__, strerror(errno));
}
...
and structure for registration
static const struct net_device_ops u50_netdev_ops = {
.ndo_init = u50_dev_init,
.ndo_uninit = u50_dev_uninit,
.ndo_open = u50_dev_open,
.ndo_stop = u50_dev_stop,
.ndo_start_xmit = u50_dev_xmit,
.ndo_do_ioctl = u50_dev_ioctl,
.ndo_set_mac_address = U50SetHWAddr,
};
If you need some code to respond to SIOCDEVPRIVATE, you used to be able to do it via ndo_do_ioctl (writing a compatible function, then linking it in a net_device_ops struct in 5.4). However, in 5.15 it was changed so now you have to implement a ndo_siocdevprivate function, rather than ndo_do_ioctl, which is no longer called, according to the kernel documentation.
source:
https://elixir.bootlin.com/linux/v5.15.57/source/include/linux/netdevice.h
Patch that did this: spinics.net/lists/netdev/msg698158.html

Why does clock() returns -1 in C

I'm trying to implement an error handler using the clock() function from the "time.h" library. The code runs inside an embeeded system (Colibri IMX7 - M4 Processor). The function is used to monitor a current value within a specific range, if the value of the current isn't correct the function should return an error message.
The function will see if the error is ocurring and in the first run it will save the first appearance of the error in a clock_t as reference, and then in the next runs if the error is still there, it will compare the current time using clock() with the previous reference and see if it will be longer than a specific time.
The problem is that the function clock() is always returning -1. What should I do to avoid that? Also, why can't I declare a clock_t variable as static (e.g. static clock_t start_t = clock()?
Please see below the function:
bool CrossLink_check_error_LED_UV_current_clock(int current_state, int current_at_LED_UV)
{
bool has_LED_UV_current_deviated = false;
static int current_number_of_errors_Current_LED_CANNON = 0;
clock_t startTimeError = clock();
const int maximum_operational_current_when_on = 2000;
const int minimum_turned_on_LED_UV_current = 45;
if( (current_at_LED_UV > maximum_operational_current_when_on)
||(current_state!=STATE_EMITTING && (current_at_LED_UV > minimum_turned_on_LED_UV_current))
||(current_state==STATE_EMITTING && (current_at_LED_UV < minimum_turned_on_LED_UV_current)) ){
current_number_of_errors_Current_LED_CANNON++;
if(current_number_of_errors_Current_LED_CANNON > 1) {
if (clock() - startTimeError > 50000){ // 50ms
has_LED_UV_current_deviated = true;
PRINTF("current_at_LED_UV: %d", current_at_LED_UV);
if(current_state==STATE_EMITTING){
PRINTF(" at state emitting");
}
PRINTF("\n\r");
}
}else{
if(startTimeError == -1){
startTimeError = clock();
}
}
}else{
startTimeError = 0;
current_number_of_errors_Current_LED_CANNON = 0;
}
return has_LED_UV_current_deviated;
}
Edit: I forgot to mention before, but we are using GCC 9.3.1 arm-none-eabi compiler with CMake to build the executable file. We have an embedeed system (Colibri IMX7 made by Toradex) that consists in 2 A7 Processors that runs our Linux (more visual interface) and the program that is used to control our device runs in a M4 Processor without an OS, just pure bare-metal.
For a lot of provided functions in the c standard library, if you have the documentation installed (usually it gets installed with the compiler), you can view documentation using the man command in the shell. With man clock, it tells me that:
NAME
clock - determine processor time
SYNOPSIS
#include <time.h>
clock_t clock(void);
DESCRIPTION
The clock() function returns an approximation of processor time used by the program.
RETURN VALUE
The value returned is the CPU time used so far as a clock_t; to get the number of seconds used, divide by
CLOCKS_PER_SEC. If the processor time used is not available or its value cannot be represented, the function
returns the value (clock_t) -1.
etc.
This tells us that -1 means that the processor time (CLOCK_PROCESS_CPUTIME_ID) is unavailable. The solution is to use CLOCK_MONOTONIC instead. We can select the clock we want to use with clock_gettime.
timespec clock_time;
if (clock_gettime(CLOCK_MONOTONIC, &clock_time)) {
printf("CLOCK_MONOTONIC is unavailable!\n");
exit(1);
}
printf("Seconds: %d Nanoseconds: %ld\n", clock_time.tv_sec, clock_time.tv_nsec);
To answer the second part of your question:
static clock_t start_time = clock();
is not allowed because the return value of the function clock() is not known until runtime, but in C the initializer of a static variable must be a compile-time constant.
You can write:
static clock_t start_time = 0;
if (start_time == 0)
{
start_time = clock();
}
But this may or may not be suitable to use in this case, depending on whether zero is a legitimate return value of the function. If it could be, you would need something like:
static bool start_time_initialized = false;
static clock_t start_time;
if (!start_time_initialized)
{
start_time_initialized = true;
start_time = clock();
}
The above is reliable only if you cannot have two copies of this function running at once (it is not re-entrant).
If you have a POSIX library available you could use a pthread_once_t to do the same as the above bool but in a re-entrant way. See man pthread_once for details.
Note that C++ allows more complicated options in this area, but you have asked about C.
Note also that abbreviating "start time" as start_t is a very bad idea, because the suffix _t means "type" and should only be used for type names.
in the end the problem was that since we are running our code on bare metal, the clock() function wasn't working. We ended up using an internal timer on the M4 Processor that we found, so now everything is fine. Thanks for the answers.

Using a Batch File to send Arguments to MFC Application with a GUI

I'm looking for the best possible approach to incorporate a batch file to send arguments to the MFC application rather than relying on the GUI interface. Does anyone know the best method to go about doing this?
I use the following code in my InitInstance method of my app class:
LPWSTR *szArglist = nullptr;
int iNumArgs = 0;
szArglist = CommandLineToArgvW(GetCommandLine(), &iNumArgs);
if (iNumArgs > 0 && szArglist != nullptr)
{
for (int iArg = 0; iArg < iNumArgs; iArg++)
{
CString strArg(szArglist[iArg]);
int iDelim = strArg.Find(_T("="));
if (iDelim != -1)
{
CString strParamName = strArg.Left(iDelim);
CString strParamValue = strArg.Mid(iDelim + 1);
if (strParamName.CollateNoCase(_T("/lang")) == 0)
{
m_strPathLanguageResourceOverride.Format(_T("%sMeetSchedAssist%s.dll"),
(LPCTSTR)GetProgramPath(), (LPCTSTR)strParamValue.MakeUpper());
if (!PathFileExists(m_strPathLanguageResourceOverride))
m_strPathLanguageResourceOverride = _T("");
}
}
}
// Free memory allocated for CommandLineToArgvW arguments.
LocalFree(szArglist);
}
As you can see, I use the CommandLineToArgvW method to extract and process the command line arguments.
A GUI program can receive command line arguments just like a command line program can.
Your Application class (CWinApp, if memory serves) contains a member named m_lpCmdLine that contains the command line arguments (if any) in a CString.
If you also want to deal with shell parameters, you'll probably also want to look at WinApp::ParseCommandLine and CCommandLineInfo (note, if you're dealing with a Wizard-generated program, chances are that WinApp::ParseCommandLine is already being called by default).

OpenCL possible reason a clGetEventInfo would cause a segfault?

I have a pretty complicated OpenCL app. It fires up 5 different contexts on 5 different GPUs, and executes the same kernel on all of them, splitting up the work into 1024 "chunks" to be processed.
Each time a kernel finishes, a result is checked for, and it's given a new chunk. Sometimes, when running, as the app is starting (very rarely mid-run) it will immediately segfault on the GetEventInfo call.
This is done in a loop using callbacks and clGetEventInfo calls to ensure something is finished before moving on to the next step.
GDB output:
(gdb) back
#0 0x00007fdc686ab525 in clGetEventInfo () from /usr/lib/libOpenCL.so.1
#1 0x00000000004018c1 in ready (event=0x26a00000267) at gputest.c:165
#2 0x0000000000404b5a in main (argc=9, argv=0x7fffdfe3b268) at gputest.c:544
The ready function:
int ready(cl_event event) {
int rdy;
if(!event)
return 0;
clGetEventInfo(event, CL_EVENT_COMMAND_EXECUTION_STATUS, sizeof(cl_int), &rdy, NULL);
if(rdy == CL_COMPLETE)
return 1;
return 0;
}
How the kernel is run, the event set, and checked. Some pseudocode inserted for brevity:
while(test if loop is complete) {
for(j = 0; j < GPUS; j++) {
if(gpu[j].waiting && loops < 9999) {
gpu[j].waiting = 0;
offset[j] = loops * 1024 * 1024;
loops++;
EC("kernel init", clEnqueueNDRangeKernel(queues[j], kernel_init[j], 1, &(offset[j]), &global_work_size, &work128, 0, NULL, &events[j]));
gpu[j].readsearch = events[j];
gpu[j].reading = 1;
}
}
for(j = 0; j < GPUS; j++) {
if(gpu[j].reading && ready(gpu[j].readsearch)) {
gpu[j].reading = 0;
gpu[j].waiting = 1;
// unrelated reporting other code here
}
}
}
Its pretty simple. There is more to the code, but it's unrelated. The ready/checking function is very simple. I even added debugging to the ready function to printf the event # to see what was happening when it crashed - nothing really. No pattern I could see.
What could be causing this?
Ugh. Found the problem. Since you cannot initialize values when you create/declare a struct, I was using some values uninitialized. I malloc'ed the gpu structs then just started using them. With if(gpu[x].reading &&...) being random data and completely uninitialized. So sometimes it was non-zero, which allowed the ready() function to fire off. Since the gpu[x].readsearch event was never set in the first place, clGetEventInfo bombed trying to use whatever was at the memory location.
This would be time number 482,847 that accidentally using uninitialized variables has burned me.

Getting the current stack trace on Mac OS X

I'm trying to work out how to store and then print the current stack in my C++ apps on Mac OS X. The main problem seems to be getting dladdr to return the right symbol when given an address inside the main executable. I suspect that the issue is actually a compile option, but I'm not sure.
I have tried the backtrace code from Darwin/Leopard but it calls dladdr and has the same issue as my own code calling dladdr.
Original post:
Currently I'm capturing the stack with this code:
int BackTrace(Addr *buffer, int max_frames)
{
void **frame = (void **)__builtin_frame_address(0);
void **bp = ( void **)(*frame);
void *ip = frame[1];
int i;
for ( i = 0; bp && ip && i < max_frames; i++ )
{
*(buffer++) = ip;
ip = bp[1];
bp = (void**)(bp[0]);
}
return i;
}
Which seems to work ok. Then to print the stack I'm looking at using dladdr like this:
Dl_info dli;
if (dladdr(Ip, &dli))
{
ptrdiff_t offset;
int c = 0;
if (dli.dli_fname && dli.dli_fbase)
{
offset = (ptrdiff_t)Ip - (ptrdiff_t)dli.dli_fbase;
c = snprintf(buf, buflen, "%s+0x%x", dli.dli_fname, offset );
}
if (dli.dli_sname && dli.dli_saddr)
{
offset = (ptrdiff_t)Ip - (ptrdiff_t)dli.dli_saddr;
c += snprintf(buf+c, buflen-c, "(%s+0x%x)", dli.dli_sname, offset );
}
if (c > 0)
snprintf(buf+c, buflen-c, " [%p]", Ip);
Which almost works, some example output:
/Users/matthew/Library/Frameworks/Lgi.framework/Versions/A/Lgi+0x2473d(LgiStackTrace+0x5d) [0x102c73d]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x2a006(tart+0x28e72) [0x2b006]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x2f438(tart+0x2e2a4) [0x30438]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x35e9c(tart+0x34d08) [0x36e9c]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x1296(tart+0x102) [0x2296]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x11bd(tart+0x29) [0x21bd]
It's getting the method name right for the shared object but not for the main app. Those just map to "tart" (or "start" minus the first character).
Ideally I'd like line numbers as well as the method name at that point. But I'll settle for the correct function/method name for starters. Maybe shoot for line numbers after that, on Linux I hear you have to write your own parser for a private ELF block that has it's own instruction set. Sounds scary.
Anyway, can anyone sort this code out so it gets the method names right?
What releases of OS X are you targetting. If you are running on Mac OS X 10.5 and higher you can just use the backtrace() and backtrace_symbols() libraray calls. They are defined in execinfo.h, and there is a manpage with some sample code.
Edit:
You mentioned in the comments that you need to run on Tiger. You can probably just include the implementation from Libc in your app. The source is available from Apple's opensource site. Here is a link to the relevent file.

Resources