FreeBSD newbus driver loading succesfully but cant create /dev/** file and debugging - debugging

I am installing a new newbuf driver on FreeBSD 10.0 . After compiling with make the driver.ko file has been created and than kldload can load successfully. kldload returns 0 and I can see the device at the kldstat output. When attempt to use the driver opening the /dev/** file, the file is not exist.
I think that this /dev/** file should be created by make_dev function which is located in device_attach member method. To test if the kldload reaches this attaching function; when write printf and uprintf to debug the driver, I can not see any output at console nor dmesg output.
But the problem is after writing printf at beginnings (after local variable definitions) of device_identify and device_probe functions, I can't see any output again at console nor dmesg.
My question is that even if the physical driver has problem (not located etc.), should I see the ouput of printf at the device_identify member function which is called by kldload at starting course (I think)?
Do I have a mistake when debugging newbuf driver with printf (I also tried a hello_world device driver and at this driver I can take output of printf at dmesg)?
Mainly how can I test/debug this driver's kldload processes?
Below some parts of my driver code (I think at least I should see MSG1, but I can not see):
struct mydrv_softc
{
device_t dev;
};
static devclass_t mydrv_devclass;
static struct cdevsw mydrv_cdevsw = {
.d_version = D_VERSION,
.d_name = "mydrv",
.d_flags = D_NEEDGIANT,
.d_open = mydrv_open,
.d_close = mydrv_close,
.d_ioctl = mydrv_ioctl,
.d_write = mydrv_write,
.d_read = mydrv_read
};
static void mydrv_identify (driver_t *driver, device_t parent) {
devclass_t dc;
device_t child;
printf("MSG1: The process inside the identfy function.");
dc = devclass_find("mydrv");
if (devclass_get_device(dc, 0) == NULL) {
child = BUS_ADD_CHILD(parent, 0, "mydrv", -1);
}
}
static int mydrv_probe(device_t dev) {
printf("MSG2: The process inside the probe function.");
mydrv_init();
if (device_get_unit(dev) != 0)
return (ENXIO);
device_set_desc(dev, "FreeBSD Device Driver");
return (0);
}
static int mydrv_attach(device_t dev) {
struct mydrv_softc *sc;
device_printf(dev, "MSG3: The process will make attachment.");
sc = (struct mydrv_softc *) device_get_softc(dev);
sc->dev = (device_t)make_dev(&mydrv_cdevsw, 0, UID_ROOT, GID_WHEEL, 0644, "mydrv_drv");
return 0;
}
static int mydrv_detach(device_t dev) {
struct mydrv_softc *sc;
sc = (struct mydrv_softc *) device_get_softc(dev);
destroy_dev((struct cdev*)(sc->dev));
bus_generic_detach(dev);
return 0;
}
static device_method_t mydrv_methods[] = {
DEVMETHOD(device_identify, mydrv_identify),
DEVMETHOD(device_probe, mydrv_probe),
DEVMETHOD(device_attach, mydrv_attach),
DEVMETHOD(device_detach, mydrv_detach),
{ 0, 0 }
};
static driver_t mydrv_driver = {
"mydrv",
mydrv_methods,
sizeof(struct mydrv_softc),
};
DRIVER_MODULE(mydrv, ppbus, mydrv_driver, mydrv_devclass, 0, 0);

If you don't see your printf's output on your console then your device functions will probably not be called. Can you show us your module's code?
Have you used DRIVER_MODULE() or DEV_MODULE()?
What parent bus are you using?

I guess printf works fine, but I prefer to use device_printf as it also prints the device name, and will be easier when looking through logs or dmesg output. Also leave multiple debug prints and check the log files on your system. Most logs for the device drivers are logged in /var/log/messages. But check other log files too.
Are you running your code on a virtual machine? Some device drivers don't show up their device files in /dev if the OS is running on a virtual machine. You should probably run your OS on actual hardware for the device file to show up.
As far as I know, you can't see the output in dmesg if you cannot find the corresponding device file in /dev but you may have luck with logs as I mentioned.
The easiest way to debug is of course using the printf statements. Other than this, you can debug the kernel using gdb running on another system. I am not familiar with the exact process but I know you can do this. Google it.

Related

Has there been any change between kernel 5.15 and 5.4.0 concerning ioctl valid commands?

We have some custom driver working on 5.4.0. It's pretty old and the original developers are no longer supporting it, so we have to maintain it in our systems.
When upgrading to Ubuntu 22 (Kernel 5.15), the driver suddenly stopped working, and sending ioctl with the command SIOCDEVPRIVATE (which used to work in kernel 5.4.0, and in fact is used to get some necessary device information)now gives "ioctl: Operation not supported" error with no extra information anywhere on the logs.
So... has something changed between those two kernels? We did have to adapt some of the structures used to register the driver, but I can't see anything concerning registering valid operations there. Do I have to register valid operations somewhere now?
Alternatively, does somebody know what part of the kernel code is checking for the operation to be supported? I've been trying to find it from ioctl.c, but I can't seem to find where that particular error comes from.
The driver code that supposedly takes care of this (doesn't even reach first line on 5.15):
static int u50_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) {
struct u50_priv *priv = netdev_priv(dev);
if (cmd == SIOCDEVPRIVATE) {
memcpy(&ifr->ifr_data, priv->tty->name, strlen(priv->tty->name));
}
return 0;
}
And the attempt to access it that does no longer work:
struct ifreq ifr = {0};
struct ifaddrs *ifaddr, *ifa;
getifaddrs(&ifaddr);
for (ifa = ifaddr; ifa != NULL; ifa = ifa->ifa_next) {
memcpy(ifr.ifr_name, ifa->ifa_name, IFNAMSIZ);
if (ioctl(lonsd, SIOCDEVPRIVATE, &ifr) < 0) {
perror("ioctl");
syslog(LOG_ERR, "Ioctl:%d: %s\n", __LINE__, strerror(errno));
}
...
and structure for registration
static const struct net_device_ops u50_netdev_ops = {
.ndo_init = u50_dev_init,
.ndo_uninit = u50_dev_uninit,
.ndo_open = u50_dev_open,
.ndo_stop = u50_dev_stop,
.ndo_start_xmit = u50_dev_xmit,
.ndo_do_ioctl = u50_dev_ioctl,
.ndo_set_mac_address = U50SetHWAddr,
};
If you need some code to respond to SIOCDEVPRIVATE, you used to be able to do it via ndo_do_ioctl (writing a compatible function, then linking it in a net_device_ops struct in 5.4). However, in 5.15 it was changed so now you have to implement a ndo_siocdevprivate function, rather than ndo_do_ioctl, which is no longer called, according to the kernel documentation.
source:
https://elixir.bootlin.com/linux/v5.15.57/source/include/linux/netdevice.h
Patch that did this: spinics.net/lists/netdev/msg698158.html

Device number linux module issue

I'm developing character devices for a Linux kernel distro Poky 1.7 built with Yocto and installed on a Freescale imx6sx target board. They're simple modules, that are just used to read/write IO registers of the SoC.
Everything has worked well till I've load the modules after login, by simple calling in the target console
modprobe mydevice
Problems have begun since I've automatically load this modules at boot time. First time system startup after flashing my board, my custom modules are loaded and I can open them, do read/write operations...therefore they're working.
Restarting the system (switching it off and on or rebooting), modules are still loaded, but I'm unable to use them.
After some trials, what I've seen is that the problem is dynamic allocation of the device number they use, in particular function alloc_chrdev_region( ) called in the __init( ) function of the module.
In fact, really first time the system is switched on after has been flashed, the device file's device number in /dev is up-to-date with the /sys/class/myclass/mydevice/dev device number, while from the first reboot they doesn't match anymore. Briefly:
First boot:
/dev/mydevice device number [Major, Minor] = 247,0
/sys/class/myclass/mydevice/dev device number [Major, Minor] = 247,0
Second and later boots:
/dev/mydevice device number [Major, Minor] = 247,0
/sys/class/myclass/mydevice/dev device number [Major, Minor] = 248,0
So, it seems like the device file is kept in memory instead of been updated at every boot.
Here is main code for __init( ) and __exit( ) functions:
static struct class *g_deviceClass;
static struct cdev g_characterDevice;
static dev_t g_deviceNumber;
static int __init npe_power_drv_init(void)
{
int32_t result;
if ( alloc_chrdev_region(&g_deviceNumber, 0, 1, DEVICE_NAME) < 0 )
{
return -EAGAIN;
}
cdev_init(&g_characterDevice, &npe_power_drv_ops);
g_deviceClass = class_create(THIS_MODULE, CLASS_NAME);
if( device_create(g_deviceClass, NULL, g_deviceNumber, NULL, DEVICE_NAME) == NULL )
{
class_destroy(g_deviceClass);
unregister_chrdev_region(g_deviceNumber, 1);
return -EAGAIN;
}
result = cdev_add(&g_characterDevice, g_deviceNumber, 1);
if( result < 0 )
{
device_destroy(g_deviceClass, g_deviceNumber);
class_destroy(g_deviceClass);
unregister_chrdev_region(g_deviceNumber, 1);
return -EAGAIN;
}
return 0;
}
static void __exit npe_power_drv_exit(void)
{
device_destroy(g_deviceClass, g_deviceNumber);
class_destroy(g_deviceClass);
cdev_del(&g_characterDevice);
unregister_chrdev_region(g_deviceNumber, 1);
}
module_init(npe_power_drv_init);
module_exit(npe_power_drv_exit);
Using static device number allocation, register_chrdev_region( ) everything works fine. Sure I can use this latter way to implement my code, but I would like to know why I've got this behaviour.
Many thanks for any help
Andrea

KEXT: vnode_open() result in Kernel Panic

Sorry if it was asked before, but I can't really google any.
I was trying to read files within the KEXT of OSX, using vnode_open() like the following:
struct vnode *vp = NULL;
kern_return_t kret;
vfs_context_t ctx = vfs_context_current();
kret = vnode_open(path, FREAD, 0, 0, &vp, ctx);
if (kret != KERN_SUCCESS) {
// Error log
} else {
proc_t proc = vfs_context_proc(ctx);
kauth_cred_t vp_cred = vfs_context_ucred(ctx);
char *buf = NULL;
int resid;
int len = sizeof(struct astruct);
buf = (char *)IOMalloc(len);
kret = vn_rdwr(UIO_READ, fvp, (caddr_t)buf,
len, 0, UIO_SYSSPACE, 0, vp_cred, &resid, proc);
vnode_close(fvp, FREAD, ctx);
if (kret != KERN_SUCCESS) {
// Error log
}
// Do something with the result.
}
vfs_context_rele(ctx);
Once the kext was loaded, the system panics and reboots. As long as vnode_open() is there, it panics.
Am I doing it wrong?
One thing that immediately stands out:
You shouldn't be using vfs_context_current() - use vfs_context_create(NULL). What's worse, you're subsequently calling vfs_context_rele(ctx); on the returned context. Only vfs_context_create retains, vfs_context_current() does not, so you're over-releasing the VFS context. This could certainly cause a kernel panic.
In general, when developing kexts, you can really be doing a lot more than just letting the system reboot after a kernel panic:
Panic logs are written to NVRAM, and on next boot, are saved to file. You can inspect them via Console.app, under "System Diagnostic Reports" starting with "kernel".
Panic logs contain a stack trace, which is by default unsymbolicated. You can symbolicate them after the crash, but it's a lot more convenient to set the keepsyms=1 boot argument and get the crash handler to symbolicate for you.
You can set up the debug boot argument to make crashes initiate the kernel debugger or core dumps, write the stack trace to serial/firewire console, etc.
These things are documented in part on Apple's site, but a bit of searching on the web might get you some more detailed information. In any case, they're incredibly useful for debugging problems like this.

Setting process name on Mac OS X at runtime

I'm trying to change my process' name as it appears in ps and Activity Monitor at runtime. I found several notes that there is no portable way to do this (which I don't care about).
Here's what I tried. None of these approaches worked for me.
Changing argv[0] (seems to be the way to go on some Unix systems)
Calling [[NSProcessInfo processInfo] setProcessName:#"someName"]
Calling setprogname (calling getprogname returns the name I set, but that is irrelevant)
I also read about a function called setproctitle which should be defined in stdlib.h if it is available, but it's not there.
There must be a way to accomplish this because QTKitServer - the faceless decoder for QuickTime Player X - has its corresponding QuickTime Player's PID in its process name.
Does anybody have a clue about how to accomplish this? I'd very much prefer a Core Foundation or POSIXy way over an Objective-C method to do this.
Thanks,
Marco
Edit: If it is in any way relevant, I'm using Mac OS X 10.6.5 and Xcode 3.2.5
There are good reasons to change the process name. Java software should change process names because when running different java tools I want to see which java process is for which tool.
Chromium does it: http://src.chromium.org/viewvc/chrome/trunk/src/base/mac/mac_util.mm.
Node.js uses same code to implement Process.title = 'newtitle': https://github.com/joyent/node/blob/master/src/platform_darwin_proctitle.cc
Note: This fails if someone does su to a different not logged-in user: https://github.com/joyent/node/issues/1727
Here the source code in its full complex glory. By the way, someone told me it also works for Mac OS X Lion and also fails with su.
// Copyright (c) 2011 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
void SetProcessName(CFStringRef process_name) {
if (!process_name || CFStringGetLength(process_name) == 0) {
NOTREACHED() << "SetProcessName given bad name.";
return;
}
if (![NSThread isMainThread]) {
NOTREACHED() << "Should only set process name from main thread.";
return;
}
// Warning: here be dragons! This is SPI reverse-engineered from WebKit's
// plugin host, and could break at any time (although realistically it's only
// likely to break in a new major release).
// When 10.7 is available, check that this still works, and update this
// comment for 10.8.
// Private CFType used in these LaunchServices calls.
typedef CFTypeRef PrivateLSASN;
typedef PrivateLSASN (*LSGetCurrentApplicationASNType)();
typedef OSStatus (*LSSetApplicationInformationItemType)(int, PrivateLSASN,
CFStringRef,
CFStringRef,
CFDictionaryRef*);
static LSGetCurrentApplicationASNType ls_get_current_application_asn_func =
NULL;
static LSSetApplicationInformationItemType
ls_set_application_information_item_func = NULL;
static CFStringRef ls_display_name_key = NULL;
static bool did_symbol_lookup = false;
if (!did_symbol_lookup) {
did_symbol_lookup = true;
CFBundleRef launch_services_bundle =
CFBundleGetBundleWithIdentifier(CFSTR("com.apple.LaunchServices"));
if (!launch_services_bundle) {
LOG(ERROR) << "Failed to look up LaunchServices bundle";
return;
}
ls_get_current_application_asn_func =
reinterpret_cast<LSGetCurrentApplicationASNType>(
CFBundleGetFunctionPointerForName(
launch_services_bundle, CFSTR("_LSGetCurrentApplicationASN")));
if (!ls_get_current_application_asn_func)
LOG(ERROR) << "Could not find _LSGetCurrentApplicationASN";
ls_set_application_information_item_func =
reinterpret_cast<LSSetApplicationInformationItemType>(
CFBundleGetFunctionPointerForName(
launch_services_bundle,
CFSTR("_LSSetApplicationInformationItem")));
if (!ls_set_application_information_item_func)
LOG(ERROR) << "Could not find _LSSetApplicationInformationItem";
CFStringRef* key_pointer = reinterpret_cast<CFStringRef*>(
CFBundleGetDataPointerForName(launch_services_bundle,
CFSTR("_kLSDisplayNameKey")));
ls_display_name_key = key_pointer ? *key_pointer : NULL;
if (!ls_display_name_key)
LOG(ERROR) << "Could not find _kLSDisplayNameKey";
// Internally, this call relies on the Mach ports that are started up by the
// Carbon Process Manager. In debug builds this usually happens due to how
// the logging layers are started up; but in release, it isn't started in as
// much of a defined order. So if the symbols had to be loaded, go ahead
// and force a call to make sure the manager has been initialized and hence
// the ports are opened.
ProcessSerialNumber psn;
GetCurrentProcess(&psn);
}
if (!ls_get_current_application_asn_func ||
!ls_set_application_information_item_func ||
!ls_display_name_key) {
return;
}
PrivateLSASN asn = ls_get_current_application_asn_func();
// Constant used by WebKit; what exactly it means is unknown.
const int magic_session_constant = -2;
OSErr err =
ls_set_application_information_item_func(magic_session_constant, asn,
ls_display_name_key,
process_name,
NULL /* optional out param */);
LOG_IF(ERROR, err) << "Call to set process name failed, err " << err;
}
Edit: It's a complex and confusing problem.
On OS X there is no setproctitle(3). One has to write into the argv array (ugly
and a bit dangerous because it is possible to overwrite some environment variables with bogus stuff). Done correctly it works very well.
Additionally Apple has the ActivityMonitor application, something like the Task Manager under Windows. The code above manipulates ActivityMonitor but this manipulation doesn't seem to be encouraged by Apple (hence the use of undocumented functions).
Important: ps and ActivityMonitor don't show the same information.
Also important: ActivityMonitor is not available if you don't have GUI. This can happen if you ssh in to a remote Apple box and there is nobody logged in by GUI. Sadly there is a bug by Apple IMO. Just querying if there is a GUI sends an annoying warning message to stderr.
Summary: If you need to change ActivityMonitor, use the code above. If you have GUI-less situations and and dislike warnings on stderr, redirect stderr temporarily to /dev/null during the call of SetProcessName. If you need to change ps information, write into argv.
You can use the lsappinfo tool which comes with macOS since at least 10.6 and up to present day (10.13.2):
Shell:
lsappinfo setinfo <PID> --name <NAME>
C++:
#include <sstream>
#include <string>
#include <stdlib.h>
void setProcessName (pid_t pid, std::string name)
{
std::ostringstream cmd;
cmd << "/usr/bin/lsappinfo setinfo " << pid;
cmd << " --name \"" << name << "\"";
system (cmd.str().c_str());
}

Getting the current stack trace on Mac OS X

I'm trying to work out how to store and then print the current stack in my C++ apps on Mac OS X. The main problem seems to be getting dladdr to return the right symbol when given an address inside the main executable. I suspect that the issue is actually a compile option, but I'm not sure.
I have tried the backtrace code from Darwin/Leopard but it calls dladdr and has the same issue as my own code calling dladdr.
Original post:
Currently I'm capturing the stack with this code:
int BackTrace(Addr *buffer, int max_frames)
{
void **frame = (void **)__builtin_frame_address(0);
void **bp = ( void **)(*frame);
void *ip = frame[1];
int i;
for ( i = 0; bp && ip && i < max_frames; i++ )
{
*(buffer++) = ip;
ip = bp[1];
bp = (void**)(bp[0]);
}
return i;
}
Which seems to work ok. Then to print the stack I'm looking at using dladdr like this:
Dl_info dli;
if (dladdr(Ip, &dli))
{
ptrdiff_t offset;
int c = 0;
if (dli.dli_fname && dli.dli_fbase)
{
offset = (ptrdiff_t)Ip - (ptrdiff_t)dli.dli_fbase;
c = snprintf(buf, buflen, "%s+0x%x", dli.dli_fname, offset );
}
if (dli.dli_sname && dli.dli_saddr)
{
offset = (ptrdiff_t)Ip - (ptrdiff_t)dli.dli_saddr;
c += snprintf(buf+c, buflen-c, "(%s+0x%x)", dli.dli_sname, offset );
}
if (c > 0)
snprintf(buf+c, buflen-c, " [%p]", Ip);
Which almost works, some example output:
/Users/matthew/Library/Frameworks/Lgi.framework/Versions/A/Lgi+0x2473d(LgiStackTrace+0x5d) [0x102c73d]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x2a006(tart+0x28e72) [0x2b006]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x2f438(tart+0x2e2a4) [0x30438]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x35e9c(tart+0x34d08) [0x36e9c]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x1296(tart+0x102) [0x2296]
/Users/matthew/Code/Lgi/LgiRes/build/Debug/LgiRes.app/Contents/MacOS/LgiRes+0x11bd(tart+0x29) [0x21bd]
It's getting the method name right for the shared object but not for the main app. Those just map to "tart" (or "start" minus the first character).
Ideally I'd like line numbers as well as the method name at that point. But I'll settle for the correct function/method name for starters. Maybe shoot for line numbers after that, on Linux I hear you have to write your own parser for a private ELF block that has it's own instruction set. Sounds scary.
Anyway, can anyone sort this code out so it gets the method names right?
What releases of OS X are you targetting. If you are running on Mac OS X 10.5 and higher you can just use the backtrace() and backtrace_symbols() libraray calls. They are defined in execinfo.h, and there is a manpage with some sample code.
Edit:
You mentioned in the comments that you need to run on Tiger. You can probably just include the implementation from Libc in your app. The source is available from Apple's opensource site. Here is a link to the relevent file.

Resources