Occasional OSX kernel panic in RetrieveINQUIRYData() - macos

I have a SCSI driver which calls IOSCSIPrimaryCommandsDevice::RetrieveINQUIRYData() as part of the startup process. Sometimes, after unplugging, ejecting, re-inserting and generally abusing the device during testing, a kernel panic occurs in there.
The signature of this function is:
bool RetrieveINQUIRYData (
UInt8 EVPD,
UInt8 inquiryPage,
UInt8 * inquiryBuffer,
UInt16 * dataSize);
The routine which calls this starts like this. It is modelled after PublishUnitSerialNumber from IOSCSITargetDevice.cpp
void MyDriverClass::PublishUnitSerialNumber ( IOService * object )
{
bool result = false;
SCSICmd_INQUIRY_Page80_Header * data = NULL;
IOBufferMemoryDescriptor * buffer = NULL;
OSString * string = NULL;
char serialNumber[kINQUIRY_MaximumDataSize] = { 0 };
UInt16 length = 0;
SInt16 serialLength = 0;
DEBUG_LOG("%s[%p]::%s(%p)\n", getName(), this, __FUNCTION__, object);
buffer = IOBufferMemoryDescriptor::withCapacity ( kINQUIRY_MaximumDataSize, kIODirectionIn );
require( buffer != NULL, ErrorExit );
data = (SCSICmd_INQUIRY_Page80_Header*) buffer->getBytesNoCopy();
length = kINQUIRY_MaximumDataSize;
require(data != NULL, ReleaseBuffer);
bzero ( data, kINQUIRY_MaximumDataSize );
result = RetrieveINQUIRYData ( 0x01,
kINQUIRY_Page80_PageCode,
( UInt8 * ) data,
&length ); //PANIC IN HERE!
require ( result, ReleaseBuffer );
require ( ( data->PAGE_CODE == kINQUIRY_Page80_PageCode ), ReleaseBuffer );
This works most of the time, it's only occasionally that it panics. The GDB backtrace is faily unhelpful, as there are no symbols in that function, so all I know is that it's after another function call. I also can't find the source for this function - I think it's part of the closed-source IOKit code.
#0 Debugger (message=0x8001003b <Address 0x8001003b out of bounds>) at /SourceCache/xnu/xnu-1504.15.3/osfmk/i386/AT386/model_dep.c:867
#1 0xffffff8000204d15 in panic (str=0xffffff800057ecb8 "Kernel trap at 0x%016llx, type %d=%s, registers:\nCR0: 0x%016llx, CR2: 0x%016llx, CR3: 0x%016llx, CR4: 0x%016llx\nRAX: 0x%016llx, RBX: 0x%016llx, RCX: 0x%016llx, RDX: 0x%016llx\nRSP: 0x%016llx, RBP: 0x%0"...) at /SourceCache/xnu/xnu-1504.15.3/osfmk/kern/debug.c:303
#2 0xffffff80002d1208 in panic_trap [inlined] () at :1100
#3 0xffffff80002d1208 in kernel_trap (state=<value temporarily unavailable, due to optimizations>) at /SourceCache/xnu/xnu-1504.15.3/osfmk/i386/trap.c:1001
#4 0xffffff80002e3f4a in trap_from_kernel () at pmap.h:215
#5 0xffffff7f808040b6 in ?? ()
#6 0xffffff7f80804b8b in ?? ()
#7 0xffffff7f80f1324c in com_company_driver_myDriver::PublishUnitSerialNumber (this=0xffffff80130e0600, object=0xffffff8015375000) at /Volumes/user/src/driver/MyDriver.cpp:106
#8 0xffffff7f80f13553 in com_company_driver_myDriver::start (this=0xffffff80130e0600, provider=0xffffff8015375000) at /Volumes/user/src/driver/MyDriver.cpp:53
#9 0xffffff800052d5a6 in IOService::startCandidate (this=0x2710, service=0xe) at /SourceCache/xnu/xnu-1504.15.3/iokit/Kernel/IOService.cpp:2879
#10 0xffffff800052dcb1 in IOService::probeCandidates (this=0xffffff8015375000, matches=<value temporarily unavailable, due to optimizations>) at /SourceCache/xnu/xnu-1504.15.3/iokit/Kernel/IOService.cpp:2798
...
...
Is there something I should be checking or doing before calling RetrieveINQUIRYData() that I am not doing in the above code? From what I can see, all the pointers are checked for NULL, the lengths are set to constants and the only other argument is the literal for EVPD.

I'm not familiar with the SCSI stack, but here are a few ideas:
Even the closed-source kexts have basic debug symbols. So if you add-kext the relevant kext from /System/Library/Extensions/ you should at least get function/method names (and signatures, if C++). You won't get line numbers of course, but you'll get function disassembly boundaries, etc.
The panic message is interesting:
#0 Debugger (message=0x8001003b <Address 0x8001003b out of bounds>) at /SourceCache/xnu/xnu-1504.15.3/osfmk/i386/AT386/model_dep.c:867
In the 10.6.8 sources, I can't find where that message is coming from, so I'm not quite sure what bounds are expected and where this requirement is coming from. It looks like it might be a physical address though - can you maybe insert some kprintf debug output to dump your buffer's getPhysicalSegment(0, NULL, 0) result and see if that matches the address that triggers the KP? If that's not it, see if you can figure out what instruction is causing the crash and where the incorrect address is coming from.
Good luck!

Related

Using IRPs for I/O on device object returned by IoGetDeviceObjectPointer()

Can one use IoCallDriver() with an IRP created by IoBuildAsynchronousFsdRequest() on a device object returned by IoGetDeviceObjectPointer()? What I have currently fails with blue screen (BSOD) 0x7E (unhandled exception), which when caught shows an Access Violation (0xc0000005). Same code worked when the device was stacked (using the device object returned by IoAttachDeviceToDeviceStack()).
So what I have is about the following:
status = IoGetDeviceObjectPointer(&device_name, FILE_ALL_ACCESS, &FileObject, &windows_device);
if (!NT_SUCCESS(status)) {
return -1;
}
offset.QuadPart = 0;
newIrp = IoBuildAsynchronousFsdRequest(io, windows_device, buffer, 4096, &offset, &io_stat);
if (newIrp == NULL) {
return -1;
}
IoSetCompletionRoutine(newIrp, DrbdIoCompletion, bio, TRUE, TRUE, TRUE);
status = ObReferenceObjectByPointer(newIrp->Tail.Overlay.Thread, THREAD_ALL_ACCESS, NULL, KernelMode);
if (!NT_SUCCESS(status)) {
return -1;
}
status = IoCallDriver(bio->bi_bdev->windows_device, newIrp);
if (!NT_SUCCESS(status)) {
return -1;
}
return 0;
device_name is \Device\HarddiskVolume7 which exists according to WinObj.exe .
buffer has enough space and is read/writable. offset and io_stat are on stack (also tried with heap, didn't help). When catching the exception (SEH exception) it doesn't blue screen but shows an access violation as reason for the exception. io is IRP_MJ_READ.
Do I miss something obvious? Is it in general better to use IRPs than the ZwCreateFile / ZwReadFile / ZwWriteFile API (which would be an option, but isn't that slower?)? I also tried a ZwCreateFile to have an extra reference, but this also didn't help.
Thanks for any insights.
you make in this code how minimum 2 critical errors.
can I ask - from which file you try read (or write) data ? from
FileObject you say ? but how file system driver, which will handle
this request know this ? you not pass any file object to newIrp.
look for IoBuildAsynchronousFsdRequest - it have no file object
parameter (and impossible get file object from device object - only
visa versa - because on device can be multiple files open). so it
and can not be filled by this api in newIrp. you must setup it
yourself:
PIO_STACK_LOCATION irpSp = IoGetNextIrpStackLocation( newIrp );
irpSp->FileObject = FileObject;
I guess bug was exactly when file system try access FileObject
from irp which is 0 in your case. also read docs for
IRP_MJ_READ - IrpSp->FileObject -
Pointer to the file object that is associated with DeviceObject
you pass I guess local variables io_stat (and offset) to
IoBuildAsynchronousFsdRequest. as result io_stat must be valid
until newIrp is completed - I/O subsystem write final result to it
when operation completed. but you not wait in function until request
will be completed (in case STATUS_PENDING returned) but just exit
from function. as result later I/O subsystem, if operation completed
asynchronous, write data to arbitrary address &io_stat (it became
arbitrary just after you exit from function). so you need or check
for STATUS_PENDING returned and wait in this case (have actually
synchronous io request). but more logical use
IoBuildSynchronousFsdRequest in this case. or allocate io_stat
not from stack, but say in your object which correspond to file. in
this case you can not have more than single io request with this
object at time. or if you want exactly asynchronous I/O - you can do
next trick - newIrp->UserIosb = &newIrp->IoStatus. as result you
iosb always will be valid for newIrp. and actual operation status
you check/use in DrbdIoCompletion
also can you explain (not for me - for self) next code line ?:
status = ObReferenceObjectByPointer(newIrp->Tail.Overlay.Thread, THREAD_ALL_ACCESS, NULL, KernelMode);
who and where dereference thread and what sense in this ?
Can one use ...
we can use all, but with condition - we understand what we doing and deep understand system internally.
Is it in general better to use IRPs than the ZwCreateFile / ZwReadFile
/ ZwWriteFile API
for performance - yes, better. but this require more code and more complex code compare api calls. and require more knowledge. also if you know that previous mode is kernel mode - you can use NtCreateFile, NtWriteFile, NtReadFile - this of course will be bit slow (need every time reference file object by handle) but more faster compare Zw version
Just wanted to add that the ObReferenceObjectByPointer is needed
because the IRP references the current thread which may exit before
the request is completed. It is dereferenced in the Completion
Routine. Also as a hint the completion routine must return
STATUS_MORE_PROCESSING_REQUIRED if it frees the IRP (took me several
days to figure that out).
here you make again several mistakes. how i understand you in completion routine do next:
IoFreeIrp(Irp);
return StopCompletion;
but call simply call IoFreeIrp here is error - resource leak. i advice you check (DbgPrint) Irp->MdlAddress at this point. if you read data from file system object and request completed asynchronous - file system always allocate Mdl for access user buffer in arbitrary context. now question - who free this Mdl ? IoFreeIrp - simply free Irp memory - nothing more. you do this yourself ? doubt. but Irp is complex object, which internally hold many resources. as result need not only free it memory but call "destructor" for it. this "destructor" is IofCompleteRequest. when you return StopCompletion (=STATUS_MORE_PROCESSING_REQUIRED) you break this destructor at very begin. but you must latter again call IofCompleteRequest for continue Irp (and it resources) correct destroy.
about referencing Tail.Overlay.Thread - what you doing - have no sense:
It is dereferenced in the Completion Routine.
but IofCompleteRequest access Tail.Overlay.Thread after it
call your completion routine (and if you not return
StopCompletion). as result your reference/dereference thread lost
sense - because you deference it too early, before system
actually access it.
also if you return StopCompletion and not more call
IofCompleteRequest for this Irp - system not access
Tail.Overlay.Thread at all. and you not need reference it in this
case.
and exist else one reason, why reference thread is senseless. system
access Tail.Overlay.Thread only for insert Apc to him - for call
final part (IopCompleteRequest) of Irp destruction in original
thread context. really this need only for user mode Irp's requests,
where buffers and iosb located in user mode and valid only in
context of process (original thread ). but if thread is terminated -
call of KeInsertQueueApc fail - system not let insert apc to
died thread. as result IopCompleteRequest will be not called and
resources not freed.
so you or dereference Tail.Overlay.Thread too early or you not need do this at all. and reference for died thread anyway not help. in all case what you doing is error.
you can try do next here:
PETHREAD Thread = Irp->Tail.Overlay.Thread;
IofCompleteRequest(Irp, IO_NO_INCREMENT);// here Thread will be referenced
ObfDereferenceObject(Thread);
return StopCompletion;
A second call to IofCompleteRequest causes the I/O manager to resume calling the IRP's completion. here io manager and access Tail.Overlay.Thread insert Apc to him. and finally you call ObfDereferenceObject(Thread); already after system access it and return StopCompletion for break first call to IofCompleteRequest. look like correct but.. if thread already terminated, how i explain in 3 this will be error, because KeInsertQueueApc fail. for extended test - call IofCallDriver from separate thread and just exit from it. and in completion run next code:
PETHREAD Thread = Irp->Tail.Overlay.Thread;
if (PsIsThreadTerminating(Thread))
{
DbgPrint("ThreadTerminating\n");
if (PKAPC Apc = (PKAPC)ExAllocatePool(NonPagedPool, sizeof(KAPC)))
{
KeInitializeApc(Apc, Thread, 0, KernelRoutine, 0, 0, KernelMode, 0);
if (!KeInsertQueueApc(Apc, 0, 0, IO_NO_INCREMENT))
{
DbgPrint("!KeInsertQueueApc\n");
ExFreePool(Apc);
}
}
}
PMDL MdlAddress = Irp->MdlAddress;
IofCompleteRequest(Irp, IO_NO_INCREMENT);
ObfDereferenceObject(Thread);
if (MdlAddress == Irp->MdlAddress)
{
// IopCompleteRequest not called due KeInsertQueueApc fail
DbgPrint("!!!!!!!!!!!\n");
IoFreeMdl(MdlAddress);
IoFreeIrp(Irp);
}
return StopCompletion;
//---------------
VOID KernelRoutine (PKAPC Apc,PKNORMAL_ROUTINE *,PVOID *,PVOID *,PVOID *)
{
DbgPrint("KernelRoutine(%p)\n", Apc);
ExFreePool(Apc);
}
and you must got next debug output:
ThreadTerminating
!KeInsertQueueApc
!!!!!!!!!!!
and KernelRoutine will be not called (like and IopCompleteRequest) - no print from it.
so what is correct solution ? this of course not documented anywhere, but based on deep internal understand. you not need reference original thread. you need do next:
Irp->Tail.Overlay.Thread = KeGetCurrentThread();
return ContinueCompletion;
you can safe change Tail.Overlay.Thread - if you have no any pointers valid only in original process context. this is true for kernel mode requests - all your buffers in kernel mode and valid in any context. and of course you not need break Irp destruction but continue it. for correct free mdl and all irp resources. and finally system call IoFreeIrp for you.
and again for iosb pointer. how i say pass local variable address, if you exit from function before irp completed (and this iosb accessed) is error. if you break Irp destruction, iosb will be not accessed of course, but in this case much better pass 0 pointer as iosb. (if you latter something change and iosb pointer will be accessed - will be the worst error - arbitrary memory corrupted - with unpredictable effect. and research crash of this will be very-very hard). but if you completion routine - you not need separate iosb at all - you have irp in completion and can direct access it internal iosb - for what you need else one ? so the best solution will be do next:
Irp->UserIosb = &Irp->IoStatus;
full correct example how read file asynchronous:
NTSTATUS DemoCompletion (PDEVICE_OBJECT /*DeviceObject*/, PIRP Irp, BIO* bio)
{
DbgPrint("DemoCompletion(p=%x mdl=%p)\n", Irp->PendingReturned, Irp->MdlAddress);
bio->CheckResult(Irp->IoStatus.Status, Irp->IoStatus.Information);
bio->Release();
Irp->Tail.Overlay.Thread = KeGetCurrentThread();
return ContinueCompletion;
}
VOID DoTest (PVOID buf)
{
PFILE_OBJECT FileObject;
NTSTATUS status;
UNICODE_STRING ObjectName = RTL_CONSTANT_STRING(L"\\Device\\HarddiskVolume2");
OBJECT_ATTRIBUTES oa = { sizeof(oa), 0, &ObjectName, OBJ_CASE_INSENSITIVE };
if (0 <= (status = GetDeviceObjectPointer(&oa, &FileObject)))
{
status = STATUS_INSUFFICIENT_RESOURCES;
if (BIO* bio = new BIO(FileObject))
{
if (buf = bio->AllocBuffer(PAGE_SIZE))
{
LARGE_INTEGER ByteOffset = {};
PDEVICE_OBJECT DeviceObject = IoGetRelatedDeviceObject(FileObject);
if (PIRP Irp = IoBuildAsynchronousFsdRequest(IRP_MJ_READ, DeviceObject, buf, PAGE_SIZE, &ByteOffset, 0))
{
Irp->UserIosb = &Irp->IoStatus;
Irp->Tail.Overlay.Thread = 0;
PIO_STACK_LOCATION IrpSp = IoGetNextIrpStackLocation(Irp);
IrpSp->FileObject = FileObject;
bio->AddRef();
IrpSp->CompletionRoutine = (PIO_COMPLETION_ROUTINE)DemoCompletion;
IrpSp->Context = bio;
IrpSp->Control = SL_INVOKE_ON_CANCEL|SL_INVOKE_ON_ERROR|SL_INVOKE_ON_SUCCESS;
status = IofCallDriver(DeviceObject, Irp);
}
}
bio->Release();
}
ObfDereferenceObject(FileObject);
}
DbgPrint("DoTest=%x\n", status);
}
struct BIO
{
PVOID Buffer;
PFILE_OBJECT FileObject;
LONG dwRef;
void AddRef()
{
InterlockedIncrement(&dwRef);
}
void Release()
{
if (!InterlockedDecrement(&dwRef))
{
delete this;
}
}
void* operator new(size_t cb)
{
return ExAllocatePool(PagedPool, cb);
}
void operator delete(void* p)
{
ExFreePool(p);
}
BIO(PFILE_OBJECT FileObject) : FileObject(FileObject), Buffer(0), dwRef(1)
{
DbgPrint("%s<%p>(%p)\n", __FUNCTION__, this, FileObject);
ObfReferenceObject(FileObject);
}
~BIO()
{
if (Buffer)
{
ExFreePool(Buffer);
}
ObfDereferenceObject(FileObject);
DbgPrint("%s<%p>(%p)\n", __FUNCTION__, this, FileObject);
}
PVOID AllocBuffer(ULONG NumberOfBytes)
{
return Buffer = ExAllocatePool(PagedPool, NumberOfBytes);
}
void CheckResult(NTSTATUS status, ULONG_PTR Information)
{
DbgPrint("CheckResult:status = %x, info = %p\n", status, Information);
if (0 <= status)
{
if (ULONG_PTR cb = min(16, Information))
{
char buf[64], *sz = buf;
PBYTE pb = (PBYTE)Buffer;
do sz += sprintf(sz, "%02x ", *pb++); while (--cb); sz[-1]= '\n';
DbgPrint(buf);
}
}
}
};
NTSTATUS GetDeviceObjectPointer(POBJECT_ATTRIBUTES poa, PFILE_OBJECT *FileObject )
{
HANDLE hFile;
IO_STATUS_BLOCK iosb;
NTSTATUS status = IoCreateFile(&hFile, FILE_READ_DATA, poa, &iosb, 0, 0,
FILE_SHARE_VALID_FLAGS, FILE_OPEN, FILE_NO_INTERMEDIATE_BUFFERING, 0, 0, CreateFileTypeNone, 0, 0);
if (0 <= (status))
{
status = ObReferenceObjectByHandle(hFile, 0, *IoFileObjectType, KernelMode, (void**)FileObject, 0);
NtClose(hFile);
}
return status;
}
and output:
BIO::BIO<FFFFC000024D4870>(FFFFE00001BAAB70)
DoTest=103
DemoCompletion(p=1 mdl=FFFFE0000200EE70)
CheckResult:status = 0, info = 0000000000001000
eb 52 90 4e 54 46 53 20 20 20 20 00 02 08 00 00
BIO::~BIO<FFFFC000024D4870>(FFFFE00001BAAB70)
the eb 52 90 4e 54 46 53 read ok

Kernel address poising by clearing upper bits?

Is there some mechanism in Linux which is poisoning addresses by zeroing upper 16 bits?
I am debugging a kernel crash on an Intel x86-64 machine. The instruction which is causing the crash tries to access an address of 0x880139f3da00:
crash> bt
R10: 0000000000000001 R11: 0000000000000001 R12: 0000880139f3da00
~~~~~~~~~~~~~~~~~~~~~
crash> p arp_tbl.nht->hash_buckets[255]
$66 = (struct neighbour *) 0x880139f3da00
crash> p *arp_tbl.nht->hash_buckets[255]
Cannot access memory at address 0x880139f3da00
The hash_buckets table is valid:
crash> p arp_tbl.nht->hash_buckets[253]
$70 = (struct neighbour *) 0xffff88007325ae00
$71 = {
next = 0x0,
tbl = 0xffffffff81abbf20 <arp_tbl>,
Setting upper word to 0xffff makes the address valid and returns a valid data structure:
crash> p *((struct neighbour *)0xffff880139f3da00)
$73 = {
next = 0xffff88006de69a00,
tbl = 0xffffffff81abbf20 <arp_tbl>,
... rest looks reasonable too ...
Structure is updated by RCU operations (e.g. very likely by these in neigh_flush_dev()). So, what could be the reason that the address becomes invalid in such a way?
I can exclude hardware defects (seen on two machines and with different addresses). Systems are running CentOS 7 with kernel 3.10.0-514.6.1.el7.centos.plus.x86_64 till 3.10.0-514.21.2.el7.centos.plus.x86_64.
Update
From another crash dump, I see an skb of an IPv6 packet with
crash> p *((struct sk_buff *)0xffff880070e25e00)
$57 = {
transport_header = 54,
network_header = 14,
mac_header = 0,
...
head = 0xffff880138e28000 "",
data = 0xffff880138e2800e "`",
...
}
This crashes when writing the first 0x8 bytes in
#define HH_DATA_MOD 16
static inline int neigh_hh_output(const struct hh_cache *hh, struct sk_buff *skb)
{
if (likely(hh_len <= HH_DATA_MOD)) {
memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); <<<<<
This would explain why two bytes are overridden (16 - 14).
can you inspect the memory location this address was read from? typically such a "partial zero" read is a result of memset being run on the area. after this cpu triggered a crash there was possibly enough time for whoever else was modifying the area to finish zeroing and possibly even fill it with other data.
so far there is no reason to suspect rcu plays any role here
this is most definitely not "poisoning" done by the kernel (it would be quite weird to do it in this way). however, if the crash is reproducible (you say it occurred on at least 2 different machines?) then running a debug kernel may be of help, especially with slab debug enabled.

Interrupt performance on linux kernel with RT patches - should be better?

I have bumped into a bit inconsistent IRQ/ISR performance on Freescales imx.233 running linux kernel (3.8.13) with CONFIG_PREEMPT_RT patches.
I am little bit surprised why this processor (ARM9, 454mhz) is unable to keep up even with 74kHz IRQ requests.. ?
In my kernel config I have set following flags:
CONFIG_TINY_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_RT_BASE=y
CONFIG_HAVE_PREEMPT_LAZY=y
CONFIG_PREEMPT_LAZY=y
CONFIG_PREEMPT_RT_FULL=y
CONFIG_PREEMPT_COUNT=y
CONFIG_DEBUG_PREEMPT=y
On the system there is basically nothing running (created by buildroot), and I set PWM to generate a pulse of 74kHz, that serves as interrupt.
Then in the ISR, I just trigger another GPIO output pin, and check the output.
What I find is that sometimes I miss an interrupt -
You can see the missed interrupt here:
And also the the triggering of output pin seems to be a bit inconsistent, the output pin is triggered usually within "5% window", that might still be acceptable. But I worry, that when I start implementing data transfer logic, instead of just triggering the pin, I might run into further problems...
My simple driver code looks like this:
#needed includes
uint16_t INPUT_IRQ = 39;
uint16_t OUTPUT_GPIO = 38;
struct test_device *device;
//Prototypes
void irqtest_exit(void);
int irqtest_init(void);
void free_device(void);
//Default functions
module_init(irqtest_init);
module_exit(irqtest_exit);
//triggering flag
uint16_t pulse = 0x1;
irqreturn_t irq_handle_function(int irq, void *device_id)
{
pulse = !pulse;
gpio_set_value(OUTPUT_GPIO, pulse);
return IRQ_HANDLED;
}
struct test_device {
int huuhaa;
};
void free_device() {
if (device)
kfree(device);
}
int irqtest_init(void) {
int result = 0;
device = kmalloc(sizeof *device, GFP_KERNEL);
device->huuhaa = 10;
printk("IRB/irqtest_init: Inserting IRQ module\n");
printk("IRB/irqtest_init: Requesting GPIO (%d)\n", INPUT_IRQ);
result = gpio_request_one(INPUT_IRQ, GPIOF_IN, "PWM input");
if (result != 0) {
free_device();
printk("IRB/irqtest_init: Failed to set GPIO (%d) as input.. exiting\n", INPUT_IRQ);
return -EINVAL;
}
result = gpio_request_one(OUTPUT_GPIO, GPIOF_OUT_INIT_LOW , "IR OUTPUT");
if (result != 0) {
free_device();
printk("IRB/irqtest_init: Failed to set GPIO (%d) as output.. exiting\n", OUTPUT_GPIO);
return -EINVAL;
}
//Set our desired interrupt line as input
result = gpio_direction_input(INPUT_IRQ);
if (result != 0) {
printk("IRB/irqtest_init: Failed to set IRQ as input.. exiting\n");
free_device();
return -EINVAL;
}
//Set flags for our interrupt, guessing here..
irq_flags |= IRQF_NO_THREAD;
irq_flags |= IRQF_NOBALANCING;
irq_flags |= IRQF_TRIGGER_RISING;
irq_flags |= IRQF_NO_SOFTIRQ_CALL;
//register interrupt
result = request_irq(gpio_to_irq(INPUT_IRQ), irq_handle_function, irq_flags, "irq testing", device);
if (result != 0) {
printk("IRB/irqtest_init: Failed to reserve GPIO 38\n");
return -EINVAL;
}
printk("IRB/irqtest_init: insert success\n");
return 0;
}
void irqtest_exit(void) {
if (device)
kfree(device);
gpio_free(INPUT_IRQ);
gpio_free(OUTPUT_GPIO);
printk("IRB/irqtest_exit: Removing irqtest module\n");
}
int irqtest_open(struct inode *inode, struct file *filp) {return 0;}
int irqtest_release(struct inode *inode, struct file *filp) {return 0;}
In the system, I have following interrupts registered, after the driver is loaded:
# cat /proc/interrupts
CPU0
16: 36379 - MXS Timer Tick
17: 0 - mxs-spi
18: 2103 - mxs-dma
60: 0 gpio-mxs irq testing
118: 0 - mxs-spi
119: 0 - mxs-dma
120: 0 - RTC alarm
124: 0 - 8006c000.serial
127: 68050 - uart-pl011
128: 151 - ci13xxx_imx
Err: 0
I wonder if the flags I declare to my IRQ are good ? I noticed that with this configuration, I can no longer reach console, so kernel seems totally consumed with servicing this 74kHz trigger now.. this can't be right ?
I suppose it's not a big deal for me since this is only during data transfer, but still I feel I'm doing something wrong..
Also, I wonder if it would be more efficient to map the registers with ioremap, and trigger the output with direct memory writes ?
Is there some way I could increase the priority of the interrupt even higher ? Or could I somehow lock the kernel for the duration of the data transfer (~400ms), and generate somehow else my timing for the output ?
Edit: Forgot to add /proc/interrupts output to the question...
What you experience here is interrupt jitter. This is to be expected on Linux, because the kernel regularly disables the interrupts for various tasks (entering a spinlock, handling an interrupt, etc.).
This will happen, regardless wether you have PREEMPT_RT or not, so expecting to generate 74kHz signal with regular interrupts is pretty much unrealistic.
Now, ARM has higher priority interrupts called FIQs, that will never be masked or disabled.
Linux doesn't use FIQ, and is not built to deal with the fact that an FIQ could be used, so you won't be able to use the generic kernel framework.
From Linux driver development point of view however, it's not really different as long as you keep this in mind: you have to write a handler, and associate it to an IRQ. You'll also have to poke into the interrupt controller to make it generate a FIQ for the interrupt you want to use (the details on how to change it are platform-dependant. Some platforms have functions to do that (like imx25 and mxc_set_irq_fiq), some others don't. imx23/28 don't, so you'll have to do it by hand).
The only thing that the functions to setup a fiq handler only work with a assembly-written handler, so you'll have to rewrite your handler in assembly (with your current code, it should be trivial though).
You can grab additional details to the blog post Alexandre posted (http://free-electrons.com/blog/fiq-handlers-in-the-arm-linux-kernel/), where you'll find working code, samples, and explanations on how it all works together.
You can have a look at what my colleague Maxime Ripard did using an FIQ on a similar SoC (i.mx28) :
http://free-electrons.com/blog/fiq-handlers-in-the-arm-linux-kernel/
Try this flags:
int irq_flags;
...
irq_flags = IRQF_TRIGGER_RISING | IRQF_EARLY_RESUME
I had a kernel 3.8.11 and can't find IRQF_NO_SOFTIRQ_CALL define. It's only for 3.8.13?
Also I didn't see irq_flags define. Where is it?

How do I log messages / break execution for alloc, retain, release and dealloc of CFType objects?

I want to be able to log messages (and preferably break to the debugger) each time a specific CFType object (for my current purposes, a CGPDFDocument) is allocated, retained, released or deallocated.
Because there isn't a Create...() method for CGPDFDocument which takes a CFAllocatorRef, I'm trying to change the default allocator temporarily like this:
void MyPDFDocumentCreate()
{
// ...
CFAllocatorRef defaultAllocator = CFAllocatorGetDefault();
CFAllocatorSetDefault(MyLogAllocator());
CGPDFDocumentRef documentRef = CGPDFDocumentCreateWithProvider(provider);
CFAllocatorSetDefault(defaultAllocator);
// ...
}
where MyLogAllocator() is defined as follows:
static void *(*DefaultAllocate)(CFIndex size, CFOptionFlags hint, void *info);
static const void *(*DefaultRetain)(const void *info);
static void (*DefaultRelease)(const void *info);
void *LogAllocate(CFIndex size, CFOptionFlags hint, void *info)
{
fprintf(stderr, "LogAllocate %p", info);
if (DefaultAllocate)
return DefaultAllocate(size, hint, info);
else
return NULL;
}
const void *LogRetain(const void *info)
{
fprintf(stderr, "LogRetain");
if (DefaultRetain)
return DefaultRetain(info);
else
return info;
}
void LogRelease(const void *info)
{
fprintf(stderr, "LogRelease");
if (DefaultRelease)
DefaultRelease(info);
}
static CFAllocatorRef MyLogAllocator()
{
static CFAllocatorRef theLogAllocator = NULL;
if (!theLogAllocator)
{
CFAllocatorContext context;
CFAllocatorRef defaultAllocator = CFAllocatorGetDefault();
CFAllocatorGetContext(defaultAllocator, &context);
DefaultAllocate = context.allocate;
DefaultRetain = context.retain;
DefaultRelease = context.release;
context.allocate = LogAllocate;
context.retain = LogRetain;
context.release = LogRelease;
theLogAllocator = CFAllocatorCreate(kCFAllocatorUseContext, &context);
}
return theLogAllocator;
}
However, it seems that the default allocator (kCFAllocatorSystemDefault as far as I can tell) has NULL for context.retain and context.release, so I don't have any original implementations to call. That may be why, when I try the code above, I get the following stack trace:
#0 0x357ded12 in CFRetain ()
#1 0x357dcb68 in _CFRuntimeCreateInstance ()
#2 0x303fe35e in CGTypeCreateInstanceWithAllocator ()
#3 0x303fe34c in CGTypeCreateInstance ()
#4 0x304b32f4 in CGPDFDocumentCreateWithProvider ()
#5 0x000293f4 in MyPDFDocumentCreate ([...]) at [...]
XCode doesn't actually tell me why it's halting, but if I try to continue I get:
(gdb) continue
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x357ded12 in CFRetain ()
(gdb) continue
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x357ded12 in CFRetain ()
(gdb)
However many times I continue, I get the same SIGTRAP. I don't know how to interpret it; the only breakpoint I have set up is a symbolic one on objc_exception_throw.
One thing to note is that LogRetain() and LogAllocate() are each called successfully once (in that order) from CFAllocatorCreate():
#0 LogRetain (info=0x1a8000) at [...]
#1 0x358086f2 in CFAllocatorCreate ()
#2 0x00028d58 in MyLogAllocator () at [...]
#3 0x000293e0 in MyPDFDocumentCreate ([...]) at [...]
#0 LogAllocate (size=104, hint=0, info=0x1a8000) at [...]
#1 0x3580882e in CFAllocatorCreate ()
#2 0x00028d58 in MyLogAllocator () at [...]
#3 0x000293e0 in MyPDFDocumentCreate ([...]) at [...]
And then LogAllocate() is again successful from CFAllocatorAllocate():
#0 LogAllocate (size=64, hint=1024, info=0x1a8000) at [...]
#1 0x357dcc06 in CFAllocatorAllocate ()
#2 0x357dcb04 in _CFRuntimeCreateInstance ()
#3 0x303fe35e in CGTypeCreateInstanceWithAllocator ()
#4 0x303fe34c in CGTypeCreateInstance ()
#5 0x304b32f4 in CGPDFDocumentCreateWithProvider ()
#6 0x000293f4 in MyPDFDocumentCreate ([...]) at [...]
before the _CFRuntimeCreateInstance() at #2 calls the problematic CFRetain() detailed above.
Could somebody please help me to understand what's going on here (especially how the default allocator handles retain and release, and why I'm getting the SIGTRAP); how to fix it; and whether there's a better way to do what I'm trying to do?
(I figured I might be able to work out how to use DTrace to probe CFRetain() and CFRelease(), filtered by the CFTypeID for CGPDFDocument, but I wouldn't know what to probe for deallocation (allocation is not so important to track as I know it's done within CGPDFDocumentCreateWithProvider()). Also, I'd prefer to be able to break to the debugger on retain / release / deallocate, which I don't think is possible using DTrace.)
UPDATE: Having now read the source code for CFRelease I realise I misunderstood the purpose of context.retain and context.release -- they are for retaining and releasing context.info. So the entire approach described above is a non-starter. However, perhaps a DTrace/Instruments wizard may still be able to work some magic?!
It's a very interesting problem. Since you've gone as deep as investigating filtering DTrace, and you're diving into the CFRelease source, then you can look at using gdb breakpoint conditions to choose when to break. To determine if a deallocation is going to occur, just use CFGetRetainCount().
That said, I'm guessing you're pulling your hair out tracking down some over-release crash, right? Things to look at that are probably more useful than reverse-engineering CFRelease():
CFZombie
Instruments' Allocations instrument provides full stacks of when objects were retained and released, allocated and destroyed. Turn on the option "Record reference counts."

How do I get an Mac OS Component Manager Component to be visible to other processes?

This is a bit esoteric, but there have to be a few people here who know how OS X's Carbon Component Manager works. I've made a couple of little apps to play around with making Components (see here for some background). Actually, one of the apps is a sample program straight from Apple called 'Fiendishthngs'. It lists all the Components that the Component Manager is making available. My program is a simple little thing that registers a Component, lists all the Components that the Component Manager has, and then waits around indefinately (to avoid purging the Component that it registered).
On my system, the Component Manager is tracking 873 Components (mostly codecs of one sort of another). My program that registers a Component registers it, and then counts 874 Components because it just registered one itself, of course). Here's the source:
void RegisterBasicComponent()
{
ComponentDescription desc;
desc.componentType = kMyComponentType;
desc.componentSubType = kMyComponentSubType;
desc.componentManufacturer = kMyComponentManufacturer;
desc.componentFlags = 0;
desc.componentFlagsMask = cmpIsMissing;
ComponentRoutineUPP MyComponentRoutineUPP
= NewComponentRoutineUPP( &MyComponentRoutineProc );
// Handle name_handle = NewHandle( sizeof( kMyComponentName ) );
//strcpy( *(char**)name_handle, kMyComponentName );
//RegisterComponent( &desc, MyComponentRoutineUPP, registerComponentGlobal, name_handle, NULL, NULL );
Component component = RegisterComponent( &desc, MyComponentRoutineUPP, registerComponentGlobal, NULL, NULL, NULL );
if ( NULL != component )
printf("The registration seems to have worked!\n");
else
printf("Nope - didn't work for some reason.\n");
}
int main( void )
{
RegisterBasicComponent();
ComponentDescription looking;
// OSType componentType; /* A unique 4-byte code indentifying the command set */
// OSType componentSubType; /* Particular flavor of this instance */
// OSType componentManufacturer; /* Vendor indentification */
// UInt32 componentFlags; /* 8 each for Component,Type,SubType,Manuf/revision */
// UInt32 componentFlagsMask; /* Mask for specifying which flags to consider in search, zero during registration */
looking.componentType = kAnyComponentType;
looking.componentSubType = kAnyComponentSubType;
// looking.componentSubType = kComponentResourceType
looking.componentManufacturer = kAnyComponentManufacturer;
looking.componentFlags = 0;
looking.componentFlagsMask = cmpIsMissing;
long numComponents = CountComponents ( &looking );
printf("Found %ld components.\n", numComponents);
Component component = 0;
int i = 0;
while (true)
{
component = FindNextComponent(component, &looking);
if ( 0 == component )
break;
ComponentDescription desc;
Handle componentName = NewHandle(256);
Handle componentInfo = NewHandle(1024);
Handle componentIcon = 0;
OSErr err = GetComponentInfo( component,
&desc,
componentName,
componentInfo,
componentIcon );
if ( err != noErr )
{
printf("Couldn't find any info on component %d of %ld in list!\n", i
, numComponents);
break;
}
printf( "%d of %ld: '%c%c%c%c', '%c%c%c%c', '%c%c%c%c', '%s'\n",
i, numComponents,
SPLAT_WORD( desc.componentManufacturer ),
SPLAT_WORD( desc.componentType ),
SPLAT_WORD( desc.componentSubType ),
*componentName );
RecoverHandle( *componentName );
RecoverHandle( *componentInfo );
++i;
}
while (true)
{
printf("Waiting around for someone to use me...\n");
sleep( 3 );
}
}
Anyways, when I run this, keep it running (so the Component would presumably stay registered with the Component Manager), and then run Fiendishthngs, Fiendishthngs can't see my test Component that I register - it only sees 873 Components. The 'registerComponentGlobal' flag passed into RegisterComponent() is supposed to make the Component available to other processes, but it seems like something is going awry.
Any ideas?
Well, I left this issue behind, resigning myself to the fact that OS X's Component Manager probably doesn't support the 'global' option anymore.
This would make a lot of sense, really. Making your component 'global' from your process to other processes would require out-of-process call marshalling, like with RPC, with OS X. On the other hand, in OS 9 and earlier, it would make perfect sense, given that all processes lived in a common address space. For OS 9 it would be trivial to make a component globally available to all processes.
Anyways, today I was disassembling RegisterComponentFileRefEntries() where the relevant code seems to reside, and sure enough, I saw this in the pre-amble to the function (the comments are mine):
0x9841a026 <+0018> mov eax, DWORD PTR [ebp+0x8] // load param spec
0x9841a029 <+0021> mov DWORD PTR [ebp-0xdc],eax // local spec = param spec
0x9841a02f <+0027> mov edx,DWORD PTR [ebp+0x10] // load param toRegister
0x9841a032 <+0030> mov DWORD PTR [ebp-0xe0],edx // local toRegister = param toRegister
The signature for RegisterComponentFileRefEntries is
extern OSErr
RegisterComponentFileEntries(
const FSSpec * spec,
short global,
const ComponentDescription * toRegister, /* can be NULL */
UInt32 registerCount)
The only 2 parameters that RegisterComponentFileRefEntries bothers with are spec (at ebp+0x8) and toRegister (at ebp+0x10). global (at ebp+0xc) and registerCount (at ebp+0x14) are completely ignored.

Resources