NPP_NewStream: seekable set to 0 (false) for local file - firefox

I am trying to implement an NPAPI plugin with streaming capabilities (NP_SEEK+NPN_RequestRead). No matter what I try the boolean NPBool seekable is always set to 0 (false).
I am starting firefox (iceweasel on Debian) from the command line on a local file:
$ iceweasel test1.html
If I attach gdb to the npapi plugin here is what I see:
(gdb)
#2 0x00007f7e9da54e14 in mozilla::plugins::BrowserStreamChild::StreamConstructed (this=0x7f7e925cf310, mimeType=..., seekable=<optimized out>, stype=<optimized out>)
at /tmp/buildd/iceweasel-24.6.0esr/dom/plugins/ipc/BrowserStreamChild.cpp:62
62 &mStream, seekable, stype);
(gdb)
#3 0x00007f7e9da5688e in mozilla::plugins::PluginInstanceChild::AnswerPBrowserStreamConstructor (this=<optimized out>, aActor=<optimized out>, url=...,
length=<optimized out>, lastmodified=<optimized out>, notifyData=<optimized out>, headers=..., mimeType=..., seekable=#0x7fff25ed51df: false, rv=0x7fff25ed51e0,
stype=0x7fff25ed51e2) at /tmp/buildd/iceweasel-24.6.0esr/dom/plugins/ipc/PluginInstanceChild.cpp:2285
2285 ->StreamConstructed(mimeType, seekable, stype);
(gdb) p seekable
$1 = (const bool &) #0x7fff25ed51df: false
(gdb) up
#4 0x00007f7e9da8f77f in mozilla::plugins::PPluginInstanceChild::OnCallReceived (this=0x7f7e925f2c00, __msg=..., __reply=#0x7fff25ed5470: 0x0)
at /tmp/buildd/iceweasel-24.6.0esr/build-xulrunner/ipc/ipdl/PPluginInstanceChild.cpp:2479
warning: Source file is more recent than executable.
2479 if ((!(AnswerPBrowserStreamConstructor(actor, url, length, lastmodified, notifyData, headers, mimeType, seekable, (&(rv)), (&(stype)))))) {
(gdb) list -
2469 if ((!(actor))) {
2470 return MsgValueError;
2471 }
2472 (actor)->mId = RegisterID(actor, (__handle).mId);
2473 (actor)->mManager = this;
2474 (actor)->mChannel = mChannel;
2475 (mManagedPBrowserStreamChild).InsertElementSorted(actor);
2476 (actor)->mState = mozilla::plugins::PBrowserStream::__Start;
2477
2478 int32_t __id = mId;
(gdb) list -
2459 FatalError("Error deserializing 'bool'");
2460 return MsgValueError;
2461 }
2462 (__msg).EndRead(__iter);
2463 if ((!(PPluginInstance::Transition(mState, Trigger(Trigger::Send, PPluginInstance::Msg_PBrowserStreamConstructor__ID), (&(mState)))))) {
2464 NS_WARNING("bad state transition!");
2465 }
2466 NPError rv;
2467 uint16_t stype;
2468 actor = AllocPBrowserStream(url, length, lastmodified, notifyData, headers, mimeType, seekable, (&(rv)), (&(stype)));
(gdb) list -
2449 }
2450 if ((!(Read((&(headers)), (&(__msg)), (&(__iter)))))) {
2451 FatalError("Error deserializing 'nsCString'");
2452 return MsgValueError;
2453 }
2454 if ((!(Read((&(mimeType)), (&(__msg)), (&(__iter)))))) {
2455 FatalError("Error deserializing 'nsCString'");
2456 return MsgValueError;
2457 }
2458 if ((!(Read((&(seekable)), (&(__msg)), (&(__iter)))))) {
(gdb) up
#5 0x00007f7e9da868f0 in mozilla::plugins::PPluginModuleChild::OnCallReceived (this=<optimized out>, __msg=..., __reply=#0x7fff25ed5470: 0x0)
at /tmp/buildd/iceweasel-24.6.0esr/build-xulrunner/ipc/ipdl/PPluginModuleChild.cpp:1023
warning: Source file is more recent than executable.
1023 return (__routed)->OnCallReceived(__msg, __reply);
(gdb) list -
1013 PPluginModuleChild::OnCallReceived(
1014 const Message& __msg,
1015 Message*& __reply)
1016 {
1017 int32_t __route = (__msg).routing_id();
1018 if ((MSG_ROUTING_CONTROL) != (__route)) {
1019 ChannelListener* __routed = Lookup(__route);
1020 if ((!(__routed))) {
1021 return MsgRouteError;
1022 }
(gdb) bt
If I copy test1.html over to /var/www, and then point to http://localhost/test1.html everything works as expected.
However the documentation mention
seekable
Boolean indicating whether the stream is seekable:
true: Seekable. Stream supports random access through calls to NPN_RequestRead (for example, local files or HTTP servers that support byte-range requests).

The documentation is outright lying.
The seekable flag in the call to NPP_NewStream
originates from OnStartBinding
which calls into nsPluginStreamListenerPeer::IsSeekable
which just returns nsPluginStreamListenerPeer::mSeekable.
The only time mSeekable is ever set true is when (source)
The stream is http (https, spdy)
The http response has no Content-Encoding
The http response provides a Content-Length.
The http response has Accept-Ranges: bytes (omitting the header is not supported)
For all other stream types (incl. file://) and http streams not matching the requirements the seekable flag is hence always false.
Moreover, NPN_RequestRead is only implemented for http streams, but doesn't actually care about seekable and furthermore does not actually check if the server returns 206.
Conclusion
You can only use NP_SEEKstreams with http (https, spdy). This is why stuff works from http://localhost, but not from a local file (file://).

Related

boost::asio triggers a sigsegv in std::type_info::operator==

So, I've got an application that uses boost::asio. Due to complexity of the project, I cannot share it's source code, sadly :(
The application uses boost's asio lib to create some webservices. When attempting to use it, however, there's a sigsegv in std::type_info::operator==, which, as I'm aware, should rather work.
(gdb) backtrace
#0 0x0000000000457b79 in std::type_info::operator== (
this=0x7ffff6dadf61 <typeinfo for boost::asio::detail::typeid_wrapper<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >+1>, __arg=...)
at /usr/include/c++/4.8.2/typeinfo:123
#1 0x00007ffff6961911 in boost::asio::detail::service_registry::keys_match (key1=..., key2=...) at /usr/include/boost/asio/detail/impl/service_registry.ipp:94
#2 0x00007ffff69619a1 in boost::asio::detail::service_registry::do_use_service (this=0x6bd3c0, key=...,
factory=0x7ffff6988ba8 <boost::asio::detail::service_registry::create<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >(boost::asio::io_service&)>)
at /usr/include/boost/asio/detail/impl/service_registry.ipp:114
#3 0x00007ffff69842b6 in boost::asio::detail::service_registry::use_service<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > > (this=0x6bd3c0)
at /usr/include/boost/asio/detail/impl/service_registry.hpp:48
The rest of backtrace ommited, as I don't think it would help anyway.
When digging further into what GDB prints, that's what I get:
(gdb) frame 0
#0 0x0000000000457b79 in std::type_info::operator== (
this=0x7ffff6dadf61 <typeinfo for boost::asio::detail::typeid_wrapper<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >+1>, __arg=...)
at /usr/include/c++/4.8.2/typeinfo:123
123 || (__name[0] != '*' &&
(gdb) list
118 : __builtin_strcmp (__name, __arg.__name) < 0; }
119
120 bool operator==(const type_info& __arg) const _GLIBCXX_NOEXCEPT
121 {
122 return ((__name == __arg.__name)
123 || (__name[0] != '*' &&
124 __builtin_strcmp (__name, __arg.__name) == 0));
125 }
126 #else
127 // On some targets we can rely on type_info's NTBS being unique,
(gdb) print __name
$2 = 0xd000007ffff6afbc <Address 0xd000007ffff6afbc out of bounds>
(gdb) print __arg.__name
warning: can't find linker symbol for virtual table for `std::type_info' value
$3 = 0x7ffff6afbc60 <typeinfo name for boost::asio::detail::typeid_wrapper<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >> "N5boost4asio6detail14typeid_wrapperINS0_22deadline_timer_serviceINS_10posix_time5ptimeENS0_11time_traitsIS5_EEEEEE"
(gdb) print __name[0]
Cannot access memory at address 0xd000007ffff6afbc
(gdb) frame 1
#1 0x00007ffff6961911 in boost::asio::detail::service_registry::keys_match (key1=..., key2=...) at /usr/include/boost/asio/detail/impl/service_registry.ipp:94
94 if (*key1.type_info_ == *key2.type_info_)
(gdb) list
89 {
90 if (key1.id_ && key2.id_)
91 if (key1.id_ == key2.id_)
92 return true;
93 if (key1.type_info_ && key2.type_info_)
94 if (*key1.type_info_ == *key2.type_info_)
95 return true;
96 return false;
97 }
98
(gdb) print key1.type_info_
$4 = (const std::type_info *) 0x7ffff6dadf61 <typeinfo for boost::asio::detail::typeid_wrapper<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >+1>
(gdb) print *key1.type_info_
$5 = {_vptr.type_info = 0x6000000000006906, __name = 0xd000007ffff6afbc <Address 0xd000007ffff6afbc out of bounds>}
(gdb) print *key2.type_info_
warning: can't find linker symbol for virtual table for `std::type_info' value
$6 = warning: can't find linker symbol for virtual table for `std::type_info' value
{_vptr.type_info = 0x6906d0 <_ZTVN10__cxxabiv117__class_type_infoE##CXXABI_1.3+16>,
__name = 0x7ffff6afbc60 <typeinfo name for boost::asio::detail::typeid_wrapper<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >> "N5boost4asio6detail14typeid_wrapperINS0_22deadline_timer_serviceINS_10posix_time5ptimeENS0_11time_traitsIS5_EEEEEE"}
(gdb) print key2.type_info_
$7 = (const std::type_info *) 0x7ffff6dadf60 <typeinfo for boost::asio::detail::typeid_wrapper<boost::asio::deadline_timer_service<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime> > >>
The only theory I have as to why it works like that, is that the application consists of a main binary, which is statically linked to necessary libboost*.a, and it can load .so files as plugins, each also linked with the same libboost*.a. That's the best I can guess at this point.
Anybody has any idea as to why it fails like that, and could please help me with it?
Looks like the service is not finding the service object by typeid.
This implies that parts of the library have not been compatibly compiled.
Either include all the shared objects/libraries that you require at runtime (and make sure they're loaded, and not the system ones) or recompile your application against the library versions on the target system.
(Even differences in compiler flags might cause ABI incompatibilities in some platforms)

Occasional OSX kernel panic in RetrieveINQUIRYData()

I have a SCSI driver which calls IOSCSIPrimaryCommandsDevice::RetrieveINQUIRYData() as part of the startup process. Sometimes, after unplugging, ejecting, re-inserting and generally abusing the device during testing, a kernel panic occurs in there.
The signature of this function is:
bool RetrieveINQUIRYData (
UInt8 EVPD,
UInt8 inquiryPage,
UInt8 * inquiryBuffer,
UInt16 * dataSize);
The routine which calls this starts like this. It is modelled after PublishUnitSerialNumber from IOSCSITargetDevice.cpp
void MyDriverClass::PublishUnitSerialNumber ( IOService * object )
{
bool result = false;
SCSICmd_INQUIRY_Page80_Header * data = NULL;
IOBufferMemoryDescriptor * buffer = NULL;
OSString * string = NULL;
char serialNumber[kINQUIRY_MaximumDataSize] = { 0 };
UInt16 length = 0;
SInt16 serialLength = 0;
DEBUG_LOG("%s[%p]::%s(%p)\n", getName(), this, __FUNCTION__, object);
buffer = IOBufferMemoryDescriptor::withCapacity ( kINQUIRY_MaximumDataSize, kIODirectionIn );
require( buffer != NULL, ErrorExit );
data = (SCSICmd_INQUIRY_Page80_Header*) buffer->getBytesNoCopy();
length = kINQUIRY_MaximumDataSize;
require(data != NULL, ReleaseBuffer);
bzero ( data, kINQUIRY_MaximumDataSize );
result = RetrieveINQUIRYData ( 0x01,
kINQUIRY_Page80_PageCode,
( UInt8 * ) data,
&length ); //PANIC IN HERE!
require ( result, ReleaseBuffer );
require ( ( data->PAGE_CODE == kINQUIRY_Page80_PageCode ), ReleaseBuffer );
This works most of the time, it's only occasionally that it panics. The GDB backtrace is faily unhelpful, as there are no symbols in that function, so all I know is that it's after another function call. I also can't find the source for this function - I think it's part of the closed-source IOKit code.
#0 Debugger (message=0x8001003b <Address 0x8001003b out of bounds>) at /SourceCache/xnu/xnu-1504.15.3/osfmk/i386/AT386/model_dep.c:867
#1 0xffffff8000204d15 in panic (str=0xffffff800057ecb8 "Kernel trap at 0x%016llx, type %d=%s, registers:\nCR0: 0x%016llx, CR2: 0x%016llx, CR3: 0x%016llx, CR4: 0x%016llx\nRAX: 0x%016llx, RBX: 0x%016llx, RCX: 0x%016llx, RDX: 0x%016llx\nRSP: 0x%016llx, RBP: 0x%0"...) at /SourceCache/xnu/xnu-1504.15.3/osfmk/kern/debug.c:303
#2 0xffffff80002d1208 in panic_trap [inlined] () at :1100
#3 0xffffff80002d1208 in kernel_trap (state=<value temporarily unavailable, due to optimizations>) at /SourceCache/xnu/xnu-1504.15.3/osfmk/i386/trap.c:1001
#4 0xffffff80002e3f4a in trap_from_kernel () at pmap.h:215
#5 0xffffff7f808040b6 in ?? ()
#6 0xffffff7f80804b8b in ?? ()
#7 0xffffff7f80f1324c in com_company_driver_myDriver::PublishUnitSerialNumber (this=0xffffff80130e0600, object=0xffffff8015375000) at /Volumes/user/src/driver/MyDriver.cpp:106
#8 0xffffff7f80f13553 in com_company_driver_myDriver::start (this=0xffffff80130e0600, provider=0xffffff8015375000) at /Volumes/user/src/driver/MyDriver.cpp:53
#9 0xffffff800052d5a6 in IOService::startCandidate (this=0x2710, service=0xe) at /SourceCache/xnu/xnu-1504.15.3/iokit/Kernel/IOService.cpp:2879
#10 0xffffff800052dcb1 in IOService::probeCandidates (this=0xffffff8015375000, matches=<value temporarily unavailable, due to optimizations>) at /SourceCache/xnu/xnu-1504.15.3/iokit/Kernel/IOService.cpp:2798
...
...
Is there something I should be checking or doing before calling RetrieveINQUIRYData() that I am not doing in the above code? From what I can see, all the pointers are checked for NULL, the lengths are set to constants and the only other argument is the literal for EVPD.
I'm not familiar with the SCSI stack, but here are a few ideas:
Even the closed-source kexts have basic debug symbols. So if you add-kext the relevant kext from /System/Library/Extensions/ you should at least get function/method names (and signatures, if C++). You won't get line numbers of course, but you'll get function disassembly boundaries, etc.
The panic message is interesting:
#0 Debugger (message=0x8001003b <Address 0x8001003b out of bounds>) at /SourceCache/xnu/xnu-1504.15.3/osfmk/i386/AT386/model_dep.c:867
In the 10.6.8 sources, I can't find where that message is coming from, so I'm not quite sure what bounds are expected and where this requirement is coming from. It looks like it might be a physical address though - can you maybe insert some kprintf debug output to dump your buffer's getPhysicalSegment(0, NULL, 0) result and see if that matches the address that triggers the KP? If that's not it, see if you can figure out what instruction is causing the crash and where the incorrect address is coming from.
Good luck!

How to send buffer bigger than 256kByte with boost asio on OSX, works fine on WIN

I have a simple client server approach sending char buffers using tcp socket with boost asio on OSX. When I use a buffer bigger than 256kBytes I get a Memory Access error. It seems to happen on the Server side. But nonetheless, the exact same code with bigger buffer sizes than 256kBytes is working perfectly on Windows!
gdb on OSX tells me
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00000001002bebb8
[Switching to process 15955 thread 0x2003]
0x0000000100012920 in connectionThread::operator() ()
I'm wondering if there is a OS specific memory allocation issue I have to be aware of. Boost Asio docs weren't really helpful.
I do not want to split the buffer! I want to send an entire buffer, which could have a size up to 921600Bytes. Which already works perfectly on windows with boost asio.
Message
char data[307200];
Client side SEND msg
void send( boost::asio::ip::tcp::socket& s ) const {
try {
boost::asio::write( s, boost::asio::buffer( data, data_size));
}
catch( std::exception e ) {
throw std::runtime_error("message send error | " + std::string( e.what() ) );
}
}
Server side READ msg
void read( boost::asio::ip::tcp::socket& s ) {
try {
// Read data
boost::asio::read( s, boost::asio::buffer( &data, data_size ));
}
catch( std::exception e ) {
throw std::runtime_error( "message read error | " + std::string( e.what() ) );
}
}
backtrace
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00000001002bebb8
[Switching to process 17916 thread 0x2003]
0x0000000100012920 in connectionThread::operator() ()
(gdb) backtrace
#0 0x0000000100012920 in connectionThread::operator() ()
#1 0x000000010000679d in boost::_mfi::mf1<void, connectionThread, boost::shared_ptr<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> > >&>::operator() ()
#2 0x00000001000068ca in boost::_bi::list2<boost::_bi::value<connectionThread*>, boost::_bi::value<boost::shared_ptr<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> > > > >::operator()<boost::_mfi::mf1<void, connectionThread, boost::shared_ptr<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> > >&>, boost::_bi::list0> ()
#3 0x000000010000690a in boost::_bi::bind_t<void, boost::_mfi::mf1<void, connectionThread, boost::shared_ptr<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> > >&>, boost::_bi::list2<boost::_bi::value<connectionThread*>, boost::_bi::value<boost::shared_ptr<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> > > > > >::operator() ()
#4 0x0000000100006928 in boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf1<void, connectionThread, boost::shared_ptr<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> > >&>, boost::_bi::list2<boost::_bi::value<connectionThread*>, boost::_bi::value<boost::shared_ptr<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> > > > > > >::run ()
#5 0x000000010006b699 in thread_proxy (param=<value temporarily unavailable, due to optimizations>) at libs/thread/src/pthread/thread.cpp:121
#6 0x00007fff8094cfd6 in _pthread_start ()
#7 0x00007fff8094ce89 in thread_start ()
(gdb)
Thanks for any ideas

How do I log messages / break execution for alloc, retain, release and dealloc of CFType objects?

I want to be able to log messages (and preferably break to the debugger) each time a specific CFType object (for my current purposes, a CGPDFDocument) is allocated, retained, released or deallocated.
Because there isn't a Create...() method for CGPDFDocument which takes a CFAllocatorRef, I'm trying to change the default allocator temporarily like this:
void MyPDFDocumentCreate()
{
// ...
CFAllocatorRef defaultAllocator = CFAllocatorGetDefault();
CFAllocatorSetDefault(MyLogAllocator());
CGPDFDocumentRef documentRef = CGPDFDocumentCreateWithProvider(provider);
CFAllocatorSetDefault(defaultAllocator);
// ...
}
where MyLogAllocator() is defined as follows:
static void *(*DefaultAllocate)(CFIndex size, CFOptionFlags hint, void *info);
static const void *(*DefaultRetain)(const void *info);
static void (*DefaultRelease)(const void *info);
void *LogAllocate(CFIndex size, CFOptionFlags hint, void *info)
{
fprintf(stderr, "LogAllocate %p", info);
if (DefaultAllocate)
return DefaultAllocate(size, hint, info);
else
return NULL;
}
const void *LogRetain(const void *info)
{
fprintf(stderr, "LogRetain");
if (DefaultRetain)
return DefaultRetain(info);
else
return info;
}
void LogRelease(const void *info)
{
fprintf(stderr, "LogRelease");
if (DefaultRelease)
DefaultRelease(info);
}
static CFAllocatorRef MyLogAllocator()
{
static CFAllocatorRef theLogAllocator = NULL;
if (!theLogAllocator)
{
CFAllocatorContext context;
CFAllocatorRef defaultAllocator = CFAllocatorGetDefault();
CFAllocatorGetContext(defaultAllocator, &context);
DefaultAllocate = context.allocate;
DefaultRetain = context.retain;
DefaultRelease = context.release;
context.allocate = LogAllocate;
context.retain = LogRetain;
context.release = LogRelease;
theLogAllocator = CFAllocatorCreate(kCFAllocatorUseContext, &context);
}
return theLogAllocator;
}
However, it seems that the default allocator (kCFAllocatorSystemDefault as far as I can tell) has NULL for context.retain and context.release, so I don't have any original implementations to call. That may be why, when I try the code above, I get the following stack trace:
#0 0x357ded12 in CFRetain ()
#1 0x357dcb68 in _CFRuntimeCreateInstance ()
#2 0x303fe35e in CGTypeCreateInstanceWithAllocator ()
#3 0x303fe34c in CGTypeCreateInstance ()
#4 0x304b32f4 in CGPDFDocumentCreateWithProvider ()
#5 0x000293f4 in MyPDFDocumentCreate ([...]) at [...]
XCode doesn't actually tell me why it's halting, but if I try to continue I get:
(gdb) continue
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x357ded12 in CFRetain ()
(gdb) continue
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x357ded12 in CFRetain ()
(gdb)
However many times I continue, I get the same SIGTRAP. I don't know how to interpret it; the only breakpoint I have set up is a symbolic one on objc_exception_throw.
One thing to note is that LogRetain() and LogAllocate() are each called successfully once (in that order) from CFAllocatorCreate():
#0 LogRetain (info=0x1a8000) at [...]
#1 0x358086f2 in CFAllocatorCreate ()
#2 0x00028d58 in MyLogAllocator () at [...]
#3 0x000293e0 in MyPDFDocumentCreate ([...]) at [...]
#0 LogAllocate (size=104, hint=0, info=0x1a8000) at [...]
#1 0x3580882e in CFAllocatorCreate ()
#2 0x00028d58 in MyLogAllocator () at [...]
#3 0x000293e0 in MyPDFDocumentCreate ([...]) at [...]
And then LogAllocate() is again successful from CFAllocatorAllocate():
#0 LogAllocate (size=64, hint=1024, info=0x1a8000) at [...]
#1 0x357dcc06 in CFAllocatorAllocate ()
#2 0x357dcb04 in _CFRuntimeCreateInstance ()
#3 0x303fe35e in CGTypeCreateInstanceWithAllocator ()
#4 0x303fe34c in CGTypeCreateInstance ()
#5 0x304b32f4 in CGPDFDocumentCreateWithProvider ()
#6 0x000293f4 in MyPDFDocumentCreate ([...]) at [...]
before the _CFRuntimeCreateInstance() at #2 calls the problematic CFRetain() detailed above.
Could somebody please help me to understand what's going on here (especially how the default allocator handles retain and release, and why I'm getting the SIGTRAP); how to fix it; and whether there's a better way to do what I'm trying to do?
(I figured I might be able to work out how to use DTrace to probe CFRetain() and CFRelease(), filtered by the CFTypeID for CGPDFDocument, but I wouldn't know what to probe for deallocation (allocation is not so important to track as I know it's done within CGPDFDocumentCreateWithProvider()). Also, I'd prefer to be able to break to the debugger on retain / release / deallocate, which I don't think is possible using DTrace.)
UPDATE: Having now read the source code for CFRelease I realise I misunderstood the purpose of context.retain and context.release -- they are for retaining and releasing context.info. So the entire approach described above is a non-starter. However, perhaps a DTrace/Instruments wizard may still be able to work some magic?!
It's a very interesting problem. Since you've gone as deep as investigating filtering DTrace, and you're diving into the CFRelease source, then you can look at using gdb breakpoint conditions to choose when to break. To determine if a deallocation is going to occur, just use CFGetRetainCount().
That said, I'm guessing you're pulling your hair out tracking down some over-release crash, right? Things to look at that are probably more useful than reverse-engineering CFRelease():
CFZombie
Instruments' Allocations instrument provides full stacks of when objects were retained and released, allocated and destroyed. Turn on the option "Record reference counts."

How is pattern matching in Scala implemented at the bytecode level?

How is pattern matching in Scala implemented at the bytecode level?
Is it like a series of if (x instanceof Foo) constructs, or something else? What are its performance implications?
For example, given the following code (from Scala By Example pages 46-48), how would the equivalent Java code for the eval method look like?
abstract class Expr
case class Number(n: Int) extends Expr
case class Sum(e1: Expr, e2: Expr) extends Expr
def eval(e: Expr): Int = e match {
case Number(x) => x
case Sum(l, r) => eval(l) + eval(r)
}
P.S. I can read Java bytecode, so a bytecode representation would be good enough for me, but probably it would be better for the other readers to know how it would look like as Java code.
P.P.S. Does the book Programming in Scala give an answer to this and similar questions about how Scala is implemented? I have ordered the book, but it has not yet arrived.
The low level can be explored with a disassembler but the short answer is that it's a bunch of if/elses where the predicate depends on the pattern
case Sum(l,r) // instance of check followed by fetching the two arguments and assigning to two variables l and r but see below about custom extractors
case "hello" // equality check
case _ : Foo // instance of check
case x => // assignment to a fresh variable
case _ => // do nothing, this is the tail else on the if/else
There's much more that you can do with patterns like or patterns and combinations like "case Foo(45, x)", but generally those are just logical extensions of what I just described. Patterns can also have guards, which are additional constraints on the predicates. There are also cases where the compiler can optimize pattern matching, e.g when there's some overlap between cases it might coalesce things a bit. Advanced patterns and optimization are an active area of work in the compiler, so don't be surprised if the byte code improves substantially over these basic rules in current and future versions of Scala.
In addition to all that, you can write your own custom extractors in addition to or instead of the default ones Scala uses for case classes. If you do, then the cost of the pattern match is the cost of whatever the extractor does. A good overview is found in http://lamp.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf
James (above) said it best. However, if you're curious it's always a good exercise to look at the disassembled bytecode. You can also invoke scalac with the -print option, which will print your program with all Scala-specific features removed. It's basically Java in Scala's clothing. Here's the relevant scalac -print output for the code snippet you gave:
def eval(e: Expr): Int = {
<synthetic> val temp10: Expr = e;
if (temp10.$isInstanceOf[Number]())
temp10.$asInstanceOf[Number]().n()
else
if (temp10.$isInstanceOf[Sum]())
{
<synthetic> val temp13: Sum = temp10.$asInstanceOf[Sum]();
Main.this.eval(temp13.e1()).+(Main.this.eval(temp13.e2()))
}
else
throw new MatchError(temp10)
};
Since version 2.8, Scala has had the #switch annotation. The goal is to ensure, that pattern matching will be compiled into tableswitch or lookupswitch instead of series of conditional if statements.
To expand on #Zifre's comment: if you are reading this in the future and the scala compiler has adopted new compilation strategies and you want to know what they are, here's how you find out what it does.
Copy-paste your match code into a self-contained example file. Run scalac on that file. Then run javap -v -c theClassName$.class.
For example, I put the following into /tmp/question.scala:
object question {
abstract class Expr
case class Number(n: Int) extends Expr
case class Sum(e1: Expr, e2: Expr) extends Expr
def eval(e: Expr): Int = e match {
case Number(x) => x
case Sum(l, r) => eval(l) + eval(r)
}
}
Then I ran scalac question.scala, which produced a bunch of *.class files. Poking around a bit, I found the match statement inside question$.class. The javap -c -v question$.class output is available below.
Since we're looking for a condition control flow construct, knowing about the java bytecode instruction set suggests that looking for "if" should be a good place to start.
In two locations we find a pair of consecutive lines on the form isinstanceof <something>; ifeq <somewhere>, which means: if the most recently computed value is not an instance of something then goto somewhere. (ifeq is jump if zero, and isinstanceof gives you a zero to represent false.)
If you follow the control flow around, you'll see that it agrees with the answer given by #Jorge Ortiz: we do if (blah isinstanceof something) { ... } else if (blah isinstanceof somethingelse) { ... }.
Here is the javap -c -v question$.class output:
Classfile /tmp/question$.class
Last modified Nov 20, 2020; size 956 bytes
MD5 checksum cfc788d4c847dad0863a797d980ad2f3
Compiled from "question.scala"
public final class question$
minor version: 0
major version: 50
flags: (0x0031) ACC_PUBLIC, ACC_FINAL, ACC_SUPER
this_class: #2 // question$
super_class: #4 // java/lang/Object
interfaces: 0, fields: 1, methods: 3, attributes: 4
Constant pool:
#1 = Utf8 question$
#2 = Class #1 // question$
#3 = Utf8 java/lang/Object
#4 = Class #3 // java/lang/Object
#5 = Utf8 question.scala
#6 = Utf8 MODULE$
#7 = Utf8 Lquestion$;
#8 = Utf8 <clinit>
#9 = Utf8 ()V
#10 = Utf8 <init>
#11 = NameAndType #10:#9 // "<init>":()V
#12 = Methodref #2.#11 // question$."<init>":()V
#13 = Utf8 eval
#14 = Utf8 (Lquestion$Expr;)I
#15 = Utf8 question$Number
#16 = Class #15 // question$Number
#17 = Utf8 n
#18 = Utf8 ()I
#19 = NameAndType #17:#18 // n:()I
#20 = Methodref #16.#19 // question$Number.n:()I
#21 = Utf8 question$Sum
#22 = Class #21 // question$Sum
#23 = Utf8 e1
#24 = Utf8 ()Lquestion$Expr;
#25 = NameAndType #23:#24 // e1:()Lquestion$Expr;
#26 = Methodref #22.#25 // question$Sum.e1:()Lquestion$Expr;
#27 = Utf8 e2
#28 = NameAndType #27:#24 // e2:()Lquestion$Expr;
#29 = Methodref #22.#28 // question$Sum.e2:()Lquestion$Expr;
#30 = NameAndType #13:#14 // eval:(Lquestion$Expr;)I
#31 = Methodref #2.#30 // question$.eval:(Lquestion$Expr;)I
#32 = Utf8 scala/MatchError
#33 = Class #32 // scala/MatchError
#34 = Utf8 (Ljava/lang/Object;)V
#35 = NameAndType #10:#34 // "<init>":(Ljava/lang/Object;)V
#36 = Methodref #33.#35 // scala/MatchError."<init>":(Ljava/lang/Object;)V
#37 = Utf8 this
#38 = Utf8 e
#39 = Utf8 Lquestion$Expr;
#40 = Utf8 x
#41 = Utf8 I
#42 = Utf8 l
#43 = Utf8 r
#44 = Utf8 question$Expr
#45 = Class #44 // question$Expr
#46 = Methodref #4.#11 // java/lang/Object."<init>":()V
#47 = NameAndType #6:#7 // MODULE$:Lquestion$;
#48 = Fieldref #2.#47 // question$.MODULE$:Lquestion$;
#49 = Utf8 question
#50 = Class #49 // question
#51 = Utf8 Sum
#52 = Utf8 Expr
#53 = Utf8 Number
#54 = Utf8 Code
#55 = Utf8 LocalVariableTable
#56 = Utf8 LineNumberTable
#57 = Utf8 StackMapTable
#58 = Utf8 SourceFile
#59 = Utf8 InnerClasses
#60 = Utf8 ScalaInlineInfo
#61 = Utf8 Scala
{
public static final question$ MODULE$;
descriptor: Lquestion$;
flags: (0x0019) ACC_PUBLIC, ACC_STATIC, ACC_FINAL
public static {};
descriptor: ()V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: new #2 // class question$
3: invokespecial #12 // Method "<init>":()V
6: return
public int eval(question$Expr);
descriptor: (Lquestion$Expr;)I
flags: (0x0001) ACC_PUBLIC
Code:
stack=3, locals=9, args_size=2
0: aload_1
1: astore_2
2: aload_2
3: instanceof #16 // class question$Number
6: ifeq 27
9: aload_2
10: checkcast #16 // class question$Number
13: astore_3
14: aload_3
15: invokevirtual #20 // Method question$Number.n:()I
18: istore 4
20: iload 4
22: istore 5
24: goto 69
27: aload_2
28: instanceof #22 // class question$Sum
31: ifeq 72
34: aload_2
35: checkcast #22 // class question$Sum
38: astore 6
40: aload 6
42: invokevirtual #26 // Method question$Sum.e1:()Lquestion$Expr;
45: astore 7
47: aload 6
49: invokevirtual #29 // Method question$Sum.e2:()Lquestion$Expr;
52: astore 8
54: aload_0
55: aload 7
57: invokevirtual #31 // Method eval:(Lquestion$Expr;)I
60: aload_0
61: aload 8
63: invokevirtual #31 // Method eval:(Lquestion$Expr;)I
66: iadd
67: istore 5
69: iload 5
71: ireturn
72: new #33 // class scala/MatchError
75: dup
76: aload_2
77: invokespecial #36 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
80: athrow
LocalVariableTable:
Start Length Slot Name Signature
0 81 0 this Lquestion$;
0 81 1 e Lquestion$Expr;
20 61 4 x I
47 34 7 l Lquestion$Expr;
54 27 8 r Lquestion$Expr;
LineNumberTable:
line 6: 0
line 7: 2
line 8: 27
line 6: 69
StackMapTable: number_of_entries = 3
frame_type = 252 /* append */
offset_delta = 27
locals = [ class question$Expr ]
frame_type = 254 /* append */
offset_delta = 41
locals = [ top, top, int ]
frame_type = 248 /* chop */
offset_delta = 2
}
SourceFile: "question.scala"
InnerClasses:
public static #51= #22 of #50; // Sum=class question$Sum of class question
public static abstract #52= #45 of #50; // Expr=class question$Expr of class question
public static #53= #16 of #50; // Number=class question$Number of class question
ScalaInlineInfo: length = 0xE (unknown attribute)
01 01 00 02 00 0A 00 09 01 00 0D 00 0E 01
Scala: length = 0x0 (unknown attribute)

Resources