Unmanaged to managed callback much slower when target is in another AppDomain - performance

I'm calling managed code from unmanaged code using a delegate. When I call into managed code in the default AppDomain I'm measuring an average of 5.4ns per call. When I calling to a second AppDomain I'm measuring 194ns per call. (default VS2017 x86 release configuration, not running under the debugger).
Why is performance so much lower when calling into an AppDomain that isn't the default? Since I'm coming from the unmanaged side, which has no knowledge of AppDomains I would expect to be calling straight into the target domain. However, the performance hit would imply that the delegate is calling into the default domain then marshaling to the real target. I do see UM2MDoADCallBack when stepping through the disassembly. Which shows up under WrongAppDomain: in UMThunkStub.asm
How can I prevent this unnecessary marshaling and call directly into a specific AppDomain?
The code I'm using to test this is below.
#pragma unmanaged
#include <wtypes.h>
#include <cstdint>
#include <cwchar>
typedef void (__stdcall *ManagedUpdatePtr)();
struct ProfileSample
static uint64_t frequency;
uint64_t startTick;
wchar_t* name;
int count;
ProfileSample(wchar_t* name_, int count_)
name = name_;
count = count_;
LARGE_INTEGER win32_startTick;
startTick = win32_startTick.QuadPart;
LARGE_INTEGER win32_endTick;
uint64_t endTick = win32_endTick.QuadPart;
uint64_t deltaTicks = endTick - startTick;
double nanoseconds = (double) deltaTicks / (double) frequency * 1000000000.0 / count;
wchar_t buffer[128];
swprintf(buffer, _countof(buffer), L"%s - %.4f ns\n", name, nanoseconds);
if (!IsDebuggerPresent())
MessageBoxW(nullptr, buffer, nullptr, 0);
uint64_t ProfileSample::frequency = 0;
LARGE_INTEGER frequency;
ProfileSample::frequency = frequency.QuadPart;
ManagedUpdatePtr GetManagedUpdatePtr();
auto managedUpdate = GetManagedUpdatePtr();
//Warm stuff up
for ( size_t i = 0; i < 100; i++ )
const int num = 10000000;
ProfileSample p(L"ManagedUpdate", num);
for ( size_t i = 0; i < num; i++ )
return 0;
#pragma managed
using namespace System;
using namespace System::Diagnostics;
using namespace System::Runtime::InteropServices;
ref struct ManagedObject : MarshalByRefObject
auto delegate = gcnew Action(this, &ManagedObject::ManagedUpdate);
IntPtr fPtr = Marshal::GetFunctionPointerForDelegate(delegate);
return (ManagedUpdatePtr) fPtr.ToPointer();
void ManagedUpdate()
//Debug::WriteLine("\n\nManagedUpdate ({0})", (Object^) AppDomain::CurrentDomain->FriendlyName);
auto pluginDomain = AppDomain::CreateDomain("Plugin Domain");
auto managedObject = (ManagedObject^) pluginDomain->CreateInstanceAndUnwrap("ManagedHelper", "ManagedObject");
return managedObject->GetManagedUpdatePtr();


std::atomic on struct bit-fields

I'm modifying some existing open source library and there is a struct (say named as Node) containing bit-fields, e.g.
struct Node {
std::atomic<uint32_t> size:30;
std::atomic<uint32_t> isnull:1;
To fit my needs, these fields need to be atomic so I was expecting to use std::atomic for this and faced compile time error:
bit-field 'size' has non-integral type 'std::atomic<uint32_t>'
According to documentation, there is a restricted set of types which can be used for std::atomic
Can anyone advise/have idea on how to get functionality of atomic fields with the minimum impact to the existing source code?
Thanks in advance!
I used an unsigned short as an example below.
This is less ideal, but you could sacrifice 8 bits and insert a std::atomic_flag in the bit field with a union. Unfortunately, std::atomic_flag type is a std::atomic_bool type.
This structure can be spin locked manually every time you access it. However, the code should have minimal performance degradation (unlike creating, locking, unlocking, destroying with a std::mutex and std::unique_lock).
This code may waste about 10-30 clock cycles to enable low cost multi-threading.
PS. Make sure the reserved 8 bits below are not messed up by the endian structure of the processor. You may have to define at the end for big-endian processors. I only tested this code on an Intel CPU (always little-endian).
#include <iostream>
#include <atomic>
#include <thread>
union Data
std::atomic_flag access = ATOMIC_FLAG_INIT; // one byte
typedef unsigned short ushort;
ushort reserved : 8;
ushort count : 4;
ushort ready : 1;
ushort unused : 3;
} bits;
class SpinLock
inline SpinLock(std::atomic_flag &access, bool locked=true)
: mAccess(access)
if(locked) lock();
inline ~SpinLock()
inline void lock()
while (mAccess.test_and_set(std::memory_order_acquire))
// each attempt will take about 10-30 clock cycles
inline bool try_lock(unsigned int attempts=0)
if (! attempts) return false;
-- attempts;
return true;
inline void unlock()
std::atomic_flag &mAccess;
void aFn(int &i, Data &d)
SpinLock lock(d.access, false);
// manually locking/unlocking can be tighter
if (d.bits.ready)
d.bits.ready ^= true; // alternate each time
int main(void)
Data f;
f.bits.count = 0;
f.bits.ready = true;
std::thread *p[8];
for (int i = 0; i < 8; ++ i)
p[i] = new std::thread([&f] (int i) { aFn(i, f); }, i);
for (int i = 0; i < 8; ++i)
delete p[i];
std::cout << "size: " << sizeof(f) << std::endl;
std::cout << "count: " << f.bits.count << std::endl;
The result is as expected...
size: 2
count: 4

There is no entry for device under /dev even after class_create and device_create

I am making one simple char driver and I learnt that there are 2 ways I can get Major number for my driver to pair with - alloc_chrdev_region(and register_chrdev_region) and register_chrdev. I initially started with register_chrdev and it gave me my major number and also created entry in /dev (class and device create used).
But when I change for register_chrdev to alloc_chrdev_region to acquire major number (using chrdev_init and chrdev_add), leaving rest of the entry function same, I don't see an entry in /dev, though when I make it manually with mknode, and run the test application to use the driver, it works fine.
Below is the code of entry point that does not produce the /dev entry
#include <linux/version.h>
#include <linux/types.h>
#include <linux/kdev_t.h>
#define DEVICE_NAME "myCharDevice"
#define MODULE_NAME "myCharDriver"
#define CLASS_NAME "myCharClass"
static char *bufferMemory;
static int bufferPointer;
static int bufferSize = 15;
static dev_t myChrDevid;
static struct cdev *myChrDevCdev;
static struct class *pmyCharClass;
static struct device *pmyCharDevice;
int majorNumber = 0;
static int charDriverOpen(struct inode *inodep, struct file *filep);
static int charDriverClose(struct inode *inodep, struct file *filep);
static ssize_t charDriverWrite(struct file *filep, const char *buffer, size_t len, loff_t *offset);
static ssize_t charDriverRead(struct file *filep, char *buffer, size_t len, loff_t *offset);
static int charDriverEntry(void);
static void charDriverExit(void);
static ssize_t attrShowData(struct device*, struct device_attribute*, char*);
static ssize_t attrStoreData(struct device*, struct device_attribute*, const char*, size_t);
static ssize_t attrShowBuffer(struct device*, struct device_attribute*, char*);
static ssize_t attrStoreBuffer(struct device*, struct device_attribute*, const char*, size_t);
/* The following function is called when the file placed on the sysfs is accessed for read*/
static ssize_t attrShowData(struct device* pDev, struct device_attribute* attr, char* buffer)
printk(KERN_INFO "MESG: The data has been accessed through the entry in sysfs\n");
if (bufferPointer == 0)
printk(KERN_WARNING "Thre is no data to read from buffer!\n");
return -1;
strncpy(buffer, bufferMemory, bufferPointer);
/* Note : Here we can directly use strncpy because we are already in kernel space and do not need to translate address*/
return bufferPointer;
static ssize_t attrStoreData(struct device* pDev, struct device_attribute* attr, const char* buffer, size_t length)
printk(KERN_INFO "Writing to attribute\n");
bufferPointer = length;
strncpy(bufferMemory, buffer, length);
return length;
static ssize_t attrShowBuffer(struct device* pDev, struct device_attribute* attr, char* buffer)
int counter;
int temp = bufferSize;
char bufferSizeArray[4] = {0};
counter = 3;
//printk(KERN_INFO "Buffer = %d\n",bufferSize % 10);
bufferSizeArray[counter] = '0' + (bufferSize % 10);
//printk(KERN_INFO "Character at %d is : %c\n",counter,bufferSizeArray[counter]);
bufferSize /= 10;
while(counter != -1);
strncpy(buffer, bufferSizeArray, 4);
bufferSize = temp;
/* Note : Here we can directly use strncpy because we are already in kernel space and do not need to translate address*/
return 4;
static ssize_t attrStoreBuffer(struct device* pDev, struct device_attribute* attr, const char* buffer, size_t length)
int counter;
bufferPointer = length;
//printk(KERN_INFO "Length : %d With first char %c\n",length,buffer[0]);
bufferSize = 0;
for (counter = 0; counter < length-1 ; counter++)
bufferSize = (bufferSize * 10) + (buffer[counter] - '0') ;
//printk(KERN_INFO "Buffer size new : %d\n",bufferSize);
return length;
/* These macros converts the function in to instances dev_attr_<_name>*/
/* Defination of the macro is as follows : DEVICE_ATTR(_name, _mode, _show, _store) */
/* Note the actual implementation of the macro makes an entry in the struct device_attribute. This macro does that for us */
static DEVICE_ATTR(ShowData, S_IRWXU, attrShowData, attrStoreData); // S_IRUSR gives read access to the user
static DEVICE_ATTR(Buffer, S_IRWXU, attrShowBuffer, attrStoreBuffer); // S_IRUSR gives read access to the user
static struct file_operations fops =
.open = charDriverOpen,
.release = charDriverClose,
.read = charDriverRead,
.write = charDriverWrite,
static int __init charDriverEntry()
int returnValue;
//majorNumber = register_chrdev(0, DEVICE_NAME, &fops);
returnValue = alloc_chrdev_region(&myChrDevid, 0, 1, DEVICE_NAME);
/* This function takes 4 arguments - dev_t address, start of minor number, range/count of minor number, Name; Note - unlike register_chrdev fops have not
yet been tied to the major number */
if (returnValue < 0)
printk(KERN_ALERT "ERROR : can not aquire major number! error %d",returnValue);
return -1;
printk(KERN_INFO "Aquired Major Number! : %d\n", MAJOR(myChrDevid));
myChrDevCdev = cdev_alloc();
if (IS_ERR(myChrDevCdev))
printk(KERN_ALERT "Failed to allocate space for CharDev struct\n");
unregister_chrdev_region(myChrDevid, 1);
return -1;
myChrDevCdev->owner = THIS_MODULE;
//myChrDevCdev->ops = &fops;/* this function inits the c_dev structure with memset 0 and then does basic konject setup and then adds fops to cdev struct*/
/* this function adds the cdev to the kernel structure so that it becomes available for the users to use it */
// Now we will create class for this device
pmyCharClass = class_create(THIS_MODULE,CLASS_NAME);
if (IS_ERR(pmyCharClass))
printk(KERN_ALERT "Failed to Register Class\n");
unregister_chrdev_region(myChrDevid, 1);
return -1;
printk(KERN_INFO "Class created!\n");
pmyCharDevice = device_create(pmyCharClass, NULL, MKDEV(majorNumber,0),NULL,DEVICE_NAME);
if (IS_ERR(pmyCharDevice))
printk(KERN_ALERT "Failed to Register Class\n");
unregister_chrdev_region(myChrDevid, 1);
return -1;
printk(KERN_INFO "Device created!\n");
returnValue = cdev_add(myChrDevCdev, myChrDevid, 1);
if (returnValue < 0)
printk(KERN_ALERT "Failed to add chdev \n");
return -1;
/* We now have created the class and we have aquired major numer. But we have not yet tied out created fileops with anything.
We will do that now */
//returnValue = cdev_init(cdev)
printk(KERN_INFO "Now We will create the attribute entry in sysfs\n");
/* the function used is device_create_file(struct device *, struct device_attribute*) */
device_create_file(pmyCharDevice, &dev_attr_ShowData); // The second argumnet is the structure created by the DEVICE_ATTR macro
device_create_file(pmyCharDevice, &dev_attr_Buffer);
return 0;
static void __exit charDriverExit()
device_remove_file(pmyCharDevice, &dev_attr_Buffer);
device_remove_file(pmyCharDevice, &dev_attr_ShowData);
device_destroy(pmyCharClass, MKDEV(majorNumber,0));
unregister_chrdev_region(myChrDevid, 1);
printk(KERN_INFO "Unmounting module done !\n");
static int charDriverOpen(struct inode *inodep, struct file *filep)
if ((filep->f_flags & O_ACCMODE) != O_RDWR)
printk(KERN_ALERT "WARNING : This driver can only be opened in both read and write mode\n");
return -1;
bufferMemory = kmalloc(bufferSize,GFP_KERNEL);
bufferPointer = 0;
return 0;
static int charDriverClose(struct inode *inodep, struct file *filep)
return 0;
static ssize_t charDriverWrite(struct file *filep, const char *buffer, size_t len, loff_t *offset)
// Here we will only allow to write one byte of data
if (len > bufferSize)
printk(KERN_WARNING "Attempted to write data larger than 15 byte!\n");
return 0;
//bufferMemory[bufferPointer] = *buffer;
copy_from_user(bufferMemory, buffer, len);
bufferPointer += len;
return len;
static ssize_t charDriverRead(struct file *filep, char *buffer, size_t len, loff_t *offset)
if(len > bufferSize || len > bufferPointer)
printk(KERN_WARNING "Attempting to read more than buffer size ! Deny\n");
return 0;
copy_to_user(buffer, bufferMemory, len);
// buffer[0] = bufferMemory[0];
bufferPointer -= len;
return len;
module_param(bufferSize, int, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(bufferSize, "Buffer Memory Size [15]");
Now if I replace the while alloc_chrdev_region, cdev_init and cdev_add with just register_chrdev(), The entry in /dev pops up. I am unable to figure out what more does register_chrdev() do that the former combination does not.
Thank you
Edit : Found the issue.
it was due to using MKDEV(majorNumber, 0); Without actually storing major number in the majorNumber variable using MAJOR();
Not deleting the question as someone can find it useful

Windows Thumbnail Handler Never Gets Called

I'm writing a thumbnail handler for a custom file type. When I register it, it returns with a success message. The "thumbnail" is shown as completely blank, but it does not say "no thumbnail available". The problem is, the actual thumbnail handler never gets called. I know this, because I put a series of statements in my handler under handler::initialize, handler::queryinterface and handler::getthumbnail. I have gone through most of this documentation., but all I can find is that I need to use initialize and getthumbnail. Here is my code:
#pragma once
#include <windows.h>
#include <thumbcache.h> // For IThumbnailProvider
#include <wincodec.h> // Windows Imaging Codecs
#include <fstream>
#include <iostream>
#pragma comment(lib, "windowscodecs.lib")
class ThumbnailProvider :
public IInitializeWithStream,
public IThumbnailProvider
// IUnknown
IFACEMETHODIMP QueryInterface(REFIID riid, void **ppv);
// IInitializeWithStream
IFACEMETHODIMP Initialize(IStream *pStream, DWORD grfMode);
// IThumbnailProvider
~ ThumbnailProvider();
// Reference count of component.
long m_cRef;
// Provided during initialization.
IStream *m_pStream;
std::ofstream output;
void stripImageFrom (IStream *stream, HBITMAP *phbmp);
#include " ThumbnailProvider.h"
#include <Shlwapi.h>
#include <Wincrypt.h> // For CryptStringToBinary.
#include <msxml6.h>
#include <atlimage.h>
#include <fstream>
#pragma comment(lib, "Shlwapi.lib")
#pragma comment(lib, "Crypt32.lib")
#pragma comment(lib, "msxml6.lib")
extern HINSTANCE g_hInst;
extern long g_cDllRef;
ThumbnailProvider:: ThumbnailProvider() : m_cRef(1), m_pStream(NULL)
std::ofstream st;
st.open("C:\\Users\\labs\\Desktop\\Output\\out.txt", std::ios_base::app);
st << "Made provider";
ThumbnailProvider::~ ThumbnailProvider()
#pragma region IUnknown
// Query to the interface the component supported.
IFACEMETHODIMP ThumbnailProvider::QueryInterface(REFIID riid, void **ppv)
std::ofstream st;
st.open("C:\\Users\\labs\\Desktop\\Output\\out.txt", std::ios_base::app);
st << "Querying interface";
static const QITAB qit[] =
QITABENT( ThumbnailProvider, IThumbnailProvider),
QITABENT( ThumbnailProvider, IInitializeWithStream),
{ 0 },
return QISearch(this, qit, riid, ppv);
// Increase the reference count for an interface on an object.
IFACEMETHODIMP_(ULONG) ThumbnailProvider::AddRef()
return InterlockedIncrement(&m_cRef);
// Decrease the reference count for an interface on an object.
IFACEMETHODIMP_(ULONG) ThumbnailProvider::Release()
ULONG cRef = InterlockedDecrement(&m_cRef);
if (0 == cRef)
delete this;
return cRef;
#pragma endregion
#pragma region IInitializeWithStream
// Initializes the thumbnail handler with a stream.
IFACEMETHODIMP ThumbnailProvider::Initialize(IStream *pStream, DWORD grfMode)
std::ofstream st;
st.open("C:\\Users\\labs\\Desktop\\Output\\out.txt", std::ios_base::app);
st << "Got to initialization";
// A handler instance should be initialized only once in its lifetime.
if (m_pStream == NULL)
// Take a reference to the stream if it has not been initialized yet.
hr = pStream->QueryInterface(&m_pStream);
return hr;
#pragma endregion
#pragma region IThumbnailProvider
// Gets a thumbnail image and alpha type. The GetThumbnail is called with the
// largest desired size of the image, in pixels. Although the parameter is
// called cx, this is used as the maximum size of both the x and y dimensions.
// If the retrieved thumbnail is not square, then the longer axis is limited
// by cx and the aspect ratio of the original image respected. On exit,
// GetThumbnail provides a handle to the retrieved image. It also provides a
// value that indicates the color at of the image and whether it has
// valid alpha in ation.
IFACEMETHODIMP ThumbnailProvider::GetThumbnail(UINT cx, HBITMAP *phbmp,
WTS_ALPHATYPE *pdwAlpha) {
std::ofstream st;
st.open("C:\\Users\\labs\\Desktop\\Output\\out.txt", std::ios_base::app);
st << "Getting thumbnail";
ThumbnailProvider::stripImageFrom (m_pStream, phbmp);
cx = 1024 * 1024;
*pdwAlpha = WTSAT_UNKNOWN;
return S_OK;
#pragma endregion
#pragma region Helper Functions
// The PNG signature is 137 80 78 71 13 10 26 10. This does not make sense to do backward. I will do it forward.
void ThumbnailProvider::stripImageFrom (IStream *stream, HBITMAP *phbmp) {
unsigned long numBytes = 0;
unsigned long *numBytesPtr = &numBytes;
char *chptr = nullptr;
byte vals[8] = { 0 };
STATSTG *stat = nullptr;
DWORD temp = NULL;
stream->Stat(stat, temp);
unsigned long long length = stat->cbSize.QuadPart;
unsigned long long i;
for (i = 0; i<length; i++) {
stream->Read(chptr, 1, numBytesPtr);
if (*chptr == 137) {
vals[0] = 1;
else if (*chptr == 80 && vals[0]) {
vals[1] = 1;
else if (*chptr == 78 && vals[1]) {
vals[2] = 1;
else if (*chptr == 71 && vals[2]) {
vals[3] = 1;
else if (*chptr == 13 && vals[3]) {
vals[4] = 1;
else if (*chptr == 10 && vals[4] && !vals[5]) {
vals[5] = 1;
else if (*chptr == 26 && vals[5]) {
vals[6] = 1;
else if (*chptr == 10 && vals[6]) {
vals[7] = 1;
i -= 7;
else {
memset(vals, 0, 8 * sizeof(vals[0]));
if (vals[7]) {
IStream *imgstream = nullptr;
stream->Read(imgstream, length-i, numBytesPtr);
CImage *img = nullptr;
*phbmp = *img;
#pragma endregion
Everything else is just edited from one of Microsoft's examples, so I am pretty sure that is ok. The example is here.
If you're having this problem, make sure you're building for the right architecture. That solved it for me.

Kernel module export many device attributes to userspace

I'm working with a kernel driver for an I2C device and up until now I've been making simple attributes available using the sysfs DEVICE_ATTR helper. Now I need to make a long list of attributes available like /sys/bus/i2c/device/.../param0, .../param1, etc. but it seems inefficient to write a function for each one and maybe even the wrong use of the sysfs system. For example:
static DEVICE_ATTR(param0, S_IRUGO, NULL, foo_set_param0);
static DEVICE_ATTR(param1, S_IRUGO, NULL, foo_set_param1);
static DEVICE_ATTR(param50, S_IRUGO, NULL, foo_set_param50);
The values on the device change frequently and reading them is expensive so constantly reading them or using one function to read all of them is not really any option. I'm a bit of C newbie so maybe there is something totally obvious I'm missing, but can you use a wrapper on the sysfs show callback to take a parameter? Or is there a better system I should use for this? I looked at debugfs and it seems like I'd need to maintain the values in memory for it.
You can try container_of() macro. Simply fill your attribute data in a larger structure.
Here is an example for the creation of 100 attributes in a big structure big_kattr. The parameter is UNIT_NUM.
#include <linux/init.h>
#include <linux/module.h>
#include <linux/slab.h>
#define UNIT_NUM 100
typedef struct {
struct kobj_attribute k_obj;
int num;
} big_kattr;
static struct kobject *register_kobj;
// rw functions
static ssize_t __used store_value(struct kobject *kp, struct kobj_attribute *attr, const char *buf, size_t count){
big_kattr *a = container_of(attr, big_kattr, k_obj);
sscanf(buf, "%du", &a->num);
return count;
static ssize_t show_value(struct kobject *kp, struct kobj_attribute *attr, char *buf) {
big_kattr *a = container_of(attr, big_kattr, k_obj);
return sprintf(buf, "%d\n", a->num);
// put attribute to attribute group
static struct attribute * unit_attrs[UNIT_NUM + 1];
static big_kattr full_unit_attrs[UNIT_NUM];
static struct attribute_group unit_attr_group;
static int hello_init(void){
int i;
memset(full_unit_attrs, 0, sizeof(full_unit_attrs));
memset(unit_attrs, 0, sizeof(unit_attrs));
memset(&unit_attr_group, 0, sizeof(unit_attr_group));
for(i=0; i<UNIT_NUM; i++){
char * str = kmalloc(32, GFP_KERNEL);
sprintf(str, "unit-%03d",i);
full_unit_attrs[i].k_obj.attr.name = str;
full_unit_attrs[i].k_obj.attr.mode = S_IWUSR | S_IRUGO;
full_unit_attrs[i].k_obj.show = show_value;
full_unit_attrs[i].k_obj.store = store_value;
full_unit_attrs[i].num = i;
unit_attrs[i] = &(full_unit_attrs[i].k_obj.attr);
unit_attr_group.attrs = unit_attrs;
// create sysfs object ( /sys/kernel/many directory )
register_kobj = kobject_create_and_add("many", kernel_kobj);
if (!register_kobj)
return -ENOMEM;
//create all attributes (files)
if(sysfs_create_group(register_kobj, &unit_attr_group)){
return -ENOMEM;
return 0;
static void hello_exit(void){
int i;
for(i=0; i<UNIT_NUM; i++)
cat /sys/kernel/many/unit-077
echo 12345 > /sys/kernel/many/unit-088
cat /sys/kernel/many/unit-088

i am exporting dll but GetProcAddress returning 127 error ,how to resolve it?

#include "Calc.h"
#include <windows.h>
#include <WINERROR.H.>
typedef void (WINAPI * PCTOR) ();
int main()
HMODULE hMod = LoadLibrary (L"Calci.dll");
if (NULL == hMod)
printf ("LoadLibrary failed\n");
return 1;
CCalc *pCCalc = (CCalc *) malloc (sizeof (CCalc));
if (NULL == pCCalc)
printf ("memory allocation failed\n");
return 1;
PCTOR pCtor = (PCTOR) GetProcAddress (hMod, "CCalc");//127 error
int err = GetLastError();
if (NULL == pCtor)
printf ("GetProcAddress failed\n");
return 1;
__asm { MOV ECX, pCCalc };
pCtor ();
return 0;
//dll file
#include <tchar.h>
#define CALC_API __declspec (dllexport)
#define CALC_API __declspec (dllimport)
#define SOME_INSTN_BUF 260
class CALC_API CCalc
char m_szLastUsedFunc[SOME_INSTN_BUF];
CCalc ();
int Add (int i, int j);
int Sub (int i, int j);
TCHAR* GetLastUsedFunc ();
Use dumpbin.exe to check the exact name of the export in the DLL. Maybe it doesn't exist at all?
If you have a chance to use import library instead of LoadLibrary API, it is better.
You're invoking GetProcAddress (hMod, "CCalc"), however "CCalc" isn't the name of a function: it's the name of a class.
You're trying to load the address of the CCalc::CCalc default constructor: to do that, use a tool (e.g. dumpbin) to discover the "decorated" name of the constructor.
However instead of trying to dynamic-load and invoke the constructor, a more usual way to implement this functionality would be to create a static factory method in the DLL, e.g. like this:
class CALC_API CCalc
static CCalc* create() { return new CCalc(); }
//doesn't need to be public because users instantiate this class using
//the static create method
virtual int Add (int i, int j);
virtual int Sub (int i, int j);
virtual TCHAR* GetLastUsedFunc ();
virtual ~CCalc() {}
Then use GetProcAddress to get the address of the static CCalc::create function, which because it's static you can invoke without using assembly to mess with ECX.
You can't use GetProcAddress for classes. This does not work. Only functions you can resolve their names are unmangled "C" functions.
For example:
extern "C" __declspec(dllexport) CCalc *create_calc()
return new CCalc;
Now, you can resolve it using.
As create_calc is not-mangled function.
Also you will have to provide abstract API class without implementation and make CCalc inherit ACalc, otherwise you'll get unresolved symbols tying to compile your application. Because actual add and remove member functions are not known to the application.
class ACalc {
virtual add(int i,int j) = 0;
virtaul ~ACalc() {}
class CCalc : public ACalc {
virtual add(int i,int j) { ... };
And in the main program
ACalc *ptr= call_for_dll_function
