Number of mapped views to a shared memory on Windows - windows

Is there a way to check how many views have been mapped on to a memory mapped file on Windows?
Something like the equivalent of shmctl(... ,IPC_STAT,...) on Linux?

I had the same need to access the number of shared views. So I created this question: Accessing the number of shared memory mapped file views (Windows)
You may find a solution that suits your needs there.
As per Scath comment, I am going to add here the proposed solution, although merit should go to eryksun and RbMm. Making use of NtQueryObject call one can access the HandleCount (although it may not be 100% reliable):
#include <stdio.h>
#include <windows.h>
#include <winternl.h>
typedef NTSTATUS (__stdcall *NtQueryObjectFuncPointer) (
HANDLE Handle,
OBJECT_INFORMATION_CLASS ObjectInformationClass,
PVOID ObjectInformation,
ULONG ObjectInformationLength,
PULONG ReturnLength);
int main(void)
{
_PUBLIC_OBJECT_BASIC_INFORMATION pobi;
ULONG rLen;
// Create the memory mapped file (in system pagefile) (better in global namespace
// but needs SeCreateGlobalPrivilege privilege)
HANDLE hMap = CreateFileMapping(INVALID_HANDLE_VALUE, NULL, PAGE_READWRITE|SEC_COMMIT,
0, 1, "Local\\UniqueShareName");
// Get the NtQUeryObject function pointer and then the handle basic information
NtQueryObjectFuncPointer _NtQueryObject = (NtQueryObjectFuncPointer)GetProcAddress(
GetModuleHandle("ntdll.dll"), "NtQueryObject");
_NtQueryObject(hMap, ObjectBasicInformation, (PVOID)&pobi, (ULONG)sizeof(pobi), &rLen);
// Check limit
if (pobi.HandleCount > 4) {
printf("Limit exceeded: %ld > 4\n", pobi.HandleCount);
exit(1);
}
//...
Sleep(30000);
}

Related

Trap memory accesses inside a standard executable built with MinGW

So my problem sounds like this.
I have some platform dependent code (embedded system) which writes to some MMIO locations that are hardcoded at specific addresses.
I compile this code with some management code inside a standard executable (mainly for testing) but also for simulation (because it takes longer to find basic bugs inside the actual HW platform).
To alleviate the hardcoded pointers, i just redefine them to some variables inside the memory pool. And this works really well.
The problem is that there is specific hardware behavior on some of the MMIO locations (w1c for example) which makes "correct" testing hard to impossible.
These are the solutions i thought of:
1 - Somehow redefine the accesses to those registers and try to insert some immediate function to simulate the dynamic behavior. This is not really usable since there are various ways to write to the MMIO locations (pointers and stuff).
2 - Somehow leave the addresses hardcoded and trap the illegal access through a seg fault, find the location that triggered, extract exactly where the access was made, handle and return. I am not really sure how this would work (and even if it's possible).
3 - Use some sort of emulation. This will surely work, but it will void the whole purpose of running fast and native on a standard computer.
4 - Virtualization ?? Probably will take a lot of time to implement. Not really sure if the gain is justifiable.
Does anyone have any idea if this can be accomplished without going too deep? Maybe is there a way to manipulate the compiler in some way to define a memory area for which every access will generate a callback. Not really an expert in x86/gcc stuff.
Edit: It seems that it's not really possible to do this in a platform independent way, and since it will be only windows, i will use the available API (which seems to work as expected). Found this Q here:
Is set single step trap available on win 7?
I will put the whole "simulated" register file inside a number of pages, guard them, and trigger a callback from which i will extract all the necessary info, do my stuff then continue execution.
Thanks all for responding.
I think #2 is the best approach. I routinely use approach #4, but I use it to test code that is running in the kernel, so I need a layer below the kernel to trap and emulate the accesses. Since you have already put your code into a user-mode application, #2 should be simpler.
The answers to this question may provide help in implementing #2. How to write a signal handler to catch SIGSEGV?
What you really want to do, though, is to emulate the memory access and then have the segv handler return to the instruction after the access. This sample code works on Linux. I'm not sure if the behavior it is taking advantage of is undefined, though.
#include <stdint.h>
#include <stdio.h>
#include <signal.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static void segv_handler(int, siginfo_t *, void *);
int main()
{
struct sigaction action = { 0, };
action.sa_sigaction = segv_handler;
action.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &action, NULL);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static void segv_handler(int, siginfo_t *info, void *ucontext_arg)
{
ucontext_t *ucontext = static_cast<ucontext_t *>(ucontext_arg);
ucontext->uc_mcontext.gregs[REG_RAX] = 1234;
ucontext->uc_mcontext.gregs[REG_RIP] += 2;
}
The code to read the register is written in assembly to ensure that both the destination register and the length of the instruction are known.
This is how the Windows version of prl's answer could look like:
#include <stdint.h>
#include <stdio.h>
#include <windows.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *);
int main()
{
SetUnhandledExceptionFilter(segv_handler);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *ep)
{
// only handle read access violation of REG_ADDR
if (ep->ExceptionRecord->ExceptionCode != EXCEPTION_ACCESS_VIOLATION ||
ep->ExceptionRecord->ExceptionInformation[0] != 0 ||
ep->ExceptionRecord->ExceptionInformation[1] != (ULONG_PTR)REG_ADDR)
return EXCEPTION_CONTINUE_SEARCH;
ep->ContextRecord->Rax = 1234;
ep->ContextRecord->Rip += 2;
return EXCEPTION_CONTINUE_EXECUTION;
}
So, the solution (code snippet) is as follows:
First of all, i have a variable:
__attribute__ ((aligned (4096))) int g_test;
Second, inside my main function, i do the following:
AddVectoredExceptionHandler(1, VectoredHandler);
DWORD old;
VirtualProtect(&g_test, 4096, PAGE_READWRITE | PAGE_GUARD, &old);
The handler looks like this:
LONG WINAPI VectoredHandler(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
static DWORD last_addr;
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION) {
last_addr = ExceptionInfo->ExceptionRecord->ExceptionInformation[1];
ExceptionInfo->ContextRecord->EFlags |= 0x100; /* Single step to trigger the next one */
return EXCEPTION_CONTINUE_EXECUTION;
}
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP) {
DWORD old;
VirtualProtect((PVOID)(last_addr & ~PAGE_MASK), 4096, PAGE_READWRITE | PAGE_GUARD, &old);
return EXCEPTION_CONTINUE_EXECUTION;
}
return EXCEPTION_CONTINUE_SEARCH;
}
This is only a basic skeleton for the functionality. Basically I guard the page on which the variable resides, i have some linked lists in which i hold pointers to the function and values for the address in question. I check that the fault generating address is inside my list then i trigger the callback.
On first guard hit, the page protection will be disabled by the system, but i can call my PRE_WRITE callback where i can save the variable state. Because a single step is issued through the EFlags, it will be followed immediately by a single step exception (which means that the variable was written), and i can trigger a WRITE callback. All the data required for the operation is contained inside the ExceptionInformation array.
When someone tries to write to that variable:
*(int *)&g_test = 1;
A PRE_WRITE followed by a WRITE will be triggered,
When i do:
int x = *(int *)&g_test;
A READ will be issued.
In this way i can manipulate the data flow in a way that does not require modifications of the original source code.
Note: This is intended to be used as part of a test framework and any penalty hit is deemed acceptable.
For example, W1C (Write 1 to clear) operation can be accomplished:
void MYREG_hook(reg_cbk_t type)
{
/** We need to save the pre-write state
* This is safe since we are assured to be called with
* both PRE_WRITE and WRITE in the correct order
*/
static int pre;
switch (type) {
case REG_READ: /* Called pre-read */
break;
case REG_PRE_WRITE: /* Called pre-write */
pre = g_test;
break;
case REG_WRITE: /* Called after write */
g_test = pre & ~g_test; /* W1C */
break;
default:
break;
}
}
This was possible also with seg-faults on illegal addresses, but i had to issue one for each R/W, and keep track of a "virtual register file" so a bigger penalty hit. In this way i can only guard specific areas of memory or none, depending on the registered monitors.

Accessing SuperBlock object of linux kernel in a system call

I am trying to access super block object which is defined in linux/fs.h.
But how to initialize the object so that we can access it's properties.
I found that alloc_super() is used to initialize super but how is it called?
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <errno.h>
#include <linux/fs.h>
int main(){
printf("hello there");
struct super_block *sb;
return 0;
}
The answer is very much file system dependent, since different file systems will have different super block layouts and infact different arrangements of blocks.
For instance, ext2 file systems superblock is in a known location on disk (byte 1024), and has a known size (sizeof(struct superblock) bytes).
So a typical implementation (This is not a working code but with minor modification can be made to work ) of what you want would be:
struct superblock *read_superblock(int fd) {
struct superblock *sb = malloc(sizeof(struct superblock));
assert(sb != NULL);
lseek(fd, (off_t) 1024, SEEK_SET));
read(fd, (void *) sb, sizeof(struct superblock));
return sb;
}
Now, you can alloc superblock using linux/headers, or write your own struct that exactly matches with the ext2/ext3/etc/etc file systems superblock.
Then you must know where to find the superblock (the lseek() comes here).
Also you need to pass the disk file name file_descriptor to the function.
So do a
int fd = open(argv[1], O_RDONLY);
struct superblock * sb = read_superblock(fd);

How can one share depth images between two processes?

I have 4 different depth cameras available to me: Kinect, Xtion, PMD nano, Softkinetic DepthSense.
I have the libraries that know how to read all of them: OpenNI, PMD drivers, Softkinetic drivers.
I would ideally like to make a simple grabber for each kind of camera and then just use it as a plugin into any other program i.e. get fast, non redundant access (i.e. not too many memory copies) to the data stream.
One of the problems is that in many cases I dont have the right library in 32 or 64 bit so I cant compile all grabbers in the same project.
What is the best way to achieve this?
I am a researcher so this idea isnt necessarily useful for production code but given this scenario my best solution has been to create a server process for each type of camera. Each server process knows how to load its own type of camera stream and then throws it into a shared memory space that other processes can read from.
It is obviously possible to use different kind of locking mechanisms but I have left the below code without any locks.
The server process will include the following:
#define BOOST_ALL_NO_LIB
#include <boost/interprocess/shared_memory_object.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
#include <boost/interprocess/sync/interprocess_mutex.hpp>
using namespace std;
using namespace boost::interprocess;
struct sharedImage
{
enum { width = 320 };
enum { height = 240 };
enum { dataLength = width*height*sizeof(unsigned short) };
sharedImage(){}
interprocess_mutex mutex;
unsigned short data[dataLength];
};
shared_memory_object shm;
sharedImage * sIm;
mapped_region region;
int setupSharedMemory(){
// Clear the object if it exists
shared_memory_object::remove("ImageMem");
shm = shared_memory_object(create_only /*only create*/,"ImageMem" /*name*/,read_write/*read-write mode*/);
printf("Size:%i\n",sizeof(sharedImage));
//Set size
shm.truncate(sizeof(sharedImage));
//Map the whole shared memory in this process
region = mapped_region(shm, read_write);
//Get the address of the mapped region
void * addr = region.get_address();
//Construct the shared structure in the preallocated memory of shm
sIm = new (addr) sharedImage;
return 0;
}
int shutdownSharedMemory(){
shared_memory_object::remove("ImageMem");
return 0;
}
To start it up call setupSharedMemory() and to shut down call shutdownSharedMemory().
All the values are hard coded in this simple example but its easy to imagine making it more flexible.
Now lets assume that you are using SoftKinetic's DepthSense. So then you could write the following callback for the Depth node.
void onNewDepthSample(DepthNode node, DepthNode::NewSampleReceivedData data) {
//scoped_lock<interprocess_mutex> lock(sIm->mutex);
memcpy(sIm->data, data.depthMap, sIm->dataLength);
}
What this does is simply copies the latest depth map into the shared memory space.
You could also add a timestamp and a lock and anything else you need but this basic code works well enough for me so I will leave it as it is.
Now in some other process you can access the data in a very similar fashion.
The code below is what I use to get the live SoftKinetic DepthSense depth stream into Matlab for real time processing. This method has a huge advantage over trying to write my own mex wrapper specifically for SoftKinetic because I can use the same code for all the other cameras if I write servers for them.
#include <math.h>
#include <windows.h>
#include "mex.h"
#define BOOST_ALL_NO_LIB
#include <boost/interprocess/shared_memory_object.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
#include <boost/interprocess/sync/interprocess_mutex.hpp>
#include <iostream>
#include <cstdio>
#include <cstdlib>
using namespace boost::interprocess;
struct sharedImage
{
enum { width = 320 };
enum { height = 240 };
enum { dataLength = width*height*sizeof(short) };
sharedImage(): dirty(true){}
interprocess_mutex mutex;
uint8_t data[dataLength];
bool dirty;
};
void getFrame(unsigned short *D)
{
//Open the shared memory object.
shared_memory_object shm(open_only ,"ImageMem", read_write);
//Map the whole shared memory in this process
mapped_region region(shm ,read_write);
//Get the address of the mapped region
void * addr = region.get_address();
//Construct the shared structure in memory
sharedImage * sIm = static_cast<sharedImage*>(addr);
//scoped_lock<interprocess_mutex> lock(sIm->mutex);
memcpy((char*)D, (char*)sIm->data, sIm->dataLength);
}
void mexFunction(int nlhs, mxArray *plhs[ ], int nrhs, const mxArray *prhs[ ])
{
// Build outputs
mwSize dims[2] = {320, 240};
plhs[0] = mxCreateNumericArray(2, dims, mxUINT16_CLASS, mxREAL);
unsigned short *D = (unsigned short*)mxGetData(plhs[0]);
try
{
getFrame(D);
}
catch (interprocess_exception &ex)
{
mexPrintf("getFrame:%s\n", ex.what());
}
}
which on my computer I compile in Matlab with: mex getSKFrame.cpp -IC:\Development\boost_1_48_0
And then finally to use it in Matlab: D = getSKFrame()'; imagesc(D)

How to get thread stack information on Windows?

I enumerate all threads in a process through the CreateToolhelp32Snapshot function. I would like to get some basic stack information for each thread. More specifically I would like to get stack bottom address and if possible I would like to get current stack top address. Basically this is the information displayed with the ~*k command in WinDbg. So how can I obtain the stack information from the thread's ID or HANDLE?
(Definitions can be found here.)
To get stack boundaries:
THREAD_BASIC_INFORMATION basicInfo;
NT_TIB tib;
// Get TEB address
NtQueryInformationThread(YOUR_THREAD_HANDLE, ThreadBasicInformation, &basicInfo, sizeof(THREAD_BASIC_INFORMATION), NULL);
// Read TIB
NtReadVirtualMemory(YOUR_PROCESS_HANDLE, basicInfo.TebBaseAddress, &tib, sizeof(NT_TIB), NULL);
// Check tib.StackBase and tib.StackLimit
To get the value of esp, simply use GetThreadContext.
An easier way without having to involve the Windows Driver Kit is as so:
NT_TIB* tib = (NT_TIB*)__readfsdword(0x18);
size_t* stackBottom = (size_t*)tib->StackLimit;
size_t* stackTop = (size_t*)tib->StackBase;
__readfsdword() works only for the current thread. So, the variant with NtQueryInformationThread() is more flexible.
Added some declarations which are missed in ntdll.h:
typedef enum _THREADINFOCLASS {
ThreadBasicInformation = 0,
} THREADINFOCLASS;
typedef LONG KPRIORITY;
typedef struct _CLIENT_ID {
HANDLE UniqueProcess;
HANDLE UniqueThread;
} CLIENT_ID;
typedef CLIENT_ID *PCLIENT_ID;
typedef struct _THREAD_BASIC_INFORMATION
{
NTSTATUS ExitStatus;
PVOID TebBaseAddress;
CLIENT_ID ClientId;
KAFFINITY AffinityMask;
KPRIORITY Priority;
KPRIORITY BasePriority;
} THREAD_BASIC_INFORMATION, *PTHREAD_BASIC_INFORMATION;
As fas as I know, Toolhelp works by making a copy of basic information on heaps, modules, processes and threads. This does not include the TEB block that contains the stack bottom address. I think you need to use another API, the debugger engine API, which offers functions to examine the stacks
Here's an easy way for the current thread (portable Win32 x86/x64 version):
#include <intrin.h>
NT_TIB* getTIB() {
#ifdef _M_IX86
return (NT_TIB*)__readfsdword(0x18);
#elif _M_AMD64
return (NT_TIB*)__readgsqword(0x30);
#else
#error unsupported architecture
#endif
}
NT_TIB* tib = getTIB();
void* stackBase = tib->StackBase;
void* stackLimit = tib->StackLimit;
Note: stackLimit < stackBase (as stack grows downwards).
For more details refer to Win32 TIB.

SysInternal's WinObj device listing mechanism

SysInternals's WinObj can list all device objects.
I wonder how it can list the devices.
Is there any open source we can read?(or a code snippet)
What is the most significant function I should know?
WinObj uses the NT system calls NtOpenDirectoryObject and NtQueryDirectoryObject. There is no driver or kernel code needed. You won't see the imports because these NT functions are loaded via LoadLibrary/GetProcAddress.
You don't have to enumerate the entire object namespace. If you're interested in the device objects call NtOpenDirectoryObject with "\Device", then call NtQueryDirectoryObject on the returned handle.
According to SysInternals' web page:
The native NT API provides routines
that allow user-mode programs to
browse the namespace and query the
status of objects located there, but
the interfaces are undocumented.
I've tried looking at WinObj's import table (dumpbin /imports winobj.exe) but there are no obvious suspects :-(
As per the answer from user1575778 you can use NtOpenDirectoryObject and NtQueryDirectoryObject (which from user mode are identical to ZwOpenDirectoryObject and ZwQueryDirectoryObject respectively) to list the objects inside the object manager namespace.
Have a look at objmgr.hpp of NT Objects aka ntobjx, in particular at the class NtObjMgr::Directory (or DirectoryT). It provides the same functionality nicely wrapped into a C++ class. The whole utility is open source under a liberal license (dual-licensed due to WTL-use: MIT and MS-PL), so bits and pieces can be reused however you please, provided you comply with the license terms.
But here's a simple C++ code example catering just your use case:
#include <Windows.h>
#include <tchar.h>
#include <cstdio>
#include <winternl.h>
NTSTATUS (NTAPI* NtOpenDirectoryObject)(PHANDLE, ACCESS_MASK, POBJECT_ATTRIBUTES);
NTSTATUS (NTAPI* NtQueryDirectoryObject)(HANDLE, PVOID, ULONG, BOOLEAN, BOOLEAN, PULONG, PULONG);
VOID (NTAPI* RtlInitUnicodeString_)(PUNICODE_STRING, PCWSTR);
NTSTATUS (NTAPI* NtClose_)(HANDLE);
#define DIRECTORY_QUERY (0x0001)
#define DIRECTORY_TRAVERSE (0x0002)
typedef struct _OBJECT_DIRECTORY_INFORMATION {
UNICODE_STRING Name;
UNICODE_STRING TypeName;
} OBJECT_DIRECTORY_INFORMATION, *POBJECT_DIRECTORY_INFORMATION;
#ifndef STATUS_SUCCESS
#define STATUS_SUCCESS ((NTSTATUS)0x00000000L) // ntsubauth
#endif // STATUS_SUCCESS
#ifndef STATUS_MORE_ENTRIES
#define STATUS_MORE_ENTRIES ((NTSTATUS)0x00000105L)
#endif // STATUS_MORE_ENTRIES
#ifndef STATUS_NO_MORE_ENTRIES
#define STATUS_NO_MORE_ENTRIES ((NTSTATUS)0x8000001AL)
#endif // STATUS_NO_MORE_ENTRIES
int PrintDevices()
{
NTSTATUS ntStatus;
OBJECT_ATTRIBUTES oa;
UNICODE_STRING objname;
HANDLE hDeviceDir = NULL;
RtlInitUnicodeString_(&objname, L"\\Device");
InitializeObjectAttributes(&oa, &objname, 0, NULL, NULL);
ntStatus = NtOpenDirectoryObject(&hDeviceDir, DIRECTORY_QUERY | DIRECTORY_TRAVERSE, &oa);
if(NT_SUCCESS(ntStatus))
{
size_t const bufSize = 0x10000;
BYTE buf[bufSize] = {0};
ULONG start = 0, idx = 0, bytes;
BOOLEAN restart = TRUE;
for(;;)
{
ntStatus = NtQueryDirectoryObject(hDeviceDir, PBYTE(buf), bufSize, FALSE, restart, &idx, &bytes);
if(NT_SUCCESS(ntStatus))
{
POBJECT_DIRECTORY_INFORMATION const pdilist = reinterpret_cast<POBJECT_DIRECTORY_INFORMATION>(PBYTE(buf));
for(ULONG i = 0; i < idx - start; i++)
{
if(0 == wcsncmp(pdilist[i].TypeName.Buffer, L"Device", pdilist[i].TypeName.Length / sizeof(WCHAR)))
{
_tprintf(_T("%s\n"), pdilist[i].Name.Buffer);
}
}
}
if(STATUS_MORE_ENTRIES == ntStatus)
{
start = idx;
restart = FALSE;
continue;
}
if((STATUS_SUCCESS == ntStatus) || (STATUS_NO_MORE_ENTRIES == ntStatus))
{
break;
}
}
(void)NtClose_(hDeviceDir);
return 0;
}
_tprintf(_T("Failed NtOpenDirectoryObject with 0x%08X"), ntStatus);
return 1;
}
int _tmain(int /*argc*/, _TCHAR** /*argv*/)
{
HMODULE hNtDll = ::GetModuleHandle(_T("ntdll.dll"));
*(FARPROC*)&NtOpenDirectoryObject = ::GetProcAddress(hNtDll, "NtOpenDirectoryObject");
*(FARPROC*)&NtQueryDirectoryObject = ::GetProcAddress(hNtDll, "NtQueryDirectoryObject");
*(FARPROC*)&RtlInitUnicodeString_ = ::GetProcAddress(hNtDll, "RtlInitUnicodeString");
*(FARPROC*)&NtClose_ = ::GetProcAddress(hNtDll, "NtClose");
if (!NtOpenDirectoryObject || !NtQueryDirectoryObject || !RtlInitUnicodeString_ || !NtClose_)
{
_tprintf(_T("Failed to retrieve ntdll.dll function pointers\n"));
return 1;
}
return PrintDevices();
}
Some remarks: This will not delve into subdirectories, it will not list any types other than Device and it will not resolve symbolic links, if any. For any of those features, please look at the aforementioned utility's source code and adjust as needed. winternl.h should be available in any recent Windows SDK.
The functions RtlInitUnicodeString_ and NtClose_ have a trailing underscore to avoid clashes with these native API functions, which are declared in winternl.h, but use __declspec(dllimport).
Disclosure: I am the author of ntobjx.
You can use NtOpenDirectoryObject and NtQueryDirectoryObject to enumarate the objects list in a given directory.
To get the details of the object namespace, you must use the Windows NT Undocumented API. That is also used by the WinObj as it is described here that how WinOBj getting the all results..and for those who are saying that we need a driver to do this please, read these lines on given page.
"One obvious way is to use a driver – in kernel mode everything is accessible – so the client app can get the required information by communicating with its own driver. WinObj does not use a driver, however (this is one reason it’s able to execute without admin privileges, although with admin privileges it shows all objects as opposed to partial results)."
You can start with SetupDiCreateDeviceInfoList and use other related functions to enumerate all the devices. This stuff is painful to use.

Resources