OpenSSL and multi-threads - windows

I've been reading about the requirement that if OpenSSL is used in a multi-threaded application, you have to register a thread identification function (and also a mutex creation function) with OpenSSL.
On Linux, according to the example provided by OpenSSL, a thread is normally identified by registering a function like this:
static unsigned long id_function(void){
return (unsigned long)pthread_self();
}
pthread_self() returns a pthread_t, and this works on Linux since pthread_t is just a typedef of unsigned long.
On Windows pthreads, FreeBSD, and other operating systems, pthread_t is a struct, with the following structure:
struct {
void * p; /* Pointer to actual object */
unsigned int x; /* Extra information - reuse count etc */
}
This can't be simply cast to an unsigned long, and when I try to do so, it throws a compile error. I tried taking the void *p and casting that to an unsigned long, on the theory that the memory pointer should be consistent and unique across threads, but this just causes my program to crash a lot.
What can I register with OpenSSL as the thread identification function when using Windows pthreads or FreeBSD or any of the other operating systems like this?
Also, as an additional question:
Does anyone know if this also needs to be done if OpenSSL is compiled into and used with QT, and if so how to register QThreads with OpenSSL? Surprisingly, I can't seem to find the answer in QT's documentation.

I will just put this code here. It is not panacea, as it doesn't deal with FreeBSD, but it is helpful in most cases when all you need is to support Windows and and say Debian. Of course, the clean solution assumes usage of CRYPTO_THREADID_* family introduced recently. (to give an idea, it has a CRYPTO_THREADID_cmp callback, which can be mapped to pthread_equal)
#include <pthread.h>
#include <openssl/err.h>
#if defined(WIN32)
#define MUTEX_TYPE HANDLE
#define MUTEX_SETUP(x) (x) = CreateMutex(NULL, FALSE, NULL)
#define MUTEX_CLEANUP(x) CloseHandle(x)
#define MUTEX_LOCK(x) WaitForSingleObject((x), INFINITE)
#define MUTEX_UNLOCK(x) ReleaseMutex(x)
#define THREAD_ID GetCurrentThreadId()
#else
#define MUTEX_TYPE pthread_mutex_t
#define MUTEX_SETUP(x) pthread_mutex_init(&(x), NULL)
#define MUTEX_CLEANUP(x) pthread_mutex_destroy(&(x))
#define MUTEX_LOCK(x) pthread_mutex_lock(&(x))
#define MUTEX_UNLOCK(x) pthread_mutex_unlock(&(x))
#define THREAD_ID pthread_self()
#endif
/* This array will store all of the mutexes available to OpenSSL. */
static MUTEX_TYPE *mutex_buf=NULL;
static void locking_function(int mode, int n, const char * file, int line)
{
if (mode & CRYPTO_LOCK)
MUTEX_LOCK(mutex_buf[n]);
else
MUTEX_UNLOCK(mutex_buf[n]);
}
static unsigned long id_function(void)
{
return ((unsigned long)THREAD_ID);
}
int thread_setup(void)
{
int i;
mutex_buf = malloc(CRYPTO_num_locks() * sizeof(MUTEX_TYPE));
if (!mutex_buf)
return 0;
for (i = 0; i < CRYPTO_num_locks( ); i++)
MUTEX_SETUP(mutex_buf[i]);
CRYPTO_set_id_callback(id_function);
CRYPTO_set_locking_callback(locking_function);
return 1;
}
int thread_cleanup(void)
{
int i;
if (!mutex_buf)
return 0;
CRYPTO_set_id_callback(NULL);
CRYPTO_set_locking_callback(NULL);
for (i = 0; i < CRYPTO_num_locks( ); i++)
MUTEX_CLEANUP(mutex_buf[i]);
free(mutex_buf);
mutex_buf = NULL;
return 1;
}

I only can answer the Qt part. Use QThread::currentThreadId(), or even QThread::currentThread() as the pointer value should be unique.

From the OpenSSL doc you linked:
threadid_func(CRYPTO_THREADID *id) is needed to record the currently-executing thread's identifier into id. The implementation of this callback should not fill in id directly, but should use CRYPTO_THREADID_set_numeric() if thread IDs are numeric, or CRYPTO_THREADID_set_pointer() if they are pointer-based. If the application does not register such a callback using CRYPTO_THREADID_set_callback(), then a default implementation is used - on Windows and BeOS this uses the system's default thread identifying APIs, and on all other platforms it uses the address of errno. The latter is satisfactory for thread-safety if and only if the platform has a thread-local error number facility.
As shown providing your own ID is really only useful if you can provide a better ID than OpenSSL's default implementation.
The only fail-safe way to provide IDs, when you don't know whether pthread_t is a pointer or an integer, is to maintain your own per-thread IDs stored as a thread-local value.

Related

how to fix wrong GCC ARM startup code pointer to initialized and zero variables?

[skip to UPDATE2 and save some time :-)]
I use ARM Cortex-M4, with CMSIS 5-5.7.0 and FreeRTOS, compiling using GCC for ARM (10_2021.10)
My variables are not initialized as they should.
My startup code is pretty simple, the entry point is the reset handler (CMSIS declared startup_ARMCM4.s as deprecated and recommend using the C code startup code so this is what I do).
Here is my code:
__attribute__((__noreturn__)) void Reset_Handler(void)
{
DataInit();
SystemInit(); /* CMSIS System Initialization */
main();
}
static void DataInit(void)
{
typedef struct {
uint32_t const* src;
uint32_t* dest;
uint32_t wlen;
} __copy_table_t;
typedef struct {
uint32_t* dest;
uint32_t wlen;
} __zero_table_t;
extern const __copy_table_t __copy_table_start__;
extern const __copy_table_t __copy_table_end__;
extern const __zero_table_t __zero_table_start__;
extern const __zero_table_t __zero_table_end__;
for (__copy_table_t const* pTable = &__copy_table_start__; pTable < &__copy_table_end__; ++pTable) {
for(uint32_t i=0u; i<pTable->wlen; ++i) {
pTable->dest[i] = pTable->src[i];
}
}
for (__zero_table_t const* pTable = &__zero_table_start__; pTable < &__zero_table_end__; ++pTable) {
for(uint32_t i=0u; i<pTable->wlen; ++i) {
pTable->dest[i] = 0u;
}
}
}
__copy_table_start__, __copy_table_end__ etc. have the wrong values an so no data is copied to the appropriate place in RAM.
I tried adding __libc_init_array() before DataInit(), as suggested in this answer, and remove the nostartfiles flag from the linker, but at some point __libc_init_array() jumps to an illegal address and I get a HardFault interrupt.
Is there a different method to fix it? maybe one where I can use the nostartfiles flag?
UPDATE:
Looking at the memory, where __copy_table_start__ is located, I see the data there is valid (even without the use of __libc_init_array()). It seems that pTable doesn't get the correct value.
I tried using __data_start__, __data_end__, __bss_start__, __bss_end__ and __etext instead of the above variables, in the linker file it is said they can be used in code without definition, but they cannot (maybe that's a clue?). In any case they didn't work either.
UPDATE2:
found the actual problem
all struct members get the same value (modifying one changes all others), it happens with every struct. I have no idea how this is possible. In other words the value of __copy_table_start__.src is, for example, 0x14651234, __copy_table_start__.dest is 0x00100000, and __copy_table_start__.wlen is 0x0365. When looking at pTable all members are 0x14651234.

Trap memory accesses inside a standard executable built with MinGW

So my problem sounds like this.
I have some platform dependent code (embedded system) which writes to some MMIO locations that are hardcoded at specific addresses.
I compile this code with some management code inside a standard executable (mainly for testing) but also for simulation (because it takes longer to find basic bugs inside the actual HW platform).
To alleviate the hardcoded pointers, i just redefine them to some variables inside the memory pool. And this works really well.
The problem is that there is specific hardware behavior on some of the MMIO locations (w1c for example) which makes "correct" testing hard to impossible.
These are the solutions i thought of:
1 - Somehow redefine the accesses to those registers and try to insert some immediate function to simulate the dynamic behavior. This is not really usable since there are various ways to write to the MMIO locations (pointers and stuff).
2 - Somehow leave the addresses hardcoded and trap the illegal access through a seg fault, find the location that triggered, extract exactly where the access was made, handle and return. I am not really sure how this would work (and even if it's possible).
3 - Use some sort of emulation. This will surely work, but it will void the whole purpose of running fast and native on a standard computer.
4 - Virtualization ?? Probably will take a lot of time to implement. Not really sure if the gain is justifiable.
Does anyone have any idea if this can be accomplished without going too deep? Maybe is there a way to manipulate the compiler in some way to define a memory area for which every access will generate a callback. Not really an expert in x86/gcc stuff.
Edit: It seems that it's not really possible to do this in a platform independent way, and since it will be only windows, i will use the available API (which seems to work as expected). Found this Q here:
Is set single step trap available on win 7?
I will put the whole "simulated" register file inside a number of pages, guard them, and trigger a callback from which i will extract all the necessary info, do my stuff then continue execution.
Thanks all for responding.
I think #2 is the best approach. I routinely use approach #4, but I use it to test code that is running in the kernel, so I need a layer below the kernel to trap and emulate the accesses. Since you have already put your code into a user-mode application, #2 should be simpler.
The answers to this question may provide help in implementing #2. How to write a signal handler to catch SIGSEGV?
What you really want to do, though, is to emulate the memory access and then have the segv handler return to the instruction after the access. This sample code works on Linux. I'm not sure if the behavior it is taking advantage of is undefined, though.
#include <stdint.h>
#include <stdio.h>
#include <signal.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static void segv_handler(int, siginfo_t *, void *);
int main()
{
struct sigaction action = { 0, };
action.sa_sigaction = segv_handler;
action.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &action, NULL);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static void segv_handler(int, siginfo_t *info, void *ucontext_arg)
{
ucontext_t *ucontext = static_cast<ucontext_t *>(ucontext_arg);
ucontext->uc_mcontext.gregs[REG_RAX] = 1234;
ucontext->uc_mcontext.gregs[REG_RIP] += 2;
}
The code to read the register is written in assembly to ensure that both the destination register and the length of the instruction are known.
This is how the Windows version of prl's answer could look like:
#include <stdint.h>
#include <stdio.h>
#include <windows.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *);
int main()
{
SetUnhandledExceptionFilter(segv_handler);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *ep)
{
// only handle read access violation of REG_ADDR
if (ep->ExceptionRecord->ExceptionCode != EXCEPTION_ACCESS_VIOLATION ||
ep->ExceptionRecord->ExceptionInformation[0] != 0 ||
ep->ExceptionRecord->ExceptionInformation[1] != (ULONG_PTR)REG_ADDR)
return EXCEPTION_CONTINUE_SEARCH;
ep->ContextRecord->Rax = 1234;
ep->ContextRecord->Rip += 2;
return EXCEPTION_CONTINUE_EXECUTION;
}
So, the solution (code snippet) is as follows:
First of all, i have a variable:
__attribute__ ((aligned (4096))) int g_test;
Second, inside my main function, i do the following:
AddVectoredExceptionHandler(1, VectoredHandler);
DWORD old;
VirtualProtect(&g_test, 4096, PAGE_READWRITE | PAGE_GUARD, &old);
The handler looks like this:
LONG WINAPI VectoredHandler(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
static DWORD last_addr;
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION) {
last_addr = ExceptionInfo->ExceptionRecord->ExceptionInformation[1];
ExceptionInfo->ContextRecord->EFlags |= 0x100; /* Single step to trigger the next one */
return EXCEPTION_CONTINUE_EXECUTION;
}
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP) {
DWORD old;
VirtualProtect((PVOID)(last_addr & ~PAGE_MASK), 4096, PAGE_READWRITE | PAGE_GUARD, &old);
return EXCEPTION_CONTINUE_EXECUTION;
}
return EXCEPTION_CONTINUE_SEARCH;
}
This is only a basic skeleton for the functionality. Basically I guard the page on which the variable resides, i have some linked lists in which i hold pointers to the function and values for the address in question. I check that the fault generating address is inside my list then i trigger the callback.
On first guard hit, the page protection will be disabled by the system, but i can call my PRE_WRITE callback where i can save the variable state. Because a single step is issued through the EFlags, it will be followed immediately by a single step exception (which means that the variable was written), and i can trigger a WRITE callback. All the data required for the operation is contained inside the ExceptionInformation array.
When someone tries to write to that variable:
*(int *)&g_test = 1;
A PRE_WRITE followed by a WRITE will be triggered,
When i do:
int x = *(int *)&g_test;
A READ will be issued.
In this way i can manipulate the data flow in a way that does not require modifications of the original source code.
Note: This is intended to be used as part of a test framework and any penalty hit is deemed acceptable.
For example, W1C (Write 1 to clear) operation can be accomplished:
void MYREG_hook(reg_cbk_t type)
{
/** We need to save the pre-write state
* This is safe since we are assured to be called with
* both PRE_WRITE and WRITE in the correct order
*/
static int pre;
switch (type) {
case REG_READ: /* Called pre-read */
break;
case REG_PRE_WRITE: /* Called pre-write */
pre = g_test;
break;
case REG_WRITE: /* Called after write */
g_test = pre & ~g_test; /* W1C */
break;
default:
break;
}
}
This was possible also with seg-faults on illegal addresses, but i had to issue one for each R/W, and keep track of a "virtual register file" so a bigger penalty hit. In this way i can only guard specific areas of memory or none, depending on the registered monitors.

Communication between two kernel kprobes/kretprobes

Is it possible to capture a kernel function's return value using a kretprobe and communicate it to another kretprobe which is hooked on to another kernel function.
One example of how to do this using eBPF & bcc:
#!/usr/bin/env python
from bcc import BPF
BPF(text="""
#include <uapi/linux/ptrace.h>
BPF_HASH(rvalues, u64, unsigned long);
int kretprobe__randomize_stack_top(struct pt_regs *ctx) {
u64 zero = 0;
unsigned long rvalue = PT_REGS_RC(ctx);
rvalues.lookup(&zero);
return 0;
}
int kretprobe__load_elf_binary(struct pt_regs *ctx) {
u64 zero = 0;
unsigned long *rvalue_ptr = rvalues.lookup(&zero);
if (rvalue_ptr) {
unsigned long rvalue = *rvalue_ptr;
bpf_trace_printk("value returned by randomize_stack_top: %d", rvalue);
}
return 0;
}
""").trace_print()
The value returned by randomize_stack_top is saved in the hash map rvalues with the key 0 (it's also possible to use a BPF_ARRAY since the key is fixed here). The value is retrieved in load_elf_binary with a simple lookup on the hash map.
Note: If you have several processes calling these functions, you can use their PID as keys for the hash map to discriminate between different returned values.
bcc offers an higher-level, Python API to load eBPF programs in the kernel and interact with them. eBPF programs can be used instead of kernel modules to instrument kprobes. For more information on bcc, see the tutorial on the repository.

how to use CryptoAPI in the linux kernel 2.6

I have been looking for some time but have not found anywhere near sufficient documentation / examples on how to use the CryptoAPI that comes with linux in the creation of syscalls / in kernel land.
If anyone knows of a good source please let me know, I would like to know how to do SHA1 / MD5 and Blowfish / AES within the kernel space only.
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/crypto.h>
#include <linux/err.h>
#include <linux/scatterlist.h>
#define SHA1_LENGTH 20
static int __init sha1_init(void)
{
struct scatterlist sg;
struct crypto_hash *tfm;
struct hash_desc desc;
unsigned char output[SHA1_LENGTH];
unsigned char buf[10];
int i;
printk(KERN_INFO "sha1: %s\n", __FUNCTION__);
memset(buf, 'A', 10);
memset(output, 0x00, SHA1_LENGTH);
tfm = crypto_alloc_hash("sha1", 0, CRYPTO_ALG_ASYNC);
desc.tfm = tfm;
desc.flags = 0;
sg_init_one(&sg, buf, 10);
crypto_hash_init(&desc);
crypto_hash_update(&desc, &sg, 10);
crypto_hash_final(&desc, output);
for (i = 0; i < 20; i++) {
printk(KERN_ERR "%d-%d\n", output[i], i);
}
crypto_free_hash(tfm);
return 0;
}
static void __exit sha1_exit(void)
{
printk(KERN_INFO "sha1: %s\n", __FUNCTION__);
}
module_init(sha1_init);
module_exit(sha1_exit);
MODULE_LICENSE("Dual MIT/GPL");
MODULE_AUTHOR("Me");
There are a couple of places in the kernel which use the crypto module: the eCryptfs file system (linux/fs/ecryptfs/) and the 802.11 wireless stack (linux/drivers/staging/rtl8187se/ieee80211/). Both of these use AES, but you may be able to extrapolate what you find there to MD5.
Another good example is from the 2.6.18 kernel source in security/seclvl.c
Note: You can change CRYPTO_TFM_REQ_MAY_SLEEP if needed
static int
plaintext_to_sha1(unsigned char *hash, const char *plaintext, unsigned int len)
{
struct crypto_tfm *tfm;
struct scatterlist sg;
if (len > PAGE_SIZE) {
seclvl_printk(0, KERN_ERR, "Plaintext password too large (%d "
"characters). Largest possible is %lu "
"bytes.\n", len, PAGE_SIZE);
return -EINVAL;
}
tfm = crypto_alloc_tfm("sha1", CRYPTO_TFM_REQ_MAY_SLEEP);
if (tfm == NULL) {
seclvl_printk(0, KERN_ERR,
"Failed to load transform for SHA1\n");
return -EINVAL;
}
sg_init_one(&sg, (u8 *)plaintext, len);
crypto_digest_init(tfm);
crypto_digest_update(tfm, &sg, 1);
crypto_digest_final(tfm, hash);
crypto_free_tfm(tfm);
return 0;
}
Cryptodev-linux
https://github.com/cryptodev-linux/cryptodev-linux
It is a kernel module that exposes the kernel crypto API to userspace through /dev/crypto .
SHA calculation example: https://github.com/cryptodev-linux/cryptodev-linux/blob/da730106c2558c8e0c8e1b1b1812d32ef9574ab7/examples/sha.c
As others have mentioned, the kernel does not seem to expose the crypto API to userspace itself, which is a shame since the kernel can already use native hardware accelerated crypto functions internally.
Crypto operations cryptodev supports: https://github.com/nmav/cryptodev-linux/blob/383922cabeea7dca354415e8c590f8e932f4d7a8/crypto/cryptodev.h
Crypto operations Linux x86 supports: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/x86/crypto?id=refs/tags/v4.0
The best place to start is Documentation/crytpo in the kernel sources. dm-crypt is one of the many components that probably uses the kernel crypto API and you can refer to it to get an idea about usage.
how to do SHA1 / MD5 and Blowfish / AES within the kernel space only.
Example of hashing data using a two-element scatterlist:
struct crypto_hash *tfm = crypto_alloc_hash("sha1", 0, CRYPTO_ALG_ASYNC);
if (tfm == NULL)
fail;
char *output_buf = kmalloc(crypto_hash_digestsize(tfm), GFP_KERNEL);
if (output_buf == NULL)
fail;
struct scatterlist sg[2];
struct hash_desc desc = {.tfm = tfm};
ret = crypto_hash_init(&desc);
if (ret != 0)
fail;
sg_init_table(sg, ARRAY_SIZE(sg));
sg_set_buf(&sg[0], "Hello", 5);
sg_set_buf(&sg[1], " World", 6);
ret = crypto_hash_digest(&desc, sg, 11, output_buf);
if (ret != 0)
fail;
One critical note:
Never compare the return value of crypto_alloc_hash function to NULL for detecting the failure.
Steps:
Always use IS_ERR function for this purpose. Comparing to NULL does not capture the error, hence you get segmentation faults later on.
If IS_ERR returns fail, you possibly have a missing crypto algorithm compiled into your kernel image (or as a module). Make sure you have selected the appropriate crypto algo. form make menuconfig.

SysInternal's WinObj device listing mechanism

SysInternals's WinObj can list all device objects.
I wonder how it can list the devices.
Is there any open source we can read?(or a code snippet)
What is the most significant function I should know?
WinObj uses the NT system calls NtOpenDirectoryObject and NtQueryDirectoryObject. There is no driver or kernel code needed. You won't see the imports because these NT functions are loaded via LoadLibrary/GetProcAddress.
You don't have to enumerate the entire object namespace. If you're interested in the device objects call NtOpenDirectoryObject with "\Device", then call NtQueryDirectoryObject on the returned handle.
According to SysInternals' web page:
The native NT API provides routines
that allow user-mode programs to
browse the namespace and query the
status of objects located there, but
the interfaces are undocumented.
I've tried looking at WinObj's import table (dumpbin /imports winobj.exe) but there are no obvious suspects :-(
As per the answer from user1575778 you can use NtOpenDirectoryObject and NtQueryDirectoryObject (which from user mode are identical to ZwOpenDirectoryObject and ZwQueryDirectoryObject respectively) to list the objects inside the object manager namespace.
Have a look at objmgr.hpp of NT Objects aka ntobjx, in particular at the class NtObjMgr::Directory (or DirectoryT). It provides the same functionality nicely wrapped into a C++ class. The whole utility is open source under a liberal license (dual-licensed due to WTL-use: MIT and MS-PL), so bits and pieces can be reused however you please, provided you comply with the license terms.
But here's a simple C++ code example catering just your use case:
#include <Windows.h>
#include <tchar.h>
#include <cstdio>
#include <winternl.h>
NTSTATUS (NTAPI* NtOpenDirectoryObject)(PHANDLE, ACCESS_MASK, POBJECT_ATTRIBUTES);
NTSTATUS (NTAPI* NtQueryDirectoryObject)(HANDLE, PVOID, ULONG, BOOLEAN, BOOLEAN, PULONG, PULONG);
VOID (NTAPI* RtlInitUnicodeString_)(PUNICODE_STRING, PCWSTR);
NTSTATUS (NTAPI* NtClose_)(HANDLE);
#define DIRECTORY_QUERY (0x0001)
#define DIRECTORY_TRAVERSE (0x0002)
typedef struct _OBJECT_DIRECTORY_INFORMATION {
UNICODE_STRING Name;
UNICODE_STRING TypeName;
} OBJECT_DIRECTORY_INFORMATION, *POBJECT_DIRECTORY_INFORMATION;
#ifndef STATUS_SUCCESS
#define STATUS_SUCCESS ((NTSTATUS)0x00000000L) // ntsubauth
#endif // STATUS_SUCCESS
#ifndef STATUS_MORE_ENTRIES
#define STATUS_MORE_ENTRIES ((NTSTATUS)0x00000105L)
#endif // STATUS_MORE_ENTRIES
#ifndef STATUS_NO_MORE_ENTRIES
#define STATUS_NO_MORE_ENTRIES ((NTSTATUS)0x8000001AL)
#endif // STATUS_NO_MORE_ENTRIES
int PrintDevices()
{
NTSTATUS ntStatus;
OBJECT_ATTRIBUTES oa;
UNICODE_STRING objname;
HANDLE hDeviceDir = NULL;
RtlInitUnicodeString_(&objname, L"\\Device");
InitializeObjectAttributes(&oa, &objname, 0, NULL, NULL);
ntStatus = NtOpenDirectoryObject(&hDeviceDir, DIRECTORY_QUERY | DIRECTORY_TRAVERSE, &oa);
if(NT_SUCCESS(ntStatus))
{
size_t const bufSize = 0x10000;
BYTE buf[bufSize] = {0};
ULONG start = 0, idx = 0, bytes;
BOOLEAN restart = TRUE;
for(;;)
{
ntStatus = NtQueryDirectoryObject(hDeviceDir, PBYTE(buf), bufSize, FALSE, restart, &idx, &bytes);
if(NT_SUCCESS(ntStatus))
{
POBJECT_DIRECTORY_INFORMATION const pdilist = reinterpret_cast<POBJECT_DIRECTORY_INFORMATION>(PBYTE(buf));
for(ULONG i = 0; i < idx - start; i++)
{
if(0 == wcsncmp(pdilist[i].TypeName.Buffer, L"Device", pdilist[i].TypeName.Length / sizeof(WCHAR)))
{
_tprintf(_T("%s\n"), pdilist[i].Name.Buffer);
}
}
}
if(STATUS_MORE_ENTRIES == ntStatus)
{
start = idx;
restart = FALSE;
continue;
}
if((STATUS_SUCCESS == ntStatus) || (STATUS_NO_MORE_ENTRIES == ntStatus))
{
break;
}
}
(void)NtClose_(hDeviceDir);
return 0;
}
_tprintf(_T("Failed NtOpenDirectoryObject with 0x%08X"), ntStatus);
return 1;
}
int _tmain(int /*argc*/, _TCHAR** /*argv*/)
{
HMODULE hNtDll = ::GetModuleHandle(_T("ntdll.dll"));
*(FARPROC*)&NtOpenDirectoryObject = ::GetProcAddress(hNtDll, "NtOpenDirectoryObject");
*(FARPROC*)&NtQueryDirectoryObject = ::GetProcAddress(hNtDll, "NtQueryDirectoryObject");
*(FARPROC*)&RtlInitUnicodeString_ = ::GetProcAddress(hNtDll, "RtlInitUnicodeString");
*(FARPROC*)&NtClose_ = ::GetProcAddress(hNtDll, "NtClose");
if (!NtOpenDirectoryObject || !NtQueryDirectoryObject || !RtlInitUnicodeString_ || !NtClose_)
{
_tprintf(_T("Failed to retrieve ntdll.dll function pointers\n"));
return 1;
}
return PrintDevices();
}
Some remarks: This will not delve into subdirectories, it will not list any types other than Device and it will not resolve symbolic links, if any. For any of those features, please look at the aforementioned utility's source code and adjust as needed. winternl.h should be available in any recent Windows SDK.
The functions RtlInitUnicodeString_ and NtClose_ have a trailing underscore to avoid clashes with these native API functions, which are declared in winternl.h, but use __declspec(dllimport).
Disclosure: I am the author of ntobjx.
You can use NtOpenDirectoryObject and NtQueryDirectoryObject to enumarate the objects list in a given directory.
To get the details of the object namespace, you must use the Windows NT Undocumented API. That is also used by the WinObj as it is described here that how WinOBj getting the all results..and for those who are saying that we need a driver to do this please, read these lines on given page.
"One obvious way is to use a driver – in kernel mode everything is accessible – so the client app can get the required information by communicating with its own driver. WinObj does not use a driver, however (this is one reason it’s able to execute without admin privileges, although with admin privileges it shows all objects as opposed to partial results)."
You can start with SetupDiCreateDeviceInfoList and use other related functions to enumerate all the devices. This stuff is painful to use.

Resources