Using windows fiber in a simple way but unexplainable bugs occur

Using windows fiber in a simple way but unexplainable bugs occur - windows

I played around with windows fibers implementing my own task scheduler when some odd crashes and undefined behaviors occurred.
For the sake of simplicity I started a new project and wrote a simple program who performs the following operations:
The main thread creates a bunch of fibers, then launch two threads
The main thread waits until you kill the program
Each worker thread converts himself into a fiber
Each worker thread tries to find a free fiber, then switchs to this new free fiber
Once a thread had switch to a new fiber, it pushes its previous fiber into the free fibers container
Each worker thread goes to the step 4
If you are not familiar with fiber concept this talk is a good start.
The Data
Each thread has its own ThreadData data structure to store its previous, current fiber instances, and its thread index.
I tried several way to retrieve the ThreadData data structure during execution:
I used thread local storage to store ThreadData pointer
I used a container which associate a thread_id with a ThreadData structure
The Problem
When a fiber is entered for the first time (look at the FiberFunc function), the thread using this fiber must pushes its previous fiber into the free fibers container. But it happens that sometimes the previous fiber is null, which is impossible.
It is impossible because before switching to a new fiber the thread sets its previous fiber value with its current fiber value (and it sets its current fiber value with the new fiber value).
So if a thread enters in a brand new fiber with its previous fiber set as null, it would mean it comes from nowhere (which doesn't make any sense).
The only reasons a ThreadData has its previous fiber value set as null when it enters to a brand new fiber is that another thread sets it to null or that compiler reordered instructions under the hood.
I checked the assembly and it seems that the compiler is not responsible.
There are several bugs I can't explain:
If I use the first GetThreadData() function to retrieve the ThreadData structure, I can retrieve an instance whose index is different from the thread local index (those indices have been set when threads started). This will make the program assert ( assert(threadData->index == localThreadIndex)).
If I use any other function to retrieve the ThreadData structure I will assert in the FiberFunc function because the previous fiber value is null (assert(threadData->previousFiber)).
Do you have any idea why this code doesn't work ? I spent countless hours trying to figure out what is wrong but I don't see my mistakes.
Specification
OS: Windows 10
IDE: Visual Studio 2015 and Visual Studio 2017
Compiler: VC++
Configuration: Release
Note that there is no bug in Debug configuration.
The Code
You may try to run it several times before the assert fires.
#include "Windows.h"
#include <vector>
#include <thread>
#include <mutex>
#include <cassert>
#include <iostream>
#include <atomic>
struct Fiber
{
void* handle;
};
struct ThreadData
{
Fiber* previousFiber{ nullptr };
Fiber* currentFiber{ nullptr };
Fiber fiber{ };
unsigned int index{};
};
//Threads
std::vector<std::pair<std::thread::id, unsigned int>> threadsinfo{};
//threads data container
ThreadData threadsData[8];
//Fibers
std::mutex fibersLock{};
std::vector<Fiber> fibers{};
std::vector<Fiber*> freeFibers{};
thread_local unsigned int localThreadIndex{};
thread_local Fiber* debug_localTheadLastFiber{};
thread_local ThreadData* localThreadData{};
using WindowsThread = HANDLE;
std::vector<WindowsThread> threads{};
//This is the first way to retrieve the current thread's ThreadData structure using thread_id
//ThreadData* GetThreadData()
//{
// std::thread::id threadId( std::this_thread::get_id());
// for (auto const& pair : threadsinfo)
// {
// if (pair.first == threadId)
// {
// return &threadsData[pair.second];
// }
// }
//
// //It is not possible to assert
// assert(false);
// return nullptr;
//}
//This is the second way to retrieve the current thread's ThreadData structure using thread local storage
//ThreadData* GetThreadData()
//{
// return &threadsData[localThreadIndex];
//}
//This is the third way to retrieve the current thread's ThreadData structure using thread local storage
ThreadData* GetThreadData()
{
return localThreadData;
}
//Try to pop a free fiber from the container, thread safe due to mutex usage
bool TryPopFreeFiber(Fiber*& fiber)
{
std::lock_guard<std::mutex> guard(fibersLock);
if (freeFibers.empty()) { return false; }
fiber = freeFibers.back();
assert(fiber);
assert(fiber->handle);
freeFibers.pop_back();
return true;
}
//Try to push a free fiber to the container, thread safe due to mutex usage
bool PushFreeFiber(Fiber* fiber)
{
std::lock_guard<std::mutex> guard(fibersLock);
freeFibers.push_back(fiber);
return true;
}
//the __declspec(noinline) is used to inspect code in release mode, comment it if you want
__declspec(noinline) void _SwitchToFiber(Fiber* newFiber)
{
//You want to switch to another fiber
//You first have to save your current fiber instance to release it once you will be in the new fiber
{
ThreadData* threadData{ GetThreadData() };
assert(threadData->index == localThreadIndex);
assert(threadData->currentFiber);
threadData->previousFiber = threadData->currentFiber;
threadData->currentFiber = newFiber;
debug_localTheadLastFiber = threadData->previousFiber;
assert(threadData->previousFiber);
assert(newFiber);
assert(newFiber->handle);
}
//You switch to the new fiber
//this call will either make you enter in the FiberFunc function if the fiber has never been used
//Or you will continue to execute this function if the new fiber has been already used (not that you will have a different stack so you can't use the old threadData value)
::SwitchToFiber(newFiber->handle);
{
//You must get the current ThreadData* again, because you come from another fiber (the previous statement is a switch), this fiber could have been used by any other thread
ThreadData* threadData{ GetThreadData() };
//THIS ASSERT WILL FIRES IF YOU USE THE FIRST GetThreadData METHOD, WHICH IS IMPOSSIBLE....
assert(threadData->index == localThreadIndex);
assert(threadData);
assert(threadData->previousFiber);
//We release the previous fiber
PushFreeFiber(threadData->previousFiber);
debug_localTheadLastFiber = nullptr;
threadData->previousFiber = nullptr;
}
}
void ExecuteThreadBody()
{
Fiber* newFiber{};
if (TryPopFreeFiber(newFiber))
{
_SwitchToFiber(newFiber);
}
}
DWORD __stdcall ThreadFunc(void* data)
{
int const index{ *static_cast<int*>(data)};
threadsinfo[index] = std::make_pair(std::this_thread::get_id(), index);
//setting up the current thread data
ThreadData* threadData{ &threadsData[index] };
threadData->index = index;
void* threadAsFiber{ ConvertThreadToFiber(nullptr) };
assert(threadAsFiber);
threadData->fiber = Fiber{ threadAsFiber };
threadData->currentFiber = &threadData->fiber;
localThreadData = threadData;
localThreadIndex = index;
while (true)
{
ExecuteThreadBody();
}
return DWORD{};
}
//The entry point of all fibers
void __stdcall FiberFunc(void* data)
{
//You enter to the fiber for the first time
ThreadData* threadData{ GetThreadData() };
//Making sure that the thread data structure is the good one
assert(threadData->index == localThreadIndex);
//Here you will assert
assert(threadData->previousFiber);
PushFreeFiber(threadData->previousFiber);
threadData->previousFiber = nullptr;
while (true)
{
ExecuteThreadBody();
}
}
__declspec(noinline) void main()
{
constexpr unsigned int threadCount{ 2 };
constexpr unsigned int fiberCount{ 20 };
threadsinfo.resize(threadCount);
fibers.resize(fiberCount);
for (auto index = 0; index < fiberCount; ++index)
{
fibers[index] = { CreateFiber(0, FiberFunc, nullptr) };
}
freeFibers.resize(fiberCount);
for (auto index = 0; index < fiberCount; ++index)
{
freeFibers[index] = std::addressof(fibers[index]);
}
threads.resize(threadCount);
std::vector<int> threadParamss(threadCount);
for (auto index = 0; index < threadCount; ++index)
{
//threads[index] = new std::thread{ ThreadFunc, index };
threadParamss[index] = index;
threads[index] = CreateThread(NULL, 0, &ThreadFunc, &threadParamss[index], 0, NULL);
assert(threads[index]);
}
while (true);
//I know, it is not clean, it will leak
}

Well, several months later. I figured out that the variable declared as thread_local were the culprits. If you use fiber, forget about the thread_local variables and use the per-fiber memory you allocated when you create them.
I now store my current thread index in the per-fiber structure instance.

You need to use the /GT option if you want thread local storage.
https://learn.microsoft.com/en-us/cpp/build/reference/gt-support-fiber-safe-thread-local-storage?view=msvc-170

Related

Linux device driver for a Smart Card IC module

I have a smart card IC module, and I want to create a Linux device driver for it. This module is using SPI as the controlling line and has an interrupt line to indicate whether a card is ready. I know how to create a SPI device in Linux kernel and how to read data in the kernel when the interruption happens. But I have no idea on how to transfer the data to the user space (maybe need to create a device node for it), and how to give the user space a interruption to notify it. Does anyone have some suggestion?

One way you can go about this is by creating a devfs entry and then having the interested process open that device and receive asynchronous notification from the device driver using fasync.
Once you have the notification in user space you can notify other interested processes by any means you deem fit.
I am writing a small trimmed down example illustrating this feature.
On the driver side
/* Appropriate headers */
static int myfasync(int fd, struct file *fp, int on);
static struct fasync_struct *fasyncQueue;
static struct file_operations fops =
{
.open = charDriverOpen,
.release = charDriverClose,
.read = charDriverRead,
.write = charDriverWrite,
.unlocked_ioctl = charDriverCtrl,
// This will be called when the FASYNC flag is set
.fasync = myfasync,
};
static int __init charDriverEntry()
{
// Appropriate init for the driver
// Nothing specific needs to be done here with respect to
// fasync feature.
}
static int myfasync(int fd, struct file *fp, int on)
{
// Register the process pointed to by fp to the list
// of processes to be notified when any event occurs
return fasync_helper(fd, fp, 1, &fasyncQueue);
}
// Now to the part where we want to notify the processes listed
// in fasyncQueue when something happens. Here in this example I had
// implemented the timer. Not getting in to the details of timer func
// here
static void send_signal_timerfn(unsigned long data)
{
...
printk(KERN_INFO "timer expired \n");
kill_fasync(&fasyncQueue, SIGIO, POLL_OUT);
...
}
On the user land process side
void my_notifier(int signo, siginfo_t *sigInfo, void *data)
{
printf("Signal received from the driver expected %d got %d \n",SIGIO,signo);
}
int main()
{
struct sigaction signalInfo;
int flagInfo;
signalInfo.sa_sigaction = my_notifier;
signalInfo.sa_flags = SA_SIGINFO;
sigemptyset(&signalInfo.sa_mask);
sigaction(SIGIO, &signalInfo, NULL);
int fp,i;
fp = open("/dev/myCharDevice",O_RDWR);
if (fp<0)
printf("Failed to open\n");
/*New we will own the device so that we can get the signal from the device*/
// Own the process
fcntl(fp, F_SETOWN, getpid());
flagInfo = fcntl(fp, F_GETFL);
// Set the FASYNC flag this triggers the fasync fops
fcntl(fp, F_SETFL, flagInfo|FASYNC);
...
}
Hope this clears things up.
For more detailed reading I suggest you read this

Efficient message factory and handler in C++

Our company is rewriting most of the legacy C code in C++11. (Which also means I am a C programmer learning C++). I need advice on message handlers.
We have distributed system - Server process sends a packed message over TCP to client process.
In C code this was being done:
- parse message based on type and subtype, which are always the first 2 fields
- call a handler as handler[type](Message *msg)
- handler creates temporary struct say, tmp_struct to hold the parsed values and ..
- calls subhandler[type][subtype](tmp_struct)
There is only one handler per type/subtype.
Moving to C++11 and mutli-threaded environment. The basic idea I had was to -
1) Register a processor object for each type/subtype combination. This is
actually a vector of vectors -
vector< vector >
class MsgProcessor {
// Factory function
virtual Message *create();
virtual Handler(Message *msg)
}
This will be inherited by different message processors
class AMsgProcessor : public MsgProcessor {
Message *create() override();
handler(Message *msg);
}
2) Get the processor using a lookup into the vector of vectors.
Get the message using the overloaded create() factory function.
So that we can keep the actual message and the parsed values inside the message.
3) Now a bit of hack, This message should be send to other threads for the heavy processing. To avoid having to lookup in the vector again, added a pointer to proc inside the message.
class Message {
const MsgProcessor *proc; // set to processor,
// which we got from the first lookup
// to get factory function.
};
So other threads, will just do
Message->proc->Handler(Message *);
This looks bad, but hope, is that this will help to separate message handler from the factory. This is for the case, when multiple type/subtype wants to create same Message, but handle it differently.
I was searching about this and came across :
http://www.drdobbs.com/cpp/message-handling-without-dependencies/184429055?pgno=1
It provides a way to completely separate the message from the handler. But I was wondering if my simple scheme above will be considered an acceptable design or not. Also is this a wrong way of achieving what I want?
Efficiency, as in speed, is the most important requirement from this application. Already we are doing couple of memory Jumbs => 2 vectors + virtual function call the create the message. There are 2 deference to get to the handler, which is not good from caching point of view I guess.

Though your requirement is unclear, I think I have a design that might be what you are looking for.
Check out http://coliru.stacked-crooked.com/a/f7f9d5e7d57e6261 for the fully fledged example.
It has following components:
An interface class for Message processors IMessageProcessor.
A base class representing a Message. Message
A registration class which is essentially a singleton for storing the message processors corresponding to (Type, Subtype) pair. Registrator. It stores the mapping in a unordered_map. You can also tweak it a bit for better performance. All the exposed API's of Registrator are protected by a std::mutex.
Concrete implementations of MessageProcessor. AMsgProcessor and BMsgProcessor in this case.
simulate function to show how it all fits together.
Pasting the code here as well:
/*
* http://stackoverflow.com/questions/40230555/efficient-message-factory-and-handler-in-c
*/
#include <iostream>
#include <vector>
#include <tuple>
#include <mutex>
#include <memory>
#include <cassert>
#include <unordered_map>
class Message;
class IMessageProcessor
{
public:
virtual Message* create() = 0;
virtual void handle_message(Message*) = 0;
virtual ~IMessageProcessor() {};
};
/*
* Base message class
*/
class Message
{
public:
virtual void populate() = 0;
virtual ~Message() {};
};
using Type = int;
using SubType = int;
using TypeCombo = std::pair<Type, SubType>;
using IMsgProcUptr = std::unique_ptr<IMessageProcessor>;
/*
* Registrator class maintains all the registrations in an
* unordered_map.
* This class owns the MessageProcessor instance inside the
* unordered_map.
*/
class Registrator
{
public:
static Registrator* instance();
// Diable other types of construction
Registrator(const Registrator&) = delete;
void operator=(const Registrator&) = delete;
public:
// TypeCombo assumed to be cheap to copy
template <typename ProcT, typename... Args>
std::pair<bool, IMsgProcUptr> register_proc(TypeCombo typ, Args&&... args)
{
auto proc = std::make_unique<ProcT>(std::forward<Args>(args)...);
bool ok;
{
std::lock_guard<std::mutex> _(lock_);
std::tie(std::ignore, ok) = registrations_.insert(std::make_pair(typ, std::move(proc)));
}
return (ok == true) ? std::make_pair(true, nullptr) :
// Return the heap allocated instance back
// to the caller if the insert failed.
// The caller now owns the Processor
std::make_pair(false, std::move(proc));
}
// Get the processor corresponding to TypeCombo
// IMessageProcessor passed is non-owning pointer
// i.e the caller SHOULD not delete it or own it
std::pair<bool, IMessageProcessor*> processor(TypeCombo typ)
{
std::lock_guard<std::mutex> _(lock_);
auto fitr = registrations_.find(typ);
if (fitr == registrations_.end()) {
return std::make_pair(false, nullptr);
}
return std::make_pair(true, fitr->second.get());
}
// TypeCombo assumed to be cheap to copy
bool is_type_used(TypeCombo typ)
{
std::lock_guard<std::mutex> _(lock_);
return registrations_.find(typ) != registrations_.end();
}
bool deregister_proc(TypeCombo typ)
{
std::lock_guard<std::mutex> _(lock_);
return registrations_.erase(typ) == 1;
}
private:
Registrator() = default;
private:
std::mutex lock_;
/*
* Should be replaced with a concurrent map if at all this
* data structure is the main contention point (which I find
* very unlikely).
*/
struct HashTypeCombo
{
public:
std::size_t operator()(const TypeCombo& typ) const noexcept
{
return std::hash<decltype(typ.first)>()(typ.first) ^
std::hash<decltype(typ.second)>()(typ.second);
}
};
std::unordered_map<TypeCombo, IMsgProcUptr, HashTypeCombo> registrations_;
};
Registrator* Registrator::instance()
{
static Registrator inst;
return &inst;
/*
* OR some other DCLP based instance creation
* if lifetime or creation of static is an issue
*/
}
// Define some message processors
class AMsgProcessor final : public IMessageProcessor
{
public:
class AMsg final : public Message
{
public:
void populate() override {
std::cout << "Working on AMsg\n";
}
AMsg() = default;
~AMsg() = default;
};
Message* create() override
{
std::unique_ptr<AMsg> ptr(new AMsg);
return ptr.release();
}
void handle_message(Message* msg) override
{
assert (msg);
auto my_msg = static_cast<AMsg*>(msg);
//.... process my_msg ?
//.. probably being called in some other thread
// Who owns the msg ??
(void)my_msg; // only for suppressing warning
delete my_msg;
return;
}
~AMsgProcessor();
};
AMsgProcessor::~AMsgProcessor()
{
}
class BMsgProcessor final : public IMessageProcessor
{
public:
class BMsg final : public Message
{
public:
void populate() override {
std::cout << "Working on BMsg\n";
}
BMsg() = default;
~BMsg() = default;
};
Message* create() override
{
std::unique_ptr<BMsg> ptr(new BMsg);
return ptr.release();
}
void handle_message(Message* msg) override
{
assert (msg);
auto my_msg = static_cast<BMsg*>(msg);
//.... process my_msg ?
//.. probably being called in some other thread
//Who owns the msg ??
(void)my_msg; // only for suppressing warning
delete my_msg;
return;
}
~BMsgProcessor();
};
BMsgProcessor::~BMsgProcessor()
{
}
TypeCombo read_from_network()
{
return {1, 2};
}
struct ParsedData {
};
Message* populate_message(Message* msg, ParsedData& pdata)
{
// Do something with the message
// Calling a dummy populate method now
msg->populate();
(void)pdata;
return msg;
}
void simulate()
{
TypeCombo typ = read_from_network();
bool ok;
IMessageProcessor* proc = nullptr;
std::tie(ok, proc) = Registrator::instance()->processor(typ);
if (!ok) {
std::cerr << "FATAL!!!" << std::endl;
return;
}
ParsedData parsed_data;
//..... populate parsed_data here ....
proc->handle_message(populate_message(proc->create(), parsed_data));
return;
}
int main() {
/*
* TODO: Not making use or checking the return types after calling register
* its a must in production code!!
*/
// Register AMsgProcessor
Registrator::instance()->register_proc<AMsgProcessor>(std::make_pair(1, 1));
Registrator::instance()->register_proc<BMsgProcessor>(std::make_pair(1, 2));
simulate();
return 0;
}
UPDATE 1
The major source of confusion here seems to be because the architecture of the even system is unknown.
Any self respecting event system architecture would look something like below:
A pool of threads polling on the socket descriptors.
A pool of threads for handling timer related events.
Comparatively small number (depends on application) of threads to do long blocking jobs.
So, in your case:
You will get network event on the thread doing epoll_wait or select or poll.
Read the packet completely and get the processor using Registrator::get_processor call.
NOTE: get_processor call can be made without any locking if one can guarantee that the underlying unordered_map does not get modified i.e no new inserts would be made once we start receiving events.
Using the obtained processor we can get the Message and populate it.
Now, this is the part that I am not that sure of how you want it to be. At this point, we have the processor on which you can call handle_message either from the current thread i.e the thread which is doing epoll_wait or dispatch it to another thread by posting the job (Processor and Message) to that threads receiving queue.

How to detect application terminate in kernel extension, Mac OS X

I am looking for an approach to detect application quit (e.g. cmd-q) in kernel space for processing in a network kernel extension.
More precisely:
While a process (e.g. terminal ping) is held in an IOLockSleep(... THREAD_ABORTSAFE), ctrl-c is able to release the lock.
Asking the proc_issignal(), it responses the sigmask(SIGINT).
Now I am looking for a way to detect another process quit, e.g. firefox (menu bar: Application quit (cmd-q)).
Here is what I tried:
#define FLAG(X) ((dispatch_source_get_data(src) & DISPATCH_PROC_##X) ? #X" " : "")
struct ProcessInfo {
int pid;
dispatch_source_t source;
};
// function called back on event
void process_termination_event(struct ProcessInfo* procinfo) {
dispatch_source_t src = procinfo->source;
printf("process_termination_event: %d \n", procinfo->pid);
printf("flags: %s%s\n", FLAG(EXIT), FLAG(SIGNAL));
dispatch_source_cancel(procinfo->source);
}
// function called back when the dispatch source is cancelled
void process_termination_finalize(struct ProcessInfo* procinfo) {
printf("process_termination_finalize: %d \n", procinfo->pid);
dispatch_release(procinfo->source);
}
// Monitor a process by pid, for termination
void MonitorTermination(int pid) {
struct ProcessInfo* procinfo = (struct ProcessInfo*)malloc(sizeof(struct ProcessInfo));
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_source_t dsp = dispatch_source_create(DISPATCH_SOURCE_TYPE_PROC, pid, DISPATCH_PROC_EXIT|DISPATCH_PROC_SIGNAL, queue);
procinfo->pid = pid;
procinfo->source = dsp;
dispatch_source_set_event_handler_f(procinfo->source, (dispatch_function_t)process_termination_event);
dispatch_source_set_cancel_handler_f(procinfo->source, (dispatch_function_t)process_termination_finalize);
dispatch_set_context(procinfo->source, procinfo);
dispatch_resume(procinfo->source);
}
int main(int argc, const char * argv[])
{
for (int i = 0; i < argc; ++i) {
pid_t pid = atoi(argv[i]);
printf("MonitorTermination: %d\n", pid);
fflush(stdout);
MonitorTermination(pid);
}
CFRunLoopRun();
return 0;
}
The process_termination_event will not invoke after cmd-q as explained above. Even after force quit.
The process itself is held in a loop within the network kernel extension function:
errno_t KEXT::data_out(void *cookie, socket_t so, const struct sockaddr *to, mbuf_t *data, mbuf_t *control, sflt_data_flag_t flags)
{
// at this point I would like to detect the app quit/termination signal.
while(PROCESS_IS_NOT_TEMINATING); // <-- pseudo code, actually held with IOLockSleep...
return 0;
}
I would really appreciate any help! Thanks in advance.

It may not be the way you've been thinking, but if you're in the kernel space, then I assume you're writing a kernel extension (kext). With a kernel extension, you can monitor Vnodes for executing applications. You may be able to use the File Scope instead.
In conjunction with a user-level application (daemon), the kext notifies the daemon that a process has begun execution and then monitors the termination of the launched application from the user-level daemon, using Grand Central Dispatch functions. If required, the user-application can notify the kext of the terminated app.
To monitor the termination from a user-level application, you can do something like this when you're notified of an application being executed: -
// pid and path provided from vNode scope kext...
void UserLevelApp::MonitorProcessTermination(int pid, const QString &path)
{
ProcessInfo* procinfo = new ProcessInfo;
procinfo->pid = pid;
procinfo->path = path;
procinfo->context = this;
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_source_t dsp = dispatch_source_create(DISPATCH_SOURCE_TYPE_PROC, pid, DISPATCH_PROC_EXIT, queue);
dispatch_source_set_event_handler_f(dsp, (dispatch_function_t)process_termination_event);
dispatch_source_set_cancel_handler_f(dsp, (dispatch_function_t)process_termination_finalize);
procinfo->source = dsp;
dispatch_set_context(dsp, procinfo);
dispatch_resume(dsp);
}
// app terminated call-back function
void UserLevelApp::process_termination_event(struct ProcessInfo* procinfo)
{
dispatch_source_cancel(procinfo->source);
// example of how to use the context to call a class function
procinfo->context->SomeClassFunction(procinfo->pid, procinfo->path);
qDebug("App Terminated: %d, %s\n", procinfo->pid, procinfo->path.toUtf8().data());
}
// finalize callback function
void UserLevelApp::process_termination_finalize(struct ProcessInfo* procinfo)
{
dispatch_release(procinfo->source);
delete procinfo;
}
So each launched application, notified by the kext, has an event handler associated with it and when the application terminates, you get called back in the registered functions process_termination_event and process_termination_finalize
Whilst this method requires an associated user-level daemon application with the kext, that's not such a bad thing from a security and stability point of view.

libwebsockets write to all active connections after receive

I am toying around with a libwebsockets tutorial trying to make it such that, after it receives a message from a connection over a given protocol, it sends a response to all active connections implementing that protocol. I have used the function libwebsocket_callback_all_protocol but it is not doing what I think it should do from its name (I'm not quite sure what it does from the documentation).
The goal is to have two webpages open and, when info is sent from one, the result will be relayed to both. Below is my code - you'll see that libwebsocket_callback_all_protocol is called in main (which currently does nothing, I think....) :
#include <stdio.h>
#include <stdlib.h>
#include <libwebsockets.h>
#include <string.h>
static int callback_http(struct libwebsocket_context * this,
struct libwebsocket *wsi,
enum libwebsocket_callback_reasons reason, void *user,
void *in, size_t len)
{
return 0;
}
static int callback_dumb_increment(struct libwebsocket_context * this,
struct libwebsocket *wsi,
enum libwebsocket_callback_reasons reason,
void *user, void *in, size_t len)
{
switch (reason) {
case LWS_CALLBACK_ESTABLISHED: // just log message that someone is connecting
printf("connection established\n");
break;
case LWS_CALLBACK_RECEIVE: { // the funny part
// create a buffer to hold our response
// it has to have some pre and post padding. You don't need to care
// what comes there, libwebsockets will do everything for you. For more info see
// http://git.warmcat.com/cgi-bin/cgit/libwebsockets/tree/lib/libwebsockets.h#n597
unsigned char *buf = (unsigned char*) malloc(LWS_SEND_BUFFER_PRE_PADDING + len +
LWS_SEND_BUFFER_POST_PADDING);
int i;
// pointer to `void *in` holds the incomming request
// we're just going to put it in reverse order and put it in `buf` with
// correct offset. `len` holds length of the request.
for (i=0; i < len; i++) {
buf[LWS_SEND_BUFFER_PRE_PADDING + (len - 1) - i ] = ((char *) in)[i];
}
// log what we recieved and what we're going to send as a response.
// that disco syntax `%.*s` is used to print just a part of our buffer
// http://stackoverflow.com/questions/5189071/print-part-of-char-array
printf("received data: %s, replying: %.*s\n", (char *) in, (int) len,
buf + LWS_SEND_BUFFER_PRE_PADDING);
// send response
// just notice that we have to tell where exactly our response starts. That's
// why there's `buf[LWS_SEND_BUFFER_PRE_PADDING]` and how long it is.
// we know that our response has the same length as request because
// it's the same message in reverse order.
libwebsocket_write(wsi, &buf[LWS_SEND_BUFFER_PRE_PADDING], len, LWS_WRITE_TEXT);
// release memory back into the wild
free(buf);
break;
}
default:
break;
}
return 0;
}
static struct libwebsocket_protocols protocols[] = {
/* first protocol must always be HTTP handler */
{
"http-only", // name
callback_http, // callback
0, // per_session_data_size
0
},
{
"dumb-increment-protocol", // protocol name - very important!
callback_dumb_increment, // callback
0, // we don't use any per session data
0
},
{
NULL, NULL, 0, 0 /* End of list */
}
};
int main(void) {
// server url will be http://localhost:9000
int port = 9000;
const char *interface = NULL;
struct libwebsocket_context *context;
// we're not using ssl
const char *cert_path = NULL;
const char *key_path = NULL;
// no special options
int opts = 0;
// create libwebsocket context representing this server
struct lws_context_creation_info info;
memset(&info, 0, sizeof info);
info.port = port;
info.iface = interface;
info.protocols = protocols;
info.extensions = libwebsocket_get_internal_extensions();
info.ssl_cert_filepath = cert_path;
info.ssl_private_key_filepath = key_path;
info.gid = -1;
info.uid = -1;
info.options = opts;
info.user = NULL;
info.ka_time = 0;
info.ka_probes = 0;
info.ka_interval = 0;
/*context = libwebsocket_create_context(port, interface, protocols,
libwebsocket_get_internal_extensions,
cert_path, key_path, -1, -1, opts);
*/
context = libwebsocket_create_context(&info);
if (context == NULL) {
fprintf(stderr, "libwebsocket init failed\n");
return -1;
}
libwebsocket_callback_all_protocol(&protocols[1], LWS_CALLBACK_RECEIVE);
printf("starting server...\n");
// infinite loop, to end this server send SIGTERM. (CTRL+C)
while (1) {
libwebsocket_service(context, 50);
// libwebsocket_service will process all waiting events with their
// callback functions and then wait 50 ms.
// (this is a single threaded webserver and this will keep our server
// from generating load while there are not requests to process)
}
libwebsocket_context_destroy(context);
return 0;
}

I had the same problem, the libwebsocket_write on LWS_CALLBACK_ESTABLISHED generate some random segfault so using the mail list the libwebsockets developer Andy Green instructed me the correct way is to use libwebsocket_callback_on_writable_all_protocol, the file test-server/test-server.c in library source code shows sample of use.
libwebsocket_callback_on_writable_all_protocol(libwebsockets_get_protocol(wsi))
It worked very well to notify all instances, but it only call the write method in all connected instances, it do not define the data to send. You need to manage the data yourself. The sample source file test-server.c show a sample ring buffer to do it.
http://ml.libwebsockets.org/pipermail/libwebsockets/2015-January/001580.html
Hope it helps.

From what I can quickly grab from the documentation, in order to send a message to all clients, what you should do is store somewhere (in a vector, a hashmap, an array, whatever) the struct libwebsocket * wsi that you have access when your clients connect.
Then when you receive a message and want to broadcast it, simply call libwebsocket_write on all wsi * instances.
That's what I'd do, anyway.

Track the connection count of a boost signal

What I'm trying to archive is to get an update if the number of connections to a boost::signal2::signal object is changeing.
To give you the whole picture: I'm writing a GUI application which displays data from a remote server. Each "window" in the application should get it's data for a specific dataset. If a dataset is to be displayed it needs to be remotely subscribed from the server. Multiple windows can display the same dataset (with different ordering or filtering). My goal is to subscribe to a specific dataset only ONCE and disconnect ones its not longer needed.
Background: HFT software, displaying marketdata (orderbooks, trades, ...)
My code so far: I got stuck once I tried to implement the "operator()".
enum UpdateCountMethod {
UP = 1,
DOWN = -1
};
/**
* \brief Connection class which holds a Slot as long as an instance of this class "survives".
*/
class Connection {
public:
Connection(const boost::function<void (int)> updateFunc, const boost::signals2::connection conn) : update(updateFunc), connection(conn) {
update(UP); //Increase counter only. Connection was already made.
}
~Connection() {
update(DOWN); //Decrease counter before disconnecting the slot.
connection.disconnect();
}
private:
const boost::function<void(int)> update; // Functor for updating the connection count.
const boost::signals2::connection connection; // Actual boost connection this object belongs to.
};
/**
* \brief This is a Signal/Slot "container" which number of connections can be tracked.
*/
template<typename Signature>
class ObservableSignal{
typedef typename boost::signals2::slot<Signature> slot_type;
public:
ObservableSignal() : count(0) {}
boost::shared_ptr<Connection> connect(const slot_type &t) {
// Create the boost signal connection and return our shared Connection object.
boost::signals2::connection conn = signal.connect(t);
return boost::shared_ptr<Connection>(new Connection(boost::bind(&ObservableSignal::updateCount, this, _1), conn));
}
// This is where I don't know anymore.
void operator() (/* Parameter depend on "Signature" */) {
signal(/* Parameter depend on "Signature" */); //Call the actual boost signal
}
private:
void updateCount(int updown) {
// TODO: Handle subscription if count is leaving or approaching 0.
count += updown;
std::cout << "Count: " << count << std::endl;
}
int count; // current count of connections to this signal
boost::signals2::signal<Signature> signal; // Actual boost signal
};

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Using windows fiber in a simple way but unexplainable bugs occur - windows

Well, several months later. I figured out that the variable declared as thread_local were the culprits. If you use fiber, forget about the thread_local variables and use the per-fiber memory you allocated when you create them. I now store my current thread index in the per-fiber structure instance.

You need to use the /GT option if you want thread local storage. https://learn.microsoft.com/en-us/cpp/build/reference/gt-support-fiber-safe-thread-local-storage?view=msvc-170

Related

Linux device driver for a Smart Card IC module

Efficient message factory and handler in C++

How to detect application terminate in kernel extension, Mac OS X

libwebsockets write to all active connections after receive

Track the connection count of a boost signal

Categories

Resources