i tried to profile performance of my code, and thats what i get:
i took a code from microsoft docs from topic about profiling:
#include <iostream>
#include <limits>
#include <mutex>
#include <random>
#include <functional>
//.cpp file code:
static constexpr int MIN_ITERATIONS = std::numeric_limits<int>::max() / 1000;
static constexpr int MAX_ITERATIONS = MIN_ITERATIONS + 10000;
long long m_totalIterations = 0;
std::mutex m_totalItersLock;
int getNumber()
std::uniform_int_distribution<int> num_distribution(MIN_ITERATIONS, MAX_ITERATIONS);
std::mt19937 random_number_engine; // pseudorandom number generator
auto get_num = std::bind(num_distribution, random_number_engine);
int random_num = get_num();
auto result = 0;
std::lock_guard<std::mutex> lock(m_totalItersLock);
m_totalIterations += random_num;
// we're just spinning here
// to increase CPU usage
for (int i = 0; i < random_num; i++)
result = get_num();
return result;
void doWork()
std::wcout << L"The doWork function is running on another thread." << std::endl;
auto x = getNumber();
int main()
std::vector<std::thread> threads;
for (int i = 0; i < 10; ++i) {
std::cout << "The Main() thread calls this after starting the new thread" << std::endl;
for (auto& thread : threads) {
return 0;
, and still i'm getting different output (or no output actually). Can someone help me pls? I'm trying to do that on Visual Studio Community 2019


C++11 std::threads not exiting

Could you please check the following code which is not exiting even after condition becomes false?
I'm trying to print numbers from 1 to 10 by first thread, 2 to 20 by second thread likewise & I have 10 threads, whenever count reaches to 100, my program should terminate safely by terminating all threads. But that is not happening, after printing, it stuck up and I don't understand why?
Is there any data race? Please guide.
std::mutex mu;
int count=1;
bool isDone = true;
std::condition_variable cv;
void Print10(int tid)
std::unique_lock<std::mutex> lock(mu);
cv.wait(lock,[tid](){ return ((count/10)==tid);});
for(int i=0;i<10;i++)
std::cout<<"tid="<<tid<<" count="<<count++<<"\n";
isDone = count<100;//!(count == (((tid+1)*10)+1));
std::cout<<"tid="<<tid<<" isDone="<<isDone<<"\n";
int main()
std::vector<std::thread> vec;
for(int i=0;i<10;i++)
for(auto &th : vec)
I believe the following code should work for you
using namespace std;
mutex mu;
int count=1;
bool isDone = true;
condition_variable cv;
void Print10(int tid)
unique_lock<std::mutex> lock(mu);
// Wait until condition --> Wait till count/10 = tid
while(count/10 != tid)
// Core logic
for(int i=0;i<10;i++)
cout<<"tid="<<tid<<" count="<<count++<<"\n";
// Release the current thread thus ensuring serailization
int main()
std::vector<std::thread> vec;
for(int i=0;i<10;i++)
for(auto &th : vec)
return 0;

Why can't I receive UDP packets more than once with Boost ASIO?

HINT: This works if I instantiate the io_context inside the for loop.
I know this code looks a little goofy, but it's a simplified version of code that's bigger and has this structure. Why can't I receive a second packet with the below code? It works fine with bool synch = true;. Here's the output I get:
iteration 0
receive udp
posted receive
got a packet
iteration 1
receive udp
posted receive
I have to hit Ctrl-c to quit. I expect to see "got a packet" a second time.
The receiver:
#include <array>
#include <iostream>
#include <functional>
#include <thread>
#include <boost/asio.hpp>
namespace asio = boost::asio;
namespace ip = asio::ip;
using ip::udp;
using std::cout;
using std::endl;
using boost_ec = boost::system::error_code;
int main() {
asio::io_context ioContext;
std::array<char, 65500> buffer;
auto asioBuffer = asio::buffer(buffer);
bool synch = false;
udp::endpoint remoteEndpoint;
for (unsigned int i = 0; i < 2; ++i) {
cout << "iteration " << i << endl;
auto recvSocket = udp::socket(ioContext,
udp::endpoint(udp::v4(), 9090));
if (synch) {
recvSocket.receive_from(asioBuffer, remoteEndpoint);
cout << "received a packet" << endl;
} else {
std::function<void(const boost_ec&, size_t)> impl =
[&](const boost_ec &, size_t packetSize) {
if (packetSize > 0) {
cout << "got a packet" << endl;
cout << "receive udp" << endl;
cout << "posted receive" << endl;
impl(boost_ec(), 0);
while (ioContext.poll() == 0) {
The sender:
#include <array>
#include <iostream>
#include <boost/asio.hpp>
namespace asio = boost::asio;
namespace ip = asio::ip;
namespace chrono = std::chrono;
using ip::udp;
using std::cout;
using std::endl;
int main() {
std::array<char, 65500> buffer;
asio::io_context ioContext;
auto socket = udp::socket(ioContext);;
auto endpoint = udp::endpoint(udp::v4(), 9090);
size_t packetsSent = 0;
size_t bytesSent = 0;
const double APPROX_BYTES_PER_SEC = 1e6;
const auto CHECK_INTERVAL = chrono::microseconds(100);
auto beforeStart = chrono::steady_clock::now();
auto start = beforeStart;
size_t bytesSentSinceStart = 0;
while (true) {
auto now = chrono::steady_clock::now();
auto timePassed = now - start;
if (timePassed > CHECK_INTERVAL) {
auto expectedTime = chrono::duration<double>(bytesSentSinceStart /
if (expectedTime > timePassed) {
std::this_thread::sleep_for(expectedTime - timePassed);
start = chrono::steady_clock::now();
bytesSentSinceStart = 0;
bytesSent += socket.send_to(asio::buffer(buffer), endpoint);
bytesSentSinceStart += buffer.size();
return 0;
I think this is the key:
void restart();
This function must be called prior to any second or later set of invocations of the run(), run_one(), poll() or poll_one() functions when a previous invocation of these functions returned due to the io_context being stopped or running out of work.
So the above code needs to call restart after every poll that results in ioContext.stopped() being true, and it becomes true when the ioContext no longer has anything attached to it waiting to happen.

c++ random set seed failed

I am trying to set seed to the c++ std::default_random_engine:
using namespace std;
void print_rand();
int main() {
for (int i{0}; i < 20; ++i) {
return 0;
void print_rand() {
default_random_engine e;
cout << e() << endl;
It seems that the printed numbers are same, how could I set the seed to generate the random number according to the time?
You have to seed only once instead of every time the function is called. Then you will get different values. I will move the functionality to main() to demonstrate this.
int main() {
std::default_random_engine e;
for (int i{0}; i < 20; ++i) {
std::cout << e() << std::endl;
return 0;
See Live Demo
As #P.W. said, you should seed only once. A minimal change in that direction would be using a static variable with the seed given to the constructor:
void print_rand();
int main() {
for (int i{0}; i < 20; ++i) {
return 0;
void print_rand() {
static std::default_random_engine e(time(0));
cout << e() << endl;

Protobuf ParseFromZeroCopyStream incurs high memory usage with repeated field

I have encountered a problem of high memory usage when using ParseFromZeroCopyStream to load file in which a large buffer is written. Besides, the code snippet below uses 60Gb++ of RAM but failed as the system froze after reaching its RAM limit.
FYI, I am using protobuf as DLL.
syntax = "proto3";
package Recipe;
option cc_enable_arenas = true;
message Scene
repeated int32 image_data = 1 [packed=true];
#include <iostream>
#include <fstream>
#include <ostream>
#include <istream>
#include <string>
#include <cstdint>
#include "Scene.pb.h"
#include <google\protobuf\io\zero_copy_stream_impl.h>
#include <google\protobuf\io\gzip_stream.h>
#include <google\protobuf\arena.h>
int const _MIN = 0;
int const _MAX = 255;
unsigned int const _SIZE = 1280000000;
//unsigned int const _SIZE = 2000;
unsigned int const _COMPRESSION_LEVEL = 6;
void randWithinUnsignedCharSize(uint8_t * buffer, unsigned int size)
for (size_t i = 0; i < size; ++i)
buffer[i] = i;
using namespace google::protobuf::io;
int main()
google::protobuf::Arena arena;
Recipe::Scene * scene = google::protobuf::Arena::CreateMessage<Recipe::Scene>(&arena);
uint8_t * imageData = new uint8_t[_SIZE];
randWithinUnsignedCharSize(imageData, _SIZE);
scene->mutable_image_data()->Resize(_SIZE, 0);
for (size_t i = 0; i < _SIZE; i++)
scene->set_image_data(i, imageData[i]);
std::cout << "done saving data to repeated field.\n";
std::fstream output("data.txt", std::ios::out | std::ios::trunc | std::ios::binary);
OstreamOutputStream outputFileStream(&output);
GzipOutputStream::Options options;
options.format = GzipOutputStream::GZIP;
options.compression_level = _COMPRESSION_LEVEL;
GzipOutputStream gzipOutputStream(&outputFileStream, options);
if (!scene->SerializeToZeroCopyStream(&gzipOutputStream)) {
std::cerr << "Failed to write scene." << std::endl;
return -1;
delete[] imageData;
std::cout << "Finish serializing into data.txt\n";
google::protobuf::Arena arena1;
Recipe::Scene * scene1 = google::protobuf::Arena::CreateMessage<Recipe::Scene>(&arena1);
std::fstream input("data.txt", std::ios::in | std::ios::binary);
IstreamInputStream inputFileStream(&input);
GzipInputStream gzipInputStream(&inputFileStream);
if (!scene1->ParseFromZeroCopyStream(&gzipInputStream)) {
std::cerr << "Failed to parse scene." << std::endl;
return -1;
std::cout << "scene1->imagedata_size() " << scene1->image_data_size() << std::endl;
return 0;

boost::variant vs. polymorphism, very different performance results with clang and gcc

I'm trying to figure out how much the execution time of boost::variant differ from a polymorphism approach. In my first test I got very different results on gcc 4.9.1 and clang+llvm 3.5.
You can find the code below. Here are my results:
polymorphism: 2.16401
boost::variant: 3.83487
polymorphism: 2.46161
boost::variant: 1.33326
I compiled both with -O3.
Is someone able to explain that?
#include <iostream>
#include <vector>
#include <algorithm>
#include <boost/variant.hpp>
#include <boost/variant/apply_visitor.hpp>
#include <ctime>
struct value_type {
value_type() {}
virtual ~value_type() {}
virtual void inc() = 0;
struct int_type : value_type {
int_type() : value_type() {}
virtual ~int_type() {}
void inc() { value += 1; }
int value = 0;
struct float_type : value_type {
float_type() : value_type() {}
virtual ~float_type() {}
void inc() { value += 1; }
float value = 0;
void dyn_test() {
std::vector<std::unique_ptr<value_type>> v;
for (int i = 0; i < 1024; i++) {
if (i % 2 == 0)
v.emplace_back(new int_type());
v.emplace_back(new float_type());
for (int i = 0; i < 900000; i++) {
std::for_each(v.begin(), v.end(), [](auto &item) { item->inc(); });
struct visitor : boost::static_visitor<> {
template <typename T> void operator()(T &item) { item += 1; }
using mytype = boost::variant<int, float>;
void static_test() {
std::vector<mytype> v;
for (int i = 0; i < 1024; i++) {
if (i % 2 == 0)
visitor vi;
for (int i = 0; i < 900000; i++) {
std::for_each(v.begin(), v.end(), boost::apply_visitor(vi));
template <typename F> double measure(F f) {
clock_t start = clock();
clock_t end = clock();
float seconds = (float)(end - start) / CLOCKS_PER_SEC;
return seconds;
int main() {
std::cout << "polymorphism: " << measure([] { dyn_test(); }) << std::endl;
std::cout << "boost::variant: " << measure([] { static_test(); }) << std::endl;
return 0;
Clang is known to miscompile some std::vector functions from various Standard libraries, due to some edge cases in their inliner. I don't know if those have been fixed by now but quite possibly not. Since unique_ptr is smaller and simpler than boost::variant it's more likely that it does not trigger these edge cases.
The code you post is practically "Why boost::variant is great". A dynamic allocation and random pointer index in addition to the regular indirections that both perform? That's a heavy hit (relatively).
