Different behavior of boost::serialization of strings on text archive - boost

I'm having some issue serializing a std::string with boost::serialization on a text_oarchive. AFAICT, I have two identical pieces of code that behaves differently in two different programs.
This is the program that I believe is behaving correctly:
#include <iostream>
#include <string>
#include <sstream>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
template <typename T>
void serialize_deserialize(const T & src, T & dst)
{
std::string serialized_data_str;
std::cout << "original data: " << src << std::endl;
std::ostringstream archive_ostream;
boost::archive::text_oarchive oarchive(archive_ostream);
oarchive << src;
serialized_data_str = archive_ostream.str();
std::cout << "serialized data: " << serialized_data_str << std::endl;
std::istringstream archive_istream(serialized_data_str);
boost::archive::text_iarchive iarchive(archive_istream);
iarchive >> dst;
}
int main()
{
std::string archived_data_str = "abcd";
std::string restored_data_str;
serialize_deserialize<std::string>(archived_data_str, restored_data_str);
std::cout << "restored data: " << restored_data_str << std::endl;
return 0;
}
And this is its output:
original data: abcd
serialized data: 22 serialization::archive 10 4 abcd
restored data: abcd
(You can compile it with: g++ boost-serialization-string.cpp -o boost-serialization-string -lboost_serialization)
This one, on the other hand, is an excerpt of the program I'm writing (derived from boost_asio/example/serialization/connection.hpp) that serializes std::string data converting each character in its hex representation:
/// Asynchronously write a data structure to the socket.
template <typename T, typename Handler>
void async_write(const T& t, Handler handler)
{
// Serialize the data first so we know how large it is.
std::cout << "original data: " << t << std::endl;
std::ostringstream archive_stream;
boost::archive::text_oarchive archive(archive_stream);
archive << t;
outbound_data_ = archive_stream.str();
std::cout << "serialized data: " << outbound_data_ << std::endl;
[...]
And this is an excerpt of its output:
original data: abcd
serialized data: 22 serialization::archive 10 5 97 98 99 100 0
The version (10) is the same, right? So that should be the proof that I'm using the same serialization library in both programs.
However, I really can't figure out what's going on here. I've been trying to solve this puzzle for almost an entire work day now, and I'm out of ideas.
For anyone that may want to reproduce this result, it should be sufficient to download the Boost serialization example, add the following line
connection_.async_write("abcd", boost::bind(&client::handle_write, this, boost::asio::placeholders::error));
at line 50 of client.cpp, add the following member function in client.cpp
/// Handle completion of a write operation.
void handle_write(const boost::system::error_code& e)
{
// Nothing to do. The socket will be closed automatically when the last
// reference to the connection object goes away.
}
add this cout:
std::cout << "serialized data: " << outbound_data_ << std::endl;
at connection.hpp:59
and compile with:
g++ -O0 -g3 client.cpp -o client -lboost_serialization -lboost_system
g++ -O0 -g3 server.cpp -o server -lboost_serialization -lboost_system
I'm using g++ 4.8.1 under Ubuntu 13.04 64bit with Boost 1.53
Any help would be greatly appreciated.
P.s. I'm posting this because the deserialization of the std::strings isn't working at all! :)

I see two causes of such behavior.
The compiler does not explicitly converts "abcd" from const char * to std::string and the serialization handles it as a vector of "bytes" and not as an ASCII string. Changing the code to the connection_.async_write(std::string("abcd"), boost::bind(&client::handle_write, this, boost::asio::placeholders::error)); should fix the problem.
Probably, the string type passed as the t argument of the async_write template method is not std::string but std::wstring and it is serialized not as an ASCII string ("abcd") but as an unsigned short vector and 97 98 99 100 is a decimal representation of the ASCII characters a, b, c and d.

Related

How to test an instance counter by asynchronous run of a boost childprocess?

I have tried to use boost::childprocess with an async_pipe as shown in the code example below, while expecting since there is a wait method, that the call to run would not wait for the called executable to finish before continuing to the line where I call wait(). My aim is namely to start the same executable multiple times in order to test in GTest an instance counting method (implemented based on boost managed shared memory segment).
But here fore I need the call to io_service::run(), to not wait for the called executable to finish as it does right now. Can someone tell me where I am using it wrong please? Or if this is the wrong way to unit test my function? I have been trying to find the solution for quite some time!
Here is a sample of how I call one instance of the executable:
int CallChildProcess_Style9() {
std::string strCmdLine = "E:\\file.exe --Debug MainStartUps_Off --Lock 3";
boost::asio::io_service m_oIOS;
std::vector<char> m_oAsyncBuffer_Out;
bp::async_pipe m_oAsyncPipe_Out(m_oIOS);
std::error_code build_ec;
size_t nReadSize(0);
boost::scoped_ptr<boost::process::child> m_pChildProcess(nullptr);
m_pChildProcess.reset(new bp::child(strCmdLine.data(), bp::std_out > m_oAsyncPipe_Out, build_ec));
m_oAsyncBuffer_Out.resize(1024*8);
boost::asio::async_read(m_oAsyncPipe_Out, boost::asio::buffer(m_oAsyncBuffer_Out),
[&](const boost::system::error_code &ec, std::size_t size) { nReadSize = size; });
size_t iii = m_oIOS.run();
m_pChildProcess->wait();
m_oAsyncBuffer_Out.resize(nReadSize);
std::string strBuf(m_oAsyncBuffer_Out.begin(), m_oAsyncBuffer_Out.begin() + nReadSize);
int result = m_pChildProcess->exit_code();
m_oAsyncPipe_Out.close();
m_oIOS.reset();
return result;
}
Using io_service
To be using async_pipe, you need to supply the io_service instance to the parameter keywords of bp::child:
#include <boost/asio.hpp>
#include <boost/process.hpp>
#include <boost/process/async.hpp>
#include <boost/scoped_ptr.hpp>
#include <iostream>
namespace bp = boost::process;
int CallChildProcess_Style9() {
std::string strCmdLine = "/bin/cat";
boost::asio::io_service m_oIOS;
std::vector<char> m_oAsyncBuffer_Out;
bp::async_pipe m_oAsyncPipe_Out(m_oIOS);
std::error_code build_ec;
size_t nReadSize(0);
boost::scoped_ptr<boost::process::child> m_pChildProcess(nullptr);
std::vector<std::string> const args = { "/home/sehe/Projects/stackoverflow/test.cpp" };
m_pChildProcess.reset(new bp::child(strCmdLine, args, bp::std_out > m_oAsyncPipe_Out, build_ec, m_oIOS));
std::cout << "Launched: " << build_ec.message() << std::endl;
m_oAsyncBuffer_Out.resize(1024 * 8);
boost::asio::async_read(m_oAsyncPipe_Out, boost::asio::buffer(m_oAsyncBuffer_Out),
[&](const boost::system::error_code &ec, std::size_t size) {
std::cout << "read completion handler: size = " << size << " (" << ec.message() << ")" << std::endl;
nReadSize = size;
});
std::cout << "read started" << std::endl;
size_t iii = m_oIOS.run();
std::cout << "io_service stopped" << std::endl;
std::cout << "initiate child::wait" << std::endl;
m_pChildProcess->wait();
std::cout << "wait completed" << std::endl;
std::string const strBuf(m_oAsyncBuffer_Out.data(), nReadSize);
int result = m_pChildProcess->exit_code();
m_oAsyncPipe_Out.close();
m_oIOS.reset();
return result;
}
int main() {
CallChildProcess_Style9();
}
Prints
http://coliru.stacked-crooked.com/a/8a9bc6bed3dd5e0a
Launched: Success
read started
read completion handler: size = 1589 (End of file)
io_service stopped
initiate child::wait
wait completed
Hanging Up The Child
Even with that fixed, async_pipe::async_read only reads until the buffer is full or EOF is reached. If the child process outputs more than the buffer size (8k in your sample) then it will get stuck and never finish.
E.g.: replacing the command like this:
std::string strCmdLine = "/usr/bin/yes";
Results in
Live On Coliru
Launched: Success
read started
read completion handler: size = 8192 (Success)
io_service stopped
initiate child::wait
At which it will hang till infinity. This is not because yes has infinite output. Any command having large output will hang (e.g. /bin/cat /etc/dictionaries-common/words hangs in the same way). You can prove this by looking at the strace output:
$ sudo strace -p $(pgrep yes)
strace: Process 21056 attached
write(1, "/home/sehe/Projects/stackoverflo"..., 8170
The easiest way to "fix" this would be to close the output sink after you filled up your output buffer:
boost::asio::async_read(m_oAsyncPipe_Out, boost::asio::buffer(m_oAsyncBuffer_Out),
[&](const boost::system::error_code &ec, std::size_t size) {
std::cout << "read completion handler: size = " << size << " (" << ec.message() << ")" << std::endl;
nReadSize = size;
m_oAsyncPipe_Out.close();
});
This requires you to anticipate that the child exited before you call wait() so wait() might fail:
Live On Coliru
Launched: Success
read started
read completion handler: size = 8192 (Success)
io_service stopped
initiate child::wait
wait completed (Success)
Taking A Step Back: What Do You Need?
It looks, though, that you might be complicating. If you're happy limiting the output to 8k, and all you need is to have multiple copies, why bother with async io?
Any child is already asynchronous, and you can just pass the buffer:
Live On Coliru
#include <boost/asio.hpp>
#include <boost/process.hpp>
#include <iostream>
namespace bp = boost::process;
using Args = std::vector<std::string>;
using Buffer8k = std::array<char, 8192>;
int main() {
auto first_out = std::make_unique<Buffer8k>(),
second_out = std::make_unique<Buffer8k>();
*first_out = {};
*second_out = {};
boost::asio::io_service svc;
bp::child first("/bin/echo", Args{"-n", "first"}, bp::std_out > boost::asio::buffer(*first_out), svc);
bp::child second("/bin/echo", Args{"-n", "second"}, bp::std_out >boost::asio::buffer(*second_out), svc);
std::cout << "Launched" << std::endl;
svc.run();
first.wait();
second.wait();
std::string const strFirst(first_out->data()); // uses NUL-termination (assumes text output)
std::string const strSecond(second_out->data()); // uses NUL-termination (assumes text output)
std::cout << strFirst << "\n";
std::cout << strSecond << "\n";
return first.exit_code();
}
Prints
Launched
first
second
More Examples
Because I can't really be sure about what you need, look at other examples that I wrote to actually show live async IO, where you might need to respond to particular output of one process.
Boost::process output blank lines
Read child process stdout in a separate thread with BOOST process
How to retrieve program output as soon as it printed?

using stl to run length encode a string using std::adjacent_find

I am trying to perform run length compression on a string for a special protocol that I am using. Runs are considered efficient when the run size or a particular character in the string is >=3. Can someone help me to achieve this. I have live demo on coliru. I am pretty sure this is possible with the standard library's std::adjacent_find with a combination of std::not_equal_to<> as the binary predicate to search for run boundaries and probably using std::equal_to<> once I find a boundary. Here is what I have so far but I am having trouble with the results:
Given the following input text string containing runs or spaces and other characters (in this case runs of the letter 's':
"---thisssss---is-a---tesst--"
I am trying to convert the above text string into a vector containing elements that are either pure runs of > 2 characters or mixed characters. The results are almost correct but not quite and I cannot spot the error.
g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp && ./a.out
expected the following
======================
---,thi,sssss,---,is-a,---,tesst--,
actual results
==============
---,thi,sssss,---,is-a,---,te,ss,--,
EDIT: I fixed up the previous code to make this version closer to the final solution. Specifically I added explicit tests for the run size to be > 2 to be included. I seem to be having boundary case problems though - the all spaces case and the case where the end of the strings ends in several spaces:
#include <iterator>
#include <iostream>
#include <memory>
#include <string>
#include <vector>
#include <algorithm>
#include <functional>
int main()
{
// I want to convert this string containing adjacent runs of characters
std::string testString("---thisssss---is-a---tesst--");
// to the following
std::vector<std::string> idealResults = {
"---", "thi", "sssss",
"---", "is-a",
"---", "tesst--"
};
std::vector<std::string> tokenizedStrings;
auto adjIter = testString.begin();
auto lastIter = adjIter;
// temporary string used to accumulate characters that
// are not part of a run.
std::unique_ptr<std::string> stringWithoutRun;
while ((adjIter = std::adjacent_find(
adjIter, testString.end(), std::not_equal_to<>())) !=
testString.end()) {
auto next = std::string(lastIter, adjIter + 1);
// append to foo if < run threshold
if (next.length() < 2) {
if (!stringWithoutRun) {
stringWithoutRun = std::make_unique<std::string>();
}
*stringWithoutRun += next;
} else {
// if we have encountered non run characters, save them first
if (stringWithoutRun) {
tokenizedStrings.push_back(*stringWithoutRun);
stringWithoutRun.reset();
}
tokenizedStrings.push_back(next);
}
lastIter = adjIter + 1;
adjIter = adjIter + 1;
}
tokenizedStrings.push_back(std::string(lastIter, adjIter));
std::cout << "expected the following" << std::endl;
std::cout << "======================" << std::endl;
std::copy(idealResults.begin(), idealResults.end(), std::ostream_iterator<std::string>(std::cout, ","));
std::cout << std::endl;
std::cout << "actual results" << std::endl;
std::cout << "==============" << std::endl;
std::copy(tokenizedStrings.begin(), tokenizedStrings.end(), std::ostream_iterator<std::string>(std::cout, ","));
std::cout << std::endl;
}
if (next.length() < 2) {
if (!stringWithoutRun) {
stringWithoutRun = std::make_unique<std::string>();
}
*stringWithoutRun += next;
}
This should be if (next.length() <= 2). You need to add a run of identical characters to the current token if its length is either 1 or 2.
I seem to be having boundary case problems though - the all spaces
case and the case where the end of the strings ends in several spaces
When stringWithoutRun is not empty after the loop finishes, the characters accumulated in it are not added to the array of tokens. You can fix it like this:
// The loop has finished
if (stringWithoutRun)
tokenizedStrings.push_back(*stringWithoutRun);
tokenizedStrings.push_back(std::string(lastIter, adjIter));

String length changes suddenly

Here in this code, the character length is changing suddenly. Before introducing char file the strlen(str) was correct. As I introduced the new char file the strlen value of variable str changes.
#include <unistd.h>
#include <iostream>
#include <stdio.h>
#include <string.h>
using namespace std;
int main(){
char buf[BUFSIZ];
if(!getcwd(buf,BUFSIZ)){
perror("ERROR!");
}
cout << buf << endl;
char *str;
str = new char[strlen(buf)];
strcpy(str,buf);
strcat(str,"/");
strcat(str,"input/abcdefghijklmnop");
cout << str << endl;
cout << strlen(str) << endl;
char *file;
file = new char[strlen(str)];
cout << strlen(file) << endl;
strcpy(file,str);
cout << file << endl;
}
Your code has undefined behavior because of buffer overflow. You should be scared.
You should consider using std::string.
std::string sbuf;
{
char cwdbuf[BUFSIZ];
if (getcwd(cwdbuf, sizeof(cwdbuf))
sbuf = cwdbuf;
else {
perror("getcwd");
exit(EXIT_FAILURE);
}
}
sbuf += "/input/abcdefghijklmnop";
You should compile with all warnings & debug info (e.g. g++ -Wall -Wextra -g) then use the debugger gdb. Don't forget that strings are zero-byte terminated. Your str is much too short. If you insist on avoiding std::string (which IMHO you should not), you need to allocate more space (and remember the extra zero byte).
str = new char[strlen(buf)+sizeof("/input/abcdefghijklmnop")];
strcpy(str, buf);
strcat(str, "/input/abcdefghijklmnop");
Remember that the sizeof some literal string is one byte more than its length (as measured by strlen). For instance sizeof("abc") is 4.
Likewise your file variable is one byte too short (missing space for the terminating zero byte).
file = new char[strlen(str)+1];
BTW on GNU systems (such as Linux) you could use asprintf(3) or strdup(3) (and use free not delete to release the memory) and consider using valgrind.

C++ std::unordered_map key custom hashing

I've got the following test.cpp file
#include <string>
#include <functional>
#include <unordered_map>
#include <iostream>
class Mystuff {
public:
std::string key1;
int key2;
public:
Mystuff(std::string _key1, int _key2)
: key1(_key1)
, key2(_key2)
{}
};
namespace std {
template<>
struct hash<Mystuff *> {
size_t operator()(Mystuff * const& any) const {
size_t hashres = std::hash<std::string>()(any->key1);
hashres ^= std::hash<int>()(any->key2);
std::cout << "Hash for find/insert is [" << hashres << "]" << std::endl;
return (hashres);
}
};
}; /* eof namespace std */
typedef std::unordered_map<Mystuff *, Mystuff *>mystuff_map_t;
mystuff_map_t map;
int insert_if_not_there(Mystuff * stuff) {
std::cout << "Trying insert for " << stuff->key1 << std::endl;
if (map.find(stuff) != map.end()) {
std::cout << "It's there already..." << std::endl;
return (-1);
} else {
map[stuff] = stuff;
std::cout << "Worked..." << std::endl;
}
return (0);
}
int main(){
Mystuff first("first", 1);
Mystuff second("second", 2);
Mystuff third("third", 3);
Mystuff third_duplicate("third", 3);
insert_if_not_there(&first);
insert_if_not_there(&second);
insert_if_not_there(&third);
insert_if_not_there(&third_duplicate);
}
You can compile with g++ -o test test.cpp -std=gnu++11.
I don't get what I'm doing wrong with it: the hash keying algorithm is definitely working, but for some reason (which is obviously in the - bad - way I'm doing something), third_duplicate is inserted as well in the map, while I'd wish it wasn't.
What am I doing wrong?
IIRC unordered containers need operator== as well as std::hash. Without it, I'd expect a compilation error. Except that your key is actually MyStuff* - the pointer, not the value.
That means you get the duplicate key stored as a separate item because it's actually not, to unordered_map, a real duplicate - it has a different address, and address equality is how unordered_map is judging equality.
Simple solution - use std::unordered_map<Mystuff,Mystuff> instead. You will need to overload operator== (or there's IIRC some alternative template, similar to std::hash, that you can specialize). You'll also need to change your std::hash to also accept the value rather than the pointer.
Don't over-use pointers in C++, especially not raw pointers. For pass-by-reference, prefer references to pointers (that's a C++-specific meaning of "reference" vs. "pointer"). For containers, the normal default is to use the type directly for content, though there are cases where you might want a pointer (or a smart pointer) instead.
I haven't thoroughly checked your code - there may be more issues than I caught.

C++11 Struct definition with atomic attribute

In C++11 I have a struct with lots of attributes like so:
#include <atomic>
struct Foo {
int x;
int y;
// ...
// LOTS of primitive type attributes, followed by...
// ...
std::atomic_bool bar;
}
And I'd like to define an instance like so:
bool bar_value = true;
Foo my_foo = {/*attribute values*/, bar_value};
However, the atomic_bool is throwing a "use of deleted function" error because I think copy constructing is not allowed on atomics. Is there any way around this, short of writing out a constructor or assigning each value individually?
It just seems inconvenient to have to treat this otherwise relatively banal struct in a special way just because one of its many attributes is a special case.
Updates:
Any takers? I've been looking around, but there doesn't seem to be any straightforward way to resolve this.
Try wrapping the initialization of the atomic_bool in its own initializer list. It worked for me in g++ 4.7.
#include <atomic>
#include <iostream>
struct Foo
{
int x;
int y;
std::atomic_bool bar;
};
int main(int, char**)
{
Foo f1 = {1, 2, {true}};
Foo f2 = {3, 4, {false}};
std::cout << "f1 - " << f1.x << " " << f1.y << " "
<< (f1.bar.load()?"true":"false") << std::endl;
std::cout << "f2 - " << f2.x << " " << f2.y << " "
<< (f2.bar.load()?"true":"false") << std::endl;
}
I got the following output:
$ g++ -std=c++11 test.cpp -o test && ./test
f1 - 1 2 true
f2 - 3 4 false

Resources