C++: Get state of linear congruential generator - random

It seems that if I write
#include <random>
std::minstd_rand engine(1);
std::cout << engine;
then this prints out the internal state of the engine (which is a linear congruential generator). Right now the state equals the seed (1), but if I call a random number and print out engine, it returns some large number, which is probably the state.
How do I actually get the state, in a variable?

Use a string stream instead of stdout. Example:
#include <sstream>
...
std::ostringstream os;
os << engine;
string mystate = os.str();
The o in ostringstream is for output.
The state should be last random number generated, which is why there is not an easier way to do this. It's not as ideal as something like int a; a << engine, but it'll have to do. If you need it that often, make the stringstream operation a function (Including perhaps a conversion from string to integer). You can also typedef a pair of engine/integer with the integer being the state, and make a couple of methods so it's autoset every generation call if you need the performance.
If you don't care about the state, and just want it for the future, do
int engineState = engine();
Now you have the state. Though it's not the same as what it was before, it might not matter depending on your use case.

Output from linear congruential RNG is the state. Or, as alreadynoted, use operator<< to output and convert state
Code
#include <random>
#include <iostream>
#include <sstream>
int main() {
auto engine = std::minstd_rand{ 1 };
auto q = engine();
auto os = std::ostringstream{};
os << engine;
auto r = std::stoul(os.str()); // use ul to fit output
std::cout << q << " " << os.str() << " " << r << '\n';
return 0;
}
prints
48271 48271 48271
Alternative might be if particular implementation implements discard properly in O(log2(N)) time, according to paper by F.Brown https://laws.lanl.gov/vhosts/mcnp.lanl.gov/pdf_files/anl-rn-arb-stride.pdf. In such case you could move one position back, call RNG again and get your state as output.
Compiler and library I use - Visual C++ 2017 15.7 - has not implemented discard in such way, and useless for moving back.

LCGs consist of a simple state that is represented by a single integer.
This means you can treat this pointer as a pointer to an integer.
Below, I have provided an example of a template function that gets
the state (seed) of an engine and even works for classes deriving LCGs.
#include <random>
template <class T, T... v>
T getSeed(std::linear_congruential_engine<T, v...>& rand) {
static_assert(sizeof(rand) == sizeof(T));
return *reinterpret_cast<T*>(&rand);
}
#include <iostream>
int main() {
std::minstd_rand engine(19937);
auto seed = getSeed(engine);
std::cout << sizeof(engine);
std::cout << '\t' << seed;
}
^ This method is way more efficient (x320 times) than serializing through a stream,
or by creating a dummy ostream and specializing std::operator<< for every case.
template<class T, T... v>
using LCG = std::linear_congruential_engine<T, v...>;
#define DummyRandSpec32 uint_fast32_t, 0xDEADBEEF, 0xCAFE, 0xFFFFFFFF
typedef LCG<DummyRandSpec32> DummyRand32; // the same engine type
template<class T, class R>
T* getSeed(R& rand) // getSeed 70:1 nextInt
{ // creating stream is heavy operation
// return rand._M_x; // cannot access private
__dummy_ostream<T> dumdum; // workaround
auto& didey = *reinterpret_cast<DummyRand32*>(&rand);
std::operator<<(dumdum, didey); // specialized
return dumdum.retrieve(); // pointer to state
}
int main() {
std::minstd_rand engine(19937);
std::cout << *getSeed<uint_fast32_t>(engine);
std::cout << std::endl << engine << std::endl;
}
^ Here is ill-coded my first attempt at a solution, if you want to compare.
It is worth mentioning that a field name of the state is implementation-specific.
Purposefully left out std::operator<< and __dummy_ostream.

Related

Moving between two different contiguous containers

I have a std::vector<double> that I have to move to a boost::container::flat_set<double>.
Both containers are contiguous, so after sorting the vector in principle I could move the data from one to the other.
Is there a way to move the whole data between these two different containers?
Please, take into account that I want to move the whole data, not element by element.
I can move data between containers of the same type, but not between different containers.
std::vector<double> v1 = ...
std::sort(v1.begin(), v1.end());
std::vector<double> v2(std::move(v1)); // ok
boost::flat_set<double> f2(v1.begin(), v1.end()); // doesn't move, it copies
boost::flat_set<double> f3(std::move(v1)); // doesn't compile
It seems that for this to work flat_set should have a move constructor from containers with .data(), where the pointer is stolen from the argument.
I believe there is some way to verify whenever data alignment in both containers match and memcpy could be used (and source cleared without destructing) exists and maybe someone will share it with us, but as long as we want to use STL there is a way: the std::move_iterator. It makes your container constructor move elements instead of copying. It does not remove elements out of source container though, but leaves them stateless (e.g. empty strings as in example).
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <boost/container/flat_set.hpp>
int main()
{
std::vector<std::string> v1 = {"a","v","d"};
std::sort(v1.begin(), v1.end());
std::vector<std::string> v2(std::move(v1)); // ok
boost::container::flat_set<std::string> f1(std::make_move_iterator(v2.begin()), std::make_move_iterator(v2.end())); // moves, but does not remove elements from of source container
for(auto& s : v1)
std::cout << "'" << s << "'" << ' ';
std::cout << " <- v1 \n";
for(auto& s : v2)
std::cout << "'" << s << "'" << ' ';
std::cout << " <- v2 \n";
for(auto& s : f1)
std::cout << "'" << s << "'" << ' ';
std::cout << " <- f1 \n";
}
Output
<- v1
'' '' '' <- v2
'a' 'd' 'v' <- f1
Online code: https://wandbox.org/permlink/ZLbocXKdqYHT0zYi
It looks like it is not possible without modifying the constructor boost::container::flat.
Without modifying either class it seems that the only a hack would do it, for example using reinterpret_cast.
The solution I found is either to use an alternative implementation of vector or very ugly code.
Before going into my solution, I must that say that this is probably a
defect of both classes. These clases should have a set of
release()/aquire(start, end) functions that respectively
returns the pointer range to the data releasing the ownership and
gets the pointer range owning it from then on. An alternative could be to
have a constructor that moves from any other container that has a the
data member function.
Solution using reinterpret_cast and a different implementation of vector
It turns out that reinterpret_casting from std::vector to boost::container::flat_set is not possible, because the layout is not compatible.
However it is possible to reinterpret_cast from boost::container::vector to boost::container::flat_set out of the box (that is because they have a common implementation).
#include<cassert>
#include<boost/container/flat_set.hpp>
int main(){
boost::container::vector<double> v = {1.,2.,3.};
boost::container::flat_set<double> fs = std::move(reinterpret_cast<boost::container::flat_set<double>&>(v));
assert(v.size() == 0);
assert(*fs.find(2.) == 2.);s
assert(fs.find(4.) == fs.end());
}
So, I can replace std::vector by boost::container::vector and I can move data to a flat_set.
Non-portable solution using std::vector and ugly code
The reason the layout of std::vector and boost::container::vector are different is that boost::container::vector stores metadata in this way:
class boost::container::vector{
pointer m_start;
size_type m_size;
size_type m_capacity;
}
while std::vector (in GCC) is basically pure pointers,
class std::vector{
pointer _M_start;
pointer _M_finish;
pointer _M_end_of_storage;
}
So, my conclusion is that moving is possible only through a hack given that the implementation I use of std::vector is not compatible with boost::container::flat_set.
In an extreme case, one can do this (sorry if this code offends someone, the code is not portable):
template<class T>
boost::container::flat_set<T> to_flat_set(std::vector<T>&& from){
// struct dummy_vector{T* start; T* finish; T* end_storarge;}&
// dfrom = reinterpret_cast<dummy_vector&>(from);
boost::container::flat_set<T> ret;
struct dummy_flat_set{T* start; std::size_t size; std::size_t capacity;}&
dret = reinterpret_cast<dummy_flat_set&>(ret);
dret = {from.data(), from.size(), from.capacity()};
// dfrom.start = dfrom.finish = dfrom.end_storarge = nullptr;
new (&from) std::vector<T>();
return ret;
};
int main(){
std::vector<double> v = {1.,2.,3.};
boost::container::flat_set<double> fs = to_flat_set(std::move(v));
assert(v.size() == 0);
assert(*fs.find(2.) == 2.);
assert(fs.find(4.) == fs.end());
}
Note that I am not taking into account allocator issues at all. I am not sure how to handle allocators here.
In retrospect I don't mind using a form of cast for this specific problem, because somehow I have to tell that the vector is sorted before moving to flat_set. (The problem is that this goes to extreme because it is a reinterpret_cast.)
However this is a secondary issue, there should be legal way to move from std::vector to boost::container::vector.

C++ std::unordered_map key custom hashing

I've got the following test.cpp file
#include <string>
#include <functional>
#include <unordered_map>
#include <iostream>
class Mystuff {
public:
std::string key1;
int key2;
public:
Mystuff(std::string _key1, int _key2)
: key1(_key1)
, key2(_key2)
{}
};
namespace std {
template<>
struct hash<Mystuff *> {
size_t operator()(Mystuff * const& any) const {
size_t hashres = std::hash<std::string>()(any->key1);
hashres ^= std::hash<int>()(any->key2);
std::cout << "Hash for find/insert is [" << hashres << "]" << std::endl;
return (hashres);
}
};
}; /* eof namespace std */
typedef std::unordered_map<Mystuff *, Mystuff *>mystuff_map_t;
mystuff_map_t map;
int insert_if_not_there(Mystuff * stuff) {
std::cout << "Trying insert for " << stuff->key1 << std::endl;
if (map.find(stuff) != map.end()) {
std::cout << "It's there already..." << std::endl;
return (-1);
} else {
map[stuff] = stuff;
std::cout << "Worked..." << std::endl;
}
return (0);
}
int main(){
Mystuff first("first", 1);
Mystuff second("second", 2);
Mystuff third("third", 3);
Mystuff third_duplicate("third", 3);
insert_if_not_there(&first);
insert_if_not_there(&second);
insert_if_not_there(&third);
insert_if_not_there(&third_duplicate);
}
You can compile with g++ -o test test.cpp -std=gnu++11.
I don't get what I'm doing wrong with it: the hash keying algorithm is definitely working, but for some reason (which is obviously in the - bad - way I'm doing something), third_duplicate is inserted as well in the map, while I'd wish it wasn't.
What am I doing wrong?
IIRC unordered containers need operator== as well as std::hash. Without it, I'd expect a compilation error. Except that your key is actually MyStuff* - the pointer, not the value.
That means you get the duplicate key stored as a separate item because it's actually not, to unordered_map, a real duplicate - it has a different address, and address equality is how unordered_map is judging equality.
Simple solution - use std::unordered_map<Mystuff,Mystuff> instead. You will need to overload operator== (or there's IIRC some alternative template, similar to std::hash, that you can specialize). You'll also need to change your std::hash to also accept the value rather than the pointer.
Don't over-use pointers in C++, especially not raw pointers. For pass-by-reference, prefer references to pointers (that's a C++-specific meaning of "reference" vs. "pointer"). For containers, the normal default is to use the type directly for content, though there are cases where you might want a pointer (or a smart pointer) instead.
I haven't thoroughly checked your code - there may be more issues than I caught.

std::string::assign vs std::string::operator=

I coded in Borland C++ ages ago, and now I'm trying to understand the "new"(to me) C+11 (I know, we're in 2015, there's a c+14 ... but I'm working on an C++11 project)
Now I have several ways to assign a value to a string.
#include <iostream>
#include <string>
int main ()
{
std::string test1;
std::string test2;
test1 = "Hello World";
test2.assign("Hello again");
std::cout << test1 << std::endl << test2;
return 0;
}
They both work. I learned from http://www.cplusplus.com/reference/string/string/assign/ that there are another ways to use assign . But for simple string assignment, which one is better? I have to fill 100+ structs with 8 std:string each, and I'm looking for the fastest mechanism (I don't care about memory, unless there's a big difference)
Both are equally fast, but = "..." is clearer.
If you really want fast though, use assign and specify the size:
test2.assign("Hello again", sizeof("Hello again") - 1); // don't copy the null terminator!
// or
test2.assign("Hello again", 11);
That way, only one allocation is needed. (You could also .reserve() enough memory beforehand to get the same effect.)
I tried benchmarking both the ways.
static void string_assign_method(benchmark::State& state) {
std::string str;
std::string base="123456789";
// Code inside this loop is measured repeatedly
for (auto _ : state) {
str.assign(base, 9);
}
}
// Register the function as a benchmark
BENCHMARK(string_assign_method);
static void string_assign_operator(benchmark::State& state) {
std::string str;
std::string base="123456789";
// Code before the loop is not measured
for (auto _ : state) {
str = base;
}
}
BENCHMARK(string_assign_operator);
Here is the graphical comparitive solution. It seems like both the methods are equally faster. The assignment operator has better results.
Use string::assign only if a specific position from the base string has to be assigned.

why does `vector<int> v{{5,6}};` work? I thought only a single pair {} was allowed?

Given a class A with two constructors, taking initializer_list<int> and initializer_list<initializer_list<int>> respectively, then
A v{5,6};
calls the former, and
A v{{5,6}};
calls the latter, as expected. (clang3.3, apparently gcc behaves differently, see the answers. What does the standard require?)
But if I remove the second constructor, then A v{{5,6}}; still compiles and it uses the first constructor. I didn't expect this.
I thought that A v{5,6} would be the only way to access the initializer_list<int> constructor.
(I discovered this while playing around with std::vector and this question I asked on Reddit, but I created my own class A to be sure that it wasn't just a quirk of the interface for std::vector.)
I think this answer might be relevant.
Yes, this behaviour is intended, according to §13.3.1.7 Initialization
by list-initialization
When objects of non-aggregate class type T are list-initialized (8.5.4), overload resolution selects the constructor in two phases:
— Initially, the candidate functions are the initializer-list constructors (8.5.4) of the class T and the argument list consists of
the initializer list as a single argument.
— If no viable initializer-list constructor is found, overload resolution is performed again, where the candidate functions are all
the constructors of the class T and the argument list consists of the
elements of the initializer list.
In gcc I tried your example. I get this error:
error: call of overloaded 'A(<brace-enclosed initializer list>)' is ambiguous
gcc stops complaining if I use three sets of brace. i.e.:
#include <iostream>
#include <vector>
#include <initializer_list>
struct A {
A (std::initializer_list<int> il) {
std::cout << "First." << std::endl;
}
A (std::initializer_list<std::initializer_list<int>> il) {
std::cout << "Second." << std::endl;
}
};
int main()
{
A a{0}; // first
A a{{0}}; // compile error
A a2{{{0}}}; // second
A a3{{{{0}}}}; // second
}
In an attempt to mirror the vector's constructors, here are my results:
#include <iostream>
#include <vector>
#include <initializer_list>
struct A {
A (std::initializer_list<int> il) {
std::cout << "First." << std::endl;
}
explicit A (std::size_t n) {
std::cout << "Second." << std::endl;
}
A (std::size_t n, const int& val) {
std::cout << "Third." << std::endl;
}
A (const A& x) {
std::cout << "Fourth." << std::endl;
}
};
int main()
{
A a{0};
A a2{{0}};
A a3{1,2,3,4};
A a4{{1,2,3,4}};
A a5({1,2,3,4});
A a6(0);
A a7(0, 1);
A a8{0, 1};
}
main.cpp:23:10: warning: braces around scalar initializer
A a2{{0}};
^~~
1 warning generated.
First.
First.
First.
First.
First.
Second.
Third.
First.

Priority of a priority queue always needs to be integral?

I'm just curious if I can have any other data type to give the priority? Like strings, floats, etc?
In the abstract, any type with a reasonable Strict Weak Ordering can be used as the priority in a priority queue. The language you are using will determine how to define this ordering: in C++, operator< is used in standard containers, in Java, the interface Comparable and function compareTo are typically used. Custom comparison functions are also often supported, which can compare elements in a manner different than the default.
No.
The ordering element of a priority queue does not have to be integral.
Yes.
You can use whatever type you want, as long as two values of that type can be compared to determine their inherent ordering.
Basically, you can build a priority queue that uses whatever type you want, even a complex number if you can determine an ordering that makes sense for those.
There is, however, another, unasked, question here, for which the answer is:
Yes, most existing implementations of a priority queue will use an integer as the ordering element as that is the easiest, and most common, value used for this purpose.
Here is a fullblown C++ demo of how to queue SillyJobs, defined as
struct SillyJob
{
std::string description;
std::string priority;
// ...
};
It does so in two ways: using the member operator< (default) and by passing an explicit comparison predicate to priority_queue constructor.
Let's see the output up-front:
Silly: (by description length)
LOW: very very long description
HIGH: short
------------------------------------------------------------
Not so silly: (by priority value)
HIGH: short
LOW: very very long description
See it live on http://ideone.com/VEEQa
#include <queue>
#include <algorithm>
#include <functional>
#include <iostream>
#include <string>
#include <map>
struct SillyJob
{
std::string description;
std::string priority;
SillyJob(const std::string& d, const std::string& p)
: description(d), priority(p) { }
bool operator<(const SillyJob& sj) const { return description.size() < sj.description.size(); }
friend std::ostream& operator<<(std::ostream& os, const SillyJob& sj)
{ return os << sj.priority << ": " << sj.description; }
};
static bool by_priority(const SillyJob& a, const SillyJob& b)
{
static std::map<std::string, int> prio_map;
if (prio_map.empty())
{
prio_map["HIGH"] = 3;
prio_map["MEDIUM"] = 2;
prio_map["LOW"] = 1;
}
return prio_map[a.priority] < prio_map[b.priority];
}
int main()
{
std::cout << "Silly: (by description length)" << std::endl;
{
// by description length (member operator<)
std::priority_queue<SillyJob> silly_queue;
silly_queue.push(SillyJob("short", "HIGH"));
silly_queue.push(SillyJob("very very long description", "LOW"));
while (!silly_queue.empty())
{
std::cout << silly_queue.top() << std::endl;
silly_queue.pop();
}
}
std::cout << std::string(60, '-') << "\nNot so silly: (by priority value)" << std::endl;
{
// by description length (member operator<)
typedef bool (*cmpf)(const SillyJob&, const SillyJob&);
typedef std::priority_queue<SillyJob, std::vector<SillyJob>, cmpf> not_so_silly_queue;
not_so_silly_queue queue(by_priority);
queue.push(SillyJob("short", "HIGH"));
queue.push(SillyJob("very very long description", "LOW"));
while (!queue.empty())
{
std::cout << queue.top() << std::endl;
queue.pop();
}
}
}
PS. The by_priority comparison function is quite a good example of bad design, but bear in mind it was for demonstrational purposes only :)
You can use any type for priority if the values of the type can be compared with each other.

Resources