C++ std::unordered_map key custom hashing

C++ std::unordered_map key custom hashing - algorithm

I've got the following test.cpp file
#include <string>
#include <functional>
#include <unordered_map>
#include <iostream>
class Mystuff {
public:
std::string key1;
int key2;
public:
Mystuff(std::string _key1, int _key2)
: key1(_key1)
, key2(_key2)
{}
};
namespace std {
template<>
struct hash<Mystuff *> {
size_t operator()(Mystuff * const& any) const {
size_t hashres = std::hash<std::string>()(any->key1);
hashres ^= std::hash<int>()(any->key2);
std::cout << "Hash for find/insert is [" << hashres << "]" << std::endl;
return (hashres);
}
};
}; /* eof namespace std */
typedef std::unordered_map<Mystuff *, Mystuff *>mystuff_map_t;
mystuff_map_t map;
int insert_if_not_there(Mystuff * stuff) {
std::cout << "Trying insert for " << stuff->key1 << std::endl;
if (map.find(stuff) != map.end()) {
std::cout << "It's there already..." << std::endl;
return (-1);
} else {
map[stuff] = stuff;
std::cout << "Worked..." << std::endl;
}
return (0);
}
int main(){
Mystuff first("first", 1);
Mystuff second("second", 2);
Mystuff third("third", 3);
Mystuff third_duplicate("third", 3);
insert_if_not_there(&first);
insert_if_not_there(&second);
insert_if_not_there(&third);
insert_if_not_there(&third_duplicate);
}
You can compile with g++ -o test test.cpp -std=gnu++11.
I don't get what I'm doing wrong with it: the hash keying algorithm is definitely working, but for some reason (which is obviously in the - bad - way I'm doing something), third_duplicate is inserted as well in the map, while I'd wish it wasn't.
What am I doing wrong?

IIRC unordered containers need operator== as well as std::hash. Without it, I'd expect a compilation error. Except that your key is actually MyStuff* - the pointer, not the value.
That means you get the duplicate key stored as a separate item because it's actually not, to unordered_map, a real duplicate - it has a different address, and address equality is how unordered_map is judging equality.
Simple solution - use std::unordered_map<Mystuff,Mystuff> instead. You will need to overload operator== (or there's IIRC some alternative template, similar to std::hash, that you can specialize). You'll also need to change your std::hash to also accept the value rather than the pointer.
Don't over-use pointers in C++, especially not raw pointers. For pass-by-reference, prefer references to pointers (that's a C++-specific meaning of "reference" vs. "pointer"). For containers, the normal default is to use the type directly for content, though there are cases where you might want a pointer (or a smart pointer) instead.
I haven't thoroughly checked your code - there may be more issues than I caught.

Related

C++: Get state of linear congruential generator

It seems that if I write
#include <random>
std::minstd_rand engine(1);
std::cout << engine;
then this prints out the internal state of the engine (which is a linear congruential generator). Right now the state equals the seed (1), but if I call a random number and print out engine, it returns some large number, which is probably the state.
How do I actually get the state, in a variable?

Use a string stream instead of stdout. Example:
#include <sstream>
...
std::ostringstream os;
os << engine;
string mystate = os.str();
The o in ostringstream is for output.
The state should be last random number generated, which is why there is not an easier way to do this. It's not as ideal as something like int a; a << engine, but it'll have to do. If you need it that often, make the stringstream operation a function (Including perhaps a conversion from string to integer). You can also typedef a pair of engine/integer with the integer being the state, and make a couple of methods so it's autoset every generation call if you need the performance.
If you don't care about the state, and just want it for the future, do
int engineState = engine();
Now you have the state. Though it's not the same as what it was before, it might not matter depending on your use case.

Output from linear congruential RNG is the state. Or, as alreadynoted, use operator<< to output and convert state
Code
#include <random>
#include <iostream>
#include <sstream>
int main() {
auto engine = std::minstd_rand{ 1 };
auto q = engine();
auto os = std::ostringstream{};
os << engine;
auto r = std::stoul(os.str()); // use ul to fit output
std::cout << q << " " << os.str() << " " << r << '\n';
return 0;
}
prints
48271 48271 48271
Alternative might be if particular implementation implements discard properly in O(log2(N)) time, according to paper by F.Brown https://laws.lanl.gov/vhosts/mcnp.lanl.gov/pdf_files/anl-rn-arb-stride.pdf. In such case you could move one position back, call RNG again and get your state as output.
Compiler and library I use - Visual C++ 2017 15.7 - has not implemented discard in such way, and useless for moving back.

LCGs consist of a simple state that is represented by a single integer.
This means you can treat this pointer as a pointer to an integer.
Below, I have provided an example of a template function that gets
the state (seed) of an engine and even works for classes deriving LCGs.
#include <random>
template <class T, T... v>
T getSeed(std::linear_congruential_engine<T, v...>& rand) {
static_assert(sizeof(rand) == sizeof(T));
return *reinterpret_cast<T*>(&rand);
}
#include <iostream>
int main() {
std::minstd_rand engine(19937);
auto seed = getSeed(engine);
std::cout << sizeof(engine);
std::cout << '\t' << seed;
}
^ This method is way more efficient (x320 times) than serializing through a stream,
or by creating a dummy ostream and specializing std::operator<< for every case.
template<class T, T... v>
using LCG = std::linear_congruential_engine<T, v...>;
#define DummyRandSpec32 uint_fast32_t, 0xDEADBEEF, 0xCAFE, 0xFFFFFFFF
typedef LCG<DummyRandSpec32> DummyRand32; // the same engine type
template<class T, class R>
T* getSeed(R& rand) // getSeed 70:1 nextInt
{ // creating stream is heavy operation
// return rand._M_x; // cannot access private
__dummy_ostream<T> dumdum; // workaround
auto& didey = *reinterpret_cast<DummyRand32*>(&rand);
std::operator<<(dumdum, didey); // specialized
return dumdum.retrieve(); // pointer to state
}
int main() {
std::minstd_rand engine(19937);
std::cout << *getSeed<uint_fast32_t>(engine);
std::cout << std::endl << engine << std::endl;
}
^ Here is ill-coded my first attempt at a solution, if you want to compare.
It is worth mentioning that a field name of the state is implementation-specific.
Purposefully left out std::operator<< and __dummy_ostream.

What does unique_ptr<T>::operator= do in terms of deallocation

I'm having troubles understanding fully the assignment operator for unique_ptr. I understand that we can only move them, due to the fact that copy constructor and assignment operators are deleted, but what if
a unique_ptr which contains already an allocation is overwritten by a move operation? Is the content previously stored in the smart pointer free'd?
#include <iostream>
#include <memory>
class A{
public:
A() = default;
virtual void act() const {
std::cout << "act from A" << std::endl;
}
virtual ~A() {
std::cout << "destroyed A" << std::endl;
}
};
class B : public A {
public:
B() : A{} {}
void act() const override {
std::cout << "act from B" << std::endl;
}
~B() override {
std::cout << "destroyed from B " << std::endl;
}
};
int main() {
auto pP{std::make_unique<A>()};
pP->act();
==================== ! =======================
pP = std::make_unique<B>(); // || std::move(std::make_unique<B>())
==================== ! =======================
pP->act();
return 0;
}
When I do
pP = std::make_unique<B>();
does it mean that what was allocated in the first lines for pP (new A()) is destructed automatically?
Or should I opt for:
pP.reset();
pP = std::make_unique<B>();

Yes, see section 20.9.1, paragraph 4 of the C++11 draft standard
Additionally, u can, upon request, transfer ownership to another unique pointer u2. Upon completion of
such a transfer, the following postconditions hold:
u2.p is equal to the pre-transfer u.p,
u.p is equal to nullptr, and
if the pre-transfer u.d maintained state, such state has been transferred to u2.d.
As in the case of a reset, u2 must properly dispose of its pre-transfer owned object via the pre-transfer
associated deleter before the ownership transfer is considered complete
In other words, it's cleaning up after itself upon assignment like you'd expect.

Yes, replacing the content of a smart pointer will release the previously-held resource. You do not need to call reset() explicitly (nor would anyone expect you to).

Just for the sake of this particular example. It seems polymorphism in your example didn't allow you to draw clear conclusions from output:
act from A
destroyed A
act from B
destroyed from B
destroyed A
So let's simplify your example and make it straight to the point:
#include <iostream>
#include <memory>
struct A {
explicit A(int id): id_(id)
{}
~A()
{
std::cout << "destroyed " << id_ << std::endl;
}
int id_;
};
int main() {
std::unique_ptr<A> pP{std::make_unique<A>(1)};
pP = std::make_unique<A>(2);
}
which outputs:
destroyed 1
destroyed 2
Online
I hope this leaves no room for misinterpretation.

Output to logging class via operator<<

I have implemented a logging class TLogFile and now I want to overload the output operator<<.
I want to use the log like this:
TLogFile* log = new TLogFile("some arguments...");
*log << "Hello world."; // (1)
*log << "Hello world." << endl; // (2)
*log << std::hex << setw(2) << setfill('0') << someValue << endl; // (3)
I used ostream as a class member and as a friend. The class looks like this:
namespace app {
class TLogFile
{
public:
app::TLogFile& operator<< (std::string& out);
std::ostream& operator<< (std::ostream& out);
friend std::ostream& operator<< (std::ostream& out, TLogFile& o);
};
} // namespace app
Only plain text (1) is working by using the string version. A soon as I use endl (2) or iomanip (3) I get error messages:
../src/main.cpp:164:70: error: no match for 'operator<<' in 'sysdat.app::cSystemData::obj.app::cSystemObjects::applicationLog->app::TLogFile::operator<<((* & std::basic_string(((const char*)"sysdat.obj.applicationLog <<"), ((const std::allocator*)(& std::allocator()))))) << std::endl'
../src/main.cpp:164:70: note: candidates are:
../src/inc/logger.h:85:17: note: app::TLogFile& app::TLogFile::operator<<(const string&)
../src/inc/logger.h:85:17: note: no known conversion for argument 1 from '' to 'const string& {aka const std::basic_string&}'
../src/inc/logger.h:88:17: note: std::ostream& app::TLogFile::operator<<(std::ostream&)
../src/inc/logger.h:88:17: note: no known conversion for argument 1 from '' to 'std::ostream& {aka std::basic_ostream&}'
../src/inc/logger.h:93:23: note: std::ostream& app::operator<<(std::ostream&, app::TLogFile&)
../src/inc/logger.h:93:23: note: no known conversion for argument 1 from 'app::TLogFile' to 'std::ostream& {aka std::basic_ostream&}'
I believed that one of the ostream version should work.
Has anyone an idea how to overload the operator so that endl and iomanip can be used?

Your operator<< is able to take only std::ostream& and std::string&
(note: probably it should be const std::string&).
The most elegant solution I can imagine is to write a template:
class TLogFile{
protected:
std::ostream* stream;
public:
/* default ctor, copy ctor and assignment operator: */
TLogFile(std::ostream& _stream=std::clog):stream(&_stream){}
TLogFile (const TLogFile&) =default;
TLogFile& operator= (const TLogFile&) =default;
/* std::endl is overloaded,
* so I think compiler doesn't know which version to use.
* This funchtion handles function pointers, including std::endl
*/
inline TLogFile& operator<< (std::ostream&(*func)(std::ostream&)){
(*stream) << func;
return *this;
}
/* should handle everything else */
template<typename T>
inline TLogFile& operator<< (const T& t) {
(*stream) << t;
return *this;
}
}
See it working in online compiler
This way your objects' operator<<s should be able to take anything that std::ostream's can take.
Edit:
Next time, please say that you want to have custom std::endl.
I'm not sure that function with signature
inline TLogFile& operator<< (std::ostream&(*func)(std::ostream&))
is used only when std::endl is passed to it. My previous solution seems inelegent or even inworking. I'm wondering about how to change behaviour of std::endl when it's passed to object of different class.
Notes:
In most cases I'd like to use '\n instead of std::endl.
TLogFile* log = new TLogFile("some arguments...");
I think using raw pointer isn't the best idea here (it's easy to forget about delete),
unless you have to explicitly decide when the object should die.
When the object should die when the current scope does, it should be a local variable:
TLogFile log("some arguments...");
//Usage:
log << "Hello world."; // (1)
log << "Hello world." << endl; // (2)
log << std::hex << setw(2) << setfill('0') << someValue << endl; // (3)
If the object is used in multiple places, and each of the places uses it independently from others, IMO the best solution is to use std::shared_ptr:
#include <memory>
#include <utility>
auto log=std::make_shared<TLogFile>("some arguments...");
//Usage:
*log << "Hello world."; // (1)
*log << "Hello world." << endl; // (2)
*log << std::hex << setw(2) << setfill('0') << someValue << endl; // (3)
This way the object dies when the last shared_ptr does.
I used pointer in the class to be able to re-assign it. If you don't need re-assignment, you can use reference instead.

Thanks to GingerPlusPlus. I found out, that the operator operator<< (std::ostream&(*func)(std::ostream&)) is called only once for the endl (maybe this assumtion is not always true, Please read remarks/edit above of GingerPlusPlus). I replaced the ostream against a stringstream and write the contens of the stringstream when the operator ist called.
class TLogFile{
protected:
std::ostream* stream;
std::stringstream line;
public:
/* default ctor, copy ctor and assignment operator: */
TLogFile(std::ostream& _stream=std::clog):stream(&_stream){}
TLogFile (const TLogFile&) =default;
TLogFile& operator= (const TLogFile&) =default;
void write() {
// Doing some write stuff
// ...
// Empty stringstream buffer
line.str(std::string());
}
/* std::endl is overloaded,
* so I think compiler doesn't know which version to use.
* This funchtion handles function pointers, including std::endl
*/
inline TLogFile& operator<< (std::ostream&(*func)(std::ostream&)){
line << func;
write();
return *this;
}
/* should handle everything else */
template<typename T>
inline TLogFile& operator<< (const T& t) {
line << t;
return *this;
}
}

why does `vector<int> v{{5,6}};` work? I thought only a single pair {} was allowed?

Given a class A with two constructors, taking initializer_list<int> and initializer_list<initializer_list<int>> respectively, then
A v{5,6};
calls the former, and
A v{{5,6}};
calls the latter, as expected. (clang3.3, apparently gcc behaves differently, see the answers. What does the standard require?)
But if I remove the second constructor, then A v{{5,6}}; still compiles and it uses the first constructor. I didn't expect this.
I thought that A v{5,6} would be the only way to access the initializer_list<int> constructor.
(I discovered this while playing around with std::vector and this question I asked on Reddit, but I created my own class A to be sure that it wasn't just a quirk of the interface for std::vector.)

I think this answer might be relevant.
Yes, this behaviour is intended, according to §13.3.1.7 Initialization
by list-initialization
When objects of non-aggregate class type T are list-initialized (8.5.4), overload resolution selects the constructor in two phases:
— Initially, the candidate functions are the initializer-list constructors (8.5.4) of the class T and the argument list consists of
the initializer list as a single argument.
— If no viable initializer-list constructor is found, overload resolution is performed again, where the candidate functions are all
the constructors of the class T and the argument list consists of the
elements of the initializer list.
In gcc I tried your example. I get this error:
error: call of overloaded 'A(<brace-enclosed initializer list>)' is ambiguous
gcc stops complaining if I use three sets of brace. i.e.:
#include <iostream>
#include <vector>
#include <initializer_list>
struct A {
A (std::initializer_list<int> il) {
std::cout << "First." << std::endl;
}
A (std::initializer_list<std::initializer_list<int>> il) {
std::cout << "Second." << std::endl;
}
};
int main()
{
A a{0}; // first
A a{{0}}; // compile error
A a2{{{0}}}; // second
A a3{{{{0}}}}; // second
}
In an attempt to mirror the vector's constructors, here are my results:
#include <iostream>
#include <vector>
#include <initializer_list>
struct A {
A (std::initializer_list<int> il) {
std::cout << "First." << std::endl;
}
explicit A (std::size_t n) {
std::cout << "Second." << std::endl;
}
A (std::size_t n, const int& val) {
std::cout << "Third." << std::endl;
}
A (const A& x) {
std::cout << "Fourth." << std::endl;
}
};
int main()
{
A a{0};
A a2{{0}};
A a3{1,2,3,4};
A a4{{1,2,3,4}};
A a5({1,2,3,4});
A a6(0);
A a7(0, 1);
A a8{0, 1};
}
main.cpp:23:10: warning: braces around scalar initializer
A a2{{0}};
^~~
1 warning generated.
First.
First.
First.
First.
First.
Second.
Third.
First.

Priority of a priority queue always needs to be integral?

I'm just curious if I can have any other data type to give the priority? Like strings, floats, etc?

In the abstract, any type with a reasonable Strict Weak Ordering can be used as the priority in a priority queue. The language you are using will determine how to define this ordering: in C++, operator< is used in standard containers, in Java, the interface Comparable and function compareTo are typically used. Custom comparison functions are also often supported, which can compare elements in a manner different than the default.

No.
The ordering element of a priority queue does not have to be integral.
Yes.
You can use whatever type you want, as long as two values of that type can be compared to determine their inherent ordering.
Basically, you can build a priority queue that uses whatever type you want, even a complex number if you can determine an ordering that makes sense for those.
There is, however, another, unasked, question here, for which the answer is:
Yes, most existing implementations of a priority queue will use an integer as the ordering element as that is the easiest, and most common, value used for this purpose.

Here is a fullblown C++ demo of how to queue SillyJobs, defined as
struct SillyJob
{
std::string description;
std::string priority;
// ...
};
It does so in two ways: using the member operator< (default) and by passing an explicit comparison predicate to priority_queue constructor.
Let's see the output up-front:
Silly: (by description length)
LOW: very very long description
HIGH: short
------------------------------------------------------------
Not so silly: (by priority value)
HIGH: short
LOW: very very long description
See it live on http://ideone.com/VEEQa
#include <queue>
#include <algorithm>
#include <functional>
#include <iostream>
#include <string>
#include <map>
struct SillyJob
{
std::string description;
std::string priority;
SillyJob(const std::string& d, const std::string& p)
: description(d), priority(p) { }
bool operator<(const SillyJob& sj) const { return description.size() < sj.description.size(); }
friend std::ostream& operator<<(std::ostream& os, const SillyJob& sj)
{ return os << sj.priority << ": " << sj.description; }
};
static bool by_priority(const SillyJob& a, const SillyJob& b)
{
static std::map<std::string, int> prio_map;
if (prio_map.empty())
{
prio_map["HIGH"] = 3;
prio_map["MEDIUM"] = 2;
prio_map["LOW"] = 1;
}
return prio_map[a.priority] < prio_map[b.priority];
}
int main()
{
std::cout << "Silly: (by description length)" << std::endl;
{
// by description length (member operator<)
std::priority_queue<SillyJob> silly_queue;
silly_queue.push(SillyJob("short", "HIGH"));
silly_queue.push(SillyJob("very very long description", "LOW"));
while (!silly_queue.empty())
{
std::cout << silly_queue.top() << std::endl;
silly_queue.pop();
}
}
std::cout << std::string(60, '-') << "\nNot so silly: (by priority value)" << std::endl;
{
// by description length (member operator<)
typedef bool (*cmpf)(const SillyJob&, const SillyJob&);
typedef std::priority_queue<SillyJob, std::vector<SillyJob>, cmpf> not_so_silly_queue;
not_so_silly_queue queue(by_priority);
queue.push(SillyJob("short", "HIGH"));
queue.push(SillyJob("very very long description", "LOW"));
while (!queue.empty())
{
std::cout << queue.top() << std::endl;
queue.pop();
}
}
}
PS. The by_priority comparison function is quite a good example of bad design, but bear in mind it was for demonstrational purposes only :)

You can use any type for priority if the values of the type can be compared with each other.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

C++ std::unordered_map key custom hashing - algorithm

Related

C++: Get state of linear congruential generator

What does unique_ptr<T>::operator= do in terms of deallocation

Output to logging class via operator<<

why does `vector<int> v{{5,6}};` work? I thought only a single pair {} was allowed?

Priority of a priority queue always needs to be integral?

Categories

Resources