Sorted map<CString, CString> possible? - sorting

I have a std::map<CString,CString> which I subsequently iterate after it is populated.
Is it possible to sort this map by the key value? The key is a name. So when I iterate the map I would like the names in A-Z order.

std::map is standard C++ specific container which already keeps the data sorted based on key. So no need to sort after it has been populated. But yes, in order to better handle the cases of duplicate keys, you should use std::multimap since name can be duplicated if used as key.
Also, it will be better if you use CMap Class, since mixing standard C++ and windows classes seems bit clumsy.

Related

Redis stringified array VS array of stringified structs

I need to use redis in golang as cache and store a array of structs in it. Since redis allows only storing array of strings(correct me if I'm wrong), I'll marshal the items in my array. Was wondering shall I use redis list where I'm storing marshaled structs in string format OR I can just marshal the whole array and store as key<>value in redis and not use list. One pro of using list is I can fetch ranged items from the list but scale is not the problem here since I'll be storing less than 100 items in the list. What else should I consider here.
Thankyou!!
The answer depends of how you want to use redis
For instance, store one struct using json (or any kind of serialization) and store it in a single position is easy to read / write.
But if you need to efficiently retrieve/ update one field, you can save it in a different way. However this scenario is pretty rare and complex to handle.
For instance you need to be sure you write always in the same order, to calculate the right offset. If you need to add a new field, will be really difficult to be 100% backward compatible. You probably need to create a new type (like a version 2).

react-table multisort does not work when providing a custom sort function

Multisort seems to work just fine...as long as you are using built-in sort types (see this example, which loads with the first two columns included in the default sort). As soon as I attempt to use a custom sort function, sorting only seems to take into account the first column specified (see this example, which is identical to the functional first example - although it specifies a custom sort function). I tried looking through the documentation, as well as looking through the source code for the built-in sort types, and I don't see anything different from what I am doing?
(Interestingly enough, the sort icons would make it seem like both columns were sorted; but if you put a breakpoint in the custom sort function, you can clearly see that it never gets called for the second column. Also potentially worth noting is that this only seems to be an issue if both columns use a custom sort. If I alter the second example such that the first column uses one of the built-in sort types - either by manually using one, or just accepting the default - then the multisort appears to function as expected. In the inverse case, using a custom sort for the first column and a built-in sort for the second column, again only the first column specified is actually sorted. I also [posted this as a potential bug on the project itself][https://github.com/tannerlinsley/react-table/issues/3512], but am cross-posting it here on the off chance that I just missed something when attempting to implement my custom sort functions.)
From the corresponding project issue:
"This, in fact, seems to be a problem with your custom sort type. You see, in the example you provided, your defaultAlphanumericSort only returns 1 or -1.
However, if you look at the defaultOrderByFn of react-table, you will notice that the secondary (tertiary and so on ... ) sorting is only applied, if the previous sorting functions returned 0. Extend your custom sort type to return 0 on equality and you should be fine."

How to produce YAML with keys in insertion order Go?

I'm using Go-Yaml to serialize some maps to YAML. Is there a way to ensure the serialized YAML is writen with the keys in the order they were inserted into the Go Map? Or will it be necessary to reimplement the Marshal interface myself?
Go maps do not keep track of order of insertion. In order to do this you'd have to implement your own mechanism for reading in the keys and storing the order.

associate multiple strings to only one

I'm trying to make an algorithm that easily simplifies and groups synonyms (with mismatches, capitals, acronims, etc) into only one. I supose there should exist a standard way to build such a structure that, looking for a string with possible mismatches, if the string exists in the structure, it returns a normalized string key. In short, sometimes the same concept could be written in several ways, but I only want to keep the concept.
For instance: Supose I want to normalize or simplify the appearances of
"General Director", "General Manager", "G, Dtor", "Gen Dir", ...
into
"GEN_DIR"
and keep only this result for further reference.
By the way, I suppose that building a Hash with key/value pairs like
hash["General Director"]="GEN_DIR"
hash["General Manager"]="GEN_DIR"
hash["G, Dtor"]="GEN_DIR"
hash["G, Dir"]="GEN_DIR"
could be a solution, but I suspect that there are more elegant or adequate solutions to that.
I would also need the way to persist this associative structure easily without any database because it should grow as I find more mismatches of the same word or sentence. A possible approach I think is to define this structure by means of a DSL, but I'm open to suggestions.
Well, there is no rule, at least a clear one.
My aim is to scrap from web some "structured" data that sometimes is incorrectly or incompletely typed. Some fields are descriptions and can be left as is. But some fields are suposedly to be "sets" but aren's correctly typed (as in my example). As a human can read that, he immediatelly knows what it means and can associate that with its meaning.
But I would like to automate as much as possible the process of reducing those possible mismatches to only one "string" (or symbol) before, for instance, saving it into a database. So, what I would need is a kindof hash or dictionary, as sawa correctly stated, that I can use to lookup any of such dirty strings to get the normalized string or symbol.
Also, of course, it would be desirable a way to make this hash (or whatelse it could be) to learn from new mismatches in some way and add a new association automatically (possibly it could be based on a distance measure between mismatched string and normalized string that, if lower than X, a new association is built). The whole association (i.e, hash) should grow as new mismatches and concepts arise and, though, it should be kept anywhere (possibly in an xml file, or something like what Mori answered below) for future uses.
Any new Idea?

Hash for unordered set?

I am trying to solve a one-way indentity problem, a group of authors want to publish something without reveal their own real username, so are there algorithm/library for hashing an unordered set of usernames?
Some people would suggest, sort the set alphabetically first, then join, finally hash, but that's not ideal solution for dynamic growing array.
Additionaly questions (not compulsory for the main question):
If such algorithm exists, can we verify if a username is one of the authors by hash?
If we already know the hash of a group of usernames, then there is a new author added, can we get a new hash without knowing previous author usernames?
Are you willing to accept a small probability of false positives, that is of names that aren't authors which will be incorrectly identified as authors if anyone checks? (The probability can be made arbitrarily small.)
If you are, then a bloom filter would fit the bill perfectly.
You can always generate a hash, regardless of whether or not you know the other authors' user names. You can't guarantee that it's a unique hash, though.
If you know all the user names in advance, you can generate a minimal perfect hash, but any time you add a user name you'll have to generate a completely new hash table--with different hashes. That's obviously not a good solution.
It depends on what you want your final keys to look like.
One possibility is to assign unique sequential IDs to the user names and then obfuscate those ids so that they don't look like sequential IDs. This is similar to what YouTube does with their IDs--they turn a 64-bit number into an 11-character base64 string. I wrote a little article about that, with code in C#. Check out http://www.informit.com/guides/content.aspx?g=dotnet&seqNum=839.
And, yes, the process is reversible.
It sounds like a single hash won't do you any good. 1. You can't verify that a single username is in the hash; you would need to know all the usernames. 2. You can't add a new user to the hash without knowing something about the unhashed usernames (the order in which you add users to the hash will matter, for all good hash algorithms).
For #2, a partial solution is that you would not keep all the usernames, just keep something like an XOR of all the existing users. When you want to add a new user, XOR it with the existing one and re-hash the result. Then it won't matter which order you added the users in.
But the real solution, I think, is just to have a set of hashes, rather than a hash of a set. Is there a reason you can't do this? Then you can easily keep the set ordered or unordered as you wish, you can easily add users to the set, and easily check to see if a given author is already in the set.

Resources