Using redis to store a structured event log - ruby

I'm a bit new to Redis, so please forgive if this is basic.
I'm working on an app that sends automatic replies to users for certain events. I would like to use Redis to store who has received what event.
Essentially, in ruby, the data structure could look like this where you have a map of users to events and the dates that each event was sent.
{
"mary#example.com" => {
"sent_comment_reply" => ["12/12/2014", "3/6/2015"],
"added_post_reply" => ["1/4/2006", "7/1/2016"]
}
}
What is the best way to represent this in a Redis data structure so you can ask, did Mary get a sent_comment_reply? and if so, when was the latest?
In short, the question is, how(if possible) can you have a Hash structure that holds an array in Redis.
The rationale as opposed to using a set or list with a compound key is that hashes have O(1) lookup time, whereas lookups on lists(lrange) and sets(smembers) will be O(s+n) and sets O(n), respectively.

One way of structuring it in Redis, depending on the idea that you know the events of the user and you want the latest to be fresh in memory :
A sorted set per user. the content of the sorted set will be event codes; sent_comment_reply, added_post_reply with the score of the latest event as the highest. you can use ZRANK to get the answer for the question :
Did Mary get a sent_comment_reply?
A hash also for the user, this time you will have the field as the event sent_comment_reply and the value is the content of it which should be updated with the latest value including the body, date, etc. this will answer the question:
and if so, when was the latest?
Note: Sorted sets are really fast , and in this example we are depending on the events as the data.
With sorted sets you can add, remove, or update elements in a very
fast way (in a time proportional to the logarithm of the number of
elements). Since elements are taken in order and not ordered
afterwards, you can also get ranges by score or by rank (position) in
a very fast way. Accessing the middle of a sorted set is also very
fast, so you can use Sorted Sets as a smart list of non repeating
elements where you can quickly access everything you need: elements in
order, fast existence test, fast access to elements in the middle!

A possible approach to use a hash to map an array is as follows:
add_element(key , value):
len := redis.hlen(key)
redis.hset(key , len , value)
this will map array[i] element to i field in a hash key.
this will work for some cases, but I would probably go with the answer suggested in https://stackoverflow.com/a/34886801/2868839

Related

How to return a viewEntryCollection in random order

I have the following code
var vec:ViewEntryCollection = database.getView("view").getAllEntriesByKey("Mykey",true)
how can I make "vec" in random order using SSJS (or java) so that I get a new order every time?
How about having a secondary sort column on the view with a formula of #Unique. Would need to refresh the view each time and performance may not be great if the view is big.
Considered the average collection size I would loop through the collection and add each item to a Java list or a JavaScript array.
If you go Java you can use Collections.shuffle.
If you go JavaScript you can use well established functions/algorithms
For better performance, do NOT keep collection entries in memory. First, make list/array of UNIDs from your view. That will be the slowest part. Then pick any random number and pick desired number of UNIDs from the list/array. Call getDocumentByUnid or initialize (say 10) datasources.

Hashing table design in C

I have a design issue regarding HASH function.
In my program I am using a hash table of size 2^13, where the slot is calculated based on the value of the node(the hash key) which I want to insert.
Now, say my each node has two value |A|B| however I am inserting value into hash table using A.
Later on, I want to search a particular node which B not A.
Is it possible to that way? Is yes, could you highlight some design approaches?
The constraint is that I have to use A as the hash key.
Sorry, I can't share the code. Small example:
Value[] = {Part1, Part2, Part3};
insert(value)
check_for_index(value.part1)
value.part1 to be used to calculate the index of the slot.
Once slot is found then insert the "value"
Later on,
search_in_hash(part2)
check_for_index("But here I need the value.part1 to check for slot index")
So, how can I relate the part1, part2 & part3 such that I later on I can find the slot by either part2 or part3
If the problem statement is vague kindly let me know.
Unless you intend to do a search element-by-element (in which case you don't need a hash, just a plain list), then what you basically ask is - can I have a hash such that hash(X) == hash(Y), but X!=Y, so that you could map to a location using part1 and then map to the same one using part2 or 3. That completely goes against what hashing stands for.
What you should do is (as viraptor also suggested), create 3 structures, each hashed using a different part of the value, and push the full value to all 3. Then when you need to search use the proper hash by the part you want to search by.
for e.g.:
value[] = {part1, part2, part3};
hash1.insert(part1, value)
hash2.insert(part2, value)
hash3.insert(part3, value)
then
hash2.search_in_hash(part2)
or
hash3.search_in_hash(part3)
The above 2 should produce the exact same values.
Also make sure that all data manipulations (removing values, changing them), is done on all 3 structures simultaneously. For e.g. -
value = hash2.search_in_hash(part2)
hash1.remove(value.part1)
hash2.remove(part2) // you can assert that part2 == value.part2
hash3.remove(value.part3)

thrust::sort_by_key: How to store result in separate array?

I am currently sorting values by key the following way
thrust::sort_by_key(thrust::device_ptr<int>(keys),
thrust::device_ptr<int>(keys + numKeys),
thrust::device_ptr<int>(values);
which sorts the "values" array according to "keys".
Is there a way to leave the the "values" array untouched and instead store the result of sorting "values" in a separate array?
Thanks in advance.
There isn't a direct way to do what you are asking. You have two options to functionally achieve the same thing.
The first is make a copy of the values array before the call, leaving you with a sorted and unsorted version of the original data. So your example becomes
thrust::device_vector<int> values_sorted(thrust::device_ptr<int>(values),
thrust::device_ptr<int>(values + numKeys));
thrust::sort_by_key(thrust::device_ptr<int>(keys),
thrust::device_ptr<int>(keys + numKeys),
values_sorted.begin());
The second alternative is not to pass the values array to the sort at all. Thrust has a very useful permutation iterator which allows for seamless permuted access to an array without modifying the order in which that array is stored (so an iterator based gather operation, if you will). To do this, create an index vector and sort that by key instead, then instantiate a permutation iterator with that sorted index, something like
typedef thrust::device_vector<int>::iterator iit;
thrust::device_vector<int> index(thrust::make_counting_iterator(int(0)),
thrust::make_counting_iterator(int(numKeys));
thrust::sort_by_key(thrust::device_ptr<int>(keys),
thrust::device_ptr<int>(keys + numKeys),
index.begin());
thrust::permutation_iterator<iit,iit> perm(thrust::device_ptr<int>(values),
index.begin());
Now perm will return values in the keys sorted order held by index without ever changing the order of the original data.
[standard disclaimer: all code written in browser, never compiled or tested. Use at own risk]

Method to determine if keys of a dictionary are in sequence or a range

I'm trying to determine when to remove entries in the sorteddictionary, when a sequence is found, i.e. where the key is a sequence of 1,2,3,4,5,6,7,8,9,10... etc.
I have:
SortedDictionary<int, string>
Its hard to explain. I'm adding pairs where the key can be any integer value, generally on a random'ish basis. So, the program may add
<2,"jim"> <15,"Jack"> <62,"jill"> and so on.
So when it executes, the dictionary is going to filled with a sorted list which is not necessary in sequence, but I want to check, if say key values 1..10 are present, in a proper sequence, i.e 1,2,3,4,5,6,7,8,9,10.
The background is i've got stuff coming in from a messaging pipe, which is not in order. So it goes into this dictionary, and then on another thread I check the dictionary, and if return's success for the range I provide, then removes it from the dictionary and enques it, in order onto a concurrentqueue. Fundamentally an inorder to ordered exchange.
Any help is appreciated.
Bob.
If you get the highest and lowest keys, then the count would tell you if you've got a sequence.

data structure to support lookup based on full key or part of key

I need to be able to lookup based on the full key or part of the key..
e.g. I might store keys like 10,20,30,40 11,12,30,40, 12,20,30,40
I want to be able to search for 10,20,30,40 or 20,30,40
What is the best data structure for achieving this..best for time.
our programming language is Java..any pointers for open source projects will be appreciated..
Thanks in advance..
If those were the actual numbers I'd be working with, I'd use an array where a given index contains an array of all records that contain the index. If the actual numbers were larger, I'd use a hash table employed the same way.
So the structure would look like (empty indexes elided, in the case of the array implementation):
10 => ((10,20,30,40)),
11 => ((11,12,30,40)),
12 => ((11,12,30,40), (12,20,30,40)),
20 => ((10,20,30,40), (12,20,30,40)),
30 => ((10,20,30,40), (11,12,30,40), (12,20,30,40)),
40 => ((10,20,30,40), (11,12,30,40), (12,20,30,40)),
It's not clear to me whether your searches are inclusive (OR-based) or exclusive (AND-based), but either way you look up the record groups for each element of the search set; for the inclusive search you find their union, and for the exclusive search you find their intersection.
Since you seen to care about retrieval time over other concerns (such as space), I suggest you use a hashtable and you enter your items several times, once per subkey. So you'd put("10,20,30,40",mydata), then put("20,30,40",mydata) and so on (of course this would be a method, you're not going to manually call put so many times).
Use a tree structure. Here is an open source project that might help ... written in Java :-)
http://suggesttree.sourceforge.net/

Resources