I need to store strings and associate a unique integer to each one. The integer must be as short/small as possible . Is it possible to do that in Redis? Basically I need something like SADD but instead to return the number of elements in the set I need it to return the index of the element inserted(newly stored or existing).
Pseudo code:
// if mystring already exists in myset it returns its index
// otherwise stores it and returns its index.
index := storeOrReturnIndex(myset, mystring)
Would using a hashmap cover what you are looking for?
> HSET hashmap 0 "first string"
(integer) 1
> HSET hashmap 1 "second string"
(integer) 1
> HSET hashmap 2 "third string"
(integer) 1
> HGET hashmap 1
"second string"
> HLEN hashmap
3
You can store the last modified index in a key with:
> SET last_modified 1
Then retrieve it with:
> GET last_modified
You can use the Redis INCR command to atomically acquire a new, unique index.
Pattern: Counter
The counter pattern is the most obvious thing you can do with Redis atomic increment operations. The idea is simply send an INCR command to Redis every time an operation occurs. For instance in a web application we may want to know how many page views this user did every day of the year.
To do so the web application may simply increment a key every time the user performs a page view, creating the key name concatenating the User ID and a string representing the current date.
This simple pattern can be extended in many ways:
So use INCR to get the next unique, smallest index in an atomic way wherever you want to store a new item (URL). Then you can use HSET to store the index associated with your item, and HGET to get the associated index for an item.
Related
I am doing a problem and i need to do this task.
I want to add pairs (p1,q1),(p2,q2)..(pn,qn) in such way that
(i) Duplicate pair added only once(like in set).
(ii) I store count how many time each pair are added to set.For ex : (7,2) pair
will present in set only once but if i add 3 times count will 3.
Which container is efficient for this problem in c++?
Little example will be great!
Please ask if you cant understand my problem and sorry for bad English.
How about a std::map<Key, Value> to map your pairs (Key) to their count and as you insert, increment a counter (Value).
using pairs_to_count = std::map<std::pair<T1, T2>, size_t>;
std::pair<T1, T2> p1 = // some value;
std::pair<T1, T2> p2 = // some other value;
pairs_to_count[p1]++;
pairs_to_count[p1]++;
pairs_to_count[p2]++;
pairs_to_count[p2]++;
pairs_to_count[p2]++;
In this code, the operator[] will automatically add a key in the map if it does not exist yet. At that moment, it will initialize the key's corresponding value to zero. But as you insert, even the first time, that value is incremented.
Already after the first insertion, the count of 1 correctly reflects the number of insertion. That value gets incremented as you insert more.
Later, retrieving the count is a matter of calling operator[] again to get value associated with a given key.
size_t const p2_count = pairs_to_count[p2]; // equals 3
is there a good way in redis to get keys in a hash sorted by values? I've looked at the documentation and haven't found a straightforward way.
Also could someone please explain how sorting is achieved in redis, and what this documentation is trying to say?
I have a very simple hash structure which is something like this:
"salaries" - "employee_1" - "salary_amount"
I'd appreciate a detailed explanation.
You can achieve it by sorting a SET by one of your HASH fields. So you should create an indices SET of all of your hashes, and use the BY option.
Also, you can use DESC option to get the results sorted from high to low.
e.g.
localhost:6379> sadd indices h1 h2 h3 h4
(integer) 4
localhost:6379> hset h1 score 3
(integer) 1
localhost:6379> hset h2 score 2
(integer) 1
localhost:6379> hset h3 score 5
(integer) 1
localhost:6379> hset h4 score 1
(integer) 1
localhost:6379> sort indices by *->score
1) "h4"
2) "h2"
3) "h1"
4) "h3"
localhost:6379> sort indices by *->score desc
1) "h3"
2) "h1"
3) "h2"
4) "h4"
From SORT's documentation page:
Returns or stores the elements contained in the list, set or sorted set at key
So you can't really use it to sort the fields by their values in a Hash data structure. To achieve your goal you should either do the sorting in your application's code after getting the Hash's contents or use a Redis-embedded Lua script for that purpose.
Edit: After speaking with #OfirLuzon we realized that there is another, perhaps even preferable, approach which would be to use a more suitable data structure for this purpose. Instead of storing the salaries in a Hash, you should consider using a Sorted Set in which each member is an employee ID and the score is the relevant salary. This will give you ordering, ranges and paging for "free" :)
I want to store key-value pairs(T1,T2) in Redis. Both key and value are unique.
I want to be able to query on both key and value, i.e. HGET(Key) should return corresponding Value and HGET(Value) should return corresponding Key.
A trivial approach would be to create 2 Hashes in Redis (T1,T2) and (T2,T1) and then query on appropriate Hash. Problem with this approach is that insertion, update or deletion of pairs would need updates in both Hashes.
Is there a better way to serve my requirement...
If one of T1, T2 has an integer type you could use a combo like:
1->foo
2->bar
ZADD myset 1 foo
ZADD myset 2 bar
ZSCORE myset foo //returns 1.0 in O(n)
ZSCORE myset bar //return 2.0 in O(n)
ZRANGEBYSCORE myset 1 1 //returns "foo" in O(log(N)+M)
source
If this is not the case then it makes sense to maintain 2 separate hashes, preferably within a Lua script
What is the best data structure to check if the number of elements of different types of objects is the same?
For example, if I have
2 a's
3 b's
3 c's
The number of elements of the different types of objects is not the same.
If I have
2 a's
2 b's
2 c's
then this is the same.
What is the best data structure that allows you do this in O(1) time and how would you implement it?
One method is to use two dictionaries to be able to do it in O(1) dynamically.
The first maps each type to a count, {a:2,b:3,c:3}. The second maps each count to a set of types with that count. {2:{a},3:{b,c}}. If the size of the second dictionary is less than 2 (0 or 1) then clearly all types have the same count as if that was not the case then there would be at least two key-item pairs in that dictionary, presuming that the dictionary is updated when the counts change.
Adding a type just means adding it to each dictionary.
Removing a type just means removing it from each dictionary.
Updating a type requires first updating the second dictionary by removing the previous count (obtained from the first dictionary) and adding the current count, after which the first dictionary is updated.
Dictionary<Type, int> typeCounts = new Dictionary<Type, int>();
// read and store type counts
Type type = typeof(A);
if (typeCounts.Contains(type))
{
typeCounts[type]++;
}
else
{
typeCounts.Add(type, 1);
}
// populate type counts by type and finally:
if (typeCounts[typeA] == typeCounts[typeB])
{
// so on
}
I need to find the next available ID (or key) from a fixed list of possible IDs. In this case valid IDs are from 1 to 9999, inclusive. When finding the next available ID we start looking just after the last assigned ID, wrap around at the end - only once, of course - and need to check if each ID is taken before we return it as an available ID.
I have some code that does this but I think it is neither elegant nor efficient and am interested in a simpler way to accomplish the same thing. I'm using Ruby but my question is not specific to the language, so if you'd like to write an answer using any other language I will be just as appreciative of your input!
I have elided some details about checking if an ID is available and such, so just take it as a given that the functions incr_last_id, id_taken?(id), and set_last_id(id) exist. (incr_last_id will add 1 to the last assigned ID in a data store (Redis) and return the result. id_taken?(id) returns a boolean indicating if the ID is available or not. set_last_id(id) updates the data store with the new last ID.)
MaxId = 9999
def next_id
id = incr_last_id
# if this ID is taken or out of range, find the next available id
if id > MaxId || id_taken?(id)
id += 1 while id < MaxId && id_taken?(id)
# wrap around if we've exhausted the ID space
if id > MaxId
id = 1
id += 1 while id < MaxId && id_taken?(id)
end
raise NoAvailableIdsError if id > MaxId || id_taken?(id)
set_last_id(id)
end
id
end
I'm not really interested in solutions that require me to build up a list of all possible IDs and then get the set or list difference between the assigned IDs and the available IDs. That doesn't scale. I realize that this is a linear operation no matter how you slice it and that part is fine, I just think the code can be simplified or improved. I don't like the repetition caused by having to wrap around but perhaps there's no way around that.
Is there a better way? Please show me!
Since you've already searched from incr_last_id to MaxId in the first iteration, there isn't really a need to repeat it again.
Searching from 1 to incr_last_id on the second round at least reduces the search to exactly O(n) instead of a worse case of O(2n)
If you want to do it in a single loop, use modulo,
MaxId = 9999
def next_id
id = incr_last_id
limit = id - 1 #This sets the modulo test to the id just before your start point
id += 1 while (id_taken?(id) && (i % MaxId) != limit)
raise NoAvailableIdsError if id_taken?(id)
set_last_id(id)
id
end
Using a database table (MySQL in this example):
SELECT id FROM sequences WHERE sequence_name = ? FOR UPDATE
UPDATE sequences SET id = id + 1 WHERE sequence_name = ?
The FOR UPDATE gains an exclusive lock on the table, ensuring you can be the only possible process doing the same operation at the same time.
Using an in-memory fixed list:
# somewhere global, done once
#lock = Mutex.new
#ids = (0..9999).to_a
def next_id
#lock.synchronize { #ids.shift }
end
Using redis:
LPOP list_of_ids
Or just:
INCR some_id
Redis takes care of the concurrency concerns for you.
The usual answer to improve this sort of algorithm is to keep a list of "free objects" handy; you could use just a single object in the list, if you don't really want the extra effort of maintaining a list. (This would reduce the effectiveness of the free object cache, but the overhead of managing a large list of free objects might grow to be a burden. It Depends.)
Because you're wrapping your search around when you've hit MaxId, I presume there is a function give_up_id that will return the id to the free pool. Instead of simply putting a freed id back into the big pool you keep track of it with a new variable #most_recently_free or append it to a list #free_ids[].
When you need a new id, take one off the list, if the list has one. If the list doesn't have one, begin your search as you currently do.
Here's a sketch in pseudo-code:
def give_up_id(id)
#free_ids.push(id)
end
def next_id
if #free_ids.empty?
id = old_next_id()
else
id = #free_ids.pop()
return id
end
If you allow multiple threads of execution to interact with your id allocation / free routines, you'll need to protect these routines too, of course.