Updating list stored in hash table - scheme

I have an immutable hash table that contains a series of lists as values. I wrote a procedure so I could add an item to one of the lists, returning a new hash:
(define (update hash key item)
(hash-set hash
key
(cons item
(hash-ref hash key)))))
This seems to work fine, but feels awkward and verbose. Is there a built-in procedure that accomplishes this, or maybe a more idiomatic way of achieving the same thing?

This is as simple as it can get:
(define (update hash key item)
(hash-update hash key (curry cons item) '()))
Explanation:
hash-update returns a new hash with an updated value for the given key - or you can use hash-update! to modify the hash in-place.
hash and key are self-explanatory.
The third parameter is an updater procedure which receives as a parameter the old value, in this case it's a procedure that conses a new item (because the old value was a list); this is set as the new value for the given key.
The last parameter is the default value to be returned in case the key was not found, before calling the updater procedure.

Related

How does hashtable read correct values in case of collision?

I have some hashtable. For instance I have two entities like
john = { 1stname: jonh, 2ndname: johnson },
eric = { 1stname: eric, 2ndname: ericson }
Then I put them in hashtable:
ht["john"] = john;
ht["eric"] = eric;
Let's imagine there is a collision and hashtable use chaining to fix it. As a result there should be a linked list with these two entities like this
How does hashtable understand what entity should be returned for key? Hash values are the same and it knows nothing about entities structure. For instance if I write thisvar val = ht["john"]; how does hashtable (having only key value and its hash) find out that value should be john record and not eric.
I think what you are confused about is what is stored at each location in the hashtable's adjacent list. It seems like you assume that only the value is being stored. In fact, the data in each list node is a tuple (key, value).
Once you ask for ht['john'], the hashtable find the list associated with hash('john') and if the list is not empty it searches for the key 'john' in the list. If the key is found as the first element of the tuple then the value (second element of the tuple) is returned. If the key is not found, then it means that the element is not in the hashtable.
To summarize, the key hash is used to quickly identify the cell in which the element should be stored if present. Actual key equality is tested for to decide whether the key exists or not.
Is this what you are asking for? I have already put this in comments but seems to me you did not follow link
Collision Resolution in the Hashtable Class
Recall that when inserting an item into or retrieving an item from a hash table, a collision can occur. When inserting an item, an open slot must be found. When retrieving an item, the actual item must be found if it is not in the expected location. Earlier we briefly examined two collusion resolution strategies:
Linear probing
Quardratic probing
The Hashtable class uses a different technique referred to as rehasing. (Some sources refer to rehashing as double hashing.)
Rehasing works as follows: there is a set of hash different functions, H1 ... Hn, and when inserting or retrieving an item from the hash table, initially the H1 hash function is used. If this leads to a collision, H2 is tried instead, and onwards up to Hn if needed. The previous section showed only one hash function, which is the initial hash function (H1). The other hash functions are very similar to this function, only differentiating by a multiplicative factor. In general, the hash function Hk is defined as:
Hk(key) = [GetHash(key) + k * (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1)))] % hashsize
Mathematical Note With rehasing it is important that each slot in the hash table is visited exactly once when hashsize number of probes are made. That is, for a given key you don't want Hi and Hj to hash to the same slot in the hash table. With the rehashing formula used by the Hashtable class, this property is maintained if the result of (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1))and hashsize are relatively prime. (Two numbers are relatively prime if they share no common factors.) These two numbers are guaranteed to be relatively prime if hashsize is a prime number.
Rehasing provides better collision avoidance than either linear or quadratic probing.
sources here

storing values with keys then searching the smartest way, ALGORITHMS

I've got a stream with > 20 millions of values which come with their corresponding key (> 10 millions). The keys are linked to one or more values (max 50000), example:
... (key1, val1), (key2,val2), (key1, val3), (key2, val4), (key1, val6), (key3,val5)...
I store this stream as follows:
key1 : val1, val3, val6
key2 : val2, val4
key3 : val5
Each time I receive a new value in the stream, I first check if this value appears in the list of its corresponding key:
If it's not, i add the value at the end of the list.
If the value is already in the list at the last place, then I do
nothing.
Finally, if the value is already in the list, but not at the last
place, i launch a flag.
My question is: what's the more efficient data structure or tools to perform this process (I want to launch the flag the faster possible). I thought of a hash table associated with linked list (as I give in the example), but checking all the linked list each time I add a value does not sound right. Recall that I do need this notion of LAST value.
Thank you
Checking if the new value is in the list is not optimal - it takes O(n) time to check.
You can use a hashtable instead. You can store the last value separately and update it on insert.
So you have a hashtable, where the values are pairs. Each pair consists of a hashtable (used as a set) and an element (the last element in the set).
Your example looks like this:
(key1 -> (val6, (val1->1, val3->1, val6->1))
(key2 -> (val4, (val2->1, val4->1)
(key3 -> (val5, (val5->1))
You can optimize the cases when the set only contains one element, by not storing the last value explicitly.

Using cons, list, append in Scheme

I need to write a code that take a element and add to list that give as input, and return an new list instead of old list.. after than i will do recurssion and i need new list... below code is working fine.. however i try to reduce all set! that because confuse me and sometime i take error that i cannot solve..
How can i do this operation without set! ? I try just cons, list and append but none of them do this job.
(set! list (cons element list))
Thank you..
Just (cons element list) is enough.
Your code is altering the contents of list variable. We don't normally do that, in functional style, and the only way to do this is to use set! as you did.
But to just return the new list, which has a new element on top of it, the call (cons element list) is enough:
...
(let ((newlist (cons element oldlist)))
.....
..... use newlist and oldlist as needed

checking if all values of a hash are correct values (from a predefined value set) in ruby

I have a hash table with multiple values being passed to a function I dont know the names of the keys but i know that the values of the keys must be equal to characters A S or X.
How can i easily check that all values in the hash table are equal to those characters?
NullUserException is good, you could also
match_values = %w(A S X)
hash.values.all? { |value| match_values.include?(value) }

Can you have hash tables in lisp?

Can you have hash tables or dicts in Lisp? I mean the data structure that is a collection of pairs (key, value) where values can be acceded using keys.
Common Lisp has at least four different ways to do that (key value storage):
property lists (:foo 1 :bar 2)
assoc lists ((:foo . 1) (:bar . 2))
hash tables
CLOS objects (slot-value foo 'bar) to get and (setf (slot-value foo 'bar) 42) to set. The slot name can be stored in a variable: (let ((name 'bar)) (slot-value foo name)) .
For simple usage assoc lists or property lists are fine. With a larger number of elements they tend to get 'slow'. Hash tables are 'faster' but have their own tradeoffs. CLOS objects are used like in many other object systems. The keys are the slot-names defined in a CLOS class. Though it is possible to program variants that can add and remove slots on access.
Of course - Common Lisp has hash tables.
(setq a (make-hash-table))
(setf (gethash 'color a) 'brown)
(setf (gethash 'name a) 'fred)
(gethash 'color a) => brown
(gethash 'name a) => fred
(gethash 'pointy a) => nil
Property lists are good for very small examples of demonstrative purpose, but for any real need their performance is abysmal, so use hash tables.
If you're referring to Common Lisp, hash tables are provided by a type called hash-table.
Using these tables involves creating one with function make-hash-table, reading values with gethash, setting them by using gethash as a place in concert with setf, and removing entries with remhash.
The mapping from key value to hash code is available outside of hash tables with the function sxhash.
Clojure has a built-in map type:
user=> (def m {:foo "bar" :baz "bla"})
#'user/m
user=> (m :foo)
"bar"
See http://clojure.org/data_structures
Sure. Here's the SRFI defining the standard hash table libraries in Scheme:
http://srfi.schemers.org/srfi-69/srfi-69.html
There's built-in hash tables, that use a system hash function (typically SXHASH) and where you can have a couple of different equality checkers (EQ, EQL, EQUAL or EQUALP depending on what you consider to be "the same" key).
If the built-in hash tables are not good enough, there's also a generic hash table library. It will accept any pair of "hash generator"/"key comparator" and build you a hash table. However, it relies on having a good hash function to work well and that is not necessarily trivial to write.
In Lisp it's usually called a property list.

Resources