Is Guava's HashFunction threadsafe? - thread-safety

Is HashFunction in Guava library Threadsafe?
static HashFunction hashFunction = Hashing.sha256();
private static String getHashCredentials(String String) {
return hashFunction.newHasher()
.putString(String, Charsets.UTF_8).hash()
.toString();
}

Yes, if you're using built-in HashFunctions, they're pure function -- see the documentation page for HashFunction:
A hash function is a collision-averse pure function that maps an
arbitrary block of data to a number called a hash code.
Unpacking this definition:
(...)
pure function: the value produced must depend only on the input bytes, in the order they appear. Input data is never modified.
HashFunction instances should always be stateless, and therefore
thread-safe.
Bear in mind that because HashFunction is an interface, you could create stateful and non-thread-safe implementation, but it would break the contract.

Related

Removing a std::function<()> from a vector c++

I'm building a publish-subscribe class (called SystermInterface), which is responsible to receive updates from its instances, and publish them to subscribers.
Adding a subscriber callback function is trivial and has no issues, but removing it yields an error, because std::function<()> is not comparable in C++.
std::vector<std::function<void()> subs;
void subscribe(std::function<void()> f)
{
subs.push_back(f);
}
void unsubscribe(std::function<void()> f)
{
std::remove(subs.begin(), subs.end(), f); // Error
}
I've came down to five solutions to this error:
Registering the function using a weak_ptr, where the subscriber must keep the returned shared_ptr alive.
Solution example at this link.
Instead of registering at a vector, map the callback function by a custom key, unique per callback function.
Solution example at this link
Using vector of function pointers. Example
Make the callback function comparable by utilizing the address.
Use an interface class (parent class) to call a virtual function.
In my design, all intended classes inherits a parent class called
ServiceCore, So instead of registering a callback function, just
register ServiceCore reference in the vector.
Given that the SystemInterface class has a field attribute per instance (ID) (Which is managed by ServiceCore, and supplied to SystemInterface by constructing a ServiceCore child instance).
To my perspective, the first solution is neat and would work, but it requires handling at subscribers, which is something I don't really prefer.
The second solution would make my implementation more complex, where my implementation looks as:
using namespace std;
enum INFO_SUB_IMPORTANCE : uint8_t
{
INFO_SUB_PRIMARY, // Only gets the important updates.
INFO_SUB_COMPLEMENTARY, // Gets more.
INFO_SUB_ALL // Gets all updates
};
using CBF = function<void(string,string)>;
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, vector<CBF>>;
using REQINF_SUBS = map<string, INFO_SUBTREE>; // It's keyed by an iterator, explaining it goes out of the question scope.
using INFSRC_SUBS = map<string, INFO_SUBTREE>;
using WILD_SUBS = INFO_SUBTREE;
REQINF_SUBS infoSubrs;
INFSRC_SUBS sourceSubrs;
WILD_SUBS wildSubrs;
void subscribeInfo(string info, INFO_SUB_IMPORTANCE imp, CBF f) {
infoSubrs[info][imp].push_back(f);
}
void subscribeSource(string source, INFO_SUB_IMPORTANCE imp, CBF f) {
sourceSubrs[source][imp].push_back(f);
}
void subscribeWild(INFO_SUB_IMPORTANCE imp, CBF f) {
wildSubrs[imp].push_back(f);
}
The second solution would require INFO_SUBTREE to be an extended map, but can be keyed by an ID:
using KEY_T = uint32_t; // or string...
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, map<KEY_T,CBF>>;
For the third solution, I'm not aware of the limitations given by using function pointers, and the consequences of the fourth solution.
The Fifth solution would eliminate the purpose of dealing with CBFs, but it'll be more complex at subscriber-side, where a subscriber is required to override the virtual function and so receives all updates at one place, in which further requires filteration of the message id and so direct the payload to the intended routines using multiple if/else blocks, which will increase by increasing subscriptions.
What I'm looking for is an advice for the best available option.
Regarding your proposed solutions:
That would work. It can be made easy for the caller: have subscribe() create the shared_ptr and corresponding weak_ptr objects, and let it return the shared_ptr.
Then the caller must not lose the key. In a way this is similar to the above.
This of course is less generic, and then you can no longer have (the equivalent of) captures.
You can't: there is no way to get the address of the function stored inside a std::function. You can do &f inside subscribe() but that will only give you the address of the local variable f, which will go out of scope as soon as you return.
That works, and is in a way similar to 1 and 2, although now the "key" is provided by the caller.
Options 1, 2 and 5 are similar in that there is some other data stored in subs that refers to the actual std::function: either a std::shared_ptr, a key or a pointer to a base class. I'll present option 6 here, which is kind of similar in spirit but avoids storing any extra data:
Store a std::function<void()> directly, and return the index in the vector where it was stored. When removing an item, don't std::remove() it, but just set it to std::nullptr. Next time subscribe() is called, it checks if there is an empty element in the vector and reuses it:
std::vector<std::function<void()> subs;
std::size_t subscribe(std::function<void()> f) {
if (auto it = std::find(subs.begin(), subs.end(), std::nullptr); it != subs.end()) {
*it = f;
return std::distance(subs.begin(), it);
} else {
subs.push_back(f);
return subs.size() - 1;
}
}
void unsubscribe(std::size_t index) {
subs[index] = std::nullptr;
}
The code that actually calls the functions stored in subs must now of course first check against std::nullptrs. The above works because std::nullptr is treated as the "empty" function, and there is an operator==() overload that can check a std::function against std::nullptr, thus making std::find() work.
One drawback of option 6 as shown above is that a std::size_t is a rather generic type. To make it safer, you might wrap it in a class SubscriptionHandle or something like that.
As for the best solution: option 1 is quite heavy-weight. Options 2 and 5 are very reasonable, but 6 is, I think, the most efficient.

JAVA 8 Extract predicates as fields or methods?

What is the cleaner way of extracting predicates which will have multiple uses. Methods or Class fields?
The two examples:
1.Class Field
void someMethod() {
IntStream.range(1, 100)
.filter(isOverFifty)
.forEach(System.out::println);
}
private IntPredicate isOverFifty = number -> number > 50;
2.Method
void someMethod() {
IntStream.range(1, 100)
.filter(isOverFifty())
.forEach(System.out::println);
}
private IntPredicate isOverFifty() {
return number -> number > 50;
}
For me, the field way looks a little bit nicer, but is this the right way? I have my doubts.
Generally you cache things that are expensive to create and these stateless lambdas are not. A stateless lambda will have a single instance created for the entire pipeline (under the current implementation). The first invocation is the most expensive one - the underlying Predicate implementation class will be created and linked; but this happens only once for both stateless and stateful lambdas.
A stateful lambda will use a different instance for each element and it might make sense to cache those, but your example is stateless, so I would not.
If you still want that (for reading purposes I assume), I would do it in a class Predicates let's assume. It would be re-usable across different classes as well, something like this:
public final class Predicates {
private Predicates(){
}
public static IntPredicate isOverFifty() {
return number -> number > 50;
}
}
You should also notice that the usage of Predicates.isOverFifty inside a Stream and x -> x > 50 while semantically the same, will have different memory usages.
In the first case, only a single instance (and class) will be created and served to all clients; while the second (x -> x > 50) will create not only a different instance, but also a different class for each of it's clients (think the same expression used in different places inside your application). This happens because the linkage happens per CallSite - and in the second case the CallSite is always different.
But that is something you should not rely on (and probably even consider) - these Objects and classes are fast to build and fast to remove by the GC - whatever fits your needs - use that.
To answer, it's better If you expand those lambda expressions for old fashioned Java. You can see now, these are two ways we used in our codes. So, the answer is, it all depends how you write a particular code segment.
private IntPredicate isOverFifty = new IntPredicate<Integer>(){
public void test(number){
return number > 50;
}
};
private IntPredicate isOverFifty() {
return new IntPredicate<Integer>(){
public void test(number){
return number > 50;
}
};
}
1) For field case you will have always allocated predicate for each new your object. Not a big deal if you have a few instances, likes, service. But if this is a value object which can be N, this is not good solution. Also keep in mind that someMethod() may not be called at all. One of possible solution is to make predicate as static field.
2) For method case you will create the predicate once every time for someMethod() call. After GC will discard it.

Storing handles to objects in a hashmap or set in Google's V8 engine

I would like to implement this functionality in an embedded JavaScript application that uses v8 engine.
function myFunction1() {
//do stuff
}
function myFunction2() {
//do other stuff
}
myAddon.addCallback(myFunction1);
myAddon.addCallback(myFunction2);
myAddon.removeCallback(myFunction1);
In order to do this I need to store these functions in a std::set like so
void addCallback(const v8::FunctionCallbackInfo<v8::Value>& args) {
v8::HandleScope scope(args.GetIsolate());
v8::Local<v8::Function> cb = v8::Local<v8::Function>::Cast(args[0]);
std::set mySet = this->mySet;
//now how do I insert a reference to this function into mySet so I can retrieve
//it later
}
void removeCallback(const v8::FunctionCallbackInfo<v8::Value>& args) {
v8::HandleScope scope(args.GetIsolate());
v8::Local<v8::Function> cb = v8::Local<v8::Function>::Cast(args[0]);
std::set mySet = this->mySet;
//now how do I remove the element in this set that refers to this function?
}
How does one go about doing this? I don't want to use v8::Object::GetIdentityHash() because the result is not guaranteed to be unique.
I also can't just store the Local in the std::set because the copy constructor is private and it would also get descoped once removeCallback or addCallback return.
Thanks for any help in advance.
Edit: I realize I could write some javascript to do the function hashing for me, and then call one C++ binded function to iteration through all the callbacks, but I'd rather not do this every time I need to store sets or hashes of JavaScript objects.
This is correct that you can't safely store Local<T> handle, because when it gets out of scope, your function object may become available to garbage collection. What you need is a persistent handle. You can construct it out of local like:
v8::Local<v8::Function> cb = v8::Local<v8::Function>::Cast(args[0]);
v8::Persistent<v8::Function, v8::CopyablePersistentTraits<v8::Function>> value(isolate, cb);
Note CopyablePersistentTraits which allows handle copying. There is also NonCopyablePersistentTraits if you would like to prevent that.
Now you can put it in a vector:
std::vector<v8::Persistent<v8::Function, v8::CopyablePersistentTraits<v8::Function>>> v;
v.push_back(value);
Convert back to local:
v8::Local<v8::Function> local = v8::Local<v8::Function>::New(isolate, value);
For std::set you also need to provide comparison function for elements. It also might be a good idea to wrap v8::Persistent<T> into your own class like PersistentWrapper<T> (this is what I am doing in my project) to get the desired behavior.

Avoiding duplicate code when performing operation on different object properties

I have recently run into a problem which has had me thinking in circles. Assume that I have an object of type O with properties O.A and O.B. Also assume that I have a collection of instances of type O, where O.A and O.B are defined for each instance.
Now assume that I need to perform some operation (like sorting) on a collection of O instances using either O.A or O.B, but not both at any given time. My original solution is as follows.
Example -- just for demonstration, not production code:
public class O {
int A;
int B;
}
public static class Utils {
public static void SortByA (O[] collection) {
// Sort the objects in the collection using O.A as the key. Note: this is custom sorting logic, so it is not simply a one-line call to a built-in sort method.
}
public static void SortByB (O[] collection) {
// Sort the objects in the collection using O.B as the key. Same logic as above.
}
}
What I would love to do is this...
public static void SortAgnostic (O[] collection, FieldRepresentation x /* some non-bool, non-int variable representing whether to chose O.A or O.B as the sorting key */) {
// Sort by whatever "x" represents...
}
... but creating a new, highly-specific type that I will have to maintain just to avoid duplicating a few lines of code seems unnecessary to me. Perhaps I am incorrect on that (and I am sure someone will correct me if that statement is wrong :D), but that is my current thought nonetheless.
Question: What is the best way to implement this method? The logic that I have to implement is difficult to break down into smaller methods, as it is already fairly optimized. At the root of the issue is the fact that I need to perform the same operation using different properties of an object. I would like to stay away from using codes/flags/etc. in the method signature if possible so that the solution can be as robust as possible.
Note: When answering this question, please approach it from an algorithmic point of view. I am aware that some language-specific features may be suitable alternatives, but I have encountered this problem before and would like to understand it from a relatively language-agnostic viewpoint. Also, please do not constrain responses to sorting solutions only, as I have only chosen it as an example. The real question is how to avoid code duplication when performing an identical operation on two different properties of an object.
"The real question is how to avoid code duplication when performing an identical operation on two different properties of an object."
This is a very good question as this situation arises all the time. I think, one of the best ways to deal with this situation is to use the following pattern.
public class O {
int A;
int B;
}
public doOperationX1() {
doOperationX(something to indicate which property to use);
}
public doOperationX2() {
doOperationX(something to indicate which property to use);
}
private doOperationX(input ) {
// actual work is done here
}
In this pattern, the actual implementation is performed in a private method, which is called by public methods, with some extra information. For example, in this case, it can be
doOperationX(A), or doOperationX(B), or something like that.
My Reasoning: In my opinion this pattern is optimal as it achieves two main requirements:
It keeps the public interface descriptive and clear, as it keeps operations separate, and avoids flags etc that you also mentioned in your post. This is good for the client.
From the implementation perspective, it prevents duplication, as it is in one place. This is good for the development.
A simple way to approach this I think is to internalize the behavior of choosing the sort field to the class O itself. This way the solution can be language-agnostic.
The implementation in Java could be using an Abstract class for O, where the purpose of the abstract method getSortField() would be to return the field to sort by. All that the invocation logic would need to do is to implement the abstract method to return the desired field.
O o = new O() {
public int getSortField() {
return A;
}
};
The problem might be reduced to obtaining the value of the specified field from the given object so it can be use for sorting purposes, or,
TField getValue(TEntity entity, string fieldName)
{
// Return value of field "A" from entity,
// implementation depends on language of choice, possibly with
// some sort of reflection support
}
This method can be used to substitute comparisons within the sorting algorithm,
if (getValue(o[i], "A")) > getValue(o[j], "A"))
{
swap(i, j);
}
The field name can then be parametrized, as,
public static void SortAgnostic (O[] collection, string fieldName)
{
if (getValue(collection[i], fieldName)) > getValue(collection[j], fieldName))
{
swap(i, j);
}
...
}
which you can use like SortAgnostic(collection, "A").
Some languages allow you to express the field in a more elegant way,
public static void SortAgnostic (O[] collection, Expression fieldExpression)
{
if (getValue(collection[i], fieldExpression)) >
getValue(collection[j], fieldExpression))
{
swap(i, j);
}
...
}
which you can use like SortAgnostic(collection, entity => entity.A).
And yet another option can be passing a pointer to a function which will return the value of the field needed,
public static void SortAgnostic (O[] collection, Function getValue)
{
if (getValue(collection[i])) > getValue(collection[j]))
{
swap(i, j);
}
...
}
which given a function,
TField getValueOfA(TEntity entity)
{
return entity.A;
}
and passing it like SortAgnostic(collection, getValueOfA).
"... but creating a new, highly-specific type that I will have to maintain just to avoid duplicating a few lines of code seems unnecessary to me"
That is why you should use available tools like frameworks or other typo of code libraries that provide you requested solution.
When some mechanism is common that mean it can be moved to higher level of abstraction. When you can not find proper solution try to create own one. Think about the result of operation as not part of class functionality. The sorting is only a feature, that why it should not be part of your class from the beginning. Try to keep class as simple as possible.
Do not worry premature about the sense of having something small just because it is small. Focus on the final usage of it. If you use very often one type of sorting just create a definition of it to reuse it. You do not have to necessary create a utill class and then call it. Sometimes the base functionality enclosed in utill class is fair enough.
I assume that you use Java:
In your case the wheal was already implemented in person of Collection#sort(List, Comparator).
To full fill it you could create a Enum type that implement Comparator interface with predefined sorting types.

Class: Immutability vs Not Extensible

I was reading that there are many reasons for making a class final in SO threads and also in an arcticle
Two of which were
1. To remove extensibility
2. to make class immutable.
Does making a class immutable have the characteristic along with it as being final ( it's methods )? I don't see the difference between the two?
Immutable object does not allow to change his state. Final class does not allow to inherit itself. For example class Foo (see below) is immutable (the state, ie _name is never changed ) and class Bar is mutable (rename method allows to change the state):
final class Foo
{
private String _name;
public Foo(string name)
{
_name = name;
}
public String getName()
{
return _name;
}
}
final class Bar
{
private String _name;
public Bar(string name)
{
_name = name;
}
public String getName()
{
return _name;
}
public void rename(string newName)
{
_name = newName;
}
}
It can sometimes be useful to recognize types as "verifiably deeply immutable", meaning that static analysis can demonstrate that (1) once an instance is constructed, none of its properties will ever change, and (2) every object instance to which it holds a reference is verifiably deeply immutable. Classes which are open to extension cannot be verifiably deeply immutable, because a static analyzer would have no way of knowing whether a mutable subclass might be created, and a reference to that mutable subclass stored within what's supposed to be a verifiably deeply immutable object.
On the other hand, it can sometimes be useful to have abstract (and thus extensible) classes which are specified to be deeply immutable. The abstract class would have no way of forcing derived classes to immutable, but any mutable derived classes should be considered "broken". The situation would be somewhat analogous to the requirement that two object instances which report themselves as "equal" to each other should report the same hash code. It's possible to design classes which violate that requirement, but any errant hash-table behavior that results is the fault of the broken hash-code function, rather than the hash table.
For example, one might have an abstract ImmutableMatrix property with a method to read the element at a given (row,column) location. One possible implementation would be to back an NxM ImmutableMatrix with an array of N*M elements. On the other hand, it may also be useful to define some subclasses like ImmutableDiagonalMatrix, with an array of N elements, where Value(R,C) would yield 0 for R!=C, and Arr[R] for R==C. If a significant fraction of the arrays one is using will be diagonal arrays, one could save a lot of memory for each such instance. While leaving the class extensible would leave open the possibility that someone might extend it in a fashion which is open to mutation, it would also leave open the possibility that a programmer who knew that many of the arrays a program used would fit some particular form could design a class to optimally store that form.

Resources