leveldb thread safety: reading in one thread, writing through another - leveldb

I have created one DB instance and would like to use it from different threads: one for reading through an iterator and another for writing. Would that be thread-safe or do I need to protect the object with locks?

It seems that would be safe. What wouldn't be safe is to use same iterator in different threads.

Related

Why golang Maps are not safe duing for read operations? [duplicate]

This question already has answers here:
How safe are Golang maps for concurrent Read/Write operations?
(8 answers)
Closed 4 months ago.
Firstly, I have already read this How safe are Golang maps for concurrent Read/Write operations? and my question is not how safe golang maps are during read/write but a lot more specific. In multiple places in the above mentioned thread at SO, people have mentioned that Concurrent read from maps are still ok! and logically that should be the case but it turns out that there are some other resources as well where people mention golang maps are not even safe to read concurrently. For example take a look at the following resources
https://go.dev/blog/maps
https://www.youtube.com/watch?v=BywIJqYodl4
In one place https://groups.google.com/g/golang-nuts/c/_XHqFejikBg I saw people mentioned that reads are concurrent if it happens only after the initialisation of Maps. How sync.Map is playing here? Why reads are not concurrent if each goroutines process for a single key but writes happens after the map is initialised via other goroutines? It shouldn't be the case because for each key hash is being calculated and for a given hash write/read never happens by any other goroutines.
Please help to explain it. Thanks
If you create an initialize a map, and then create goroutines and read from it concurrently, it is safe, there is no need to serialize access. The initialization of the map must be completed before the goroutines start reading from it. There is a nuance here though: the map initialization must be complete before concurrent reads. If you complete the initialization before you create the goroutines, then it is safe. If you create the goroutines before map initialization is complete, then you have to make sure goroutines wait until initialization is complete by using a synchronization mechanism like a channel.

Is NSObject's retain method atomic?

Is NSObject's retain method atomic?
For example, when retaining the same object from two different threads, is it promised that the retain count has gone up twice, or is it possible for the retain count to be incremented just once?
Thanks.
NSObject as well as object allocation and retain count functions are thread-safe — see Appendix A: Thread Safety Summary in the Thread Programming Guide.
Edit: I’ve decided to take a look at the open source part of Core Foundation. In CFRuntime.c, __CFDoExternRefOperation() is the function responsible for updating the the retain counters. It tests whether the process has more than one thread and, if there’s more than one thread, it acquires a spin lock before updating the retain count, hence making this operation thread safe.
Interestingly enough, the retain count is not an attribute (or instance variable) of an object in the struct (class) sense. The runtime keeps a separate structure with retain counters. In fact, if I understand it correctly, this structure is an array of hash tables and there’s a spin lock for each hash table. This means that a lock refers to multiple objects that have been placed in the same hash table, i.e., the lock is neither global (for all instances) nor per instance.

how to use stanford parser with threads

hello I want to use stanford parser wuth threads but I dont know how to do that with thread pool. I want that all threads will do this:
LexicalizedParser.apply(Object in)
but I dont want to create all the time new object of LexicalizedParser because it will load
lp = new LexicalizedParser("englishPCFG.ser.gz");
and it will take 2 sec for each obj.
what can I do?
thanks!
Guess it's too late but a thread safe version is there: http://nlp.stanford.edu/software/lex-parser.shtml
You can use ThreadLocal.
It allows you to keep one instance of parser per thread. Thus any created instance of parser will never be used from more than one thread.
Usually it shouldn't create more instances than CPUs*cores you have.
For me it is ~4-5 instances (if I disable Hyper Threading on my quadcore).
P.S. Not related to StanfordNLP. Sometimes poor class implementations contain static fields and modify them in non-thread safe way. General safe parallelization approach for such implementations would be:
move computation part into separate process;
launch (CPUs*cores) number of processes with computations.
use IPC technic for communicating between main/background processes.

Address Book thread safety and performance

My sense from the Address Book documentation and my understanding of the underlying CoreData implementation suggests that Address Book should be thread safe, and making queries from multiple threads should pose no problems. But I'm having trouble finding any explicit discussion of thread safety in the docs. This raises a few questions:
Is it safe to use +sharedAddressBook on multiple threads for read-only access? I believe the answer is yes.
For write-access on background threads, it appears that you should use +addressBook instead (and save your changes manually). Do I understand this correctly?
Has anyone investigated the performance impact of making multiple simultaneous queries to Address Book on multiple threads? This should be very similar to the performance of making multiple CoreData queries on multiple threads. My sense is that I would gain little by making parallel queries since I assume they will serialize when they hit SQLLite, but I'm not certain here.
I need to make dozens of queries (some complex) against AddressBook and am doing so on a background thread using NSOperation to avoid blocking the UI (which it currently does). My underlying question is whether it makes sense to set the max concurrent operations to a value larger than 1, and whether there is any danger in doing so if the application may also be writing to AddressBook at the same time on another thread.
Unless an API says it is threadsafe it is not. Even if the current implementation happens to be thread safe it might not be in the future. In other words, do not use AB from multiple threads.
As an aside, what about it being CoreData based makes you think it would be thread safe? CoreData uses a thread confinement model where it is only safe to access a context on a single thread, all the objects from the context must be accessed on the same thread.
That means that sharedAddressBook will not be thread safe if it keeps an NSManagedObjectContext around to use. It would only be safe if AB creates a new context every time it needs to do something and immediately disposes of it, or if it creates a context per thread and always uses the appropriate context (probably by storing a ref to it in the threadDictionary). In either event it would not be safe to store anything as NSManagedObjects since the contexts would be constantly destroyed, which means every ABRecord would have to store an NSManagedObjectID so it could reconstitute the object in the appropriate context whenever it needed it.
Clearly all of that is possible, it may be what is done, but it is hardly the obvious implementation.

Is it safe to manipulate objects that I created outside my thread if I don't explicitly access them on the thread which created them?

I am working on a cocoa software and in order to keep the GUI responsive during a massive data import (Core Data) I need to run the import outside the main thread.
Is it safe to access those objects even if I created them in the main thread without using locks if I don't explicitly access those objects while the thread is running.
With Core Data, you should have a separate managed object context to use for your import thread, connected to the same coordinator and persistent store. You cannot simply throw objects created in a context used by the main thread into another thread and expect them to work. Furthermore, you cannot do your own locking for this; you must at minimum lock the managed object context the objects are in, as appropriate. But if those objects are bound to by your views a controls, there are no "hooks" that you can add that locking of the context to.
There's no free lunch.
Ben Trumbull explains some of the reasons why you need to use a separate context, and why "just reading" isn't as simple or as safe as you might think, in this great post from late 2004 on the webobjects-dev list. (The whole thread is great.) He's discussing the Enterprise Objects Framework and WebObjects, but his advice is fully applicable to Core Data as well. Just replace "EC" with "NSManagedObjectContext" and "EOF" with "Core Data" in the meat of his message.
The solution to the problem of sharing data between threads in Core Data, like the Enterprise Objects Framework before it, is "don't." If you've thought about it further and you really, honestly do have to share data between threads, then the solution is to keep independent object graphs in thread-isolated contexts, and use the information in the save notification from one context to tell the other context what to re-fetch. -[NSManagedObjectContext refreshObject:mergeChanges:] is specifically designed to support this use.
I believe that this is not safe to do with NSManagedObjects (or subclasses) that are managed by a CoreData NSManagedObjectContext. In general, CoreData may do many tricky things with the sate of managed objects, including firing faults related to those objects in separate threads. In particular, [NSManagedObject initWithEntity:insertIntoManagedObjectContext:] (the designated initializer for NSManagedObjects as of OS X 10.5), does not guarantee that the returned object is safe to pass to an other thread.
Using CoreData with multiple threads is well documented on Apple's dev site.
The whole point of using locks is to ensure that two threads don't try to access the same resource. If you can guarantee that through some other mechanism, go for it.
Even if it's safe, but it's not the best practice to use shared data between threads without synchronizing the access to those fields. It doesn't matter which thread created the object, but if more than one line of execution (thread/process) is accessing the object at the same time, since it can lead to data inconsistency.
If you're absolutely sure that only one thread will ever access this object, than it'd be safe to not synchronize the access. Even then, I'd rather put synchronization in my code now than wait till later when a change in the application puts a second thread sharing the same data without concern about synchronizing access.
Yes, it's safe. A pretty common pattern is to create an object, then add it to a queue or some other collection. A second "consumer" thread takes items from the queue and does something with them. Here, you'd need to synchronize the queue but not the objects that are added to the queue.
It's NOT a good idea to just synchronize everything and hope for the best. You will need to think very carefully about your design and exactly which threads can act upon your objects.
Two things to consider are:
You must be able to guarantee that the object is fully created and initialised before it is made available to other threads.
There must be some mechanism by which the main (GUI) thread detects that the data has been loaded and all is well. To be thread safe this will inevitably involve locking of some kind.
Yes you can do it, it will be safe
...
until the second programmer comes around and does not understand the same assumptions you have made. That second (or 3rd, 4th, 5th, ...) programmer is likely to start using the object in a non safe way (in the creator thread). The problems caused could be very subtle and difficult to track down. For that reason alone, and because its so tempting to use this object in multiple threads, I would make the object thread safe.
To clarify, (thanks to those who left comments):
By "thread safe" I mean programatically devising a scheme to avoid threading issues. I don't necessarily mean devise a locking scheme around your object. You could find a way in your language to make it illegal (or very hard) to use the object in the creator thread. For example, limiting the scope, in the creator thread, to the block of code that creates the object. Once created, pass the object over to the user thread, making sure that the creator thread no longer has a reference to it.
For example, in C++
void CreateObject()
{
Object* sharedObj = new Object();
PassObjectToUsingThread( sharedObj); // this function would be system dependent
}
Then in your creating thread, you no longer have access to the object after its creation, responsibility is passed to the using thread.

Resources