Does set variable modify variable values across threads? - debugging

Say I have a function foo
void foo () {
bool block = true;
while(block) {}
}
And I have 5 threads running this function, all blocked in the while loop.
If I attach gdb to this process, and jump over to one of these 5 threads, and do the following
(gdb) set variable block=false
My question is what is the full effect of the above set variable statement?
Does it change the value of block to false on all threads, or just the current thread in gdb? Does running the above statement unblock only one or all of the 5 threads stuck in the while loop?

Does it change the value of block to false on all threads, or just the current thread in gdb?
Just the current thread. You have 5 separate and independent stack variables named block; it is unreasonable to expect GDB to change more than one of them.
Does running the above statement unblock only one or all of the 5 threads stuck in the while loop?
It follows that only one thread will be unblocked.
If you want an ability to unblock all threads at once, make the variable global (and volatile, since you are going to be modifying it from "outside" of the program).
P.S. Your spin loops are going to burn a lot of CPU, and compiler is allowed to compile them out. Inserting a usleep() for short duration may help with both problems.

Related

Should InterlockedExchange be used on all setting of a variable?

I'm using InterlockedExchange in Windows and I have two questions that put together are basically my title question.
InterlockedExchange uses type LONG (32-bits). According to Microsoft's documentation Interlocked Variable Access: "simple reads and writes to 32-bit variables are atomic operations without InterlockedExchange". According to function documentation InterlockedExchange: "This function is atomic with respect to calls to other interlocked functions". Is it or is it not atomic to read/write a LONG in Windows without the interlocked functions?
I am reviewing some code in which one thread sets a variable then all further accesses on that variable by that thread or any number of threads it created use InterlockedExchange. To make it simple consider thread running main() creates a thread running other():
LONG foo;
main()
{
foo = TRUE;
createthread(other());
/* do other things */
if(InterlockedExchange(&foo, 0))
{
cleanup()
}
}
other()
{
/* do other things */
if(InterlockedExchange(&foo, 0))
{
cleanup()
}
}
cleanup()
{
/* it's expected this is only called once */
}
Is it possible in this example that foo will not appear TRUE to either of the InterlockedExchange calls? If main is busy doing other things so that the first InterlockedExchange call is made from the other thread, does that mean foo is guaranteed to have been written by the main thread and visible to the other thread at that time?
Sorry if this is unclear I don't know how to phrase it better.
Is it or is it not atomic to read/write a LONG in Windows without the interlocked functions?
Yes to both, assuming default alignment. This follows from the quoted statement "simple reads and writes to properly-aligned 32-bit variables are atomic operations", because LONG is a 32-bit signed integer in Windows, regardless of the bitness of the OS.
Is it possible in this example that foo will not appear TRUE to either of the InterlockedExchange calls?
No, not possible if both calls are reached. That's because within a single thread the foo = TRUE; write is guaranteed to be visible to the InterlockedExchange call that comes after it. So the InterlockedExchange call in main will see either the TRUE value previously set in main, or the FALSE value reset in the other thread. Therefore, one of the InterlockedExchange calls must read the foo value as TRUE.
However, if the /* do other things */ code in main is an infinite loop while(1); then that would leave only one InterlockedExchange outstanding in other, and it may be possible for that call to see foo as FALSE, for the same reason as...
If main is busy doing other things so that the first InterlockedExchange call is made from the other thread, does that mean foo is guaranteed to have been written by the main thread and visible to the other thread at that time?
Not necessarily. The foo = TRUE; write is visible to the main thread at the point the secondary thread is created, but may not necessarily be visible to the other thread when it starts, or even when it gets to the InterlockedExchange call.

How to check which index in a loop is executing without slow down process?

What is the best way to check which index is executing in a loop without too much slow down the process?
For example I want to find all long fancy numbers and have a loop like
for( long i = 1; i > 0; i++){
//block
}
and I want to learn which i is executing in real time.
Several ways I know to do in the block are printing i every time, or checking if(i % 10000), or adding a listener.
Which one of these ways is the fastest. Or what do you do in similar cases? Is there any way to access the value of the i manually?
Most of my recent experience is with Java, so I'd write something like this
import java.util.concurrent.atomic.AtomicLong;
public class Example {
public static void main(String[] args) {
AtomicLong atomicLong = new AtomicLong(1); // initialize to 1
LoopMonitor lm = new LoopMonitor(atomicLong);
Thread t = new Thread(lm);
t.start(); // start LoopMonitor
while(atomicLong.get() > 0) {
long l = atomicLong.getAndIncrement(); // equivalent to long l = atomicLong++ if atomicLong were a primitive
//block
}
}
private static class LoopMonitor implements Runnable {
private final AtomicLong atomicLong;
public LoopMonitor(AtomicLong atomicLong) {
this.atomicLong = atomicLong;
}
public void run() {
while(true) {
try {
System.out.println(atomicLong.longValue()); // Print l
Thread.sleep(1000); // Sleep for one second
} catch (InterruptedException ex) {}
}
}
}
}
Most AtomicLong implementations can be set in one clock cycle even on 32-bit platforms, which is why I used it here instead of a primitive long (you don't want to inadvertently print a half-set long); look into your compiler / platform details to see if you need something like this, but if you're on a 64-bit platform then you can probably use a primitive long regardless of which language you're using. The modified for loop doesn't take much of an efficiency hit - you've replaced a primitive long with a reference to a long, so all you've added is a pointer dereference.
It won't be easy, but probably the only way to probe the value without affecting the process is to access the loop variable in shared memory with another thread. Threading libraries vary from one system to another, so I can't help much there (on Linux I'd probably use pthreads). The "monitor" thread might do something like probe the value once a minute, sleep()ing in between, and so allowing the first thread to run uninterrupted.
To have a null cost reporting (on multi-cpu computers) : set your index as a "global" property (class-wide for instance), and have a separate thread to read and report the index value.
This report could be timer-based (5 times per seconds or so).
Rq : Maybe you'll need also a boolean stating 'are we in the loop ?'.
Volatile and Caches
If you're going to be doing this in, say, C / C++ and use a separate monitor thread as previously suggested then you'll have to make the global/static loop variable volatile. You don't want the compiler decide deciding to use a register for the loop variable. Some toolchains make that assumption anyway, but there's no harm being explicit about it.
And then there's the small issue of caches. A separate monitor thread nowadays will end up on a separate core, and that'll mean that the two separate cache subsystems will have to agree on what the value is. That will unavoidably have a small impact on the runtime of the loop.
Real real time constraint?
So that begs the question of just how real time is your loop anyway? I doubt that your timing constraint is such that you're depending on it running within a specific number of CPU clock cycles. Two reasons, a) no modern OS will ever come close to guaranteeing that, you'd have to be running on the bare metal, b) most CPUs these days vary their own clock rate behind your back, so you can't count on a specific number of clock cycles corresponding to a specific real time interval.
Feature rich solution
So assuming that your real time requirement is not that constrained, you may wish to do a more capable monitor thread. Have a shared structure protected by a semaphore which your loop occasionally updates, and your monitor thread periodically inspects and reports progress. For best performance the monitor thread would take the semaphore, copy the structure, release the semaphore and then inspect/print the structure, minimising the semaphore locked time.
The only advantage of this approach over that suggested in previous answers is that you could report more than just the loop variable's value. There may be more information from your loop block that you'd like to report too.
Mutex semaphores in, say, C on Linux are pretty fast these days. Unless your loop block is very lightweight the runtime overhead of a single mutex is not likely to be significant, especially if you're updating the shared structure every 1000 loop iterations. A decent OS will put your threads on separate cores, but for the sake of good form you'd make the monitor thread's priority higher than the thread running the loop. This would ensure that the monitoring does actually happen if the two threads do end up on the same core.

Set breakpoint on variable value change

I'm just wondering if is it possible to set breakpoint on change of variable value (in any programming language and tool) ?
For example, I want to say: "Stop anywhere, when value of variable 'a' will be changed".
I know that there is ability to set condition breakpoint and to stop execution when a variable have some specific value, but I didn't hear about observing variable changes.
If it is not possible, why ?
In my experience you can achieve this with a "memory breakpoint" or "memory watch point". For example gdb does it like this: Can I set a breakpoint on 'memory access' in GDB?
As far as I've seen with write watchpoints, the break actually triggers when a is written to, regardless of whether the new value is equal to the old value. So if by "changed" you really mean "changed" then there are fewer examples out there. Possibly even none, I'm not sure, although I don't suppose it would be technically difficult to implement change-only watchpoints, assuming that you were implementing write watchpoints.
For some languages it makes a difference what kind of variable a is. For example, in C or C++ variables can be "lifted" into registers when optimization is enabled, in which case hardware memory watchpoints on the address of the variable will not necessarily catch every change.
There's also a limitation with variables on the stack, that if your function exits but the watchpoint is still set, then it could catch access to the same address, now in use for a different variable in a different function. If your function is called again later (or recursively), it's not necessarily starting from the same stack position, and if not then your watchpoint would fail to catch access to the "same" variable at a different location.
"Stop when a particular condition is true at a particular line of code" is in my experience called a "conditional breakpoint". It generally uses a different mechanism --
the debugger will most likely put a breakpoint instruction at that line of code. Each time it triggers the debugger will check the condition and continue execution if it's false.
Some processors support hardware breakpoints which will break when an address is read or written. For example, if I have a 4 byte variable at address 0x10005060, then I can set a hardware breakpoint like this (using windbg): ba w4 0x10005060. The processor will break if any of the 4 bytes are written. The following command instructs the processor to break when any of those 4 bytes a read or written: ba r4 0x10005060.

Kernel threads vs Timers

I'm writing a kernel module which uses a customized print-on-screen system. Basically each time a print is involved the string is inserted into a linked list.
Every X seconds I need to process the list and perform some operations on the strings before printing them.
Basically I have two choices to implement such a filter:
1) Timer (which restarts itself in the end)
2) Kernel thread which sleeps for X seconds
While the filter is performing its stuff nothing else can use the linked list and, of course, while inserting a string the filter function shall wait.
AFAIK timer runs in interrupt context so it cannot sleep, but what about kernel threads? Can they sleep? If yes is there some reason for not to use them in my project? What other solution could be used?
To summarize: my filter function has got only 3 requirements:
1) Must be able to printk
2) When using the list everything else which is trying to access the list must block until the filter function finishes execution
3) Must run every X seconds (not a realtime requirement)
kthreads are allowed to sleep. (However, not all kthreads offer sleepful execution to all clients. softirqd for example would not.)
But then again, you could also use spinlocks (and their associated cost) and do without the extra thread (that's basically what the timer does, uses spinlock_bh). It's a tradeoff really.
each time a print is involved the string is inserted into a linked list
I don't really know if you meant print or printk. But if you're talking about printk(), You would need to allocate memory and you are in trouble because printk() may be called in an atomic context. Which leaves you the option to use a circular buffer (and thus, you should be tolerent to drop some strings because you might not have enough memory to save all the strings).
Every X seconds I need to process the list and perform some operations on the strings before printing them.
In that case, I would not even do a kernel thread: I would do the processing in print() if not too costly.
Otherwise, I would create a new system call:
sys_get_strings() or something, that would dump the whole linked list into userspace (and remove entries from the list when copied).
This way the whole behavior is controlled by userspace. You could create a deamon that would call the syscall every X seconds. You could also do all the costly processing in userspace.
You could also create a new device says /dev/print-on-screen:
dev_open would allocate the memory, and print() would no longer be a no-op, but feed the data in the device pre-allocated memory (in case print() would be used in atomic context and all).
dev_release would throw everything out
dev_read would get you the strings
dev_write could do something on your print-on-screen system

How to wait for one second on an 8051 microcontroller?

I'm supposed to write a program that will send some values to registers, then wait one second, then change the values. The thing is, I'm unable to find the instruction that will halt operations for one second.
How about setting up a timer interrupt ?
Some useful hints and code snippets in this Keil 8051 application note.
There is no such 'instruction'. There is however no doubt at least one hardware timer peripheral (the exact peripheral set depends on the exact part you are using). Get out the datasheet/user manual for the part you are using and figure out how to program the timer; you can then poll it or use interrupts. Typically you'd configure the timer to generate a periodic interrupt that then increments a counter variable.
Two things you must know about timer interrupts: Firstly, if your counter variable is greater than 8-bit, access to it will not be atomic, so outside of the interrupt context you must either temporarily disable interrupts to read it, or read it twice in succession with the same value to validate it. Secondly, the timer counter variable must be declared volatile to prevent the compiler optimising out access to it; this is true of all variables shared between interrupts and threads.
Another alternative is to use a low power 'sleep' mode if supported; you set up a timer to wake the processor after the desired period, and issue the necessary sleep instruction (this may be provided as an 'intrinsic' by your compiler, or you may be controlled by a peripheral register. This is general advice, not 8051 specific; I don't know if your part even supports a sleep mode.
Either way you need to wade through the part specific documentation. If you could tell us the exact part, you may get help with that.
A third solution is to use an 8051 specific RTOS kernel which will provide exactly the periodic delay function you are looking for, as well as multi-threading and IPC.
I would setup a timer so that it interrupts every 10ms. In that interrupt, increment a variable.
You will also need to write a function to disable interrupts and read that variable.
In your main program, you will read the timer variable and then wait until it is 10100 more than it is when you started.
Don't forget to watch out for the timer variable rolling over.

Resources