Wake-up from Deep Power-Down mode causes a reset in LPC1768 - sleep

I need to minimize the current consumption in my board which uses a LPC1768. Now I don't have any problem with going into Deep Sleep or Power-Down modes and waking up from those modes. I have configured the RTC to generate an interrupt after some predefined time which does wake up the MCU correctly and works just fine.
My problem occurs when I want to go into Deep Power-Down mode which is precisely what I need (its consumes much less power). But after generating the RTC Interrupt the MCU goes into a reset state and starts the execution from the beginning as if someone pushes the reset button!
Now why is that? I read from the documents (like this example: AN10915: Using the LPC1700 power modes) that these three routines are pretty much the same.
I don't understand. There should be no problem according to the example.
I really need to do this otherwise we loose the battery sooner than it is supposed to.

UM10360.pdf, chapter 4.8.4 says: "In Deep Power-down mode, power is shut off to the entire chip" [...]
That means all data that is not in the RTC backup registers is lost, and the chip will thus restart with a reset.

Related

ATXmega phantom breakpoint while debugging

I have an ATXmega that I am writing code for. I have IAR embedded workbench and an AVR ONE! that connects to the circuit board with a PDI.
When debugging, I am able to get the chip to halt. That is, the system clock will halt, as though it has hit a break point. The IAR debugging GUI will make no change when this happens; I click 'reset' and then 'go' and IAR tells me the CPU is running normally, but the chip itself thinks a breakpoint has been hit. If I click 'pause' and then 'play' again, the chip will continue without issue. This state happens much more reliably when there are breakpoints set elsewhere in the code.
I have a debug signal on a spare GPIO pin that toggles every clock cycle, producing a signal that is half the frequency of the external crystal. When the chip enters this halted state, both the clock and debug signals cease.
I have been able to remove and add code to the start of main, and the program will halt in different places, but it will always halt after the same amount of time (seemingly. I have not been able to fully confirm that the time or number of clock cycles is consistent).
This only ever happens when debugging, never when the chip is running with no debugger. I have replaced the chip, I have given the same program to a colleague's computer and used his toolchain to program my board, and I have commented out away almost my entire code, and the error persists. My colleagues have never seen this behaviour before.
One final thing: we have a circuit that pulls a few pins high when given a ~8v signal. These are the only pins that are pulled high. The halted/error state has only been seen when this ~8v signal is present.
What's going on here? Has anyone else ever seem this sort of phantom breakpoint?
Edit: I would like to add that we have removed the circuit that pulls those pins high and we still see the behaviour.

TEST pin of 8086

According to some online lectures:
8086 will enter a wait state after execution of the WAIT instruction and will resume execution only when the (𝐓𝐄𝐒𝐓) is made low by an active hardware.
Then, what is the use of TEST pin in the minimum mode of 8086 microprocessor? Why it is not one of maximum mode specific pins?
The TEST input and the WAIT instruction are used to poll for an external event. This polling capability can be useful in any processor mode. Thus the pin is not specific to maximum mode.
In fact, this limited form of polling is most often useful in very small systems, so its inclusion in minimum mode makes a great deal of sense.
Larger systems tend to use interrupts more for this sort of thing.

If the processor is always spinning, why does it drain more resources sometimes and not others?

I've had a massive curiosity as to this for a long time. How do processors "sleep"? As far as I can tell it spins until a set time, that's fine. But there's a problem with this, if the processor is always running at full speed, then why does it get noisy and heat up when it's from my program but stay quiet and not drain power during the OS idle spinning?
Does this have more to do with shared variable access? Even a spin lock requires checking a register with a shared variable. So would it require an external device I/O to actually heat up, or is there a "different sleep"?
I've always wondered how to put my program into a sleep() and know it won't drain battery, (admittedly before I learned about OS schedules/time slicing).
In short, how does this work, and how would I ensure that my Sleep() or otherwise spin-locks use the low-power type.

Will moving code into kernel space give more precise timing?

Background information:
I presently have a hardware device that connects to the USB port. The hardware device is responsible sending out precise periodic messages onto various networks that it, in turn, connects too. Inside the hardware device I have a couple Microchip dsPICs. There are two modes of operation.
One scenario is where send simple "jobs" down to the dsPICs that, in turn, can send out the precise messages with .001ms accuracy. This architecture is not ideal for more complex messaging where we need to send a periodic packet that changes based on events going on within the PC application. So we have a second mode of operation where our PC application will send the periodic messages and the dsPICs simply convert and transmit in response. All this, by the way, is transparent to the end user of our software. Our hardware device is a test tool used in the automotive field.
Currently, we use a USB to serial chip from FTDI and the FTDI Windows drivers to interface the hardware to our PC software.
The problem is that in mode two where we send messages from the PC, the best we are able to achieve is around 1ms on average hardware range. We are subjected to Windows kernel pre-emption. I've tried a number of "tricks" to improve things such as:
Making sure our reader & writer threads live on seperate CPU affinities when possible.
Increasing the thread priority of the writer while reducing that of the reader.
Informing the user to turn off screen saver and other applications when using our software.
Replacing createthread calls with CreateTimerQueueTimer calls.
All our software is written in C/C++. I'm very familiar and comfortable with advanced Windows programming; such as IO Completions, Overlapped I/O, lockless thread queues (really a design strategy), sockets, threads, semaphores, etc...
However, I know nothing about Windows driver development. I've read through a few papers on KMDF vs. UDMF vs. WDM.
I'm hoping a seasoned Windows kernel mode driver developer will respond here...
The next rev. of our hardware has the option to replace the FTDI chip and use either the dsPIC's USB interface or, possibly, port the open source Linux FTDI stuff to Windows and continue to use the FTDI chip within our custom driver. I think by going to a kernel mode driver on the PC side, I can establish a kernel driver that can send out periodic messages at more precise intervals without preemption and/or possibly taking advantage of DMA.
We have a competitor in our business who I think does exactly something similar with their tools. As far as I know, user space applications can not schedule a thread any better than 1ms. We currently use timeGetTime in a thread. I've experiemented with timer queues (via CreateTimerQueueTimer) with no real improvement.
Is a WDM the correct approach to achieve more precise timing?
Our competitor some how is achieveing very precise timing from Windows driven signals to their hardware and they do load a kernel driver (.sys) and their device runs over USB2.0 as does ours.
If WDM is the way to go, can I get some advise on what kernel functions I should be studying for setting up the timings?
Thanks for reading
In kernel mode, you have the luxury of getting a DPC triggered in multiples of 100-nanosecond intervals without dealing with interrupts. A DPC cannot be preempted (aka interrupted by thread scheduler) because thread scheduler is also a DPC. An interrupt can still preempt a DPC though. So an interval value of 10 should do the trick for you to have a callback with utmost precision.
However you don't have access to many features such as paged memory, or a specific thread's memory space at DPC level because they run in arbitrary context. It could be useful to defer processing to your own user mode process' context using an APC which has access to more features.
Kernel threads don't get any special treatment in terms of priority. They are the same as user threads from scheduler's perspective. There are couple more higher-priority levels kernel threads can get but usually no kernel thread uses any of them. I don't think your bottleneck is thread priority. It doesn't matter how big your priority number is, having just one above everyone else is enough for you to become the "god thread" which receives top priority. Having highest priority doesn't mean that you'll get continuous attention. OS will still pause your thread to run others so quantum starvation does not occur.
Another note on Windows preemption behavior: Balance Set Manager temporarily boosts a thread's priority when a thread is signaled by an asynchronous event (GUI click, timer trigger, I/O completion) to allow completion code to finish it's procesing with less preemption. Using an async timer handler should give enough boost to prevent preemption at least for a quantum. I wonder why your code does not fall into that window. However it seems like you are not the only one having problems with timer precision: http://www.virtualdub.org/blog/pivot/entry.php?id=272
I agree with Paul on complexity of driver development, but as long as you have a good justification it's not rocket science, just more effort.
This is one of the fundamental design aspects of the Windows kernel - that code running at passive level (=> all user-mode code) is subject to DPCs and interrupts taking up time, and if you want 1us accuracy, you're probably not going to get it with either a UMDF or user-mode driver.
However, writing a kernel driver is not a light or cheap undertaking, it is very difficult, both to even write, and to ensure that it works on your customers' machines (a lot of testing is required). Getting it right will cost you significant engineering resources.
As a stopgap, I'd look into MMCSS for >= Vista (http://msdn.microsoft.com/en-us/library/windows/desktop/ms684247(v=vs.85).aspx), it may give you enough priority that you can be satisfied.
If you really want to go down the rabbit hole, KMDF is what you should be using. KMDF is a framework on top of WDM that represents a lot of codified best-practices for drivers. Unless you're absolutely forced to, KMDF is always the best way to go for drivers. And to be honest, you're almost certainly going to want to either contract with OSR (http://www.osr.com) or hire someone (several people?) experienced in writing Windows drivers.
Your focus on drivers and kernel performance misses the forest for the trees. The elephant in the room is the fact that full-speed USB 2 bus frames happen with 1ms period. High speed USB 2 micro-frames happen every 1/8ms.
When you send data over full-speed USB (like for most FTDI chips), the best your application can hope for is that the data will get to the device sometime during the very next frame. With an unloaded USB bus, the transfer will happen very close to the start-of-frame. You'll observe it as 1ms granularity with small random deviation. This is precisely what you're seeing, and is not bad. For example, since all USB devices attached to the same host will see the frames at the same time, it's a simple way to synchronize multiple device clocks with better than microsecond precision. What your application can do is simply send a message that has not only the data, but some time in the near future when it should be sent out. Another issue with USB is that there are no guarantees as to when your requests for data transmission will be serviced. You're sharing a bus with other devices, after all.
I think you need to reengineer your system and not depend on any sort of timing from the PC end. The application that runs on the PC should be assumed to be, timing-wise, limited to the performance of the human that interacts with it. Anything that requires guaranteed real time performance must be on your dsPIC devices. Even the USB bus doesn't cut it as you have no guarantees at all as to how soon will your request be scheduled on the bus.
Basically, if you want guaranteed real-time performance on Windows, then there must be no user mode involved -- it must all run in kernel mode, and you must use communications channels that are for your exclusive use (or you make them act that way, e.g. by filtering right on top of the USB host).

How to generate ~100kHz clock signal in Liunx kernel module with bit-banging?

I'm trying to generate clock signal on GPIO pin (ARM platform, mach-davinci, kernel 2.6.27) which will have something arroung 100kHz. Using tasklet with high priority to do that. Theory is simple, set gpio high, udelay for 5us, set gpio low, wait another 5us, but strange problems appear. First of all, can't get this 5us of dalay, but it's fine, looks like hw performance problem, so i moved to period = 40us (gives ~25kHz). Second problem is worst. Once per ~10ms udelay waits 3x longer than usual. I'm thinking that it's hearbeat taking this time, but this is is unacceptable from protocol (which will be implemented on top of this) point of view. Is there any way to temporary disable heartbeat procedure, lets say, for 500ms ? Or maybe I'm doing it wrong from the beginning? Any comments?
You cannot use tasklet for this kind of job. Tasklets can be preempted by interrupts. In some case your tasklet can be even executed in the process context!
If you absolutely have to do it this way, use an interrupt handler - get in, disable interrupts, do whatever you have to do and get out as fast as you can.
Generating the clock asynchronously in software is not the right thing to do. I can think of two alternatives that will work better:
Your processor may have a built-in clock generator peripheral that isn't already being used by the kernel or another driver. When you set one of these up, you tell it how fast to run its clock, and it just starts running out the pulses.
Get your processor's datasheet and study it.
You might not find a peripheral called a "clock" per se, but might find something similar that you can press into service, like a PWM peripheral.
The other device you are talking to may not actually require a regular clock. Some chips that need a "clock" line merely need a line that goes high when there is a bit to read, which then goes low while the data line(s) are changing. If this is the case, the 100 kHz thing you're reading isn't a hard requirement for a clock of exactly that frequency, it is just an upper limit on how fast the clock line (and thus the data line(s)) are allowed to transition.
With a CPU so much faster than the clock, you want to split this into two halves:
The "top half" sets the data line(s) state correctly, then brings the clock line up. Then it schedules the bottom half to run 5 Ξs later, using an interrupt or kernel timer.
In the "bottom half", called by the interrupt or timer, bring the clock line back down, then schedule the top half to run again 5 Ξs later.
Unless you can run your timer tasklet at higher priority than the kernel timer, you will always be susceptible to this kind of jitter. You do really have to do this by bit-ganging? It would be far easier to use a hardware timer or PWM generator. Configure the timer to run at your desired rate, set the pin to output, and you're done.
If you need software control on each bit period, you can try and work around the other tasks by setting your tasklet to run at a short period, say three-fourths of your 40 us delay. In the tasklet, disable interrupts and poll the clock until you get to the correct 40 us timeslot, set the I/O state, re-enable interrupts, and exit. But this effectively types up 25 % of your system in watching a clock.

Resources