How does Windows calculate remaining time on battery? - windows

I'm testing an app to let users know when to plug and unplug their laptop to get the most life out of their laptop battery. As well as this I'm trying to replicate the tooltip from the Windows power meter.
It's fairly successful so far with a couple of differences.
The Windows time remaining notification, e.g. "X hr XX min (XX%) remaining", doesn't show up until after around a minute.
The Windows time remaining seems more stable under changing battery loads
These lead me to think that the Windows time remaining algorithm is averaging over the past minute or so but I can't find any documentation of that. Does anyone know exactly what it does so I can reproduce it?
Here's my implementation (in Python but the question is language-agnostic). I'm thinking I'll need to average the most recent x discharge rates from polling every y seconds but need to know the values for x and y.
t = wmi.WMI(moniker = "//./root/wmi")
batts = t.ExecQuery('Select * from BatteryStatus where Voltage > 0')
time_left = 0
for _i, b in enumerate(batts):
time_left += float(b.RemainingCapacity) / float(b.DischargeRate)
hours = int(time_left)
mins = 60 * (time_left % 1.0)
return '%i hr %i min' % (hours, mins)

Windows follows the ACPI specification, and given the specification gives a method of calculating remaining battery time, I'd assume this would be how they'd do it.
Edit: Found a somewhat confirming source.
I'm referring specifically to chapter 3.9.3 "Battery Gas Gauge".
Remaining Battery Percentage[%] = Battery Remaining Capacity [mAh/mWh] / Last Full Charged Capacity [mAh/mWh]* 100
if you need that in hours:
Remaining Battery Life [h]= Battery Remaining Capacity [mAh/mWh] / Battery Present Drain Rate [mA/mW]
This essentially represents the current rate of change in charge capacity per unit time, you'll need to look at the ACPI spec to see how Windows implements it specifically.
The variables in question I'd assume would have to be queried from the battery controller and I'd let Windows handle all the compatibility issues there.
For this there exists the Windows Management Instrumentation classes Win32_Battery and (probably more appropriate) Win32_PortableBattery. Upon some further digging, it seems like these classes calculate the remaining time for you and don't expose the current charge of the battery (probably to encourage people to have it calculated only one way/rounding issues, etc). The closest "cool" thing you can do is estimate/calculate battery wear by FullChargeCapacity / DesignCapacity.
The next best thing I could find appears to be a lower level API exposed through IOCTL_BATTERY_QUERY_INFORMATION, but that seems like it also doesn't give current charge capacity in milliWatt-Hours.
tl;dr Use the remaining times and percentages calculated for you by the above classes if possible :/
As an aside, some laptop manufacturers bundle their own tools to calculate time remaining and query specific micro controller implementations in their own batteries and are able to make a more informed/non-linear guestimate about remaining battery.

Related

Android Beacon Library - Long Search Time

Update: I would still want advice as the start up time is still slow but reduced to about 10 seconds compared to 2 minutes.
As far as I understand, the default beacon search time is around once every 1.1 second for this library. However, despite setting my beacon broadcast frequency to 10Hz (I-Beacon) and 'didDetermineStateForRegion' reporting a detection of beacons coming into range, it takes about 1 minute for 'didEnter/ExitRegion' and the 'range notifier' to give me an alert that a beacon is in range / give me a list of beacons that are in range. After it starts giving me alerts of beacons entering into range, the response is great, at less than 0.5 seconds for a beacon that is turned on/off.
What are the possible reasons and solutions for the issue? I am trying to create an I-Beacon Attendance App. Many thanks.
*I also tried advices given from other posts like turning off Wifi to minimise interference.
Clement
The time it takes to detect a beacon in the background is largely derermined by how the phone scans in low power mode, combined with the beacon transmitter's advertising rate.
Android devices generally put BLE scans into low power mode when the screen is off. The Android Beacon Library does so explicitly when using BackgroundPowerSaver and the OS enforces this anyway on newer Android versions.
Low power mode means the BLE chip is commanded to use a duty cycle when scans are on. On open source Android, this is set to a 5120 ms interval with only 512 ms window of active scanning -- a 10% duty cycle. This saves battery by about 90% vs constant scanning, but it delays detections.
private static final int SCAN_MODE_LOW_POWER_WINDOW_MS = 512;
private static final int SCAN_MODE_LOW_POWER_INTERVAL_MS = 5120;
private static final int SCAN_MODE_BALANCED_WINDOW_MS = 1024;
private static final int SCAN_MODE_BALANCED_INTERVAL_MS = 4096;
private static final int SCAN_MODE_LOW_LATENCY_WINDOW_MS = 4096;
private static final int SCAN_MODE_LOW_LATENCY_INTERVAL_MS = 4096;
See here at AOSP
This is where the transmitter's advertising rate cones in. If the transmitter is advertising at 10Hz, there should be about 10 packets per second to detect. These are spaced randomly but on average one every 100ms. So you might usually be able to detect 4 packets during the 450 ms active scan window. In practice, you almost never detect that many as some are lost due to noise and collisions in radio space. At close range, and 80 percent receive rate is typical. At further ranges, the receive rate goes down further.
If a packet is detected in one scan interval under this scenario, the OS will get a callback on just under 5 seconds. If for some reason no packet is detected in the first scan window but is in the second, the callback will come in just over 9 seconds.
Improving this time means changing the scan interval to be smaller. On Android, you can only do this by changing the scan mode to high power as the window size is fixed by the OS. This usually means having the screen on.
The numbers above are for open source Android (e.g. Pixel phones). Some manufacturers (secretly) customize these settings. My testing suggests most Samsung devices with Android 6+ set the scan interval to 10 seconds with an unknown active scan duration. This means Samsung devices will give you about the results you describe even under the best conditions. Other manufacturers may vary. Getting the value for your manufacturer is impossible without the source code -- the only alternative is experimentation like you are doing.
Finally, do not confuse the Android Beacon Library's scanPeriod and betweenScanPeriod with the scan window/interval described above. While both have similar goals and effects, the OS scan window is not configurable and enforced at a much lower level, usually by the Bluetooth chip itself on newer devices.

What’s the meaning of `Duration: 30.18s, Total samples = 26.26s (87.00%)` in go pprof?

As my understanding, pprof stops and samples go program at every 10ms. So a 30s program should got 3000 samples, but what’s the meaning of the 26.26s? How can the samples count be shown as time duration?
What’s more, I even ever got such output shows that the sample time is bigger than wall time, how could it be such result?
Duration: 5.13s, Total samples = 5.57s (108.58%)
That confusing wording was reported in google/pprof issue 128
The "Total samples" part is confusing.
Milliseconds are continuous, but samples are discrete — they're individual points, so how can you sum them up into a quantity of milliseconds?
The "sum" is the sum of a discrete number (a quantity of samples), not a continuous range (a time interval).
Reporting the sums makes perfect sense, but reporting discrete numbers using continuous units is just plain confusing.
Please update the formatting of the Duration line to give a clearer indication of what a quantity of samples reported in milliseconds actually means.
Raul Silvera's answer:
Each callstack in a profile is associated to a set of values. What is reported here is the sum of these values for all the callstacks in the profile, and is useful to understand the weight of individual frames over the full profile.
We're reporting the sum using the unit described in the profile.
Would you have a concrete suggestion for this example?
Maybe just a rewording would help, like:
Duration: 1.60s, Samples account for 14.50ms (0.9%)
There are still pprof improvements discussed for pprof in golang/go issue 36821
pprof CPU profiles lack accuracy (closeness to the ground truth) and precision (repeatability across different runs).
The issue is with the use of OS timers used for sampling; OS timers are coarse-grained and have a high skid.
I will propose a design to extend CPU profiling by sampling CPU Performance Monitoring Unit (PMU) aka hardware performance counters
It includes examples where Total samples exceeds duration.
Dependence on the number of cores and length of test execution:
The results of goroutine.go test depend on the number of CPU cores available.
On a multi-core CPU, if you set GOMAXPROCS=1, goroutine.go will not show a huge variation, since each goroutine runs for several seconds.
However, if you set GOMAXPROCS to a larger value, say 4, you will notice a significant measurement attribution problem.
One reason for this problem is that the itimer samples on Linux are not guaranteed to be delivered to the thread whose timer expired.
Since Go 1.17 (and improved in Go 1.18), you can add pprof labels to know more:
A cool feature of Go's CPU profiler is that you can attach arbitrary key value pairs to a goroutine. These labels will be inherited by any goroutine spawned from that goroutine and show up in the resulting profile.
Let's consider the example below that does some CPU work() on behalf of a user.
By using the pprof.Labels() and pprof.Do() API, we can associate the user with the goroutine that is executing the work() function.
Additionally the labels are automatically inherited by any goroutine spawned within the same code block, for example the backgroundWork() goroutine.
func work(ctx context.Context, user string) {
labels := pprof.Labels("user", user)
pprof.Do(ctx, labels, func(_ context.Context) {
go backgroundWork()
directWork()
})
}
How you use these labels is up to you.
You might include things such as user ids, request ids, http endpoints, subscription plan or other data that can allow you to get a better understanding of what types of requests are causing high CPU utilization, even when they are being processed by the same code paths.
That being said, using labels will increase the size of your pprof files. So you should probably start with low cardinality labels such as endpoints before moving on to high cardinality labels once you feel confident that they don't impact the performance of your application.

Speed up without a serial fraction

I ran a set of experiments on a parallel package, say superlu-dist, with different processor numbers e.g.: 4, 16, 32, 64
I got the wall clock time for each experiment, say: 53.17s, 32.65s, 24.30s, 16.03s
The formula of speedup is :
serial time
Speedup = ----------------------
parallel time
But there is no information about the serial fraction.
Can I simply take the reciprocal of the wall clock time?
Can I simply take the reciprocal of the wall clock time ?
No,true Speedup figures require comparing Apples to Apples :
This means, that an original, pure-[SERIAL] process-scheduling ought be compared with any other scenario, where parts may get modified, so as to use some sort of parallelism ( the parallel fraction may get re-organised, so as to run on N CPUs / computing-resources, whereas the serial fraction is left as was ).
This obviously means, that the original [SERIAL]-code was extended ( both in code ( #pragma-decorators, OpenCL-modifications, CUDA-{ host_to_dev | dev_to_host }-tooling etc.), and in time( to execute these added functionalities, that were not present in the original [SERIAL]-code, to benchmark against ), so as to add some new sections, where the ( possible [PARALLEL] ) other part of the processing will take place.
This comes at cost -- add-on overhead costs ( to setup and to terminate and to communicate data from [SERIAL]-part there, to the [PARALLEL]-part and back ) -- which all adds additional [SERIAL]-part workload ( and execution time + latency ).
For more details, feel free to read section Criticism in article on re-formulated Amdahl's Law.
The [PARALLEL]-portion seems interesting, yet the Speedup principal ceiling is in the [SERIAL]-portion duration ( s = 1 - p ) in the original,
but to which add-on durations and added latency costs need to get added as accumulated alongside the "organisation" of work from an original, pure-[SERIAL], to the wished-to-have [PARALLEL]-code execution process scheduling, if realistic evaluation is to be achieved
run the test on a single processor and set that as the serial time, ...,
as #VictorSong has proposed sounds easy, but benchmarks an incoherent system ( not the pure-[SERIAL] original) and records a skewed yardstick to compare against.
This is the reason, why fair methods ought be engineered. The pure-[SERIAL] original code-execution can be time-stamped, so as to show the real durations of unchanged-parts, but the add-on overhead times have to get incorporated into the add-on extensions of the serial part of the now parallelised tests.
The re-articulated Amdahl's Law of Diminishing Returns explains this, altogether with impacts from add-on overheads and also from atomicity-of-processing, that will not allow further fictions of speedup growth, given more computing resources are added, but a parallel-fraction of the processing does not permit further split of task workloads, due to some form of its internal atomicity-of-processing, that cannot be further divided in spite of having free processors available.
The simplified of the two, re-formulated expressions stands like this :
1
S = __________________________; where s, ( 1 - s ), N were defined above
( 1 - s ) pSO:= [PAR]-Setup-Overhead add-on
s + pSO + _________ + pTO pTO:= [PAR]-Terminate-Overhead add-on
N
Some interactive GUI-tools for further visualisations of the add-on overhead-costs are available for interactive parametric simulations here - just move the p-slider towards the actual value of the ( 1 - s ) ~ having a non-zero fraction of the very [SERIAL]-part of the original code :
What do you mean when you say "serial fraction"? According to a Google search apparently superlu-dist is C, so I guess you could just use ctime or chrono and take the time the usual way, it works for me with both manual std::threads and omp.
I'd just run the test on a single processor and set that as the serial time, then do the test again with more processors (just like you said).

PWM transistor heating - Rapberry

I have a raspberry and an auxiliary PCB with transistors for driving some LED strips.
The strips datasheets says 12V, 13.3W/m, i'll use 3 strips in parallel, 1.8m each, so 13.3*1.8*3 = 71,82W, with 12 V, almost 6A.
I'm using an 8A transistor, E13007-2.
In the project i have 5 channels of different LEDs: RGB and 2 types of white.
R, G, B, W1 and W2 are directly connected in py pins.
LED strips are connected with 12V and in CN3, CN4 for GND (by the transistor).
Transistor schematic.
I know that that's a lot of current passing through the transistors, but, is there a way to reduce the heating? I think it's getting 70-100°C. I already had a problem with one raspberry, and i think it's getting dangerous for the application. I have some large traces in the PCB, that's not the problem.
Some thoughts:
1 - Resistor driving the base of the transistor. Maybe it won't reduce heating, but i think it's advisable for short circuit protection, how can i calculate this?
2 - The PWM has a frequency of 100Hz, is there any difference if i reduce this frequency?
The BJT transistor you're using has current gain hFE of roughly 20. This means that the collector current is roughly 20 times the base current, or the base current needs to be 1/20 of the collector current, i.e. 6A/20=300mA.
Rasperry PI for sure can't supply 300mA current from the IO pins, so you're operating the transistor in linear region, which causes it to dissipate a lot of heat.
Change your transistors to MOSFETs with low enough threshold voltage (like 2.0V to have enough conduction at 3.3V IO voltage) to keep it simple.
Using a N-Channel MOSFET will run much cooler if you get enough gate voltage to force to completely enhance. Since this is not a high volume item why not simply use a MOSFET gate driver chip. Then you can use a low RDS on device. Another device is the siemons BTS660 (S50085B BTS50085B TO-220). it is a high side driver that you will need to drive with an open collector or drain device. It will switch 5A at room temperature with no heat sink.It is rated for much more current and is available in a To220 type package. It is obsolete but available as is the replacement. MOSFETs are voltage controlled while transistors are current controlled.

Solaris prstat - definition of "recent" time used in percentages

The man page for prstat (on Solaris 10 in my case) notes that that CPU % output is the "percentage of recent CPU time". I am trying to understand in more depth what "recent" means in this context - is it a defined amount of time prior to the sample, does it relate to the sampling interval, etc? Appreciate any insights, particularly with references to supporting documentation. I've searched but haven't been able to find a good answer. Thanks!
Adrian
The kernel maintains data that you see at the bottom - those three numbers.
For each process.
uptime shows you what those numbers are. Those are the 'recent' times for load average - the line at the bottom of prstat. 1 minute, 5 minutes, and 15 minutes.
Recent == 1 minute worth of sampling (last 60 seconds). Those numbers are averages, which is why when you first start prstat the number and processes usually change.
On the first pass you may see processes like nscd that have lots of cpu but have been up for a long time. The first display iteration is completely historical. After that the numbers reflect recent == last one minute average.
You should consider enabling sar sampling to get a much better picture.
Want a reference - try :
http://www.amazon.com/Solaris-Internals-OpenSolaris-Architecture-Edition/dp/0131482092

Resources