How to determine programmatically the Windows' Performance settings with Delphi 2010 - windows

The following code is to fade my application on close.
procedure TfrmMain.btnClose1Click(Sender: TObject);
var
i : Integer;
begin
for i := 255 downto 0 do begin
frmMain.AlphaBlendValue := i;
application.ProcessMessages;
end;
Close;
end;
With Windows performance set to “Let Windows choose…”
When closing my Delphi app with the above code the fade is almost
instantaneous (maybe ¼ second at the most, if I blink I miss the
transition).
If I set the performance Option to ‘Adjust for best performance”
When exiting the same app the fade takes over 12 seconds.
Using the same code but commenting out the AlphaBlendValue change removes the delay.
I tested this out on both Delphi 2010 and DelphiXE2 and the results are the same.
This was tested on Windows 7 Ultimate 64bit if that makes any difference.
To say the least this behavior puzzles me.
I thought that the forms Alpha property was handled by the GPU and would therefore not be affected by Windows performance settings that should would be targeted at maximizing CPU performance.
So as far as this is concerned I'm not sure if this is a Windows 7 bug, a Delphi bug or just my lack of knowledge.
As far as a fix...
Is there a way to tell if Windows is running in crap graphics/max performance mode so that I can disable Alpha fade effects in my apps?
Edit for clarity:
While I would like to fix the fade what I am really looking for is a way to determine what the Windows performance setting is.
I am looking for how to determine a specific Windows setting - when you go into Windows Performance Options there are 3 tabs. On the first tab "Visual Effects" there are 3 canned options and a 4th option for 'Custom'. Minimally I am trying to determine if the option chosen is 'Adjust for best performance', if I could determine what the settings are on this tab even better.
Appreciate any help.

The fundamental problem with your code is that you are forcing 256 distinct updates irrespective of the performance characteristics of the machine. You don't have to use every single alpha blend value between 255 and 0. You can skip some values and still have a smooth fade.
You need to account for the actual graphics performance of the machine. Since you cannot predict that, you should account for real time in your fade code. Doing so will give you a consistent rate of fade irrespective of the performance characteristics of your machine.
So, here's a simple example to demonstrate tying the fade rate to real time:
procedure TfrmMain.btnClose1Click(Sender: TObject);
var
Stopwatch: TStopwatch;
NewAlphaBlendValue: Integer;
begin
Stopwatch := TStopwatch.StartNew;
while True do
begin
NewAlphaBlendValue := 255-(Stopwatch.ElapsedMilliseconds div 4);
if NewAlphaBlendValue>0 then
AlphaBlendValue := NewAlphaBlendValue
else
break;
end;
Close;
end;
The fade has a 1 second duration. You can readily adjust the mathematics to modify the duration to your requirements. This code will produce a smooth fade even on your low performing machine.
I would also comment that you should not use the global variable drmMain in a TfrmMain method. The TfrmMain method already has access to the instance. It is Self. And of course you can omit the Self. What's more the call to ProcessMessages is bad. That allows re-entrant handling of queued input messages. You don't want that to happen. So remove the call to ProcessMessages.
You actually ask about detecting the Adjust for best performance setting. But I think that's the wrong question. For a start you should fix your fade code so that the fade duration is independent of graphics performance.
Having done that you may still wish to disable the fade if the user has asked for lower quality appearance settings. I don't think you should look for one of the 3 canned options that you mention. They are quite possibly Windows version specific. Personally I would base the behaviour on the Animate windows when minimizing and maximizing setting. My rationale is that if the user does not want minimize and maximize to be animated, then presumably they don't want window close to be faded.
Here's how to read that setting:
function GetWindowAnimation: Boolean;
var
AnimationInfo: TAnimationInfo;
begin
AnimationInfo.cbSize := SizeOf(AnimationInfo);
if not SystemParametersInfo(SPI_GETANIMATION, AnimationInfo.cbSize,
#AnimationInfo, 0) then
RaiseLastOSError;
Result := AnimationInfo.iMinAnimate<>0;
end;
I think that most of the other settings that you may be concerned with can also be read using SystemParametersInfo. You should be able to work out how to do so by following the documentation.

Sorry for the tardy followup but it took me a while to figure out a working answer to my question and some of the issues behind it.
First, a thank you to David Heffernan for insight on a better way to handle the fade loop and information on the TStopWatch function from the Delphi's Diagnostics unit, much appreciated.
In regards to being able to determine the Windows' Performance settings...
When using the following un-optimized fade loop
procedure TfrmMain.btnFadeNCloseClick(Sender: TObject);
var
i : Integer;
begin
for i := 255 downto 0 do
frmMain.AlphaBlendValue := i;
Close;
end;
the actual Windows Performance Option settings causing the performance issue are "Enable desktop composition" and "Use visual styles on Windows and buttons". If both options are enabled there is no issue, if either setting is not enabled the loop crawls** (about 12 seconds on my system if the form is maximized).
Turns out that turning Aero Glass on or off affects these same 2 settings. So being able to detect if Aero Glass is on or not enables me to determine whether to not to enable the form effects, such as transition fades and other eye candy, in my apps. Plus now I can also capture that information in my bug reports.
**Note this appears to be an NVidia issue/bug, or at least an issue that is much more severe on systems with NVidia graphics cards. On 2
different NVidia systems (with recent, if not latest drivers) I
got similar results for a miximized form fade - less than
.001 seconds if Aero Glass is on, around 12 seconds if Aero Glass is
off. On a system with an Intel graphics card - less than .001 seconds
if Aero Glass is on, about 3.7 seconds if Aero Glass is off. Now
granted my test sampling is small, 3 NVidia systems (counting my
customer who initially reported the issue) and one non-NVidia system
but if I was using a decent NVidia graphic card I would not bother
turning Aero Glass off.
Below is the working code to detect if Aero Glass is enabled via Delphi:
This function has been tested on a Windows7 64 bit system and works with Delphi 2007, 2010 and Xe2 (32 & 64bit).
All of the various versions of the Delphi function below that I found on the net were broken - along with comments of people complaining about getting Access Violation errors.
What finally shed the light on fixing the bad code was Gerry Coll's response to: AccessViolationException in Delphi - impossible (check it, unbelievable...) which was about trying to fix AV errors in a function of the same type.
function ISAeroEnabled: Boolean;
type
_DwmIsCompositionEnabledFunc = function(var IsEnabled: Bool): HRESULT; stdcall;
var
Flag : BOOL;
DllHandle : THandle;
OsVersion : TOSVersionInfo;
DwmIsCompositionEnabledFunc: _DwmIsCompositionEnabledFunc;
begin
Result:=False;
ZeroMemory(#OsVersion, SizeOf(OsVersion));
OsVersion.dwOSVersionInfoSize := SizeOf(TOSVERSIONINFO);
if ((GetVersionEx(OsVersion)) and (OsVersion.dwPlatformId = VER_PLATFORM_WIN32_NT) and
(OsVersion.dwMajorVersion = 6) and (OsVersion.dwMinorVersion < 2)) then //Vista&Win7 only (no Win8)
begin
DllHandle := LoadLibrary('dwmapi.dll');
try
if DllHandle <> 0 then
begin
#DwmIsCompositionEnabledFunc := GetProcAddress(DllHandle, 'DwmIsCompositionEnabled');
if (#DwmIsCompositionEnabledFunc <> nil) then
begin
if DwmIsCompositionEnabledFunc(Flag)= S_OK then
Result:= Flag;
end;
end;
finally
FreeLibrary(DllHandle);
end;
end;
end;

Related

Delphi XE8 Printe VCL and DocumentProperties strange issue

I'm using the vcl.printers unit (delphi XE8) and I'm facing an error when "talking" to a printer.
I traced into the vcl.printers and found this code (written by EMB people):
if OpenPrinter(ADevice, FPrinterHandle, nil) then
begin
if DeviceMode = 0 then // alloc new device mode block if one was not passed in
begin
DeviceMode := GlobalAlloc(GHND,
DocumentProperties(0, FPrinterHandle, ADevice, nil, nil, 0));
if DeviceMode <> 0 then
begin
DevMode := GlobalLock(DeviceMode);
if DocumentProperties(0, FPrinterHandle, ADevice, DevMode^,
DevMode^, DM_OUT_BUFFER) < 0 then
begin
GlobalUnlock(DeviceMode);
GlobalFree(DeviceMode);
DeviceMode := 0;
DevMode := nil;
end
end;
end;
if DeviceMode <> 0 then
SetPrinterCapabilities(DevMode^.dmFields);
end;
The
DocumentProperties(0, FPrinterHandle, ADevice, nil, nil, 0)
return the correct buffer size the first time (I haven't written it somewhere), then going thru the second execution time it returns 4294967295 bytes, indeed a -1 because declaration is wrong, but meanning an error.
As you can see the VCL code handle the errors very poorly since there is no error check !
But what error I have here and why ?
DocumentProperties lies in winspool.dll
To recover from it, I need to reboot the PC, but I cannot use this more than one time pass that is vey annoying for debug.
The printer is simply the "PDFcreator"
I tried with other PC and seems OK even if I run it many times.
I have also two laser network printers.
Thanks
In the past I did have the same problem with two customers. I did track it down to printers unit (printers.pas) Kind of hard to track it down without debugger on a distant computer in other part of my country.
Ok.. but I did track it all the way down to this line:
DeviceMode := GlobalAlloc(GHND, DocumentProperties (0, FPrinterHandle, ADevice, StubDevMode, StubDevMode, 0)); in the function SetPrinter in the unit PRINTERS.PAS
When I did broke it up into two lines, i.e. call to DocumentProperties first and store the value in integer variable and then check the value and only then call Globalalloc if the value is greater than 0 and kind of debugged it with stored values in debug file the error was truly in the DocumentProperties function from SPOOL.DLL if I remember correctly. This function returned -1 as size for the device, but only with this customer on one computer (he is using 4 or 5 with my program)
Of all my customers (close to 200 clients) I have had this issue on two computers. The other one fixed it kind of itself.. I didn't know how it did get fixed. The later one I was trying to fix just a couple of minutes ago. In the end I found a solution. I did fix this customer with simple change of shortcut. I began to use the automatic fix for compatibility in Windows 10 and then ran the button "test program" and it worked.. No error choosing printers or using it's propertis. Ok.. Then I tried again with the shortcut alone.. aarrgg.. error returned.. but then, aha.. I thought to myself "this has to connects to how Windows is running this program" and changed how Windows 10 ran the program as check "run as administrator" to uncheck.
And no problem.
On almost every compture with Windows 10 I do check "Run as Administrator" with no problems. I think there was a update or some issues with spool.dll that connects these dots.
ps. If you google this behavior with Delphi DocumentProperties problems, then you will find out this is known problem.. some say connected to x86 and x64 mode, but I found this out.

Delphi REST mac memory leak

I am currently looking for a way around an apparent memory leak in the Mac implementation of the REST client. The code to generate the memory leak is the following (running XE8, update 1):
program mac_REST_leak_test;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, REST.Client, REST.Types, IPPeerClient;
var
request : TRestRequest;
ii, iMax : integer;
begin
iMax := 1;
for ii := 0 to iMax do
begin
request := TRestRequest.Create(nil);
// Fake Online REST API for Testing and Prototyping
request.Client := TRestClient.Create('http://jsonplaceholder.typicode.com/');
request.Method := rmPOST;
request.Execute();
request.Client.Free();
request.Free();
end;
end.
This is the smallest block of code that demonstrates the leak. Essentially, I have a synching service that makes REST requests every so often.
When I run this on Windows, using MadExcept, no leaks are found. Examining the running process in ProcessMonitor shows no increase in the amount of memory being used.
When run on a Mac, however, the Activity Monitor shows the memory allocated to the app continue to rise. Further, when run using Instruments, there appear to be leaks dealing with several URL and HTTP classes on mac.
Does anybody know how to resolve this leak?
(As an aside, it would be really helpful to know exactly where the leak is coming from on Mac, but the only Delphi classes listed are the TMethodImplementationIntercept. I'm to believe that this is due to the fact that Delphi doesn't generate a dSYM file for Mac. If anybody knows a way around that, that would be awesome too!)
UPDATE
By varying iMax from 1 to 10 and comparing the FastMM4 output, it appears that the leak is in the class Macapi.ObjectiveC.TConvObjID.XForm. The 10 iteration output contains 9 more leaks with this as a stack trace compared to the 1 iteration. I have reported this to Embarcadero as RSP-12242.
Yes FastMM4 has OSX leak reporting support in latest SVN revision. Unfortunatly, the "global" leaks from a simple empty Delphi FMX application makes it difficult to analyse the mem-logfile. A few leaks has been fixed in XE10 but some objects in MacApi.ObjectiveC bridge still generate leaks. I have reported that in Quality Central & Quality Portal (QC & QP). So it's difficult to use FastMM4 for leak finding.
Please separate Delphi object leaks and ObjectiveC leaks, second you can find with instruments.

Pixel modifying code runs quick in main app, really slow in Delphi 6 DirectShow filter with other problems

I have a Delphi 6 application that sends bitmaps to a DirectShow DLL in real-time, 25 frames a second. The DirectShow DLL is my code too and is also written in Delphi 6 using the DSPACK DirectShow component suite. I have a simple block of code that goes through each pixel in the bitmap modifying the brightness and contrast of the image, if a certain flag is set, otherwise the bitmap is pushed out the DirectShow DLL unmodified (push source video filter). The code used to be in the main application and then I just moved it into the DirectShow DLL. When it was in the main application it ran fine. I could see the changes in the bitmap as expected. However, now that the code resides in the DirectShow DLL it has the following problems:
When the code block below is active the DirectShow DLL is really slow. I have a quad core i5 and it's really slow. I can also see a big spike in the CPU consumption. In contrast, the very same code running in the main application ran fine on an old single core P4. It did hit the CPU noticeably on that old machine but the video was smooth and there were no problems. The images are only 352 x 288 pixels in size.
I don't see the expected changes to the visible bitmap. I can trace the code in the DirectShow DLL and see the numerical values of each pixel properly altered by the code, but the viewable image in the Graph Edit ActiveMovie window looks completely unchanged.
If I deactivate the code, which I can do in real-time, the ActiveMovie window shows video that is as smooth as glass, perfectly rendered with the CPU barely touched. If I reactivate the code the video is now really choppy, probably showing only 1 to 2 frames a second with a long delay before the first frame is shown, and the CPU spikes. Not completely, but a lot more than I would expect.
I tried compiling the DirectShow DLL with everything on including range checking, overflow checking, etc. and there were no warnings or errors during run-time. I then tried compiling for fastest speed and it still had the exact same problems listed above. Something is really wrong and I can't figure out what. Note, I do indeed lock the canvas before modifying the bitmap and unlock it after I'm done. If it weren't for the "everything on" compilation run I noted above I'd say it felt like an FPU Exception was being raised and silently swallowed with every pixel computation, but as I said, no errors or Exceptions are occurring.
UPDATE: I am putting this here so that the solution, which is embedded in one of Roman R's comment, is plainly visible. The problem that I was not setting the PixelFormat property to pf24Bit before accessing the ScanLine property. As Roman suggested, not doing this must make the TBitmap code create a temporary copy of the bitmap. As soon as I added the line of code below the problems went away, both that of changes not being visible and the soft page faults. It's an insidious problem because the only object that is affected is the pointer you use to access the ScanLine property, since (assumption) it contains a pointer to a temporary copy of the bitmap. That's must be why the subsequent TextOut() call still worked since it worked on the original copy of the bitmap.
clip.PixelFormat := pf24bit; // The missing code line that fixes the problem.
Here's the code block I've been referring to:
function IntToByte(i: Integer): Byte;
begin
if i > 255 then
Result := 255
else if i < 0 then
Result := 0
else
Result := i;
end;
// ---------------------------------------------------------------
procedure brightnessTurboBoost(var clip: TBitmap; rangeExpansionPowerOf2: integer; shiftValue: Byte);
var
p0: PByte;
x,y: Integer;
begin
if (rangeExpansionPowerOf2 = 0) and (shiftValue = 0) then
exit; // These parameter settings will not change the pixel values.
for y := 0 to clip.Height-1 do
begin
p0 := clip.scanline[y];
// Can't just do the whole buffer as a big block of bytes since the
// individual scan lines may be padded for CPU alignment.
for x := 0 to (clip.Width - 1) * 3 do
begin
if rangeExpansionPowerOf2 >= 1 then
p0^ := IntToByte((p0^ shl rangeExpansionPowerOf2) + shiftValue)
else
p0^ := IntToByte(p0^ + shiftValue);
Inc(p0);
end;
end;
end;
There are a few things to say about this code snippet.
First of all, you are using Scanline property of TBitmap class. I have not been dealign with Delphi for many years, so I might be wrong about this but I am under impression that Scanline is not actually a thin accessor, is it? It might be internally hiding things which can dramatically affect performance, such as "if he wants to access the bits of the image, then we have to first convert it to DIB before returning pointers". So a thing looking so simple might appear to be a killer.
"if rangeExpansionPowerOf2 >= 1 then" in the inner loop body? You don't really want to compare this all the way. Either make two separate functions or duplicate the whole loop without in two version for zero and non-zero rangeExpansionPowerOf2 and do this if only once.
"for ... to (clip.Width - 1) * 3 do" I am not really sure that Delphi optimizes the upper boundary evaluation to make it only once. You might be doing those multiplication thrice for every pixel, while you could do it only once the whole image.
For top perofrmance IntToByte is definitely implemented in MMX to avoid ifs and process multiple bytes at once.
Still as you say that images are only 352x288, I would suspect that #1 is ruining the performance.

Switching from debug to release configuration having no effect on performance?

I have tested a couple of benchmarking snippets on Delphi like this one:
uses
..., Diagnostics;
procedure TForm2.Button1Click(Sender: TObject);
var
i,elapsed: integer;
stopwatch: TStopwatch;
ff: textfile;
begin
if FileExists('c:\bench.txt') then
DeleteFile('c:\bench.txt');
stopwatch := TStopwatch.create;
stopwatch.Reset;
stopwatch.Start;
AssignFile(ff,'c:\bench.txt');
Rewrite(ff);
for I := 1 to 999000 do
write(ff,'Delphi programmers are ladies men :D');
CloseFile(ff);
stopwatch.Stop;
elapsed := stopwatch.ElapsedMilliseconds;
ShowMessage(inttostr(elapsed));
end;
It does not matter if I run/compile under debug or release configuration the result is around 900.
When I switch from debug to release in Visual Studio (for both c++ and c#) my programs become MAGICALLY faster. I am using Delphi 2010 and I activate release configuration from project manager as well as project -> configuration manager and even project -> options -> Delphi compiler but with no effect why??
If it matters: I am using Windows XP, I got 1gb RAM and an Intel Core2 CPU.
Did you check, how the configurations differ? Even if they have names like RELEASE or DEBUG, they are fully configurable. You can even configure them the other way round.
The code you are timing is mostly I/O related. So make sure that the IO checks are turned off in the RELEASE configuration.
Delphi still creates fast code even when debugged ;)
In addition to what Uwe said, make sure you do a "Build" after switching the configuration. Doing a simple compile or running the app will not recompile all units with the new settings.
Like the other commenters, I also wouldn't expect too much of a difference between the two configurations given the benchmark used. The real bottleneck is the I/O and that will very likely outbalance any performance differences between DEBUG and RELEASE.
Finally, debugging in Delphi just isn't that much slower than Release builds. Heck, I sometimes run Outlook in the debugger for most of the day (I'm developing Outlook addins) without noticing any perceivable performance difference.
That's a bad test case I think. All you do is write to a file, which means most of the time is spent in Windows code, not in your Delphi code, and hence the compiler settings won't significantly affect total execution time
There's nothing in your main code bulk:
for I := 1 to 999000 do
write(ff,'Delphi programmers are ladies men :D');
that requires strenuous checks. Your choices are:
Range checking
Overflow checking
I/O checking
Of those three, only I/O checking will apply, and that is probably the equivalent of adding:
for I := 1 to 999000 do
begin
hresult := Write(ff, 'Dephi programmers are ladies men :D');
if hresult < 0 then
raise EIOException.Create('That''s what your mom told me, in bed.');
end;
And a the CMP and JNE CPU instructions are not very complicated. They're dwarfed by writing to the hard-drive.
It runs just as fast because it is fast.

Clock drift on Windows

I've developed a Windows service which tracks business events. It uses the Windows clock to timestamp events. However, the underlying clock can drift quite dramatically (e.g. losing a few seconds per minute), particularly when the CPUs are working hard. Our servers use the Windows Time Service to stay in sync with domain controllers, which uses NTP under the hood, but the sync frequency is controlled by domain policy, and in any case even syncing every minute would still allow significant drift. Are there any techniques we can use to keep the clock more stable, other than using hardware clocks?
Clock ticks should be predictable, but on most PC hardware - because they're not designed for real-time systems - other I/O device interrupts have priority over the clock tick interrupt, and some drivers do extensive processing in the interrupt service routine rather than defer it to a deferred procedure call (DPC), which means the system may not be able to serve the clock tick interrupt until (sometimes) long after it was signalled.
Other factors include bus-mastering I/O controllers which steal many memory bus cycles from the CPU, causing it to be starved of memory bus bandwidth for significant periods.
As others have said, the clock-generation hardware may also vary its frequency as component values change with temperature.
Windows does allow the amount of ticks added to the real-time clock on every interrupt to be adjusted: see SetSystemTimeAdjustment. This would only work if you had a predictable clock skew, however. If the clock is only slightly off, the SNTP client ("Windows Time" service) will adjust this skew to make the clock tick slightly faster or slower to trend towards the correct time.
I don't know if this applies, but ...
There's an issue with Windows that if you change the timer resolution with timeBeginPeriod() a lot, the clock will drift.
Actually, there is a bug in Java's Thread wait() (and the os::sleep()) function's Windows implementation that causes this behaviour. It always sets the timer resolution to 1 ms before wait in order to be accurate (regardless of sleep length), and restores it immediately upon completion, unless any other threads are still sleeping. This set/reset will then confuse the Windows clock, which expects the windows time quantum to be fairly constant.
Sun has actually known about this since 2006, and hasn't fixed it, AFAICT!
We actually had the clock going twice as fast because of this! A simple Java program that sleeps 1 millisec in a loop shows this behaviour.
The solution is to set the time resolution yourself, to something low, and keep it there as long as possible. Use timeBeginPeriod() to control that. (We set it to 1 ms without any adverse effects.)
For those coding in Java, the easier way to fix this is by creating a thread that sleeps as long as the app lives.
Note that this will fix this issue on the machine globally, regardless of which application is the actual culprit.
You could run "w32tm /resync" in a scheduled task .bat file. This works on Windows Server 2003.
Other than resynching the clock more frequently, I don't think there is much you can do, other than to get a new motherboard, as your clock signal doesn't seem to be at the right frequency.
http://www.codinghorror.com/blog/2007/01/keeping-time-on-the-pc.html
PC clocks should typically be accurate to within a few seconds per day. If you're experiencing massive clock drift-- on the order of minutes per day-- the first thing to check is your source of AC power. I've personally observed systems with a UPS plugged into another UPS (this is a no-no, by the way) that gained minutes per day. Removing the unnecessary UPS from the chain fixed the time problem. I am no hardware engineer, but I'm guessing that some timing signal in the power is used by the real-time clock chip on the motherboard.
As already mentioned, Java programs can cause this issue.
Another solution that does not require code modification is adding the VM argument -XX:+ForceTimeHighResolution (found on the NTP support page).
9.2.3. Windows and Sun's Java Virtual Machine
Sun's Java Virtual Machine needs to be started with the >-XX:+ForceTimeHighResolution parameter to avoid losing interrupts.
See http://www.macromedia.com/support/coldfusion/ts/documents/createuuid_clock_speed.htm for more information.
From the referenced link (via the Wayback machine - original link is gone):
ColdFusion MX: CreateUUID Increases the Windows System Clock Speed
Calling the createUUID function multiple times under load in
Macromedia ColdFusion MX and higher can cause the Windows system clock
to accelerate. This is an issue with the Java Virtual Machine (JVM) in
which Thread.sleep calls less than 10 milliseconds (ms) causes the
Windows system clock to run faster. This behavior was originally filed
as Sun Java Bug 4500388
(developer.java.sun.com/developer/bugParade/bugs/4500388.html) and has
been confirmed for the 1.3.x and 1.4.x JVMs.
In ColdFusion MX, the createUUID function has an internal Thread.sleep
call of 1 millisecond. When createUUID is heavily utilized, the
Windows system clock will gain several seconds per minute. The rate of
acceleration is proportional to the number of createUUID calls and the
load on the ColdFusion MX server. Macromedia has observed this
behavior in ColdFusion MX and higher on Windows XP, 2000, and 2003
systems.
Increase the frequency of the re-sync.
If the syncs are with your own main server on your own network there's no reason not to sync every minute.
Sync more often. Look at the Registry entries for the W32Time service, especially "Period". "SpecialSkew" sounds like it would help you.
Clock drift may be a consequence of the temperature; maybe you could try to get temperature more constant - using better cooling perhaps? You're never going to loose drift totally, though.
Using an external clock (GPS receiver etc...), and a statistical method to relate CPU time to Absolute Time is what we use here to synch events in distributed systems.
Since it sounds like you have a big business:
Take an old laptop or something which isn't good for much, but seems to have a more or less reliable clock, and call it the Timekeeper. The Timekeeper's only job is to, once every (say) 2 minutes, send a message to the servers telling the time. Instead of using the Windows clock for their timestamps, the servers will put down the time from the Timekeeper's last signal, plus the elapsed time since the signal. Check the Timekeeper's clock by your wristwatch once or twice a week. This should suffice.
What servers are you running? In desktops the times I've come across this are with Spread Spectrum FSB enabled, causes some issues with the interrupt timing which is what makes that clock tick. May want to see if this is an option in BIOS on one of those servers and turn it off if enabled.
Another option you have is to edit the time polling interval and make it much shorter using the following registry key, most likely you'll have to add it (note this is a DWORD value and the value is in seconds, e.g. 600 for 10min):
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient\SpecialPollInterval
Here's a full workup on it: KB816042
I once wrote a Delphi class to handle time resynchs. It is pasted below. Now that I see the "w32tm" command mentioned by Larry Silverman, I suspect I wasted my time.
unit TimeHandler;
interface
type
TTimeHandler = class
private
FServerName : widestring;
public
constructor Create(servername : widestring);
function RemoteSystemTime : TDateTime;
procedure SetLocalSystemTime(settotime : TDateTime);
end;
implementation
uses
Windows, SysUtils, Messages;
function NetRemoteTOD(ServerName :PWideChar; var buffer :pointer) : integer; stdcall; external 'netapi32.dll';
function NetApiBufferFree(buffer : Pointer) : integer; stdcall; external 'netapi32.dll';
type
//See MSDN documentation on the TIME_OF_DAY_INFO structure.
PTime_Of_Day_Info = ^TTime_Of_Day_Info;
TTime_Of_Day_Info = record
ElapsedDate : integer;
Milliseconds : integer;
Hours : integer;
Minutes : integer;
Seconds : integer;
HundredthsOfSeconds : integer;
TimeZone : LongInt;
TimeInterval : integer;
Day : integer;
Month : integer;
Year : integer;
DayOfWeek : integer;
end;
constructor TTimeHandler.Create(servername: widestring);
begin
inherited Create;
FServerName := servername;
end;
function TTimeHandler.RemoteSystemTime: TDateTime;
var
Buffer : pointer;
Rek : PTime_Of_Day_Info;
DateOnly, TimeOnly : TDateTime;
timezone : integer;
begin
//if the call is successful...
if 0 = NetRemoteTOD(PWideChar(FServerName),Buffer) then begin
//store the time of day info in our special buffer structure
Rek := PTime_Of_Day_Info(Buffer);
//windows time is in GMT, so we adjust for our current time zone
if Rek.TimeZone <> -1 then
timezone := Rek.TimeZone div 60
else
timezone := 0;
//decode the date from integers into TDateTimes
//assume zero milliseconds
try
DateOnly := EncodeDate(Rek.Year,Rek.Month,Rek.Day);
TimeOnly := EncodeTime(Rek.Hours,Rek.Minutes,Rek.Seconds,0);
except on e : exception do
raise Exception.Create(
'Date retrieved from server, but it was invalid!' +
#13#10 +
e.Message
);
end;
//translate the time into a TDateTime
//apply any time zone adjustment and return the result
Result := DateOnly + TimeOnly - (timezone / 24);
end //if call was successful
else begin
raise Exception.Create('Time retrieval failed from "'+FServerName+'"');
end;
//free the data structure we created
NetApiBufferFree(Buffer);
end;
procedure TTimeHandler.SetLocalSystemTime(settotime: TDateTime);
var
SystemTime : TSystemTime;
begin
DateTimeToSystemTime(settotime,SystemTime);
SetLocalTime(SystemTime);
//tell windows that the time changed
PostMessage(HWND_BROADCAST,WM_TIMECHANGE,0,0);
end;
end.
I believe Windows Time Service only implements SNTP, which is a simplified version of NTP. A full NTP implementation takes into account the stability of your clock in deciding how often to sync.
You can get the full NTP server for Windows here.

Resources