I have a qt gui that spawns a c++11 clang server in osx 10.8 xcode
It does a cryptographic proof-of-work mining of a name (single mining thread)
when i click .app process takes 4 1/2 hours
when i run the exact exe inside the .app folder, from the terminal, process takes 30 minutes
question, how do i debug this?
thank you
====================================
even worse:
mining server running in terminal.
if i start GUI program that connect to server and just sends (ipc) it the "mine" command: 4 hours
if I start a CL-UI that connects to server and just sends (ipc) it the "mine" command: 30 minutes
both cases the server is mining in a tight loop. corrupt memory? single CPU is at 100%, as it should be.. cant figure it out.
=========
this variable is is used w/o locking...
volatile bool running = true;
server thread
fut = std::async(&Commissioner::generateName, &comish, name, m_priv.get_public_key() );
server loop...
nonce_t reset = std::numeric_limits<nonce_t>::max()-1000;
while ( running && hit < target ) {
if ( nt.nonce >= reset )
{
nt.utc_sec = fc::time_point::now();
nt.nonce = 0;
}
else { ++nt.nonce; }
hit = difficulty(nt.id());
}
evidence is now pointing to deterministic chaotic behavior. just very sensitive to initial conditions.
initial condition may be the timestamp data within the object that is hashed during mining.
mods please close.
Related
Im working on Mac Book Pro, OS : Catalina 10.15.7.
At first I was using VS Code to develop in Go but the ( fan?? ) i guess started sounding like a turbo Jet after a while up to the point that the entire OS would shutdown on its own. ( I do not exactly recall what the message said, it was a black screen with white text saying something like your cpu utilization was too high etc, etc and we had to restart the system ).
Today I am trying to run this Python3 script :
#!/usr/local/bin/python3
import csv
import json
import boto3
import time
from multiprocessing import Pool
dynamodb = boto3.resource('dynamodb', endpoint_url='http://localhost:4566', region_name='us-east-2')
table = dynamodb.Table('myTable')
collection = []
count = 0
with open('items.csv', newline='') as f:
reader = csv.DictReader(f)
for row in reader:
obj = {}
collection.append({
"PK" : int(row['id']),
"SK" : "product",
"Name" : row['name']
})
def InsertItem(i):
table.put_item(Item=i)
if __name__ == '__main__':
with Pool(processes=25) as pool:
result = pool.map(InsertItem, collection, 50)
print(result)
And the same behavior occurs ( it does not seem to be related to VS Code now since im directly running this script from the terminal ), the fans are extremely noisy, the performance drops to almost 0 and i get the lollypop mouse pointer of death. ( which seems to be an omen of the PC about to restart itself ) and the process I mentioned above happens again.
Some hints of what is going on :
Im not the only one having this problem. Another teammate does React and is seeing the same behaviour. ( Hs is using VsCode too but I think the problem is more generic ).
It seems to appear only with "intensive" tasks. ( And please take "intensive" with a grain of salt. I do the very same tasks in my Ubuntu Machine with half the RAM and it does not even flinch ).
I have been using Mac for years, and I do not recall having this issue.
So, my question is, is someone else noticing something similar? Is there some workaround for this?
Last note: The python script you see above I tested last week, it did not take even 2 minutes to run. today with these issues its just lingers forever. And I can see for the prints I am doing that it attempts to insert stuff but it freezes without moving forward.
I have an app written in golang (partially), as part of its operation it will spawn an external process (written in c) and begin monitoring. This external process can take many hours to complete so I am looking for a way to prevent the machine from sleeping or hibernating whilst processing.
I would like to be able to then relinquish this lock so that when the process is finished the machine is allowed to sleep/hibernate
I am initially targeting windows, but a cross-platform solution would be ideal (does nix even hibernate?).
Thanks to Anders for pointing me in the right direction - I put together a minimal example in golang (see below).
Note: polling to reset the timer seems to be the only reliable method, I found that when trying to combine with the continuous flag it would only take effect for approx 30 seconds (no idea why), having said that polling on this example is excessive and could probably be increased to 10 mins (since min hibernation time is 15 mins)
Also FYI this is a windows specific example:
package main
import (
"log"
"syscall"
"time"
)
// Execution States
const (
EsSystemRequired = 0x00000001
EsContinuous = 0x80000000
)
var pulseTime = 10 * time.Second
func main() {
kernel32 := syscall.NewLazyDLL("kernel32.dll")
setThreadExecStateProc := kernel32.NewProc("SetThreadExecutionState")
pulse := time.NewTicker(pulseTime)
log.Println("Starting keep alive poll... (silence)")
for {
select {
case <-pulse.C:
setThreadExecStateProc.Call(uintptr(EsSystemRequired))
}
}
}
The above is tested on win 7 and 10 (not tested on Win 8 yet - presumed to work there too).
Any user request to sleep will override this method, this includes actions such as shutting the lid on a laptop (unless power management settings are altered from defaults)
The above were sensible behaviors for my application.
On Windows, your first step is to try SetThreadExecutionState:
Enables an application to inform the system that it is in use, thereby preventing the system from entering sleep or turning off the display while the application is running
This is not a perfect solution but I assume this is not an issue for you:
The SetThreadExecutionState function cannot be used to prevent the user from putting the computer to sleep. Applications should respect that the user expects a certain behavior when they close the lid on their laptop or press the power button
The Windows 8 connected standby feature is also something you might need to consider. Looking at the power related APIs we find this description of PowerRequestSystemRequired:
The system continues to run instead of entering sleep after a period of user inactivity.
This request type is not honored on systems capable of connected standby. Applications should use PowerRequestExecutionRequired requests instead.
If you are dealing with tablets and other small devices then you can try to call PowerSetRequest with PowerRequestExecutionRequired to prevent this although the description of that is also not ideal:
The calling process continues to run instead of being suspended or terminated by process lifetime management mechanisms. When and how long the process is allowed to run depends on the operating system and power policy settings.
You might also want to use ShutdownBlockReasonCreate but I'm not sure if it blocks sleep/hibernate.
This question comes from the recent question "Correct way to cap Mathematica memory use?"
I wonder, is it possible to programmatically restart MathKernel keeping the current FrontEnd process connected to new MathKernel process and evaluating some code in new MathKernel session? I mean a "transparent" restart which allows a user to continue working with the FrontEnd while having new fresh MathKernel process with some code from the previous kernel evaluated/evaluating in it?
The motivation for the question is to have a way to automatize restarting of MathKernel when it takes too much memory without breaking the computation. In other words, the computation should be automatically continued in new MathKernel process without interaction with the user (but keeping the ability for user to interact with the Mathematica as it was originally). The details on what code should be evaluated in new kernel are of course specific for each computational task. I am looking for a general solution how to automatically continue the computation.
From a comment by Arnoud Buzing yesterday, on Stack Exchange Mathematica chat, quoting entirely:
In a notebook, if you have multiple cells you can put Quit in a cell by itself and set this option:
SetOptions[$FrontEnd, "ClearEvaluationQueueOnKernelQuit" -> False]
Then if you have a cell above it and below it and select all three and evaluate, the kernel will Quit but the frontend evaluation queue will continue (and restart the kernel for the last cell).
-- Arnoud Buzing
The following approach runs one kernel to open a front-end with its own kernel, which is then closed and reopened, renewing the second kernel.
This file is the MathKernel input, C:\Temp\test4.m
Needs["JLink`"];
$FrontEndLaunchCommand="Mathematica.exe";
UseFrontEnd[
nb = NotebookOpen["C:\\Temp\\run.nb"];
SelectionMove[nb, Next, Cell];
SelectionEvaluate[nb];
];
Pause[8];
CloseFrontEnd[];
Pause[1];
UseFrontEnd[
nb = NotebookOpen["C:\\Temp\\run.nb"];
Do[SelectionMove[nb, Next, Cell],{12}];
SelectionEvaluate[nb];
];
Pause[8];
CloseFrontEnd[];
Print["Completed"]
The demo notebook, C:\Temp\run.nb contains two cells:
x1 = 0;
Module[{},
While[x1 < 1000000,
If[Mod[x1, 100000] == 0, Print["x1=" <> ToString[x1]]]; x1++];
NotebookSave[EvaluationNotebook[]];
NotebookClose[EvaluationNotebook[]]]
Print[x1]
x1 = 0;
Module[{},
While[x1 < 1000000,
If[Mod[x1, 100000] == 0, Print["x1=" <> ToString[x1]]]; x1++];
NotebookSave[EvaluationNotebook[]];
NotebookClose[EvaluationNotebook[]]]
The initial kernel opens a front-end and runs the first cell, then it quits the front-end, reopens it and runs the second cell.
The whole thing can be run either by pasting (in one go) the MathKernel input into a kernel session, or it can be run from a batch file, e.g. C:\Temp\RunTest2.bat
#echo off
setlocal
PATH = C:\Program Files\Wolfram Research\Mathematica\8.0\;%PATH%
echo Launching MathKernel %TIME%
start MathKernel -noprompt -initfile "C:\Temp\test4.m"
ping localhost -n 30 > nul
echo Terminating MathKernel %TIME%
taskkill /F /FI "IMAGENAME eq MathKernel.exe" > nul
endlocal
It's a little elaborate to set up, and in its current form it depends on knowing how long to wait before closing and restarting the second kernel.
Perhaps the parallel computation machinery could be used for this? Here is a crude set-up that illustrates the idea:
Needs["SubKernels`LocalKernels`"]
doSomeWork[input_] := {$KernelID, Length[input], RandomReal[]}
getTheJobDone[] :=
Module[{subkernel, initsub, resultSoFar = {}}
, initsub[] :=
( subkernel = LaunchKernels[LocalMachine[1]]
; DistributeDefinitions["Global`"]
)
; initsub[]
; While[Length[resultSoFar] < 1000
, DistributeDefinitions[resultSoFar]
; Quiet[ParallelEvaluate[doSomeWork[resultSoFar], subkernel]] /.
{ $Failed :> (Print#"Ouch!"; initsub[])
, r_ :> AppendTo[resultSoFar, r]
}
]
; CloseKernels[subkernel]
; resultSoFar
]
This is an over-elaborate setup to generate a list of 1,000 triples of numbers. getTheJobDone runs a loop that continues until the result list contains the desired number of elements. Each iteration of the loop is evaluated in a subkernel. If the subkernel evaluation fails, the subkernel is relaunched. Otherwise, its return value is added to the result list.
To try this out, evaluate:
getTheJobDone[]
To demonstrate the recovery mechanism, open the Parallel Kernel Status window and kill the subkernel from time-to-time. getTheJobDone will feel the pain and print Ouch! whenever the subkernel dies. However, the overall job continues and the final result is returned.
The error-handling here is very crude and would likely need to be bolstered in a real application. Also, I have not investigated whether really serious error conditions in the subkernels (like running out of memory) would have an adverse effect on the main kernel. If so, then perhaps subkernels could kill themselves if MemoryInUse[] exceeded a predetermined threshold.
Update - Isolating the Main Kernel From Subkernel Crashes
While playing around with this framework, I discovered that any use of shared variables between the main kernel and subkernel rendered Mathematica unstable should the subkernel crash. This includes the use of DistributeDefinitions[resultSoFar] as shown above, and also explicit shared variables using SetSharedVariable.
To work around this problem, I transmitted the resultSoFar through a file. This eliminated the synchronization between the two kernels with the net result that the main kernel remained blissfully unaware of a subkernel crash. It also had the nice side-effect of retaining the intermediate results in the event of a main kernel crash as well. Of course, it also makes the subkernel calls quite a bit slower. But that might not be a problem if each call to the subkernel performs a significant amount of work.
Here are the revised definitions:
Needs["SubKernels`LocalKernels`"]
doSomeWork[] := {$KernelID, Length[Get[$resultFile]], RandomReal[]}
$resultFile = "/some/place/results.dat";
getTheJobDone[] :=
Module[{subkernel, initsub, resultSoFar = {}}
, initsub[] :=
( subkernel = LaunchKernels[LocalMachine[1]]
; DistributeDefinitions["Global`"]
)
; initsub[]
; While[Length[resultSoFar] < 1000
, Put[resultSoFar, $resultFile]
; Quiet[ParallelEvaluate[doSomeWork[], subkernel]] /.
{ $Failed :> (Print#"Ouch!"; CloseKernels[subkernel]; initsub[])
, r_ :> AppendTo[resultSoFar, r]
}
]
; CloseKernels[subkernel]
; resultSoFar
]
I have a similar requirement when I run a CUDAFunction for a long loop and CUDALink ran out of memory (similar here: https://mathematica.stackexchange.com/questions/31412/cudalink-ran-out-of-available-memory). There's no improvement on the memory leak even with the latest Mathematica 10.4 version. I figure out a workaround here and hope that you may find it's useful. The idea is that you use a bash script to call a Mathematica program (run in batch mode) multiple times with passing parameters from the bash script. Here is the detail instruction and demo (This is for Window OS):
To use bash-script in Win_OS you need to install cygwin (https://cygwin.com/install.html).
Convert your mathematica notebook to package (.m) to be able to use in script mode. If you save your notebook using "Save as.." all the command will be converted to comments (this was noted by Wolfram Research), so it's better that you create a package (File->New-Package), then copy and paste your commands to that.
Write the bash script using Vi editor (instead of Notepad or gedit for window) to avoid the problem of "\r" (http://www.linuxquestions.org/questions/programming-9/shell-scripts-in-windows-cygwin-607659/).
Here is a demo of the test.m file
str=$CommandLine;
len=Length[str];
Do[
If[str[[i]]=="-start",
start=ToExpression[str[[i+1]]];
Pause[start];
Print["Done in ",start," second"];
];
,{i,2,len-1}];
This mathematica code read the parameter from a commandline and use it for calculation.
Here is the bash script (script.sh) to run test.m many times with different parameters.
#c:\cygwin64\bin\bash
for ((i=2;i<10;i+=2))
do
math -script test.m -start $i
done
In the cygwin terminal type "chmod a+x script.sh" to enable the script then you can run it by typing "./script.sh".
You can programmatically terminate the kernel using Exit[]. The front end (notebook) will automatically start a new kernel when you next try to evaluate an expression.
Preserving "some code from the previous kernel" is going to be more difficult. You have to decide what you want to preserve. If you think you want to preserve everything, then there's no point in restarting the kernel. If you know what definitions you want to save, you can use DumpSave to write them to a file before terminating the kernel, and then use << to load that file into the new kernel.
On the other hand, if you know what definitions are taking up too much memory, you can use Unset, Clear, ClearAll, or Remove to remove those definitions. You can also set $HistoryLength to something smaller than Infinity (the default) if that's where your memory is going.
Sounds like a job for CleanSlate.
<< Utilities`CleanSlate`;
CleanSlate[]
From: http://library.wolfram.com/infocenter/TechNotes/4718/
"CleanSlate, tries to do everything possible to return the kernel to the state it was in when the CleanSlate.m package was initially loaded."
We are currently testing a bugfix for an old VB6 application, the initial version of the program would get the PID and store it in an int, and then write it to the database. This works fine until your application gets assigned a PID higher than 32768, in which case you get overflow and the application dies.
We fixed this by changing everything to long, instead of int. but now we have a problem testing. We only see this problem rarely in our production environments (but with devastating effect when it occurs), and never in testing. I've tried to provoke a high PID by spawning a ton of programs, but I never managed to get it past PID 25000.
I did find a tool called HighPid (http://winprogger.com/?p=29) but sadly it doesn't seem to deliver on it's promises. So does anyone out there have a similar (but working) tool, or some other trick to force a high PID on a windows server?
Launch 32767 dummy really light-weight processes? ;-)
100kb*32k = 3.2Gb, so RAM should not limit you.
You could look at it from a different perspective and instrument the function that retrieves the process id, something like (pseudocode):
function GetPID()
{
// ... retrieve process id
#if DEBUG
return pid + 32000;
#else
return pid;
#endif
}
My Win32 application performs numerous disk operations in a designated temporary folder while functioning, and seriously redesigning it is out of the question.
Some clients have antivirus software that scans the same temporary directory (it simply scans everything). We tried to talk them into disabling it - it doesn't work, so it's out of the question either.
Every once in a while (something like once for every one thousand file operations) my application tries to perform an operation on a file which is at that very time opened by the antivirus and is therefore locked by the operating system. A sharing violation occurs and causes an error in my application. This happens about once in three minutes on average.
The temporary folder can contain up to 100k files in most typical scenarios, so I don't like the idea of having them open at all times because this could cause running out of resources on some edge conditions.
Is there some reasonable strategy for my application to react to situations when a needed file is locked? Maybe something like this?
for( int i = 0; i < ReasonableNumber; i++ ) {
try {
performOperation(); // do useful stuff here
break;
} catch( ... ) {
if( i == ReasonableNumber - 1 ) {
throw; //not to hide errors if unlock never happens
}
}
Sleep( ReasonableInterval );
}
Is this a viable strategy? If so, how many times and how often should my application retry? What are better ideas if any?
A virusscanner that locks files while it's scanning them is quite bad. Clients who have virusscanners this bad need to have their brains replaced... ;-)
Okay, enough ranting. If a file is locked by some other process then you can use a "try again" strategy like you suggest. OTOH, do you really need to close and then re-open those files? Can't you keep them open until your process is done?
One tip: Add a delay (sleep) when you try to re-open the file again. About 100 ms should be enough. If the virusscanner keeps the file open that long then it's a real bad scanner. Clients with scanners that bad deserve the exception message that they'll see.
Typically, try up to three times... -> Open, on failure try again, on second failure try again, on third failure just crash.
Remember to crash in a user-friendly way.
I've had experience with antivirus software made by both Symantec and AVG which resulted in files being unavailable for open.
A common problem we experienced back in the 2002 time frame with Symantec was with MSDev6 when a file was updated in this sequence:
a file is opened
contents are modified in memory
application needs to commit changes
application creates new tmp file with new copy of file + changes
application deletes old file
application copies tmp file to old file name
application deletes the tmp file
The problem would occur between step 5 and step 6. Symantec would do something to slowdown the delete preventing the creation of a file with the same name (CreateFile returned ERROR_DELETE_PENDING). MSDev6 would fail to notice that - meaning step 6 failed. Step 7 still happened though. The delete of the original would eventually finish. So the file no longer existed on disk!
With AVG, we've been experiencing intermittent problems being able to open files that have just been modified.
Our resolution was a try/catch in a reasonable loop as in the question. Our loop count is 5.
If there is the possibility that some other process - be it the antivirus software, a backup utility or even the user themselves - can open the file, then you must code for that possibility.
Your solution, while perhaps not the most elegant, will certainly work as long as ReasonableNumber is sufficiently large - in the past I've used 10 as the reasonable number. I certainly wouldn't go any higher and you could get away with a lower value such as 5.
The value of sleep? 100ms or 200ms at most
Bear in mind that most of the time your application will get the file first time anyway.
Depends on how big your files are, but for 10s to 100s of Kb I find that 5 trys with 100ms (0.1 seconds) to be sufficient. If you still hit the error once in a while, double the wait, but YMMV.
If you have a few places in the code which needs to do this, may I suggest taking a functional approach:
using System;
namespace Retry
{
class Program
{
static void Main(string[] args)
{
int i = 0;
Utils.Retry(() =>
{
i = i + 1;
if (i < 3)
throw new ArgumentOutOfRangeException();
});
Console.WriteLine(i);
Console.Write("Press any key...");
Console.ReadKey();
}
}
class Utils
{
public delegate void Retryable();
static int RETRIES = 5;
static int WAIT = 100; /*ms*/
static public void Retry( Retryable retryable )
{
int retrys = RETRIES;
int wait = WAIT;
Exception err;
do
{
try
{
err = null;
retryable();
}
catch (Exception e)
{
err = e;
if (retrys != 1)
{
System.Threading.Thread.Sleep(wait);
wait *= 2;
}
}
} while( --retrys > 0 && err != null );
if (err != null)
throw err;
}
}
}
Could you change your application so you don't release the file handle? If you hold a lock on the file yourself the antivir application will not be able to scan it.
Otherwise a strategy such as yours will help, a bit, because it only reduces the probability but it doesn't solve the problem.
Tough problem. Most ideas that I have go into a direction that you don't want (e.g. redesign).
I don't know how many files you have in your directory, but if it's not that much you may be able to work around your problem by keeping all files open and locked while your program runs.
That way the virus scanner will have no chance to interrupt your file-accesses anymore.