How to stop Wolfram script process? NetModel::dllock: The requested neural network model is currently downloading in another process, - download

I try to run GPT2 with Wolfram script.
In[1]:= NetModel["GPT2 Transformer Trained on WebText Data"]
The download speed is too slow,500M in 6 hours???
Thus, I stop the script with Ctrl+C, and change the Internet connection.
This leads to the following Errors. Restart the computer and network doesn't make sense. I just want to know:
How to stop the wolfram script process after the terminal(even the computer) is closed?
Where is the Default Download PATH of WolframEngine? So that I may delete the Wrong Files.
In[2]:= NetModel["GPT2 Transformer Trained on WebText Data"]
NetModel::dllock:
The requested neural network model is currently downloading in another
process, try again when it is finished. If there is not a download in
progress run DeleteObject[ResourceObject[GPT2 Transformer Trained on
WebText Data]] before trying again.
Out[2]= $Failed
In[3]:= DeleteObject[ResourceObject[GPT2 Transformer Trained on WebText Data]]
ResourceObject::noas:
The argument Data GPT2 on Trained Transformer WebText
should be the name or id of an existing resource or an Association
defining a new resource.
DeleteObject::nim: Cannot delete object $Failed.
Out[3]= $Failed

Related

is there a way to save only the model with huggingface trainer?

I want to keep multiple checkpoints during training to analyse them later but the Trainer also saves other files to resume training. Is there a way to only save the model to save space and writing time?
15K rng_state.pth
906 trainer_state.json
623 scheduler.pt
2,1G optimizer.pt
2,5K training_args.bin
1,1G pytorch_model.bin
900 config.json
I could just delete the optimizer after training but i'm also working with a disk with slower write speed so this is also a consideration
Unfortunately, there is currently no way to disable the saving of single files. There are basically two ways to get your behavior:
The "hacky" way would be to simply disable the line of code in the Trainer source code that stores the optimizer, which (if you train on your local machine) should be this one. However, whenever you update your transformers version, this could lead to ugly behavior, which is why I recommend the second one.
Overwrite the _save_checkpoint() function in your own Trainer object. This way, you always guarantee that the correct files are saved, and don't have to interact with the library's code.
An outline for what this looks like:
from transformers import Trainer
class OwnTrainer(Trainer):
# Don't forget to correctly set up __init__()
# ...
def _save_checkpoint(self, model, trial, metrics=None):
# insert your own behavior here

Vowpal Wabbit: obtaining a readable_model when in --daemon mode

I am trying to stream my data to vw in --daemon mode, and would like to obtain at the end the value of the coefficients for each variable.
Therefore I'd like vw in --daemon mode to either:
- send me back the current value of the coefficients for each line of data I send.
- Write the resulting model in the "--readable_model" format.
I know about the dummy example trick save_namemodel | ... to get vw in daemon mode to save the model to a given file, but it isn't enough as I can't access the coefficient values from that file.
Any idea on how I could solve my problem ?
Unfortunately, on-demand saving of readable models isn't currently supported in the code but it shouldn't be too hard to add. Open source software is there for users to improve according to their needs. You may open a issue on github, or better, contribute the change.
See:
this code line where only the binary regressor is saved using save_predictor(). One could envision a "rsave" or "saver" tag/command to store the regressor in readable form as is being done in this code line
As a work-around you may call vw with --audit and parse every audit line for the feature names and their current weights but this would:
make vw much slower
require parsing every line to get the values rather than on demand

Automatically saving notebook (or other type files in mathematica) files

I have been facing this problem for sometimes now, a laziness caused in part by the fact that Microsoft Office automatically save files you are working on with versions and automatic recovery.
Many times when I am starting a new notebook in mathematica to do some tests or whatever, I often forget to save what I am doing.
Every now and then, depending on the computer I am using, the computer crashes and all the beautiful work I was doing is lost forever...
Is there a way to get around this other that manically saving my files every five minutes? How about file versioning?
BTW: Using MMA V8
Regarding autosaving, you may want to check out the NotebookAutoSave option, which can be set to True through Fromat->Option Inspector. You have to choose "Selected notebook", then go to Notebook Options -> File Options, and set NotebookAutoSave to True. Then, your notebook will be saved after every evaluation. Whether or not this is a satisfactory solution, of course depends on the situation.
But my experience is that the most reliable way is to develop a CTRL+S reflex - this one never lets me down and is working quite well.
As for the versioning, it is much easier with packages, for which you can use WorkBench which has integrated support for CVS and support for SVN via Eclipse plugin. For notebooks, I refer you to this SO thread. You may also find this Mathgroup discussion of some interest.
EDIT
For M8, for auto-saving purposes you can probably also run
RunScheduledTask[NotebookSave[EvaluationNotebook[]],{300}]
But I can not test this code at the moment
EDIT2
I just came across this post in the Toolbag repository - which may also be an alternative for the autosave part of the question (but please see also the discussion in comments on the relative advantages of scheduled tasks vs. Dynamic)
Since you have MMA version 8 you could use:
saveTask = CreateScheduledTask[FrontEndExecute[FrontEndToken["Save"]], 5*60];
StartScheduledTask[saveTask];
to save every 5 minutes (change the term 5*60 for other timings).
To remove the auto-save task use:
RemoveScheduledTask[saveTask];
To save only a fixed, specific notebook, store its handle in nb (finding it using Notebooks, SelectedNotebook, InputNotebook or EvaluationNotebook) and use FrontEndToken[nb,"Save"] instead of just FrontEndToken["Save"]
I have a Mathematica package that provides auto-backup functionality. When enabled, the current notebook--call it "blah.nb"--will be backed up to "blah.nb~" after a configurable amount of time has elapsed. I use it constantly and it has saved me from losing work many, many times. It's better than autosaving since it doesn't touch the actual notebook file: if you screw something up or something gets corrupted you don't want to overwrite your main file. :)
It's on GitHub here.
I've got an autosave routine that saves a copy of every open, modified notebook every 5 minutes (or whatever interval you prefer. It leaves your manually-saved copy alone, and saves a "swap file" in a separate directory that can be easily recovered if need be. The code (to be copied to init.m) is given in this answer: https://mathematica.stackexchange.com/questions/18380/automatic-recovery-after-crash/65852#65852, and copied below:
Motivated by the same concerns, I wrote the following code and added it to my init.m file. There are two main entries you'll want to change to use this. The global variable $SwapDirectory is where the swap files are saved (by swap file, I mean it in the VIm sense; an "extra" copy of your notebook, separate from your manually saved copy that periodically saves any new work). The swap files are organized within the swap directory in a directory structure which "mirrors" their original file locations, and have ".swp" appended to their file names. The other variable you might want to change is the number of seconds between autosaves, indicated by the "300" (corresponding to 5 minutes) near the bottom of the code below. At the appropriate times, this code will (automatically in the background) save swap files for ALL open notebooks, unless they are unmodified from their manually-saved versions (this exception makes the code more efficient, and more importantly, prevents the storage of swap files for documentation notebooks, for example).
In its current form, the code does not filter for only the input cells, but hopefully you can use the other answers to make that modification yourself.
Some things to note:
1) the Mathematica Put command seems to have trouble writing to network drives, even when offline access is enabled. Therefore, it is probably best to choose a SwapDirectory that is on your local machine.
2) Within SwapDirectory, you should create a sub-directory called "Recovery". This is where the AutoSaveSwap routine will make an initial save of any notebooks for which there is NO existing manual save location.
3) Simply evaluate
RecoverSwap["filePath"]
where "filePath" is a string representing the filePath of the MANUALLY-SAVED copy of the file (i.e., not the file that was created by AutoSave). This will then pop up a window containing the most recent auto-saved version of the file. The manually saved version is NEVER overwritten, unless you explicitly choose to do so. Once the recovered version pops up, you can save it whereever you like, or discard it at your discretion.
4) You should probably add this code to the KERNEL version of init.m ($UserBaseDirectory/Kernel/init.m) rather than the frontend version... this way, if you quit and restart the kernel, the autosave feature will also restart. On the other hand, this means that you must evaluate at least one expression after each start or restart to begin auto-saving. Once this initial evaluation is done, you do NOT need to have evaluated a cell for it to be backed up (unlike the built-in autosave utility).
Hope this helps someone! Feel free to respond with any questions, suggestions, or requests for improvement you may have. And, if you find this post useful, upvotes would be most appeciated! Take care.
$SwapDirectory= "C:\\Users\\pacoj\\Swap Files\\";
SaveSwap[nb_NotebookObject]:=Module[
{fileName, swapFileName, nbout, nbdir, nbdirout, recoveryDir},
If[ ! SameQ[Quiet[NotebookFileName[nb]], $Failed],
(* if the notebook is already saved to the file system *)
fileName = Last[ FileNameSplit[ NotebookFileName[nb]] ];
swapFileName = fileName <> ".swp";
nbdir = Rest[FileNameSplit # NotebookDirectory[nb]];
nbdirout= FileNameJoin[ FileNameSplit[$SwapDirectory]~Join~nbdir]<>"\\";
If[!DirectoryQ[nbdirout], CreateDirectory[nbdirout]];
nbout = NotebookGet[nb];
Put[nbout, nbdirout <> swapFileName],
(* else, if the file has never been saved, save as untitled *)
recoveryDir= $SwapDirectory <> "Recovery\\\";
fileName= ("WindowTitle" /. NotebookInformation[nb])<>".nb";
NotebookSave[nb, recoveryDir <> fileName]
]
];
RecoverSwap::noswp= "swap file `1` not found in expected location";
RecoverSwap[nbfilename_String]:=Module[
{fileName, swapFileName, nbin, nbdir, nbdirout},
fileName= Last[ FileNameSplit[ nbfilename] ];
swapFileName= fileName <> ".swp";
nbdir= Most[ Rest[FileNameSplit # nbfilename] ];
nbdirout= FileNameJoin[ FileNameSplit[$SwapDirectory]~Join~nbdir]<>"\\\";
If[ FileNames[swapFileName, {nbdirout}] == {},
Message[RecoverSwap::noswp,nbdirout <> swapFileName]; Return[],
nbin= Get[nbdirout <> swapFileName]; NotebookPut[nbin]
]
];
AutoSaveSwaps= CreateScheduledTask[
SaveSwap /# Select[Notebooks[], "ModifiedInMemory" /. NotebookInformation[#]&],
300
]
StartScheduledTask[AutoSaveSwaps]

How to check whether the FrontEnd considers evaluation still running?

Is there a way to check programmatically whether the FrontEnd considers evaluation still running?
Or even better: is there a way to check whether the FrontEnd has some pending inputs to be sent to the kernel?
P.S. This question has arisen from previous question.
EDIT
When evaluating a Cell in the FrontEnd we usually create a queue of inputs for the kernel.
I need a function that will return True if the FrontEnd has sent to the kernel the last input of the queue of inputs from the EvaluationNotebook[]. Or in other words I need a function that returns True if this current input is the last input of the queue of inputs generated by the FrontEnd.
This should work. Of course, you have to run it in a different kernel than the one that is performing the evaluation you want to check for.
NotebookEvaluatingQ[nb_] := (
SelectionMove[nb, All, Notebook];
Or ## Map["Evaluating" /. # &, Developer`CellInformation[nb]]
)
Obviously, it's best to set things up before hand using a tool like Monitor. For example,
Monitor[
Do[Pause[6], {i, 10}],
i]
will allow you to observe the progress of the index variable i. If you haven't set things up before hand, you might be able to do something using the "Interrupt Evaluation" button under the Evaluation menu. For example, try the following:
Do[Pause[6], {i, 10}]
Now, wait six or more seconds and then select "Interrupt Evaluation". You can then examine the state of i to see how far along it is. You resume evaluation using Continue under "Debugger Controls".

How do I get data from a Simulink block into a MATLAB GUI?

I have a Simulink model that uses an embedded MATLAB function for a block, and I haven't been able to figure out how to move data between the embedded MATLAB block and a GUI in real-time (i.e. while the model is running). I tried to implement a "to workspace" block in my model but I don't know how to correctly use it.
Does anyone know how to move data from a Simulink block into a GUI in real-time?
Non-real-time solution:
If you want to set parameters in a GUI, simulate a model with those parameters, and then display the simulation output in the GUI, there is a good tutorial on blinkdagger.com. One solution they describe is using the SIMSET function to define which workspace the Simulink model interacts with. You should be able to supersede the base workspace so that data is instead sent to and from the workspace of the GUI functions that are calling the Simulink model.
Real-time solution
As suggested by MikeT, you can use a RuntimeObject. You first have to use the get_param function to get the RuntimeObject from the block:
rto = get_param(obj,'RuntimeObject');
Where obj is either a block pathname or a block-object handle. You can get the pathname of the most recently selected block using the GCB function (in which case you can replace obj with gcb). You can then get the block's output with the following:
blockData = rto.OutputPort(1).Data
One additional caveat from the documentation:
To ensure the Data field contains the
correct block output, turn off the
Signal storage reuse option (see
Signal storage reuse) on the
Optimization pane in the Configuration Parameters dialog box.
You would likely end up with a loop or a timer routine running in your GUI that would continuously get the output data from the RuntimeObject for as long as the simulation is running. The documentation also states:
A run-time object exists only while
the model containing the block is
running or paused. If the model is
stopped, get_param returns an empty
handle. When you stop or pause a
model, all existing handles for
run-time objects become empty.
Your loop or timer routine would thus have to keep checking first that the RuntimeObject exists, and either stop (if it doesn't) or get the data from it (if it does). I'm unsure of exactly how to check for existence of a RuntimeObject, but I believe you would either check if the object is empty or if the BlockHandle property of the object is empty:
isempty(rto) % Check if the RuntimeObject is empty
%OR
isempty(rto.BlockHandle) % Check if the BlockHandle property is empty
From your responses, I'm guessing you want to see the results while the simulation is running, is that correct? The blinkdagger.com tutorial lets you view the results of a simulation after it is done, but not while it is running. Do you basically want to embed something like a scope block into your GUI?
There's a few ways to do this, the best is probably using the EML block's runtime object. If you use this, you should be able to look at the output of the EML block while it is running.

Resources