Does automatic hyperparemeter tuning on Matlab stop on its own? - performance

I'm tuning hyperparameters automatically using fitrsvm and my last 6 iterations have said 'accept' and not 'best'. I am wondering if I am supposed to stop the execution manually or if Matlab will do it on its own.
Mdl = fitrsvm (X_train, Y_train, 'KernelFunction', 'rbf',...
'OptimizeHyperparameters', 'auto',...
'HyperparameterOptimizationOptions', struct('AcquisitionFunctionName',...
'expected-improvement-plus','ShowPlots', true));

Related

Spark Performance tuning / optimization

I have pretty standard use case and need suggestion on how to improve the Spark(2.4) Job:
Dataframe1 (df1) = 10M records and
Dataframe2 (df2) = 50M records
then : join df1 & df2
use windowing functions etc
Result Dataframe (df3) = 2B records
further process i.e filter and generate 5 different dateset from prior df3. (when it issue starts)
The issues i face is initial few steps it works fine in notebook but as soon i reach to df3, further processing gets really slow and gets failed/killed.
What would be best way to optimized this processing? so far i tried using:
r4.xlarge cluster, also r5.16xlarge (500 GB Memory)cluster (should i try any other like M4 or C4 clusters or what would you suggest for this kind of processing)
spark conf used:
spark.conf.set("spark.executor.memory", "64g")
spark.conf.set("spark.driver.memory", "64g")
spark.conf.set("spark.executor.memoryOverHead", "24g")
spark.conf.set("spark.driver.memoryOverHead", "24g")
spark.conf.set("spark.executor.cores", "8")
spark.conf.set("spark.paralellism", 100)
spark.conf.set("spark.dynamicAllocation.enabled", "true")
spark.conf.set("spark.sql.broadcastTimeout", "7200")
spark.conf.set("spark.sql.autoBroadcastJoinThreshold", "-1")
using cache on df1,df2,df3.
once memory is used,i see disk spill, so i tried freeing GC using:
spark.conf.set("spark.driver.extraJavaOptions", "XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps")
spark.conf.set("spark.executor.extraJavaOptions", "XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps")
above steps, didn't do much help, please suggest what config, memory and cluster setting might help
or
What other optimization technique can be used here?

Parallel hyperparameter optimization with pytorch on a multi-gpu machine

I have access to a multi-gpu machine and I am running a grid search loop for parameter optimisation. I would like to know if I can distribute several iterations of the loop on multiple gpu at the same time, and if so how do I do it (what me mechanism? threading? how to gather the results if the loop execute asynchronously? etc.)
Thank you.
I'd suggest using Optuna to handle hyper-parameters search, which should in general perform better than grid search (you can still use it with grid sampling though). I have modified Optuna distributed example to use one GPU per process.
Create a training script like:
# optimize.py
import sys
import optuna
import your_model
DEVICE = 'cuda:' + sys.argv[1]
def objective(trial):
hidden_size = trial.suggest_int('hidden_size', 8, 64, log=True)
# define other hyperparameters
return your_model.score(hidden_size=hidden_size, device=DEVICE)
if __name__ == '__main__':
study = optuna.load_study(study_name='distributed-example', storage='sqlite:///example.db')
study.optimize(objective, n_trials=100)
In terminal:
pip install optuna
optuna create-study --study-name "distributed-example" --storage "sqlite:///example.db"
Then for every GPU device:
python optimize.py 0
python optimize.py 1
...
Finally, best results can be easily discovered:
import optuna
study = optuna.create_study(study_name='distributed-example', storage='sqlite:///example.db', load_if_exists=True)
print(study.best_params)
print(study.best_value)
Or even visualized.

Unable to release GPU memory after training a CNN model using keras

On a Google Colab notebook with keras(2.2.4) and tensorflow(1.13.1) as a backend, I am trying to tune a CNN, I use a simple and basic table of hyper-parameters and run my tests in a set of loops.
My problem is that I can't free the GPU memory after each iteration and Keras doesn't seem to be able to release GPU memory automatically. So every time I get a Ressource Exhausted : Out Of Memory (OOM) error
I did some digging up and run into this function that reassembles different solutions that have been suggested to solve this problem ( didn't work for me though)
for _ in hyper_parameters :
Run_model(_)
reset_keras()
my set of hyper parameters being :
IMG_SIZE = 800,1000,3
BATCH_SIZEs = [4,8,16]
EPOCHSs = [5,10,50]
LRs = [0.008,0.01]
MOMENTUMs = [0.04,0.09]
DECAYs = [0.1]
VAL_SPLITs = [0.1]
and the function I used to free the GPU memory :
def reset_keras():
sess = get_session()
clear_session()
sess.close()
sess = get_session()
try:
del model # this is from global space - change this as you need
except:
pass
print(gc.collect()) # if it's done something you should see a number being outputted
# use the same config as you used to create the session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 1
config.gpu_options.visible_device_list = "0"
set_session(tf.Session(config=config))
The only thing that I didn't fully grasp is the "same config as you used to create your model "(line 14) since with Keras we don't chose explicitly a certain configuration.
I get by for one iteration, some times two, but I can't go beyond. I already tried to change the batch_size and for the moment I am unable to afford for a machine with higher performances.
I am using a custom image generator that inherits from keras.utils.Sequence.
I monitor the state of GPU memory using this piece of code :
import psutil
import GPUtil as GPU
def printmm():
GPUs = GPU.getGPUs()
gpu = GPUs[0]
process = psutil.Process(os.getpid())
return(" Util {0:.2f}% ".format(gpu.memoryUtil*100))
print(printmm())

display a Simulink current simulation time

folks!
I am trying to display the Simulink current simulation time. I have to notice that, in my case, the system is not viewable, once I use load_system, and it would be very useful to know how progress the simulation.
For that, I have read that I should use the function 'ssGetT'. To implement it, I am using S-function builder block and I succeeded. I mean, I was able to get the current simulation time.
However, I am caught at this point, because I do not know how display it either a progress bar or a message box or any other way. Important, display from an C environment in S-function builder.
If there is any other way to do it, please me. =)
If anybody could help me, I would really appreciate it.
A couple of things to note:
There is no need to use load_system prior to using sim.
As with any MATLAB command, sim blocks further execution of m-code after that line in your m-code (or the command line) until it has finished executing (which in this case means that the simulation has stopped).
But any m-code within the model will definitely get excuted during model execution.
For instance, create a model where you feed the Clock block into a MATLAB Function block. Within the MATLAB Function block have the following code
function fcn(t)
%#codegen
coder.extrinsic('fprintf');
persistent firstTime
if isempty(firstTime)
firstTime = false;
fprintf('Starting Now\n');
end
fprintf('time = %.4f\n',t);
This will print the simulation time, at every time step, to the MATLAB Command Window, while the simulation is running (irrespective of how the model is started).
Updating...
To display a progress status in the commad view, I took Phil's suggestion.
I implemented this system in symulink in which the fcn inputs are the simulation time from a clock and the final simulation time.
I define SampleTime in the Digital Clock block as Final simulation time/steps, where steps is the number of time you want to update the progress. In my case, I update it at each 5% untill 100%, so steps is 20.
The fnc block is:
function fcn(t,tsim)
coder.extrinsic('fprintf');
persistent firstTime
if isempty(firstTime)
firstTime = false;
fprintf('\nSimulating...\n\n');
end
prog = 100*t/tsim;
fprintf(' %1.0f%%',prog);

MATLAB : access loaded MAT file very slow

I'm currently working on a project involving saving/loading quite big MAT files (around 150 MB), and I realized that it was much slower to access a loaded cell array than the equivalent version created inside a script or a function.
I created this example to simulate my code and show the difference :
clear; clc;
disp('Test for computing with loading');
if exist('data.mat', 'file')
delete('data.mat');
end
n_tests = 10000;
data = {};
for i=1:n_tests
data{end+1} = rand(1, 4096);
end
% disp('Saving data');
% save('data.mat', 'data');
% clear('data');
%
% disp('Loading data');
% load('data.mat', '-mat');
for i=1:n_tests
tic;
for j=1:n_tests
d = sum((data{i} - data{j}) .^ 2);
end
time = toc;
disp(['#' num2str(i) ' computed in ' num2str(time) ' s']);
end
In this code, no MAT file is saved nor loaded. The average time for one iteration over i is 0.75s. When I uncomment the lines to save/load the file, the computation for one iteration over i takes about 6.2s (the saving/loading time is not taking into consideration). The difference is 8x slower !
I'm using MATLAB 7.12.0 (R2011a) 64 bits with Windows 7 64 bits, and the MAT files are saved with the version v7.3.
Can it be related to the compression of the MAT file? Or caching variables ?
Is there any way to prevent/avoid this ?
I also know this problem. I think it's also related to the inefficient managing of memory in matlab - and as I remember it's not doing well with swapping.
A 150MB file can easily hold a lot of data - maybe more than can be quickly allocated.
I just made a quick calculation for your example using the information by mathworks
In your case total_size = n_tests*121 + n_tests*(1*4096* 8) is about 313MB.
First I would suggest to save them in format 7 (instead of 7.3) - I noticed very poor performance in reading this new format. That alone could be the reason of your slowdown.
Personally I solved this in two ways:
Split the data in smaller sets and then use functions that load the data when needed or create it on the fly (can be elegantly done with classes)
Move the data into a database. SQLite and MySQL are great. Both work efficiently with MUCH larger datasets (in the TBs instead of GBs). And the SQL language is quite efficient to quickly get subsets to manipulate.
I test this code with Windows 64bit, matlab 64bit 2014b.
Without saving and loading, the computation is around 0.22s,
Save the data file with '-v7' and then load, the computation is around 0.2s.
Save the data file with '-v7.3' and then load, the computation is around 4.1s.
So it is related to the compression of the MAT file.

Resources