unable to allocate 5.15 mib for an array with shape (20,150,150,3) and data type float32 - image

I am writing a code for grayscale image binary Classification. I converted my image data into train and test generators. After evaluation I'm trying to print the classification report , I'm getting the error in 3rd line
Pred = model.predict(test_generator)
Pred = no.argmax(pred, axis=1)
Actual_label=np.argmax(test_generator,axis=1)
Print(classification_report(actual_label,pred)
I was expected to get classification report but in third line "actual_label", its generating an error
Memory error: unable to allocate 5.15 mib for an array with shape (20,150,150,3) and data type float32

Related

Shap explainer gives an error with ECFP4 fingerprints

I am training a Random Forest with molecular fingerprints and adding a shap explainer, with the shap package function
explainer = shap.Explainer(forest)
and it gives me the error:
"ExplainerError: Additivity check failed in TreeExplainer! Please ensure the data matrix you passed to the explainer is the same shape that the model was trained on. If your data shape is correct then please report this on GitHub. Consider retrying with the feature_perturbation='interventional' option. This check failed because for one of the samples the sum of the SHAP values was 28208132836061024.000000, while the model output was 0.846000. If this difference is acceptable you can set check_additivity=False to disable this check."
Now, the error seems to be very straight forward, but the unintelligible thing is :
I did the exact same thing with MACC fingerprints, it worked.
I looked into the shape of the data, it is (2334, 2048) for train and (193, 2048) for validation, with MACCS it was analogous.
the validation set consists only of 1 and 0 as it should
the fingerprints are all same lengths, no errors there
I did some external validation with the validation set and there were no problems there.
roc = metrics.roc_auc_score(labels_val, predicted)
tn, fp, fn, tp = confusion_matrix(labels_val, predicted).ravel()
I checked that the forest I trained with was the forest I was using.
And yes, I even restarted my computer.
If someone has any idea what could cause this problem, please let me know!

Passing a geopandas data frame to `folium.Choropleth` (map renders gray)

I'm trying to make a Choropleth map from a GeoPandas data frame, rather than from a geojson file containing only geometry plus a pandas dataframe containing statistical data. Specifically, I would like to adapt this example, merging the shapefiles for US states with another dataset containing their respective unemployment numbers into a single GeoPandas data frame (merged), and then rendering it with folium.Choropleth.
The folium documentation says that the geo_data parameter can be a geopandas object. When I pass the geopandas_data_frame.geometry to it, the map renders. However, when I pass merged["Unemployment"] to the data parameter, each state renders in blue, despite the fact that the numbers vary.
m = folium.Map(location=[48, -102], zoom_start=3)
folium.Choropleth(
geo_data=merged,
name='choropleth',
data=merged["Unemployment"],
fill_color='YlGn',
fill_opacity=0.7,
line_opacity=1,
legend_name='Unemployment Rate (%)'
).add_to(m)
folium.LayerControl().add_to(m)
m
I have tried changing the data type of merged["Unemployment"] from float to int to str, as per this question.
Folium uses GeoJSON objects to plot the geometries (the geo_data param). You can use the geopandas but you'll have to convert it in the function call.
folium.Choropleth(geo_data=merged.to_json(),
name='choropleth',
data=merged,
columns=["id", "Unemployment"],
fill_color='YlGn',
fill_opacity=0.7,
line_opacity=1,
key_on="feature.properties.id",
legend_name='Unemployment Rate (%)').add_to(m)
The key_on parameter is the tricky one, it has to match the structure of the merged.to_json() file, just print it and check.

BagOfFeatures for Image Category Classification in Matlab

Doing this example in Matlab Image Category Classification
I have found an error trying to get the vocabulary of SURF features with this command
bag = bagOfFeatures(trainingSet);
The error is the following
Error using bagOfFeatures/parseInputs (line 1023)
The value of 'imgSets' is invalid. Expected imgSets to be one of these types:
imageSet
Instead its type was matlab.io.datastore.ImageDatastore.
I am using a ImageDatastore input instead of imgSets, but I am following a Mathworks example. Anyone can explain me why is this happening and how can I convert trainingSet into a imgSets type?
You have to convert the ImageDatastore object to an imageSet object. This can simply be done by using the following line instead:
bagOfFeatures(imageSet(trainingSet.Files));

MATLAB ConnectedComponentLabeler does not work in for loop

I am trying to get a set of binary images' eccentricity and solidity values using the regionprops function. I obtain the label matrix using the vision.ConnectedComponentLabeler function.
This is the code I have so far:
files = getFiles('images');
ecc = zeros(length(files)); %eccentricity values
sol = zeros(length(files)); %solidity values
ccl = vision.ConnectedComponentLabeler;
for i=1:length(files)
I = imread(files{i});
[L NUM] = step(ccl, I);
for j=1:NUM
L = changem(L==j, 1, j); %*
end
stats = regionprops(L, 'all');
ecc(i) = stats.Eccentricity;
sol(i) = stats.Solidity;
end
However, when I run this, I get an error says indicating the line marked with *:
Error using ConnectedComponentLabeler/step
Variable-size input signals are not supported when the OutputDataType property is set to 'Automatic'.'
I do not understand what MATLAB is talking about and I do not have any idea about how to get rid of it.
Edit
I have returned back to bwlabel function and have no problems now.
The error is a bit hard to understand, but I can explain what exactly it means. When you use the CVST Connected Components Labeller, it assumes that all of your images that you're going to use with the function are all the same size. That error happens because it looks like the images aren't... hence the notion about "Variable-size input signals".
The "Automatic" property means that the output data type of the images are automatic, meaning that you don't have to worry about whether the data type of the output is uint8, uint16, etc. If you want to remove this error, you need to manually set the output data type of the images produced by this labeller, or the OutputDataType property to be static. Hopefully, the images in the directory you're reading are all the same data type, so override this field to be a data type that this function accepts. The available types are uint8, uint16 and uint32. Therefore, assuming your images were uint8 for example, do this before you run your loop:
ccl = vision.ConnectedComponentLabeler;
ccl.OutputDataType = 'uint8';
Now run your code, and it should work. Bear in mind that the input needs to be logical for this to have any meaningful output.
Minor comment
Why are you using the CVST Connected Component Labeller when the Image Processing Toolbox bwlabel function works exactly the same way? As you are using regionprops, you have access to the Image Processing Toolbox, so this should be available to you. It's much simpler to use and requires no setup: http://www.mathworks.com/help/images/ref/bwlabel.html

Cannot use scatterplot in Octave

I was learning how to do machine learning on mldata.org and I was watching a video on Youtube on how to use the data (https://www.youtube.com/watch?v=zY0UhXPy8fM) (2:50). Using the same data, I tried to follow exactly what he did and create a scatterplot of the dataset. However when he used the scatterplot command, it worked perfectly on his side, but I cannot do it on myside.
Can anyone explain what's wrong and what I should do?
octave:2> load banana_data.octave
octave:3> pkg load communications
octave:4> whos
Variables in the current scope:
Attr Name Size Bytes Class
==== ==== ==== ===== =====
data 2x5300 84800 double
label 1x5300 42400 double
Total is 15900 elements using 127200 bytes
octave:5> scatterplot(data, label)
error: scatterplot: real X must be a vector or a 2-column matrix
error: called from:
error: /home/anthony/octave/communications-1.2.0/scatterplot.m at line 69, column 7
The error message says it all. Your data is a 2-row matrix, and not a 2-column matrix as it should be. Just transpose it with .'.
scatterplot(data.')
I dropped the label argument since it is not compatible with the communications toolbox, either in matlab or in octave.
Update:
According to news('communications'),
The plotting functions eyediagram' andscatterplot' have improved Matlab compatibility
This may be why the behaviour is different. Be ready to find other glitches, as the octave 3.2.4 used in this course is about 5 years old.
In order to use the label, you should rather use the standard octave scatter function.
Colors could be changed by choosing another colormap.
colormap(cool(256))
scatter(data(1,:), data(2,:), 6, label, "filled")

Resources