USing AddExpression / MathExpression in Weka - filter

I am working on a very basic WEKA assignment, and I'm trying to use WEKA to preprocess data from the GUI (most current version). I am trying to do very basic if statements and mathematical statements in the expression box when double clicking on MathExpression and I haven't had any success. For example I want to do
if (a5 == 2 || a5 == 0) then y = 1; else y = 0
Many different variations of this haven't worked for me and I'm also unclear on how to refer to "y" or if it needs a reference within the line.
Another example is -abs(log(a7)–3) which I wasn't able to work out either. Any ideas about how to make these statements work?

From javadoc of MathExpression
The 'A'
letter refers to the value of the attribute being processed.
Other attribute values (numeric only) can be accessed through
the variables A1, A2, A3, ...
Your filter applies to all attributes of your dataset. If I load iris dataset and apply following filter.
weka.filters.unsupervised.attribute.MathExpression -E log(A).
your attribute ,sepallength values change as following.
Before Filter After Filter
Minimum 4.3 Minimum 1.459
Maximum 7.9 Maximum 2.067
Mean 5.843 Mean 1.755
StdDev 0.828 StdDev 0.141
Also if you look to javadoc, there is no if else function but ifelse function. Therefore you should write something like
ifelse ( (A == 2 || A == 0), 1,0 )
Also this filter applies to all attributes. If you want to change only one attribute and according to other attribute values ; then you need to use "Ignore range option" and use A1,A2 to refer to other attribute values.
if you need to add new attribute use AddExpression.
An instance filter that creates a new attribute by applying a mathematical expression to existing attributes.

Related

Matlab : image region analyzer. Alternative for 'bwpropfilt'?

I'm running basic edge detection to detect windows region based on this http://www.mathworks.com/videos/edge-detection-with-matlab-119353.html
The edge works successfully :
final_edge = edge(gray_I,'sobel');
BW_out = bwareaopen(imfill(final_edge,'holes'),20);
figure;
imshow(BW_out);
Now when come to these following codes to filter image based on properties, it seems like my MATLAB R2013a can't identify this bwpropfilt method.
% imageRegionAnalyzer(BW);
% Filter image based on image properties
BW_out = bwpropfilt(BW_out,'Area', [400, 467]);
BW_out = bwpropfilt(BW_out,'Solidity',[0.5, 1]);
It says:
Undefined function 'bwpropfilt' for input arguments of type 'char'.
Then what should be my alternative to change this bwpropfilt?
bwpropfilt simply takes a look at the corresponding attribute that is output from regionprops and gives you objects that conform to that certain range and also filtering out those that are outside of the range. You can rewrite the algorithm by explicitly calling regionprops, creating a logical array to index into the structure to retain only the values within the right range (seen in the third input of bwpropfilt) corresponding to the property you want to examine (seen in the second input of bwpropfilt). If you want to finally reconstruct the image after filtering, you'll need to use the column major linear indices found in the PixelIdxList attribute, stack them all into a single vector and write to a new output image by setting all of these values to true.
Specifically, you can use the following code to reproduce the last two lines of code you have shown:
% Run regionprops and get all properties
s = regionprops(BW_out, 'all');
%%% For the first line of code
values = [s.Area];
s = s(values > 400 & values < 467);
%%% For the second line of code
values = [s.Solidity];
s = s(values > 0.5 & values < 1);
% Stack column major indices
ind = vertcat(s.PixelIdxList);
% Create output image
final_out = false(size(BW_out));
final_out(ind) = true;
final_out contains the filtered image only retaining the values within the range specified by the desired property.
Caution
The above logic only works for attributes returned from regionprops that contain only a single scalar value per unique region. If you examine the supported properties found in bwpropfilt, you will see that this list is a subset of the full list found in regionprops. This makes sense as certain regionprops properties return a vector or a matrix depending on what you choose so using a range to filter out properties becomes ambiguous if you have multiple values that characterize a particular unique region returned by regionprops.
Minor Note
Being curious, I opened up bwpropfilt to see how it is implemented as I currently have MATLAB R2016a. The above logic, with the exception of some exception handling, is essentially how bwpropfilt has been implemented so the code that I wrote is in line with the logic of the function.

SPSS Ranking Data In One Column

I'm still new with SPSS, I Have Data For The Following :
Cereals Vegetables Fruit Meat Dairy Fat Sugar Pulses
I Have Also Computed The Variables With This Formula :
Total FCS = (Cereals*2)+(Vegetables)+(Fruits)+(Meat*4)+(Dairy*4)+(Sugar*0.5)+(Pulses*3)
Now I Want To Rank The Data from the Total FCS In One Column In Order To Make Graph From It As Following:
Rank as :
<28 Poor
>28.5 - <42 Borderline
>42.5 Acceptable
What Should I Do ?
I would use a DO IF statement to assign the ranks. Example below.
DO IF FCS < 28.
COMPUTE RankFCS = 1.
ELSE IF FCS <= 42.5.
COMPUTE RankFCS = 2.
ELSE.
COMPUTE RankFCS = 3.
END IF.
VALUE LABELS RankFCS
1 'Poor'
2 'Borderline'
3 'Acceptable'.
There is a command called Recode in SPSS, you can use that command to create this rank variable. Recode command has two options
1). Recode into same variables
2). Recode into Different variables.
I am using 2nd option as you need to create a new Rank variable.
STRING RankFCS (A8).
RECODE FCS (Lowest thru 28='Poor') (28.5 thru 42='Borderline')
(42.5 thru Highest='Acceptable')
INTO RankFCS.
EXECUTE.

What makes Crystal ignore record selection formula?

Crystal 2008. Have record selection formula ending with
and
( ( "Zero" in {?Credit_Debit} and {V_ARHB_BKT_AGING_DETAIL.AMOUNT} = 0)
or ( "Credit" in {?Credit_Debit} and {V_ARHB_BKT_AGING_DETAIL.AMOUNT} < 0)
or ( "Debit" in {?Credit_Debit} and {V_ARHB_BKT_AGING_DETAIL.AMOUNT} > 0) )
but no matter what combination of values is selected for Credit_Debit the result set is the same.
Also without success, I tried joining the parameter array into a single string and using lines like
or ( {#Cred_Deb_Choices} like "*Credit*" and {V_ARHB_BKT_AGING_DETAIL.AMOUNT} < 0)
Using the first method works in the same formula when the parameter values are integers, as:
and ({?Location ID} = 0 or {V_ARHB_BKT_AGING_DETAIL.LOC_ID} in {?Location ID})
I examined the generated SQL, and saw that the part at the beginning that had no effect was not shown.
I changed a part that tested for a hard-coded value to instead test for a parameter value, and looked at the SQL again. No change.
When you try to create a filter that doesn't fit with the datatype of the field then that doesn't get reflected in record selection formula.
For Integer field give integers in record selection for text give text.
E.g:
ID=0 and Name='XXX' works
ID='Zero' and Name='XXX' doesn't
This should solve your issue

DataColumn Expression Divide By Zero

I'm using basic .net DataColumns and the associated Expression property.
I have an application which lets users define which columns to select from a database table. They can also add other columns which perform expressions on the data columns resulting in a custom grid of information.
The problem I have is when they have a calculation column along the lines of "(C2/C3)*100" where C2 and C3 are data columns and the value for C3 is zero. The old "divide by zero" issue.
The simple answer would be to convert the expression to "IIF(C3 = 0, 0, (C2/C3)*100)", however we don't expect the user to know to do that and at compile time I don't know what columns are defined. So I would have to programmatically determine which columns are being used in a division in order to construct the IIF clause. That could get quite tricky.
Is there another way to not throw an error and replace the result with 0 if a "Divide By Zero" error occurs?
Ok, I found a way. The key is to use Double and not Decimal for the column type, e.g. in the example above C3 should be a Double. This will result in a result of Infinity instead, which can be evaluated against using the expression as a whole.
E.g.
IIF(CONVERT(([C4] / [C3] )*100, 'System.String') = 'NaN' OR CONVERT(([C4] / [C3] )*100, 'System.String') = 'Infinity' OR CONVERT(([C4] / [C3] )*100, 'System.String') = '-Infinity', 0, ([C4] / [C3] )*100)
Decimal it seems doesn't provide that Infinity option.

Case Statement in Simulink

I am just not able to figure out how to proceed:
I am trying to build a model:
It would have 4 Inputs ( Boolean i/p)
It would have 1 output (Signed: 8 bit)
It would perform the following:
Based on which input is 1, it would give a corresponding output reflecting the DataRate.
If I have to write in Matlab, I would write something like this :
if (portA==1)
PSDU_Data_Rate=1;
elseif(portB==1)
PSDU_Data_Rate=2;
elseif (portC==1)
PSDU_Data_Rate=5.5;
elseif(portD==1)
PSDU_Data_Rate=11;
end
I am attaching, the part of the model which I am developing for the same functionality:
Any idea on how to proceed or code correction or suggestion on how it can be improved would be really helpful.
Thanks
Since you have 4 distinct inputs instead of a single input carrying an enumerated value, use If - Else, instead of a Case statement. I'm adding screenshots of how this can be done. Note that the If block also lets you have an Else output if you want to select one of the data rates by default (if none of the inputs are non-zero).
If block settings:
Number of inputs: 4
If expression: u1 ~= 0
Elseif expressions: u2 ~=0, u3 ~= 0, u4 ~= 0
The model consists of an If block connected to a set of If Action Subsystem blocks. The outputs of the latter can be combined into a single signal using a Merge block.

Resources