SPSS Ranking Data In One Column - ranking

I'm still new with SPSS, I Have Data For The Following :
Cereals Vegetables Fruit Meat Dairy Fat Sugar Pulses
I Have Also Computed The Variables With This Formula :
Total FCS = (Cereals*2)+(Vegetables)+(Fruits)+(Meat*4)+(Dairy*4)+(Sugar*0.5)+(Pulses*3)
Now I Want To Rank The Data from the Total FCS In One Column In Order To Make Graph From It As Following:
Rank as :
<28 Poor
>28.5 - <42 Borderline
>42.5 Acceptable
What Should I Do ?

I would use a DO IF statement to assign the ranks. Example below.
DO IF FCS < 28.
COMPUTE RankFCS = 1.
ELSE IF FCS <= 42.5.
COMPUTE RankFCS = 2.
ELSE.
COMPUTE RankFCS = 3.
END IF.
VALUE LABELS RankFCS
1 'Poor'
2 'Borderline'
3 'Acceptable'.

There is a command called Recode in SPSS, you can use that command to create this rank variable. Recode command has two options
1). Recode into same variables
2). Recode into Different variables.
I am using 2nd option as you need to create a new Rank variable.
STRING RankFCS (A8).
RECODE FCS (Lowest thru 28='Poor') (28.5 thru 42='Borderline')
(42.5 thru Highest='Acceptable')
INTO RankFCS.
EXECUTE.

Related

How do I add noise/variability to a dataset in Python, given the CV?

Given a dataset of blood results, say cholesterol level, and knowing that the instrument that produced those results is subject to a known degree of variability, how would I add that variability back into the dataset? i.e. I want to assume the result in the original dataset is the true/mean value, and then produce new results that are subject to the known variability of the instrument.
In Excel you use =NORM.INV(RAND(), mean, std_dev), where RAND() provides a random value between 0 and 1, "mean" will be the original value and I have the CV so I can calculate the SD. NORM.INV then provides the inverse of the cumulative normal distribution function.
I've done the following to create a new column with my new values, but would like to know if it is valid (i.e., will each row have a different random number between 0 and 1 as the probability? and is this formula equivalent to NORM.INV?
df8000['HDL_1'] = norm.ppf(random(), loc = df8000['HDL_0'], scale = TAE_df.loc[0,'HDL'])
Thanks in advance!

Most common "denominators" in a two column list in Google Sheets

How can I find the most commonly found 'Code' (Col B) associated with each unique 'Name' in (Col A) and find the closest value if the 'Code' in Col B is unique?
The image below shows the shared google sheet with Starting data in Columns A & B and the desired output columns in columns C and D. Each Unique Name has associated codes. Column D displays the most commonly occuring Code for each unique name. For example, Buick La Sabre 1 has 3 associated codes in B3,B4,B5 but in D3 only 98761 because it appears more frequently than the other 2 codes do in B2:B. I will explain what I mean by the closest value below.
The Codes that have a count = 1 are unique so the output in column D tries to find the closest match.
However, when the count of the code in B2:B > 1, then the output in column D = to the most frequent code associated with the Name.
Approach when there is 2 or more of the same values in column B
Query
I thought I might use a QUERY with a ORDER BY count(B) DESC LIMIT 2 in a fashion similar to this working equation:
QUERY($A$1:$D$25,"SELECT A, B ORDER BY B DESC Limit 2",1)
but I could not get it to work when I substituted in the Count function.
SORT & INDEX OR VLOOKUP
If the query function can't be fixed to work, then I thought another approach might be to combine a Vlookup/Index after sorting column B in a descending order.
UNIQUE(sort($B$3:$B,if(len($B$3:$B),countif($B$3:$B,$B$3:$B),),0,1,1))
Since a Vlookup or Index using multiple criteria would just pull the first value it finds, you would just end up with the first matching value, we would then get the most frequent value.
Approach when there is < 2 of the same values in column B
This is a little more complicated since the values can be numbers and letters.
A solution like that seen in the image below could be used if everything were a number. In our case there will usually be between 3 - 5 character alphanumeric code starting with 0 - 1 letters numbers and followed by numbers. I'm not sure what the best way to match a code like A1234 would be. I imagine a solution might be to SPLIT off letters and trying to match those first. For example A1234 would be split into A | 1234, then matching the closest letter and then the closest number. But I really am not sure what the best solution to this might be that works within the constraints of Google Sheets.
In the event that a number is equidistant between two numbers, the lower number should be chosen. For example, if 8 is the number and the closest match would be 6 or 10, then 6 should be selected.
In the event that a letter is being used it should work in a similar fashion. For example, thinking of {A, B, C} as {1, 2, 3}, B should preferrentially match to A since it comes before C.
In summary, looking for a way to find the most frequently associated code in col B that is associated with unique names in col A in this sheet and; In the event where there are none of the same codes in B2:B, a formula that will find the closest match for a number or alphanumeric code.
You can use this formula:
=QUERY({range of numerators & denominators}, "select Col2, count(Col2) group by Col2 label Col2 'Denominator', count(Col2) 'Count'")
That outputs something like this:
Denominator
Count
Den 1
Count 1
Den 2
Count 2
use:
=ARRAY_CONSTRAIN(SORTN(QUERY({A3:B},
"select Col1,Col2,count(Col2)
where Col1 is not null
group by Col1,Col2
order by count(Col2) desc,Col2 asc
label count(Col2)''"), 9^9, 2, 1, 1), 9^9, 2)

Is it possible to pre-assign values to decision variables in CPLEX OPL

I have a large number of variables ( both Binary and Continuous). Therefore I have determined a logic to assign some variables set to 0 so that they do not become part of the optimisation process.
For example I have a binary decision variable y[b][t]:
where b varies from 1 to 100
and t from 1 to 5.
I could determine using some logic that y[20][2] onwards to y[100][2] would be 0. I want to assign the fixed value of 0 to these variables y[20][2] onwards to y[100][2] thereby reducing the number of variables in my optimisation problem. While y is a binary decision variable I have other continuous variable as well which I would like to similarly set to 0 in advance.
Is there a way how this can be achieved ? I haven't used Python with CPEX but hear that this can be probably be achieved by setting a lower and upper bound of the variables. Is there a similar method in OPL.
----Added 13th Aug
May be I was not very clear or I could not understand the solution suggested.
What I wanted is say I have the following decision variable Xbmt ...(I have a few of them)
Originally declared as :
dvar float+ Xbmt[PitBlocks][Plants][TimePeriods];
But for some of the PitBlocks and some time periods I want to define this decision variable as 0. Those time periods for which I want to set the decision variable as 0 are defined in a tuple nullVariables. It has block id same as PitBlocks, and it has time_period same as TimePeriod. Hence I want something like below. But I cannot declare the decision variable twice. I need it 0 only for those ids in the nullVariable set.
dvar float+ Xbmt[NullVariablesSet.block_id][Plants][NullVariablesSet.time_period] in 0..0;
How can this be achieved where some of Xbmt remain as decision variables where as some are removed by setting as 0
see https://github.com/AlexFleischerParis/zooopl/blob/master/zoopreassign.mod
within
Making Decision Optimization Simple
int nbKids=300;
{int} seats={40,30}; // how many seats, {} means this is a set
float costBus[seats]=[500,400];
// Now let s see how preassign some decision variables
// Suppose we know that we have exactly 6 buses 40 seats
{int} preassignedseats={40};
int preassignedvalues[preassignedseats]=[6];
dvar int+ nbBus[s in seats]
in
((s in preassignedseats)?preassignedvalues[s]:0)
..
((s in preassignedseats)?preassignedvalues[s]:maxint);
minimize sum(b in seats) costBus[b]*nbBus[b];
subject to
{
sum(b in seats) b*nbBus[b]>=nbKids;
}

Is there a way to use range with Z3ints in z3py?

I'm relatively new to Z3 and experimenting with it in python. I've coded a program which returns the order in which different actions is performed, represented with a number. Z3 returns an integer representing the second the action starts.
Now I want to look at the model and see if there is an instance of time where nothing happens. To do this I made a list with only 0's and I want to change the index at the times where each action is being executed, to 1. For instance, if an action start at the 5th second and takes 8 seconds to be executed, the index 5 to 12 would be set to 1. Doing this with all the actions and then look for 0's in the list would hopefully give me the instances where nothing happens.
The problem is: I would like to write something like this for coding the problem
list_for_check = [0]*total_time
m = s.model()
for action in actions:
for index in range(m.evaluate(action.number) , m.evaluate(action.number) + action.time_it_takes):
list_for_check[index] = 1
But I get the error:
'IntNumRef' object cannot be interpreted as an integer
I've understood that Z3 isn't returning normal ints or bools in their models, but writing
if m.evaluate(action.boolean):
works, so I'm assuming the if is overwritten in a way, but this doesn't seem to be the case with range. So my question is: Is there a way to use range with Z3 ints? Or is there another way to do this?
The problem might also be that action.time_it_takes is an integer and adding a Z3int with a "normal" int doesn't work. (Done in the second part of the range).
I've also tried using int(m.evaluate(action.number)), but it doesn't work.
Thanks in advance :)
When you call evaluate it returns an IntNumRef, which is an internal z3 representation of an integer number inside z3. You need to call as_long() method of it to convert it to a Python number. Here's an example:
from z3 import *
s = Solver()
a = Int('a')
s.add(a > 4);
s.add(a < 7);
if s.check() == sat:
m = s.model()
print("a is %s" % m.evaluate(a))
print("Iterating from a to a+5:")
av = m.evaluate(a).as_long()
for index in range(av, av + 5):
print(index)
When I run this, I get:
a is 5
Iterating from a to a+5:
5
6
7
8
9
which is exactly what you're trying to achieve.
The method as_long() is defined here. Note that there are similar conversion functions from bit-vectors and rationals as well. You can search the z3py api using the interface at: https://z3prover.github.io/api/html/namespacez3py.html

how to use while loop in pseudocode

I am trying to add the user inputs using a while loop and If statements. I am having trouble figuring out how to add all the userNumbers to each other. Any help would be appreciated.
//variables
Declare Integer userIn = 0
Declare Integer total = 0
//read numbers and calculate
While decision == decY
Display “Please enter your numbers: ”
Input decision
If UserIn > 0
Display userNumbers
Set total = userIn + 1
Display “Would you like to enter another number Y/N?”
Input decision
If decision == decN
Display “Done reading numbers, your total is ”, total
End If
End If
End While
Decide on a separator for the input, unless they're only allowed to enter a single number at a time in which case you can skip to 3.
Use string splitting to cut the input up and then loop through that list with for, while, do, until, or etc.
Create a sum total variable and add each input value to it, e.g. sum = sum + split_input[index], or if it will only allow a single input at a time sum = sum + input.
Some Notes:
Adding a value to a variable can be shortened to variable += value to add a value to an existing variable and assign the result to the variable, but not all languages support this syntax.
Not all programming languages start at 0 for list indices, so be sure to change the starting index accordingly.

Resources