How to manage the number of loops in while loop Pseudocode - algorithm

I'm a beginner and wanna ask about While Loop Pseudocode for the number of loops if i want it to be entered by user and not specifying how many, should i write it like this? or i have to declare the SM first?
Thank you
BEGIN
Student = 0
WHILE Student < SM
Get Work Efficiency, Task Completion Effectiveness, Team Work, SM # SM is
the number of students’ marks to be entered
Sum = Work Efficiency + Task Completion Effectiveness + Team Work
Competency = (Sum / 50) * 100
If Competency >= 70%
grade = ‘A’
display = “Exceed Expectation!”
else if Competency >= 40% AND Competency < 70%
grade = ‘B’
display = “Meet Expectation”
else if Competency >= 0% AND Competency < 40%
grade = ‘C’
display = “Below Expectation”
else
display = “Invalid input”
End if
Student = Student + 1
END WHILE
END

As you say, SM is provided by user's input, thus you should mention it somewhere. There is a lot of various ways of writing pseudocode and they mostly depend from your needs, so you could e.g. write:
SM <- integer user input
[rest of your code]
or wrap it in a function (that way you show that returned value is dependent from SM's value:
function foo(SM):
[rest of your code]

Related

DAX IF measure - return fixed value

This should be a very simple requirement. But it seems impossible to implement in DAX.
Data model, User lookup table joined to many "Cards" linked to each user.
I have a measure setup to count rows in CardUser. That is working fine.
<measureA> = count rows in CardUser
I want to create a new measure,
<measureB> = IF(User.boolean = 1,<measureA>, 16)
If User.boolean = 1, I want to return a fixed value of 16. Effectively, bypassing measureA.
I can't simply put User.boolean = 1 in the IF condition, throws an error.
I can modify measureA itself to return 0 if User.boolean = 1
measureA> =
CALCULATE (
COUNTROWS(CardUser),
FILTER ( User.boolean != 1 )
)
This works, but I still can't find a way to return 16 ONLY if User.boolean = 1.
That's easy in DAX, you just need to learn "X" functions (aka "Iterators"):
Measure B =
SUMX( VALUES(User.boolean),
IF(User.Boolean, [Measure A], 16))
VALUES function generates a list of distinct user.boolean values (1, 0 in this case). Then, SUMX iterates this list, and applies IF logic to each record.

How to calculate original loan amount without year terms?

https://www.moneysmart.gov.au/tools-and-resources/calculators-and-apps/savings-goals-calculator
I want to get result like above calculator when I select:
I want to save: 6000
I want to spend it: As soon as possible
Starting balance: 0
Interest rate : 10%
Regular savings: 1000 Monthly
But I am getting not correct result using this code:
loan = 6000.0
interest = 10.0
monthly_payment = 1000.0
i =0.0
record = []
count = 1
add_interst = 0.0
while( loan>=0)
i = interest/(100*12)*loan
loan=i+(loan)-(monthly_payment);
add_interst = add_interst + i
end
puts add_interst
I am getting 181.42163384701658 which should be 168. I don't know where I am wrong.
The code doesn't work because you are doing the opposite of what the link you reference is doing. What they are calculating is saving interest, what you are calculating is loan interest.
Basically, this is how you should define the variables.Also, as others have pointed out, it is good to use BigDecimal to calculate money:
require 'bigdecimal'
balance = 0.to_d
interest = 10.to_d/1200.to_d
regular_saving = 1000;
goal =6000;
i = 0;
added_interest = 0
So, to correct things, you have to start from the starting balance (i.e 0) and start incrementing. Something like this:
while balance < goal
balance += regular_saving;
i = balance * (interest);
balance +=i;
added_interest+=i;
end
Note also, that in the last year you don't need to pay the full saving amount. You only need to pay to reach the goal. For that, you need to add a conditional statement to check goal - balance < regular_saving. If this was the case, the interest should be calculated in terms of the balance that should be paid (slightly less than the goal).

How can I increase my python-code speed?

I have a dataframe, df1, that reports courses students have taken, where ID is the student’s id, COURSES is a list of courses taken by the student, and TYPE and MAJOR are student attributes. The dataframe looks like this:
ID COURSES TYPE MAJOR
1 ['Intr To Archaeology', 'Statics', 'Circuits I…] Freshman EEEL
2 ['Signals & Systems I', ‘Instrumentation’…] Transfer EEEL
3 ['Keyboard Competence', 'Elementary … ] Freshman EEEL
4 ['Cultural Anthro', 'Vector Analysis’ … ] Freshma EEEL
I created a new dataframe, df2, that reports a dissimilarity measure for each pair of students based on the courses they’ve taken. df2 looks like this:
I created using the following script, but it runs very slowly (there are thousands of students). Can someone suggest a more efficient way to create df2?
One major problem is that the script below calculates the distance between (student 1 and student 2) and (student 2 and student 1), which is redundant since the distances are the same. However, the condition I created to prevent this:
if (id1 >= id2):
continue
doesn't work.
Entire script:
for id1, student1 in df.iterrows():
for id2, student2 in df.iterrows():
if (id1 >= id2):
continue
ID_1 = student1["ID"]
ID_2 = student2["ID"]
# courses as list strings
s1 = student1["COURSES"]
s2 = student2["COURSES"]
try:
# courses as sets
courses1 = set(ast.literal_eval(s1))
courses2 = set(ast.literal_eval(s2))
distance = float(len(courses1.symmetric_difference(courses2)))/(len(courses1) + len(courses2))
except:
# Some strings seem to have a different format
distance = -1
ID_1_Transfer = 1 if student1["TYPE"] == "Transfer" else 0
ID_2_Transfer = 1 if student2["TYPE"] == "Transfer" else 0
df2= df2.append({'ID_1': ID_1,'ID_2': PIDM_2,'Distance': distance, 'ID_1_Transfer': ID_1_Transfer, 'ID_2_Transfer': ID_2_Transfer}, ignore_index=True)

SUM function in Pig script

I am a student learning how to use Pig script using the hortonworks sandbox. My problem is that I am not able to use the SUM function properly. I have successfully separated the fields of a firewall log and I am able to do perform several queries and use the count function... but no luck with the SUM function which I really need in one case. This code I used below:
A = FOREACH logs_base GENERATE device_id,src,src_port,dst,dst_port,tran_ip,tran_port,service,duration,sent,rcvd,sent_pkt,rcvd_pkt,SN,user,group1, REGEX_EXTRACT(date, '\\d{3}-(\\d{2})-\\d{2}', 1) AS(month:chararray);
F1 = FILTER A BY user == 'PR11MS1120' and month == '10';
grpd1 = group F1 by user;
counter = foreach grpd1 {
sum1 = SUM(A.rcvd);
sum2 = SUM(A.sent);
generate sum1, sum2;
};
dump counter;
C = foreach F1 generate rcvd, sent;
dump C;
When I dump just the variable C I get a result displaying many records indicating the amount of data received/sent for the filter applied. eg:
(223,123)
(334,444)
(21,12344)
(...,...)
All I really want to do is add all those records together and show that total amount of received and sent: (?,?).
Note: I have tried changing the variable type to int, long, and chararray with no success either.
Some of the errors I am getting while trying to solve this are:
Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.
First make sure that the fields that you are summing up are of type int
Use - DESCRIBE A; to check the data type
After that, I think since you have used filter condition and then used group by on F1 -
F1 = FILTER A BY user == 'PR11MS1120' and month == '10';
grpd1 = group F1 by user;
So, while summing up you should use F1 instead of A -
counter = foreach grpd1 {
sum1 = SUM(F1.rcvd);
sum2 = SUM(F1.sent);
generate sum1, sum2;
};
Use DESCRIBE grpd1; and you will understand what I am trying to say, there will be no 'A'
I guess this should solve the error. Finally, check the logic of what you want in the result I have not checked that. Hope this helps.
PS - I am also a student and new to PIG.
A lucky guess here, I'm new to Pig too :)
I'm not sure if SUM can be casted to chararray(that would explain the error), so make rcvd and sent type:int and then generate the 2 sums for grpd1 bag:
F1 = FILTER A BY user == 'PR11MS1120' and month == '10';
grpd1 = group F1 by user;
C1 = foreach grpd1 generate SUM(F1.rcvd);
dump C1;
C2 = foreach grpd1 generate SUM(F1.sent);
dump C2;
NOTE: More info here.
Hope I helped a little!
Please try the following
A = FOREACH logs_base GENERATE device_id,src,src_port,dst,dst_port,tran_ip,tran_port,service,duration,sent,rcvd,sent_pkt,rcvd_pkt,SN,user,group1, REGEX_EXTRACT(date, '\\d{3}-(\\d{2})-\\d{2}', 1) AS(month:chararray);
F1 = FILTER A BY user == 'PR11MS1120' and month == '10';
grpd1 = group F1 by user;
C = foreach F1 generate group,SUM(F1.rcvd), SUM(F1.sent);
dump C;

Regroup By in PigLatin

In PigLatin, I want to group by 2 times, so as to select lines with 2 different laws.
I'm having trouble explaining the problem, so here is an example. Let's say I want to grab the specifications of the persons who have the nearest age as mine ($my_age) and have lot of money.
Relation A is four columns, (name, address, zipcode, age, money)
B = GROUP A BY (address, zipcode); # group by the address
-- generate the address, the person's age ...
C = FOREACH B GENERATE group, MIN($my_age - age) AS min_age, FLATTEN(A);
D = FILTER C BY min_age == age
--Then group by as to select the richest, group by fails :
E = GROUP D BY group; or E = GROUP D BY (address, zipcode);
-- The end would work
D = FOREACH E GENERATE group, MAX(money) AS max_money, FLATTEN(A);
F = FILTER C BY max_money == money;
I've tried to filter at the same time the nearest and the richest, but it doesn't work, because you can have richest people who are oldest as mine.
An another more realistic example is :
You have demands file like : iddem, idopedem, datedem
You have operations file like : idope,labelope,dateope,idoftheday,infope
I want to return operations that matches demands like :
idopedem matches ideope.
The dateope must be the nearest with datedem.
If datedem - date_ope > 0, then I must select the operation with the max(idoftheday), else I must select the operation with the min(idoftheday).
Relation A is 5 columns (idope,labelope,dateope,idoftheday,infope)
Relation B is 3 columns (iddem, idopedem, datedem)
C = JOIN A BY idope, B BY idopedem;
D = FOREACH E GENERATE iddem, idope, datedem, dateope, ABS(datedem - dateope) AS datedelta, idoftheday, infope;
E = GROUP C BY iddem;
F = FOREACH D GENERATE group, MIN(C.datedelta) AS deltamin, FLATTEN(D);
G = FILTER F BY deltamin == datedelta;
--Then I must group by another time as to select the min or max idoftheday
H = GROUP G BY group; --Does not work when dump
H = GROUP G BY iddem; --Does not work when dump
I = FOREACH H GENERATE group, (datedem - dateope >= 0 ? max(idoftheday) as idofdaysel : min(idoftheday) as idofdaysel), FLATTEN(D);
J = FILTER F BY idofdaysel == idoftheday;
DUMP J;
Data in the 2nd example (note date are already in Unix format) :
You have demands file like :
1, 'ctr1', 1359460800000
2, 'ctr2', 1354363200000
You have operations file like :
idope,labelope,dateope,idoftheday,infope
'ctr0','toto',1359460800000,1,'blabla0'
'ctr0','tata',1359460800000,2,'blabla1'
'ctr1','toto',1359460800000,1,'blabla2'
'ctr1','tata',1359460800000,2,'blabla3'
'ctr2','toto',1359460800000,1,'blabla4'
'ctr2','tata',1359460800000,2,'blabla5'
'ctr3','toto',1359460800000,1,'blabla6'
'ctr3','tata',1359460800000,2,'blabla7'
Result must be like :
1, 'ctr1', 'tata',1359460800000,2,'blabla3'
2, 'ctr2', 'toto',1359460800000,1,'blabla4'
Sample input and output would help greatly, but from what you have posted it appears to me that the problem is not so much in writing the Pig script but in specifying what exactly it is you hope to accomplish. It's not clear to me why you're grouping at all. What is the purpose of grouping by address, for example?
Here's how I would solve your problem:
First, design an optimization function that will induce an ordering on your dataset that reflects your own prioritization of money vs. age. For example, to severely penalize large age differences but prefer more money with small ones, you could try:
scored = FOREACH A GENERATE *, money / POW(1+ABS($my_age-age)/10, 2) AS score;
ordered = ORDER scored BY score DESC;
top10 = LIMIT ordered 10;
That gives you the 10 best people according to your optimization function.
Then the only work is to design a function that matches your own judgments. For example, in the function I chose, a person with $100,000 who is your age would be preferred to someone with $350,000 who is 10 years older (or younger). But someone with $500,000 who is 20 years older or younger is preferred to someone your age with just $50,000. If either of those don't fit your intuition, then modify the formula. Likely a simple quadratic factor won't be sufficient. But with a little experimentation you can hit upon something that works for you.

Resources