How to refactor on easy way simple rspec test to use subject/let/it and FactoryGirl? - ruby

Basically i am having very simple test where i check if account_id from Grant (so id of his parent) is equal to account_id of Role model.
Test seems ok, but i need to refactor it to use subject/let and to use FactoryGirl and i can't do that even with all the info on this subject i found.
So, test is fine i just need help with refactoring.
It's first time for me working with rails and writing test on it, so i am forced to ask this type of question here.
describe "validations" do
it "should have equal account_id" do
first_acc, second_acc = Account.new, Account.new
first_acc.id, second_acc.id = 10, 11
vlada_role, vlada_grant = Role.new, Grant.new
vlada_role1, vlada_grant1 = Role.new, Grant.new
vlada_role.account_id = first_acc.id
vlada_grant.account_id = second_acc.id
vlada_role.id, vlada_grant.id = 8000, 255
rg = RoleGrant.new
rg1 = RoleGrant.new
rg.role, rg1.role = vlada_role, vlada_role1
rg.grant, rg1.grant = vlada_grant, vlada_grant1
expect(rg.role.account_id).not_to eql(rg.grant.account_id)
expect(rg1.role.account_id).to eql(rg1.grant.account_id)
end
end

Related

How can I get the score from Question-Answer Pipeline? Is there a bug when Question-answer pipeline is used?

When I run the following code
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
text = r"""
As checked Dis is not yet on boarded to ARB portal, hence we cannot upload the invoices in portal
"""
questions = [
"Dis asked if it is possible to post the two invoice in ARB.I have not access so I wanted to check if you would be able to do it.",
]
for question in questions:
inputs = tokenizer.encode_plus(question, text, add_special_tokens=True, return_tensors="pt")
input_ids = inputs["input_ids"].tolist()[0]
text_tokens = tokenizer.convert_ids_to_tokens(input_ids)
answer_start_scores, answer_end_scores = model(**inputs)
answer_start = torch.argmax(
answer_start_scores
) # Get the most likely beginning of answer with the argmax of the score
answer_end = torch.argmax(answer_end_scores) + 1 # Get the most likely end of answer with the argmax of the score
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))
print(f"Question: {question}")
print(f"Answer: {answer}\n")
The answer that I get here is:
Question: Dis asked if it is possible to post the two invoice in ARB.I have not access so I wanted to check if you would be able to do it.
Answer: dis is not yet on boarded to ARB portal
How do I get a score for this answer? Score here is very similar to what is I get when I run Question-Answer pipeline .
I have to take this approach since Question-Answer pipeline when used is giving me Key Error for the below code
from transformers import pipeline
nlp = pipeline("question-answering")
context = r"""
As checked Dis is not yet on boarded to ARB portal, hence we cannot upload the invoices in portal.
"""
print(nlp(question="Dis asked if it is possible to post the two invoice in ARB?", context=context))
This is my attempt to get the score. It appears that I cannot figure out what feature.p_mask. So I could not remove the non-context indexes that contribute to the softmax at the moment.
# ... assuming imports and question and context
model_name="deepset/roberta-base-squad2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
inputs = tokenizer(question, context,
add_special_tokens=True,
return_tensors='pt')
input_ids = inputs['input_ids'].tolist()[0]
outputs = model(**inputs)
# used to compute score
start = outputs.start_logits.detach().numpy()
end = outputs.end_logits.detach().numpy()
# from source code
# Ensure padded tokens & question tokens cannot belong to the set of candidate answers.
#?? undesired_tokens = np.abs(np.array(feature.p_mask) - 1) & feature.attention_mask
# Generate mask
undesired_tokens = inputs['attention_mask']
undesired_tokens_mask = undesired_tokens == 0.0
# Make sure non-context indexes in the tensor cannot contribute to the softmax
start_ = np.where(undesired_tokens_mask, -10000.0, start)
end_ = np.where(undesired_tokens_mask, -10000.0, end)
# Normalize logits and spans to retrieve the answer
start_ = np.exp(start_ - np.log(np.sum(np.exp(start_), axis=-1, keepdims=True)))
end_ = np.exp(end_ - np.log(np.sum(np.exp(end_), axis=-1, keepdims=True)))
# Compute the score of each tuple(start, end) to be the real answer
outer = np.matmul(np.expand_dims(start_, -1), np.expand_dims(end_, 1))
# Remove candidate with end < start and end - start > max_answer_len
max_answer_len = 15
candidates = np.tril(np.triu(outer), max_answer_len - 1)
scores_flat = candidates.flatten()
idx_sort = [np.argmax(scores_flat)]
start, end = np.unravel_index(idx_sort, candidates.shape)[1:]
end += 1
score = candidates[0, start, end-1]
start, end, score = start.item(), end.item(), score.item()
print(tokenizer.decode(input_ids[start:end]))
print(score)
See more source code

Using multiple validation sets with keras

I am training a model with keras using the model.fit() method.
I would like to use multiple validation sets that should be validated on separately after each training epoch so that i get one loss value for each validation set. If possible they should be both displayed during training and as well be returned by the keras.callbacks.History() callback.
I am thinking of something like this:
history = model.fit(train_data, train_targets,
epochs=epochs,
batch_size=batch_size,
validation_data=[
(validation_data1, validation_targets1),
(validation_data2, validation_targets2)],
shuffle=True)
I currently have no idea how to implement this. Is it possible to achieve this by writing my own Callback? Or how else would you approach this problem?
I ended up writing my own Callback based on the History callback to solve the problem. I'm not sure if this is the best approach but the following Callback records losses and metrics for the training and validation set like the History callback as well as losses and metrics for additional validation sets passed to the constructor.
class AdditionalValidationSets(Callback):
def __init__(self, validation_sets, verbose=0, batch_size=None):
"""
:param validation_sets:
a list of 3-tuples (validation_data, validation_targets, validation_set_name)
or 4-tuples (validation_data, validation_targets, sample_weights, validation_set_name)
:param verbose:
verbosity mode, 1 or 0
:param batch_size:
batch size to be used when evaluating on the additional datasets
"""
super(AdditionalValidationSets, self).__init__()
self.validation_sets = validation_sets
for validation_set in self.validation_sets:
if len(validation_set) not in [3, 4]:
raise ValueError()
self.epoch = []
self.history = {}
self.verbose = verbose
self.batch_size = batch_size
def on_train_begin(self, logs=None):
self.epoch = []
self.history = {}
def on_epoch_end(self, epoch, logs=None):
logs = logs or {}
self.epoch.append(epoch)
# record the same values as History() as well
for k, v in logs.items():
self.history.setdefault(k, []).append(v)
# evaluate on the additional validation sets
for validation_set in self.validation_sets:
if len(validation_set) == 3:
validation_data, validation_targets, validation_set_name = validation_set
sample_weights = None
elif len(validation_set) == 4:
validation_data, validation_targets, sample_weights, validation_set_name = validation_set
else:
raise ValueError()
results = self.model.evaluate(x=validation_data,
y=validation_targets,
verbose=self.verbose,
sample_weight=sample_weights,
batch_size=self.batch_size)
for metric, result in zip(self.model.metrics_names,results):
valuename = validation_set_name + '_' + metric
self.history.setdefault(valuename, []).append(result)
which i am then using like this:
history = AdditionalValidationSets([(validation_data2, validation_targets2, 'val2')])
model.fit(train_data, train_targets,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data1, validation_targets1),
callbacks=[history]
shuffle=True)
I tested this on TensorFlow 2 and it worked. You can evaluate on as many validation sets as you want at the end of each epoch:
class MyCustomCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
res_eval_1 = self.model.evaluate(X_test_1, y_test_1, verbose = 0)
res_eval_2 = self.model.evaluate(X_test_2, y_test_2, verbose = 0)
print(res_eval_1)
print(res_eval_2)
And later:
my_val_callback = MyCustomCallback()
# Your model creation code
model.fit(..., callbacks=[my_val_callback])
Considering the current keras docs, you can pass callbacks to evaluate and evaluate_generator. So you can call evaluate multiple times with different datasets.
I have not tested it, so I am happy if you comment your experiences with it below.

how to exclude from counting holidays in rails4

I am trying to make an app for vacation requests but i am facing a problem.I have e model called VacationRequest and a view of VacationRequest where the result will be shown.
VacationRequest.rb model
def skip_holidays
count1 = 0
special_days = Date.parse("2017-05-09", "2017-05-12")
special_days.each do |sd|
if ((start_data..end_data) === sd)
numero = (end_data - start_data).to_i
numro1 = numero - sd
else
numero = (end_data - start_data).to_i
end
end
end
VacationRequest show.htm.erb
here i called the method from model
#vacation_request.skip_holidays
this is showing errors and is not working. Please help me with that!
My approach to this would be the following:
def skip_holidays
special_days = ["2017-05-09", "2017-05-12"].map(&:to_date)
accepted_days = []
refused_days = []
(start_data..end_data).each do |requested_date|
accepted_days << requested_date unless special_days.include?(requested_date)
refused_days << requested_date if special_days.include?(requested_date)
end
accepted_day_count = accepted_days.count
refused_days_count = refused_days.count
end
This way you iterated over all requested dates (the range), checked if the date is a special day - If so, refuse it, otherwise accept it.
In the end you can display statistics about accepted and refused dates.
You cannot create a date range by passing multiple argument to the constructor. Instead call the constructor twice to create a range:
special_days = Date.parse("2017-05-09") .. Date.parse("2017-05-12")
Or if instead you want only the two dates, do:
special_days = ["2017-05-09", "2017-05-12"].map &:to_date
This Date.parse("2017-05-09", "2017-05-12") don`t create a range, only return the first params parsed:
#irb
Date.parse("2017-05-09", "2017-05-12")
=> Tue, 09 May 2017
You can do it this way:
initial = Date.parse("2017-05-09")
final = Date.parse("2017-05-12")
((initial)..final).each do |date|
#rules here.
end

Improve genbank feature addition

I am trying to add more than 70000 new features to a genbank file using biopython.
I have this code:
from Bio import SeqIO
from Bio.SeqFeature import SeqFeature, FeatureLocation
fi = "myoriginal.gbk"
fo = "mynewfile.gbk"
for result in results:
start = 0
end = 0
result = result.split("\t")
start = int(result[0])
end = int(result[1])
for record in SeqIO.parse(original, "gb"):
record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feat"))
SeqIO.write(record, fo, "gb")
Results is just a list of lists containing the start and end of each one of the features I need to add to the original gbk file.
This solution is extremely costly for my computer and I do not know how to improve the performance. Any good idea?
You should parse the genbank file just once. Omitting what results contains (I do not know exactly, because there are some missing pieces of code in your example), I would guess something like this would improve performance, modifying your code:
fi = "myoriginal.gbk"
fo = "mynewfile.gbk"
original_records = list(SeqIO.parse(fi, "gb"))
for result in results:
result = result.split("\t")
start = int(result[0])
end = int(result[1])
for record in original_records:
record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feat"))
SeqIO.write(record, fo, "gb")

Ruby driver tests failing for credit card with luhn algorithm

I worked up a working code to check if a credit card is valid using luhn algorithm:
class CreditCard
def initialize(num)
##num_arr = num.to_s.split("")
raise ArgumentError.new("Please enter exactly 16 digits for the credit card number.")
if ##num_arr.length != 16
#num = num
end
def check_card
final_ans = 0
i = 0
while i < ##num_arr.length
(i % 2 == 0) ? ans = (##num_arr[i].to_i * 2) : ans = ##num_arr[i].to_i
if ans > 9
tens = ans / 10
ones = ans % 10
ans = tens + ones
end
final_ans += ans
i += 1
end
final_ans % 10 == 0 ? true : false
end
end
However, when I create driver test codes to check for it, it doesn't work:
card_1 = CreditCard.new(4563960122001999)
card_2 = CreditCard.new(4563960122001991)
p card_1.check_card
p card_2.check_card
I've been playing around with the code, and I noticed that the driver code works if I do this:
card_1 = CreditCard.new(4563960122001999)
p card_1.check_card
card_2 = CreditCard.new(4563960122001991)
p card_2.check_card
I tried to research before posting on why this is happening. Logically, I don't see why the first driver codes wouldn't work. Can someone please assist me as to why this is happening?
Thanks in advance!!!
You are using a class variable that starts with ##, which is shared among all instances of CreditCard as well as the class (and other related classes). Therefore, the value will be overwritten every time you create a new instance or apply check_card to some instance. In your first example, the class variable will hold the result for the last application of the method, and hence will reflect the result for the last instance (card_2).

Resources