Does dynamic dispatching decrease the cyclomatic complexity? - ruby

Imagine the following method converting a boolean into an int:
def bool_to_int(bool)
bool ? 1 : 0
end
bool_to_int(false) # => 0
bool_to_int(true) # => 1
Because of the conditional, the cyclomatic complexity score is two. Does the score drop to one by utilizing refinements in Ruby?:
module Extension
refine FalseClass do
def to_int; 0; end
end
refine TrueClass do
def to_int; 1 end
end
end
using Extension
false.to_int # => 0
true.to_int # => 1
In other words; does dynamic dispatching reduce the cyclomatic complexity score or are we just hiding the complexity by letting Ruby do the “heavy lifting”?

Cyclomatic complexity is based on the nodes and edges available in them. By implementing dynamic dispatching, you are just implementing conditions in a different way.
So, the cyclomatic complexity stays constant.
Edit based on the below comments.
Yes, the cyclomatic concept is bound tightly with conditional statements. Reducing if/else statements results in less cyclomatic complexity. But the paths are constants. But now they are more isolated and not based on the conditions.
So I feel, the answer should be, Yes, the complexity gets reduced if you are successful in removing the conditional statements. But paths are constants.

Related

Improvement for Selection Sort Algorithm?

I'm a new student to CS. Right now I'm self-studying Intro to programming. I was studying about Selection Sort Algorithm, and I think that by making this change below to selection sort it is going to make it be more efficient, is this true or I'm missing something??
the change is instead of calling the swap function each time even when there is no change made to the array, we can add changeMade Boolean variable and we use it to call the function only when there is a change made to the array. Please correct me if I'm wrong
Declare Integer startScan, i, minValue
Declare Integer minIndex
//the boolean variable
//that could make algorithim
//more efficient
Declare Boolean changemade
//Declare the array and Declare it is size
//and initialize it
Constant Integer SIZE = 5
Declare Integer array[SIZE]=1, 4, 8, 2, 5
For StartScan = 0 To SIZE-2
set changeMade = False
set array[startScan]= minValue
For i= startScan+1 To SIZE-1
If array[i]<minValue Then
set minValue=array[i]
set minIndex= i
set changeMade=True //the modification
End If
End For
If ChangeMade = True Then
call swap(array[minIndex], array[startScan])
End For
Module swap(Integer Ref a, Integer Ref b)
Declare Integer temp
set temp = a
set a = b
set b = temp
End Module
Operations such as swap are almost ignored while calculating the complexity.
Although all the operations are taken into account when calculating time complexity. But as loops are dominant compared to other operations, we ignore other operations and only consider dominant operations(Because for large input value, cost of all other operations are much smaller than the dominant operations) .
As an example with selection sort: When you consider all statement costs into account then you get a function f(n)=an2+bn+c (a,b and c are constants and depend on machine architecture). Here dominant term is an2.So we can say Time complexity of selection sort O(an2).We also ignore leading terms coefficient a ,as a does not change the rate of growth.
Have you read about the Asymptotic analysis and notations such as theta, omega, big O. Have a look at them, it will help you get the answer to your question.
That sounds like a false good idea.
This heuristic is effective on elements that are already at the correct position.
Assume there are 10% of them, which can be considered optimistic. While sorting N elements, you will spare 0.1 N swaps but add a large number of assignments to the flag (up to N²/2 !) and N tests of the flag (conditional instructions are very slow).
Unless the swap is really costly, odds are high that the overhead of manipulating the flag will dominate.
It is certainly a better idea to drop the flag and test minIndex != startScan, but even so, it is unsure that avoiding the swap will counterbalance the extra comparisons.

Algorithm Determining Time Complexity

For the following algorithm:
Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words. You may assume the dictionary does not contain duplicate words
For example, given
s = "leetcode",
dict = ["leet", "code"].
Return true because "leetcode" can be segmented as "leet code".
I wrote the following code:
def word_break(s, word_dict)
can_break?(s, word_dict, memo = {})
end
def can_break?(s, word_dict, memo)
return true if s.empty?
(0...s.length).each do |index|
string_to_pass = s[index + 1...s.length]
memo[string_to_pass] ||= can_break?(string_to_pass, word_dict, memo)
return true if word_dict.include?(s[0..index]) && memo[string_to_pass]
end
false
end
My understanding of the analysis here is that we have the number of recursive calls scaling linearly with input string size (since we trim down the number of recursive calls using memoization), and each recursive call does N work (to scan through the array). Does this evaluate to O(N^2) time and O(N) space? (note the dictionary is an array, not a hash).
That's a good summary of the rationale; yes, you're correct.
From a theoretical standpoint, with memoization, each character needs to serve as starting point for m scans, where m is the distance to the nearer end of the string. There are various scalar-level optimizations you can apply, but the complexity remains O(N^2)

Why is this imperative style faster than this functional style?

Say we have an array that we want to find his Equilibrium Index,
Why is imperative style faster than functional style, and what is the logic behind this algorithm (imperative's)?
functional style:
def eq_indices(list)
list.each_index.select do |i|
list[0...i].inject(0, :+) == list[i+1..-1].inject(0, :+)
end
end
imperative style:
def eq_indices(list)
left, right = 0, list.inject(0, :+)
equilibrium_indices = []
list.each_with_index do |val, i|
right -= val
equilibrium_indices << i if right == left
left += val
end
equilibrium_indices
end
Because in functional style, the sum of the left and right sides is calculated from scratch for each potential equilibrium, whereas in imperative style, the sum is calculated once, and only single subtraction and addition are performed for each potential equilibrium.
In this particular instance, the difference comes from the fact that the functional style solution has O(n^2), while the imperative has O(2n) = O(n).
Aka the functional solution makes one loop for each index and inside that loop there is another loop to determine the sums. In the imperative solution, there is one loop to assign the sum to right and one to find indices, but they are not nested.
You already have your answer, but since nobody has spelled this out clearly yet, only mentioned it implicitly, I want to make it explicit:
Why is imperative style faster than functional style
It isn't. The two versions don't implement the same algorithm. The performance difference is due to the difference in algorithms, not the difference in styles.

Calculating the expected probability that an expression resolves to True

Suppose I have a simple program that simulates a coin toss, with a given probability specified by an expression. It might look something like this:
# This is the probability that you will get heads.
$expr = "rand < 0.5"
def get_result(expr)
eval(expr)
end
def toss_coin
if get_result($expr)
return "Head"
else
return "Tail"
end
end
Now, I also want to tell the user what the probability of getting Head is.
For the given expression
"rand < 0.5"
We can eye-ball it and say the probability is 50%, because rand returns a number between 0 and 1, and therefore the expression evaluates to true 50% of the time on average.
However, if I decided to provide a rigged coin toss where the expression used to determine the outcome is
"rand < 0.3"
Now, I have a 30% chance of getting Head.
Is it possible to write a method that will take an arbitrary expression (that evaluates to a boolean!) and return the probability that it resolves to true?
def get_expected_probability(expr)
# Returns the probability the `expr` returns true
# `rand < 0.5` would return 0.5
# `rand < 0.3` would return 0.3
# `true` would return 1
# `false` would return 0
end
My guess would be that it would be theoreticially possible to write such a method, assuming you restricted yourself to rand and deterministic mathematical functions and had complete knowledge of the systems floating point implementation, etc.
It would be much more straightforward, however, to approximate the probability by executing the expression a large number of times and keeping track of the percentage of times it succeeded.
For simple comparisons to a uniform random number, yes, but in general, no. It depends on the distribution of the expression you're using to determine your boolean, and you could write arbitrarily complex expressions with bizarre distributions. However, it's pretty straightforward to estimate the probability empirically.
Create a Bernoulli (0/1) outcome based on the expression, yielding 1 when the expression is true and 0 when it is false. Generate a large number (n) of them. The long run average of the Bernoulli outcomes will converge to the probability of getting a true. If you call that p-hat and the true value is p, then p-hat should fall within the range p +/- (1.96 * sqrt(p*(1-p)/n)) 95% of the time. You can see from this that the larger the sample size n is, the more precise your estimate is.
An incredibly slow way of approximating this would be to evaluate the expression a very large number of times and estimate the probability it converges to. The Law of Large Numbers guarantees that as n approaches infinity, it will be that probability.
$expr = "rand < 0.5"
def get_result(expr)
eval(expr)
end
n = 1000000
a = Array.new(n)
n.times do |i|
a[i] = eval($expr)
end
puts a.count(true)/n.to_f
Returned 0.499899 for me.

My naive maximal clique finding algorithm runs faster than Bron-Kerbosch's. What's wrong?

In short, my naive code (in Ruby) looks like:
# $seen is a hash to memoize previously seen sets
# $sparse is a hash of usernames to a list of neighboring usernames
# $set is the list of output clusters
$seen = {}
def subgraph(set, adj)
hash = (set + adj).sort
return if $seen[hash]
$sets.push set.sort.join(", ") if adj.empty? and set.size > 2
adj.each {|node| subgraph(set + [node], $sparse[node] & adj)}
$seen[hash] = true
end
$sparse.keys.each do |vertex|
subgraph([vertex], $sparse[vertex])
end
And my Bron Kerbosch implementation:
def bron_kerbosch(set, points, exclude)
$sets.push set.sort.join(', ') if set.size > 2 and exclude.empty? and points.empty?
points.each_with_index do |vertex, i|
points[i] = nil
bron_kerbosch(set + [vertex],
points & $sparse[vertex],
exclude & $sparse[vertex])
exclude.push vertex
end
end
bron_kerbosch [], $sparse.keys, []
I also implemented pivoting and degeneracy ordering, which cut down on bron_kerbosch execution time, but not enough to overtake my initial solution. It seems wrong that this is the case; what algorithmic insight am I missing? Here is a writeup with more detail if you need to see fully working code. I've tested this on pseudo-random sets up to a million or so edges in size.
I don't know how you generate the random graphs for your tests but I suppose you use a function which generates a number according to a uniform distribution and thus you obtain a graph that is very homogeneous. That's a common problem when testing algorithms on graphs, it is very difficult to create good test cases (it's often as hard as solving the original problem).
The max-clique problem is a well-known NP hard problem and both algorithms (the naive one and the Bron Kerbosch one) have the same complexity so we can't expect a global improvement on all testcase but just an improvement on some particular cases. But because you used a uniform distribution to generate your graph, you don't have this particular case.
That's why performance of both algorithms is very similar on your data. And because Bron Kerbosch algorithm is a little more complex than the naive one, the naive one is faster.

Resources