Permutations of strings takes too long to solve - ruby

I'm creating an array of permutated and unique letters in a string, only to sort them alphabetically and find the middle element in the set.
def middle_permutation(string)
length = string.length
permutation_set = string.split("").permutation(length).to_a.map{|item| item.join}.sort
permutation_set.length.even? ? permutation_set[(permutation_set.length)/2-1] : permutation_set[(permutation_set.length/2)+1]
end
For example:
middle_permutation("zxcvbnmasd") should equal "mzxvsndcba"
Even for small strings (N >=10), the calculations take pretty long to finish, and I can forget about anything double that; is there a quicker way?

I'm assuming the letters are unique, as in the OP's question.
Sort
Pluck the middle letter of the sorted string (rounded down). This is the first letter of the middle permutation.
If the original list had an even number of letters, the rest of the permutation is the reverse sort of the remaining letters.
If not, take the middle letter again. Now the rest of the result is the reverse sort of the remaining letters.

The method below returns the desired permutation directly, without iterating through permutations.
The asker has stated that the string contains no duplicated letters, which is a requirement for this method. I assume the characters of the string are sorted. If they are not, the creation of a sorted string would be the first step:
str = "ebadc".chars.sort.join
#=> "abcde"
Code
def mid_perm(str)
return mid_perm_even_length_strings(str) if str.size.even?
first_char_index = str.size/2
str[first_char_index] << mid_perm_even_length_strings(str[0,first_char_index] +
str[first_char_index+1..-1])
end
def mid_perm_even_length_strings(str)
first_char_index = str.size/2-1
str[first_char_index] + (str[0,first_char_index] + str[first_char_index+1..-1]).reverse
end
Examples
mid_perm 'abcd'
#=> "bdca"
mid_perm 'abcde'
#=> "cbeda"
mid_perm 'abcdefghijklmnopqrstuvwxyz'
#=> "mzyxwvutsrqponlkjihgfedcba"
Explanation
Let's start by defining a method to produce permutations of the letters of a string.
def perms(str)
str.chars.permutation(str.size).map(&:join)
end
Strings containing an even number of characters
Consider
a = perms "abcd"
#=> ["abcd", "abdc", "acbd", "acdb", "adbc", "adcb",
# "bacd", "badc", "bcad", "bcda", "bdac", "bdca",
# "cabd", "cadb", "cbad", "cbda", "cdab", "cdba",
# "dabc", "dacb", "dbac", "dbca", "dcab", "dcba"]
a contains 4! #=> 4*3*2 => 24 elements, 4 being the length of the string.
Notice that since the characters in perms' argument are sorted, the array returned is also sorted1.
a == a.sort #=>true
As a.size #=> 24, the "middle" element is either a[11] #=> "bdca" or a[12] #=> "cabd" (where 11 = (24-1)/2 and 12 = 24/2), depending on how we want to round. The question stipulates that, for even-length strings, we are to round down, so that would be "bdca".
Now let's slice a into str.size equal arrays, each containing a.size/str.size #=> 24/4 => 6 elements:
b = a.each_slice(a.size/str.size).to_a
#=> [["abcd", "abdc", "acbd", "acdb", "adbc", "adcb"],
# ["bacd", "badc", "bcad", "bcda", "bdac", "bdca"],
# ["cabd", "cadb", "cbad", "cbda", "cdab", "cdba"],
# ["dabc", "dacb", "dbac", "dbca", "dcab", "dcba"]]
The desired element is therefore
b[(a.size/str.size-1)/2-1][-1]
#=> "bdca"
This value can be computed more directly as follows.
first_char_index = str.size/2-1
#=> 1
first_char = str[first_char_index]
#=> "b"
remaining_chars = (str[0,first_char_index] + str[first_char_index+1..-1]).reverse
#=> "dca"
first_char + remaining_chars
#=> "bdca"
The same logic applies to all strings having an even number of characters. We therefore can write the method mid_perm_even_length_strings shown in the Code section above.
For example (for a 12-character string)
mid_perm_even_length_strings 'abcdefghijkl'
#=> "flkjihgedcba"
Strings containing an odd number of characters
Now consider
str = "abcde"
a = perms str
#=> ["abcde", "abced", "abdce", "abdec", "abecd", "abedc",
# "acbde", "acbed", "acdbe", "acdeb", "acebd", "acedb",
# "adbce", "adbec", "adcbe", "adceb", "adebc", "adecb",
# "aebcd", "aebdc", "aecbd", "aecdb", "aedbc", "aedcb",
# "bacde", "baced", "badce", "badec", "baecd",..., "bedca",
# "cabde", "cabed", "cadbe", "cadeb", "caebd", "caedb",
# "cbade", "cbaed", "cbdae", "cbdea", "cbead", "cbeda",
# "cdabe", "cdaeb", "cdbae", "cdbea", "cdeab", "cdeba",
# "ceabd", "ceadb", "cebad", "cebda", "cedab", "cedba",
# "dabce", "dabec", "dacbe", "daceb", "daebc",..., "decba",
# "eabcd", "eabdc", "eacbd", "eacdb", "eadbc",..., "edcba"]
Here the permutation contains 5! #=> 100 elements, in 5 blocks of 20. (Again, a.each_cons(2).all? { |s1,s2| s1 < s2 } #=> true.)
The middle element of a is clearly the middle element of the block of elements that begin with
str[str.size/2] #=> "c"
That block would be the array
b = a.each_slice(a.size/str.size).to_a[str.size/2]
#=> ["cabde", "cabed", "cadbe", "cadeb", "caebd", "caedb",
# "cbade", "cbaed", "cbdae", "cbdea", "cbead", "cbeda",
# "cdabe", "cdaeb", "cdbae", "cdbea", "cdeab", "cdeba",
# "ceabd", "ceadb", "cebad", "cebda", "cedab", "cedba"]
which would be 'c' plus the middle element of the array
["abde", "abed", "adbe", "adeb", "aebd", "aedb",
"bade", "baed", "bdae", "bdea", "bead", "beda",
"dabe", "daeb", "dbae", "dbea", "deab", "deba",
"eabd", "eadb", "ebad", "ebda", "edab", "edba"]
That array is merely the permutations of the string "abde". Since that string contains an even number characters, its middle element is
mid_perm_even_length_strings 'abde'
#=> "beda"
It follows that the middle element of the permutations of the letters of "abcde" is therefore
'c' + 'abde'
#=> "cabde"
This clearly applies to all strings containing an odd number of characters.
1. The doc for Array#permutation states, "The implementation makes no guarantees about the order in which the permutations are yielded.". We therefore might need to tack .sort to the end of the operative line of perms, but with Ruby v2.4 (and I suspect, earlier versions) that is, in fact not necessary here.

I was able to compact it like this:
def middle_permutation(string)
list = string.chars.permutation.map(&:join).sort
list[list.length / 2 - (list.length.even? ? 1 : 0)]
end
Which yields:
middle_permutation('zxcvbnmasd')
# => "mzxvsndcba"

You don't need to generate all permutations. Just find overall number of permutations as PN = N! where N is string (of different chars) length and calculate only needed PN/2-th permutation by its number - for example, using this approach
public static int[] perm(int n, int k)
{
int i, ind, m=k;
int[] permuted = new int[n];
int[] elems = new int[n];
for(i=0;i<n;i++) elems[i]=i;
for(i=0;i<n;i++)
{
ind=m%(n-i);
m=m/(n-i);
permuted[i]=elems[ind];
elems[ind]=elems[n-i-1];
}
return permuted;
}

So it turns out there are two tracks to this, odd strings and even strings.
For odd strings, you take out the middle character Element of the sorted array and the one before it, in that order. When you do that you have two remaining arrays, the one the right and left, both alphabetically sorted. You tack on elements of the right array, starting with the last element, then do the same for the one on the left.
For even strings, Do the same but only take one character in the first step: the (N/2) element.
Here's my solution:
def middle_permutation(string)
string_array = string.chars.sort
mid_string = []
length = string.length
if length.even?
mid_string << string_array[length/2-1]
string_array.delete_at(length/2-1)
(mid_string << string_array.reverse).flatten.join
else
mid_string << string_array[(length/2)-1..length/2].reverse
string_array.slice!((length/2)-1, 2)
(mid_string << string_array.reverse).flatten.join
end
end

Related

Using regular expressions to multiply and sum numeric string characters contained in a hash of mixed numeric strings

Without getting too much into biology, Proteins are made of Amino Acids. Each of the 20 Amino Acids that make up Proteins are represented by characters in a sequence. Each Amino Acid char has a different chemical formula, which I represent as strings. For example, "M" has a formula of "C5H11NO2S"
Given the 20 different formulas (and the varying frequency of each amino acid chars in a protein sequence) I want to compile all 20 of them into a single formula that will yield the total formula for the protein.
So first: multiply each formula by the frequency of its char in the sequence
Second : sum together all multiplied formulas into one formula.
To accomplish this, I first tried multiplying each amino acid char frequency in the sequence by the numbers in the chemical formula. I did this using .tally
sequence ="MGAAARTLRLALGLLLLATLLRPADACSCSPVHPQQAFCNADVVIRAKAVSEKEVDSGNDIYGNPIKRIQYEIKQIKMFKGPEKDIEFI"
sequence.chars.string.tally --> {"M"=>2, "G"=>5, "A"=>11, "R"=>5, "T"=>2, "L"=>9, "P"=>5, "D"=>5, "C"=>3, "S"=>4, "V"=>5, "H"=>1, "Q"=>4, "F"=>3, "N"=>3, "I"=>8, "K"=>7, "E"=>5, "Y"=>2}
Then, I listed all the amino acids chars and formulas into a hash
hash_of_formulas = {"A"=>"C3H7NO2", "R"=>"C6H14N4O2", "N"=>"C4H8N2O3", "D"=>"C4H7NO4", "C"=>"C3H7NO2S", "E"=>"C5H9NO4", "Q"=>"C5H10N2O3", "G"=>"C2H5NO2", "H"=>"C6H9N3O2", "I"=>"C6H13NO2", "L"=>"C6H13NO2", "K"=>"C6H14N2O2", "M"=>"C5H11NO2S", "F"=>"C9H11NO2", "P"=>"C5H9NO2", "S"=>"C3H7NO3", "T"=>"C4H9NO3", "W"=>"C11H12N2O2", "Y"=>"C9H11NO3", "V"=>"C5H11NO2"}
An example of what the process for my overall goal is:
In the sequence , "M" occurs twice so "C5H11NO2S" will become "C10H22N2O4S2". "C" has a formula of "C3H7NO2S" occurs 3 times: In the sequence so "C3H7NO2S" becomes "C9H21N3O6S3"
So, Summing together "C10H22N2O4S2" and "C9H21N3O6S3" will yield "C19H43N5O10S5"
How can I repeat the process of multiplying each formula by its frequency and then summing together all multiplied formulas?
I know that I could use regex for multiplying a formula by its frequency for an individual string using
formula_multiplied_by_frequency = "C5H11NO2S".gsub(/\d+/) { |x| x.to_i * 4}
But I'm not sure of any methods to use regex on strings embedded within hashes
If I understand correctly, you want the to provide the total formula for a given protein sequence. Here's how I'd do it:
NUCLEOTIDES = {"A"=>"C3H7NO2", "R"=>"C6H14N4O2", "N"=>"C4H8N2O3", "D"=>"C4H7NO4", "C"=>"C3H7NO2S", "E"=>"C5H9NO4", "Q"=>"C5H10N2O3", "G"=>"C2H5NO2", "H"=>"C6H9N3O2", "I"=>"C6H13NO2", "L"=>"C6H13NO2", "K"=>"C6H14N2O2", "M"=>"C5H11NO2S", "F"=>"C9H11NO2", "P"=>"C5H9NO2", "S"=>"C3H7NO3", "T"=>"C4H9NO3", "W"=>"C11H12N2O2", "Y"=>"C9H11NO3", "V"=>"C5H11NO2"}
NUCLEOTIDE_COMPOSITIONS = NUCLEOTIDES.each_with_object({}) { |(nucleotide, formula), compositions|
compositions[nucleotide] = formula.scan(/([A-Z][a-z]*)(\d*)/).map { |element, count| [element, count.empty? ? 1 : count.to_i] }.to_h
}
def formula(sequence)
sequence.each_char.with_object(Hash.new(0)) { |nucleotide, final_counts|
NUCLEOTIDE_COMPOSITIONS[nucleotide].each { |element, element_count|
final_counts[element] += element_count
}
}.map { |element, element_count|
"#{element}#{element_count.zero? ? "" : element_count}"
}.join
end
sequence = "MGAAARTLRLALGLLLLATLLRPADACSCSPVHPQQAFCNADVVIRAKAVSEKEVDSGNDIYGNPIKRIQYEIKQIKMFKGPEKDIEFI"
p formula(sequence)
# => "C434H888N51O213S"
You can't use regexp to multiply things. You can use it to parse a formula, but then it's on you and regular Ruby to do the math. The first job is to prepare a composition lookup by breaking down each nucleotide formula. Once we have a composition hash for each nucleotide, we can iterate over a nucleotide sequence, and add up all the elements of each nucleotide.
BTW, tally is not particularly useful here, since tally will need to iterate over the sequence, and then you have to iterate over tally anyway — and there is no aggregate operation going on that can't be done going over each letter independently.
EDIT: I probably made the regexp slightly more complicated that it needs to be, but it should parse stuff like CuSO4 correctly. I don't know if it's an accident or not that all nucleotides are only composed of elements with a single-character symbol... :P )
Givens
We are given a string representing a protein comprised of amino acids:
sequence = "MGAAARTLRLALGLLLLATLLRPADACSCSPVHPQQAFCNADVVIR" +
"AKAVSEKEVDSGNDIYGNPIKRIQYEIKQIKMFKGPEKDIEFI"
and a hash that contains the formulas of amino acids:
formulas = {
"A"=>"C3H7NO2", "R"=>"C6H14N4O2", "N"=>"C4H8N2O3", "D"=>"C4H7NO4",
"C"=>"C3H7NO2S", "E"=>"C5H9NO4", "Q"=>"C5H10N2O3", "G"=>"C2H5NO2",
"H"=>"C6H9N3O2", "I"=>"C6H13NO2", "L"=>"C6H13NO2", "K"=>"C6H14N2O2",
"M"=>"C5H11NO2S", "F"=>"C9H11NO2", "P"=>"C5H9NO2", "S"=>"C3H7NO3",
"T"=>"C4H9NO3", "W"=>"C11H12N2O2", "Y"=>"C9H11NO3", "V"=>"C5H11NO2"
}
Obtain counts of atoms in each amino acid
As a first step we can calculate the numbers of each atom in each amino acid:
counts = formulas.transform_values do |s|
s.scan(/[CHNOS]\d*/).
each_with_object({}) do |s,h|
h[s[0]] = s.size == 1 ? 1 : s[1..-1].to_i
end
end
#=> {"A"=>{"C"=>3, "H"=>7, "N"=>1, "O"=>2},
# "R"=>{"C"=>6, "H"=>14, "N"=>4, "O"=>2},
# ...
# "M"=>{"C"=>5, "H"=>11, "N"=>1, "O"=>2, "S"=>1}
# ...
# "V"=>{"C"=>5, "H"=>11, "N"=>1, "O"=>2}}
Compute formula for protein
Then it's simply:
def protein_formula(sequence, counts)
sequence.each_char.
with_object("C"=>0, "H"=>0, "N"=>0, "O"=>0, "S"=>0) do |c,h|
counts[c].each { |aa,cnt| h[aa] += cnt }
end.each_with_object('') { |(aa,nbr),s| s << "#{aa}#{nbr}" }
end
protein_formula(sequence, counts)
#=> "C434H888N120O213S5"
Another example:
protein_formula("MCMPCFTTDHQMARKCDDCCGGKGRGKCYGPQCLCR", count)
#=> "C158H326N52O83S11"
Explanation of calculation of counts
This calculation:
counts = formulas.transform_values do |s|
s.scan(/[CHNOS]\d*/).each_with_object({}) do |s,h|
h[s[0]] = s.size == 1 ? 1 : s[1..-1].to_i
end
end
uses the method Hash#transform_values. It will return a hash having the same keys as the hash formulas, with the values of those keys in formula modified by transform_values's block. For example, formulas["A"] ("C3H7NO2") is "transformed" to the hash {"C"=>3, "H"=>7, "N"=>1, "O"=>2} in the hash that is returned, counts.
transform_values passes each value of formulas to the block and sets the block variable equal to it. The first value passed is "C3H7NO2", so it sets:
s = "C3H7NO2"
We can write the block calculation more simply:
h = {}
s.scan(/[CHNOS]\d*/).each do |s|
h[s[0]] = s.size == 1 ? 1 : s[1..-1].to_i
end
h
(Once you understand this calculation, which I explain below, see Enumerable#each_with_object to understand why I used that method in my solution.)
After initializing h to an empty hash, the following calculations are performed:
h = {}
a = s.scan(/[CHNOS]\d*/)
#=> ["C3", "H7", "N", "O2"]
a is computed using String#scan with the regular expression /[CHNOS]\d*/. That regular expression, or regex, matches exactly one character in the character class [CHNOS] followed by zero of more (*) digits (\d). It therefore separates the string "C3H7NO2" into the substrings that are returned in the array shown under the calculation of a above . Continuing,
a.each do |s|
h[s[0]] = s.size == 1 ? 1 : s[1..-1].to_i
end
changes h to the following:
h #=> {"C"=>3, "H"=>7, "N"=>1, "O"=>2}
The block variable s is initially set equal to the first element of a that is passed to each's block:
s = "C3"
then we compute:
h[s[0]] = s.size == 1 ? 1 : s[1..-1].to_i
h["A"] = 2 == 1 ? 1 : "3".to_i
= false ? 1 : 3
3
This is repeated for each element of a.
Exclamation of construction of formula for the protein
We can simplify the following code1:
sequence.each_char.with_object("C"=>0, "H"=>0, "N"=>0, "O"=>0) do |c,h|
counts[c].each { |aa,cnt| h[aa] += cnt }
end.each_with_object('') { |(aa,nbr),s| s << "#{aa}#{nbr}" }
to more or less the following:
h = { "C"=>0, "H"=>0, "N"=>0, "O"=>0, "S"=>0 }
ch = sequence.chars
#=> ["M", "G", "A",..., "F", "I"]
ch.each do |c|
counts[c].each { |aa,cnt| h[aa] += cnt }
end
h #=> {"C"=>434, "H"=>888, "N"=>120, "O"=>213, "S"=>5}
When the first value of ch ("M") is passed to each's block (when h = { "C"=>0, "H"=>0, "N"=>0, "O"=>0, "S"=>0 }), the following calculations are performed:
c = "M"
g = counts[c]
#=> {"C"=>10, "H"=>22, "N"=>2, "O"=>4, "S"=>1}
g.each { |aa,cnt| h[aa] += cnt }
h #=> {"C"=>10, "H"=>22, "N"=>2, "O"=>4, "S"=>1}
Lastly, (when h #=> {"C"=>434, "H"=>888, "N"=>120, "O"=>213, "S"=>5})
s = ''
h.each { |aa,nbr| s << "#{aa}#{nbr}" }
s #=> "C434H888N120O213S5"
When aa = "C" and nbr = 434,
"#{aa}#{nbr}"
#=> "C434"
is appended to the string s.
1. (("C"=>0, "H"=>0, "N"=>0, "O"=>0) is shorthand for ({"C"=>0, "H"=>0, "N"=>0, "O"=>0}).

Ruby - How to write a method that returns an array of strings?

I've tried different ways and this is probably the closest that I got to it. I am trying to write a method that takes in an array of strings and returns it containing the strings that are at least 5 characters long and end with "y".
I'm a beginner and this is my second problem I've come across with, and I've tried multiple if statements and using a while loop, however I could not get to it and now this is where I am at. Thank you!
def phrases(arr1, arr2)
arr1 = ["funny", "tidy", "fish", "foogiliously"]
arr2 = ["happily", "lovely", "hello", "multivitaminly"]
if (arr1.length > 5 && arr1.length == "y")
return arr1
elsif (arr2.length > 5 && arr2.length == "y")
return arr2
end
end
puts phrases(["funny", "tidy", "fish", "foogiliously"])
puts phrases(["happily", "lovely", "hello", "multivitaminly"])
If I'm understanding your question correctly, you want to return a subset of the passed in array matching your conditions (length ≥ 5 and last character = 'y'). In that case:
def phrases(words)
words.grep(/.{4}y\z/)
end
What that regex does:
.{4} means 4 of any character
y is the letter y
\z is the end of the string, so we don't match in the middle of a long word
The docs for Enumerable#select are here (an Array is an Enumerable).
Output:
> phrases(["funny", "tidy", "fish", "foogiliously"])
=> ["funny", "foogiliously"]
> phrases(["happily", "lovely", "hello", "multivitaminly"])
=> ["happily", "lovely", "multivitaminly"]
If you only want word characters, rather than any character, you'd use this regex instead: /\A.{4,}y\z/. In that case, \A means the start of the string, and \w{4,} means at least 4 word characters.
If, when given an array and inclusion criterion, one wishes to construct an array that contains those elements of the first array that satisfy the inclusion criterion, one generally uses the method Array#select or Array#reject, whichever is more more convenient.
Suppose arr is a variable that holds the given array and include_element? is a method that takes one argument, an element of arr, and returns true or false, depending on whether the inclusion criterion is satisified for that element. For example, say the array comprises the integers 1 through 6 and the inclusion criterion is that the number is even (2, 4 and 6). We could write:
arr = [1,2,3,4,5,6]
def include_element?(e)
e.even?
end
include_element?(2)
#=> true
include_element?(3)
#=> false
arr.select { |e| include_element?(e) }
#=> [2, 4, 6]
The method include_element? is so short we probably would substitute it out and just write:
arr.select { |e| e.even? }
Array#select passes each element of its receiver, arr, to select's block, assigns the block variable e to that value and evaluates the expression in the block (which could be many lines, of course). Here that expresssion is just e.even?, which returns true or false. (See Integer#even? and Integer#odd?.)
If that expression evaluates as a truthy value, the element e is to be included in the array that is returned; if it evaluates as a falsy value, e is not to be included. Falsy values (logical false) are nil and false; truthy values (logical true) are all other Ruby objects, which of course includes true.
Notice that we could instead write:
arr.reject { |e| e.odd? }
Sometimes the inclusion criterion consists of a compound expression. For example, suppose the inclusion criterion were to keep elements of arr that are both even numbers and are at least 4. We would write:
arr.select { |e| e.even? && e >= 4 }
#=> [4, 6]
With other criteria we might write:
arr.select { |e| e.even? || e >= 4 }
#=> [2, 4, 5, 6]
or
arr.select { |e| e < 2 || (e > 3 && e < 6) }
#=> [1, 4, 5]
&& (logical 'and') and || (logical 'or') are operators (search "operator expressions"). As explained at the link, most Ruby operators are actually methods, but these two are among a few that are not.
Your problem now reduces to the following:
arr.select { |str| <length of str is at least 5> && <last character of str is 'y'> }
You should be able to supply code for the <...> bits.
You are trying to write a function that should work on a single array at a time I think. Also, you are taking in an array, and retaining only those elements that satisfy your conditions: at least 5 characters long, and ends with y. This is a filtering operation. Read about the methods available for ruby's Array class here
def phrases(array)
...
filtered_array
end
Now the condition you are using is this arr1.length > 5 && arr1.length == "y".
The first half should check if the string length is greater than 5, not the array length itself. The second half is an indexing operation, and your code for that is incorrect. basically you are checking if the last character in the string is y.
Usually strings are indexed in this manner: string[index]. In your case you can use string[-1]=='y' or string[string.length - 1]=='y'. This because arrays and strings are zero indexed in ruby. The first element has index of 0, the second has an index of 1, and the last one, therefore, will have an index of length-1. If you use negative indexes then the array is indexed from the end, so string[-1] is a quick way to get to the last element.
Considering this, the function will take the following structure:
def phrases(array)
filtered_array = [] # an empty array
loop through the input array
for each element check for the condition element.length > 5 && element[-1]=='y'
if true: push the element into the filtered_array
once the loop is done, return the filtered array
end
Read about ruby arrays, the methods push, filter and select in the above linked documentation to get a better idea. I'd also recommend the codeacademy ruby tutorial.
Edit: Both halves of the condition are incorrect. I had overlooked a mistake in my earlier answer. arr1.length refers to the length of the array. You want to check the length of each string in the array. So in your for loop you should check the length of the loop variable, if that is greater than 5.
You may want to spend some time reading about the methods in the core library, especially String#end_with? and Enumerable#select. You could then write a method that'd contain something like this:
['abc', 'qwerty', 'asdfghjk', 'y'].select{|s| s.length >= 5}.select{|s| s.end_with? 'y'}
#=> ["qwerty"]

Finding Longest Substring No Duplicates - Help Optimizing Code [Ruby]

So I've been trying to solve a Leetcode Question, "Given a string, find the length of the longest substring without repeating characters."
For example
Input: "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.
Currently I optimized my algorithm when it comes to figuring out if the substring is unique by using a hash table. However my code still runs in O(n^2) runtime, and as a result exceeds the time limit during submissions.
What i try to do is to essentially go through every single possible substring and check if it has any duplicate values. Am I as efficient as it gets when it comes to the brute force method here? I know there's other methods such as a sliding window method but I'm trying to get the brute force method down first.
# #param {String} s
# #return {Integer}
def length_of_longest_substring(s)
max_length = 0
max_string = ""
n = s.length
for i in (0..n-1)
for j in (i..n-1)
substring = s[i..j]
#puts substring
if unique(substring)
if substring.length > max_length
max_length = substring.length
max_string = substring
end
end
end
end
return max_length
end
def unique(string)
hash = Hash.new(false)
array = string.split('')
array.each do |char|
if hash[char] == true
return false
else
hash[char] = true
end
end
return true
end
Approach
Here is a way of doing that with a hash that maps characters to indices. For a string s, suppose the characters in the substring s[j..j+n-1] are unique, and therefore the substring is a candidate for the longest unique substring. The next element is therefore e = s[j+n] We wish to determine if s[j..j+n-1] includes e. If it does not we can append e to the substring, keeping it unique.
If s[j..j+n-1] includes e, we determine if n (the size of the substring) is greater than the length of the previously-known substring, and update our records if it is. To determine if s[j..j+n-1] includes e, we could perform a linear search of the substring, but it is faster to maintain a hash c_to_i whose key-value pairs are s[i]=>i, i = j..j_n-1. That is, c_to_i maps the characters in the substring to their indices in full string s. That way we can merely evaluate c_to_i.key?(e) to see if the substring contains e. If the substring includes e, we use c_to_i to determine its index in s and add one: j = c_to_i[e] + 1. The new substring is therefore s[j..j+n-1] with the new value of j. Note that several characters of s may be skipped in this step.
Regardless of whether the substring contained e, we must now append e to the (possibly-updated) substring, so that it becomes s[j..j+n].
Code
def longest_no_repeats(str)
c_to_i = {}
longest = { length: 0, end: nil }
str.each_char.with_index do |c,i|
j = c_to_i[c]
if j
longest = { length: c_to_i.size, end: i-1 } if
c_to_i.size > longest[:length]
c_to_i.reject! { |_,k| k <= j }
end
c_to_i[c] = i
end
c_to_i.size > longest[:length] ? { length: c_to_i.size, end: str.size-1 } :
longest
end
Example
a = ('a'..'z').to_a
#=> ["a", "b",..., "z"]
str = 60.times.map { a.sample }.join
#=> "ekgdaxxzlwbxixhlfbpziswcoelplhobivoygmupdaexssbuuawxmhprkfms"
longest = longest_no_repeats(str)
#=> {:length=>14, :end=>44}
str[0..longest[:end]]
#=> "ekgdaxxzlwbxixhlfbpziswcoelplhobivoygmupdaexs"
str[longest[:end]-longest[:length]+1,longest[:length]]
#=> "bivoygmupdaexs"
Efficiency
Here is a benchmark comparison to #mechnicov's code:
require 'benchmark/ips'
a = ('a'..'z').to_a
arr = 50.times.map { 1000.times.map { a.sample }.join }
Benchmark.ips do |x|
x.report("mechnicov") { arr.sum { |s| max_non_repeated(s)[:length] } }
x.report("cary") { arr.sum { |s| longest_no_repeats(s)[:length] } }
x.compare!
end
displays:
Comparison:
cary: 35.8 i/s
mechnicov: 0.0 i/s - 1198.21x slower
From your link:
Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
That means you need first non-repeated substring.
I suggest here is such method
def max_non_repeated(string)
max_string = string.
each_char.
map.with_index { |_, i| string[i..].split('') }.
map do |v|
ary = []
v.each { |l| ary << l if ary.size == ary.uniq.size }
ary.uniq.join
end.
max
{
string: max_string,
length: max_string.length
}
end
max_non_repeated('pwwkew')[:string] #=> "wke"
max_non_repeated('pwwkew')[:length] #=> 3
In Ruby < 2.6 use [i..-1] instead of [i..]

Sort Integer Array Ruby

Have the function PermutationStep (num) take the num parameter being passed and return the next number greater than num using the same digits. For example: if num is 123 return 132, if it's 12453 return 12534. If a number has no greater permutations, return -1 (ie. 999)
Here's my code. I'd like to sort an array of large integers in numerical order. Using the regular sort method doesn't give the right order for some numbers. Is there a sort_by structure that I can replace 'sort' with in my code below?
def PermutationStep(num)
num = num.to_s.split('').map {|i| i.to_i}
permutations = num.permutation.to_a.sort #<= I want to sort by numerical value here
permutations.each_with_index do |n, idx|
if n == num
if n == permutations[-1]
return -1
else
return permutations[idx+1].join.to_i
end
end
end
end
For example, 11121. When I run the code it gives me 11121.I want the next highest permutation, which should be 12111.
Also, when I try { |a,b| b <=> a }, I also get errors.
You can pass a block to sort.
num.permutation.to_a.sort { |x, y| x.to_i <=> y.to_i }
This SO thread may be of some assistance: How does Array#sort work when a block is passed?
num.permutation.to_a is an array of arrays, not an array of integers, which causes the result not what you expected.
Actually you don't need to sort since you only need the minimum integer that is bigger than the input.
def PermutationStep(num)
nums = num.to_s.split('')
permutations = nums.permutation.map{|a| a.join.to_i}
permutations.keep_if{|n| n > num}.min || -1
end
puts PermutationStep(11121) # 11211
puts PermutationStep(999) # -1
Call to_i before your sort the permutations. Once that is done, sort the array an pick the first element greater than your number:
def PermutationStep(num)
numbers = num.to_s.split('')
permutations = numbers.permutation.map { |p| p.join.to_i }.sort
permutations.detect { |p| p > num } || -1
end
You don't need to consider permutations of digits to obtain the next higher number.
Consider the number 126531.
Going from right to left, we look for the first decrease in the digits. That would be 2 < 6. Clearly we cannot obtain a higher number by permuting only the digits after the 2, but we can obtain a higher number merely by swapping 2 and 6. This will not be the next higher number, however.
We therefore look for the smallest digit to the right of 2 that is greater than 2, which would be 3. Clearly, the next higher number will begin 13 and will have the remaining digits ordered smallest to largest. Therefore, the next higher number will be 131256.
You can easily see that the next higher number for 123 is 132, and for 12453 is 12534.
The proof that procedure is correct is easily established by induction, first showing that it is correct for numbers with two digits, then assuming it is correct for numbers with n>=2 digits, showing it is correct for numbers with n+1 digits.
It can be easily implemented in code:
def next_highest(n)
a = n.to_s.reverse.split('').map(&:to_i)
last = -Float::INFINITY
x,ndx = a.each_with_index.find { |d,i| res = d<last; last=d; res }
return nil unless x
swap_val = a[ndx]
swap_ndx = (0...ndx).select { |i| a[i] > swap_val }.min_by{ |i| a[i] }
a[ndx], a[swap_ndx] = a[swap_ndx], swap_val
a[0...ndx] = a[0...ndx].sort.reverse
a.join.reverse
end
next_highest(126531) #=> "131256"
next_highest(109876543210) #=> "110023456789"

method evaluating a word with vowels in alphabetical order

This piece of code returns true or false if a words vowels are in alphabetical order.
def ordered_vowel_word?(word)
vowels = ['a','e','i','o','u']
y = word.split('')
x = y.select { |l| vowels.include?(l) }
(0...(x.length - 1)).all? do |i|
x[i] <= x[i + 1]
end
end
However I don't quite understand how it does that. in particular I don't understand the very last line x[i] <= x[i + 1]. I'm also not very familiar with the .all method. why not just use .each instead?
Aren't the values of x[i] or x[i+1] letters? How can we compare letters values with less than or equal to? It doesn't make sense.
.all? will return true is the given block evaluates to true for all the elements in the enumeration.
Here you are checking if all the elements are lesser than or equal to its next element.
The less than or equal uses the Integer ordinal of the character for comparison.
eg:
> 'a'.ord
# => 97
> 'b'.ord
# => 98
So 'b' is greater than 'a'
Here first you are splitting the argument "Word" in a character array and stored it in "X".
Then You have selected all the vowels and stored it in "X" using the "include?" method. And then by using .all? method you are checking for all the elements between the index 0 to x.length-1 excluding the x.length-1 index. And yes You can also use .each here instead of .all.
And there is a method spaceship (<=>) method that can be used to compare two strings in relation to their alphabetical ranking. The <=> method returns 0 if the strings are identical, -1 if the left hand string is less than the right hand string, and 1 if it is greater.
Hope This will help You!!!
I believe the naming is the most confusing part of this method. Let's break it down:
y = word.split('')
should really be my_letters = word.split(''), since it returns an array with each letter as a string (like ["s", "o", "m", "e"])
x = y.select { |l| vowels.include?(l) }
should be my_vowels = my_letters.select { |l| vowels.include?(l) }. This will return an array of vowels from your word (like ["o", "e"])
(0...(my_vowels.length - 1)).all? do |i|
my_vowels[i] <= my_vowels[i + 1]
end
all? will only return true if the given block is true for all elements (See Enumerable#all?)
We have two elements in our array ["o", "e"], the all? is performed once, shown by the (0..1).all?. It passes the 0 index to the block and evaluates each string with the one after it.
my_vowels[0] <= my_vowels[1] represents "o" <= "e" in our example, which evaluates to false.
<= (see Comparable docs) will be true so long as the first string has a lesser or equal value than the other. For strings, value is determined by an Integer ordinal which you can find by calling .ord on the string.
And just as a reference, this is how I would write this function in a more Rubyish way:
def ordered_vowel_word?(word)
vowels = ->char{'aeiou'[char.downcase]}
vowels_in_word = word.chars.select &vowels
vowels_in_word == vowels_in_word.sort
end

Resources