Confusion in String Manipulation - wolfram-mathematica

InputForm[{a, b, c, d, e, f}] gives {a, b, c, d, e, f}
InputForm[Characters["SOMETHING"]] gives {"S", "O", "M", "E", "T", "H", "I", "N", "G"}
But why does not Drop[InputForm[Characters["SOMETHING"]],1] give {"O", "M", "E", "T", "H", "I", "N", "G"}
but gives a InputForm[] and nothing else?
How can I achieve this?
Thank You

When you evaluate
InputForm[Characters["SOMETHING"]]
Mathematica internally produces the result
InputForm[List["S","O","M","E","T","H","I","N","G"]]
i.e. it's an expression with InputForm as a head, which contains ListList["S","O","M","E","T","H","I","N","G"] as its first subexpression. You don't see the InputForm head when Mathematica displays the expression, because the front end only uses it as a hint as to how the expression should be shown, but it's still there behind the scenes.
Then when you use Drop[..., 1], it looks at the expression it's given, picks out the first subexpression, which is List["S","O","M","E","T","H","I","N","G"], and discards it. That leaves just InputForm[].
To make an analogy: if you evaluated
Drop[List[List["S","O","M","E","T","H","I","N","G"]], 1]
you would understand why you'd get an empty list back, right? It's the same thing going on.

Related

Ruby sort subarray in place

If I have an array in Ruby, foo, how can a sort foo[i..j] in-place?
I tried calling foo[i..j].sort! but it didn't sort the original array, just returned a sorted part of it.
If you want to sort part of an array you need to reinject the sorted parts. The in-place modifier won't help you here because foo[i..j] returns a copy. You're sorting the copy in place, which really doesn't mean anything to the original array.
So instead, replace the original slice with a sorted version of same:
test = %w[ z b f d c h k z ]
test[2..6] = test[2..6].sort
# => ["c", "d", "f", "h", "k"]
test
# => ["a", "b", "c", "d", "f", "h", "k", "q"]

How to find all longest words in a string?

If I have a string with no spaces in it, just a concatenation like "hellocarworld", I want to get back an array of the largest dictionary words. so I would get ['hello','car','world']. I would not get back words such as 'a' because that belongs in 'car'.
The dictionary words can come from anywhere such as the dictionary on unix:
words = File.readlines("/usr/share/dict/words").collect{|x| x.strip}
string= "thishasmanywords"
How would you go about doing this?
I would suggest the following.
Code
For a given string and dictionary, dict:
string_arr = string.chars
string_arr.size.downto(1).with_object([]) { |n,arr|
string_arr.each_cons(n) { |a|
word = a.join
arr << word if (dict.include?(word) && !arr.any? {|w| w.include?(word) })}}
Examples
dict = File.readlines("/usr/share/dict/words").collect{|x| x.strip}
string = "hellocarworld"
#=> ["hello", "world", "loca", "car"]
string= "thishasmanywords"
#=> ["this", "hish", "many", "word", "sha", "sma", "as"]
"loca" is the plural of "locus". I'd never heard of "hish", "sha" or "sma". They all appear to be slang words, as I could only find them in something called the "Urban Dictonary".
Explanation
string_arr = "hellocarworld".chars
#=> ["h", "e", "l", "l", "o", "c", "a", "r", "w", "o", "r", "l", "d"]
string_arr.size
#=> 13
so for this string we have:
13.downto(1).with_object([]) { |n,arr|...
where arr is an initially-empty array that will be computed and returned. For n => 13,
enum = string_arr.each_cons(13)
#<Enumerator: ["h","e","l","l","o","c","a","r","w","o","r","l","d"]:each_cons(13)>
which enumerates over an array consisting of the single array string_arr:
enum.size #=> 1
enum.first == string_arr #=> true
That single array is assigned to the block variable a, so we obtain:
word = enum.first.join
#=> "hellocarworld"
We find
dict.include?(word) #=> false
so this word is not added to the array arr. It is was in the dictionary we would check to make sure it was not a substring of any word already in arr, which are all of the same size or larger (longer words).
Next we compute:
enum = string_arr.each_cons(12)
#<Enumerator: ["h","e","l","l","o","c","a","r","w","o","r","l","d"]:each_cons(12)>
which we can see enumerates two arrays:
enum = string_arr.each_cons(12).to_a
#=> [["h", "e", "l", "l", "o", "c", "a", "r", "w", "o", "r", "l"],
# ["e", "l", "l", "o", "c", "a", "r", "w", "o", "r", "l", "d"]]
corresponding to the words:
enum.first.join #=> "hellocarworl"
enum.last.join #=> "ellocarworld"
neither of which are in the dictionary. We continue in this fashion, until we reach n => 1:
string_arr.each_cons(1).to_a
#=> [["h"], ["e"], ["l"], ["l"], ["o"], ["c"],
# ["a"], ["r"], ["w"], ["o"], ["r"], ["l"], ["d"]]
We find only "a" in the dictionary, but as it is a substring of "loca" or "car", which are already elements of the array arr, we do not add it.
This can be a bit tricky if you're not familiar with the technique. I often lean heavily on regular expressions for this:
words = File.readlines("/usr/share/dict/words").collect(&:strip).reject(&:empty?)
regexp = Regexp.new(words.sort_by(&:length).reverse.join('|'))
phrase = "hellocarworld"
equiv = [ ]
while (m = phrase.match(regexp)) do
phrase.gsub!(m[0]) do
equiv << m[0]
'*'
end
end
equiv
# => ["hello", "car", "world"]
Update: Strip out blank strings which would cause the while loop to run forever.
Starting at the beginning of the input string, find the longest word in the dictionary. Chop that word off the beginning of the input string and repeat.
Once the input string is empty, you are done. If the string is not empty but no word was found, remove the first character and continue the process.

How do I create every combination of single elements selected from multiple arrays?

I have 5 arrays:
["A", "B", "C"]
["A", "B", "C", "D", "E"]
["A"]
["A", "B", "C", "D", "E", "F"]
["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"]
I would like to create a list of each combination as such:
["AAAAA","AAAAB","AAAAC", "AAAAD"...
"BAAAA","BAAAB","BAAAC", "BAAAD"...]
a = [
["A", "B", "C"],
["A", "B", "C", "D", "E"],
["A"],
["A", "B", "C", "D", "E", "F"],
["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"]
]
a.inject(&:product).map(&:join)
# => ["AAAAA", "AAAAB", "AAAAC", ..., "CEAFM", "CEAFN", "CEAFO"]
Thanks to bluexuemei for the improved answer. The original solution was a.shift.product(*a).map(&:join).
A More Traditional Solution
With such a convenient library, these ruby one-liners seem almost like cheating.
Here is a more traditional way to solve this common problem that can be readily coded into other programming languages:
N = a.reduce(1) { |product,list| product * list.size } # 1350
combinations = []
0.upto(N-1) do |q|
combo = []
a.reverse.each do |list|
q, r = q.divmod list.size
combo << list[r]
end
combinations.push combo.reverse.join
end
combinations
# => ["AAAAA", "AAAAB", "AAAAC", ..., "CEAFM", "CEAFN", "CEAFO"]
The basic idea is to first calculate the total number of combinations N which is just the product of the length of all the lists. Each integer from 0 to N-1 then encodes all the information needed to provide unique indices into each list to produce each combination. One way to think of it is that the index variable q can be expressed as a 5-digit number, where each digit is in a different base, where the base is the size of the corresponding list. That is, the first digit is base-3, the second digit is base-5, the 3rd is base-1 (always 0), the 4th is base-6, and the 5th is base-15. To extract these values from q, this is just taking a series of repeated integer divisions and remainders, as done in the inner loop. Naturally this requires some homework, perhaps looking at simpler examples, to fully digest.
a.reduce(&:product).map(&:join).size

Algorithm for matching all items with another item in same list, where some have restrictions

Given array [a, b, c, d, e, f]
I want to match each letter with any other letter except itself, resulting in something like:
a - c
b - f
d - e
The catch is that each letter may be restricted to being matched with one or more of the other letters.
So let's say for example,
a cannot be matched with c, d
c cannot be matched with e, f
e cannot be matched with a
Any guidance on how to go about this? I'm using Ruby, but any pseudo code would be helpful.
Thanks!
The problem you are describing is a graph problem called maximum matching (or more specifically perfect matching). The restrictions correspond to vertexes in the graph that do not have a line between them.
You can use Edmond's matching algorithm.
Let's assume for now that a solution exists. It may not.
Pick one of your elements, and try to match it.
If it breaks one of your rules, try again until you do.
Choose another element, and try to match that. If you run through all other elements and break a rule each time, then go back, unmatch your previous match, and try another one.
Continue until all of your elements are used up.
If you don't know whether a solution exists or not, then you'll need to keep track of your attempts and figure out when you've tried them all. Or, use some checking at the beginning to see if there are any obvious contradictions in your rule set.
I'm not sure I understand the problem, but this seems to fit the question:
%w[a b c d e f].combination(2).to_a - [%w[a c],%w[a d],%w[c e],%w[c f],%w[e a]]
# => [["a", "b"], ["a", "e"], ["a", "f"], ["b", "c"], ["b", "d"], ["b", "e"], ["b", "f"], ["c", "d"], ["d", "e"], ["d", "f"], ["e", "f"]]
$letters = array('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j');
$exclusions = array('a' => array('e', 'd', 'c'), 'b' => array('a','b', 'c','d'));
foreach ($letters as $matching) {
foreach ($letters as $search) {
if(!in_array($search,$exclusions[$matching])){
if($search!=$matching){
$match[$matching][] = $search;
}
}
}
}
print_r($match);
The innermost EVAL could be added to the next outer one...
you can see this in action at
http://craigslist.fatherstorm.com/stackoverflow2.php

Generate all possible dna sequences from a few given sets

I have been trying to wrap my head around this for a while now but have not been able to come up with a good solution. Here goes:
Given a number of sets:
set1: A, T
set2: C
set3: A, C, G
set4: T
set5: G
I want to generate all possible sequences from a list of sets. In this example the length of the sequence is 5, but it can be any length up to around 20. For position 1 the possible candidates are 'A' and 'T' respectively, for position 2 the only option is 'C' and so on.
The answer for the example above would be:
ACATG, ACCTG, ACGTG, TCATG, TCCTG, TCGTG
I am doing this in ruby and I have the different sets as arrays within a master array:
[[A, T], [C], [A, C, G], [T], [G]]
At first I thought a recursive solution would be best but I was unable figure out how to set it up properly.
My second idea was to create another array of the same size with an index for each set. So 00000 would correspond to the first sequence above 'ACATG' and 10200 would correspond to 'TCGTG'. Beginning with 00000 I would increase the last index by one and modulo it with the length of the set in question (2 for set1, 1 for set2 above) and if the counter wrapped around I would zero it and increase the previous one by one.
But the more I thought about this solution it seemed too complex for this very small problem. There must be a more straight-forward solution that I am missing. Could anyone help me out?
/Nick
The Array class in Ruby 1.8.7 has an Array#product method, which returns the cartesian product of the arrays in question.
irb(main):001:0> ['A', 'T'].product(['C'], ['A', 'C', 'G'], ['T'], ['G'])
=> [["A", "C", "A", "T", "G"], ["A", "C", "C", "T", "G"], ["A", "C", "G", "T", "G"], ["T", "C", "A", "T", "G"], ["T", "C", "C", "T", "G"], ["T", "C", "G", "T", "G"]]

Resources