how to find the size of set in python? - set

import sys
  
# sample Sets
Set1 = {"A", 1, "B", 2, "C", 3}
Set2 = {"Geek1", "Raju", "Geek2", "Nikhil", "Geek3", "Deepanshu"}
Set3 = {(1, "Lion"), ( 2, "Tiger"), (3, "Fox")}
  
# print the sizes of sample Sets
print("Size of Set1: " + str(sys.getsizeof(Set1)) + "bytes")
print("Size of Set2: " + str(sys.getsizeof(Set2)) + "bytes")
print("Size of Set3: " + str(sys.getsizeof(Set3)) + "bytes")
A set is an unordered collection data type that is iterable, mutable, and has no duplicate elements. Python's set class represents the mathematical notatation of a set. The size of a set means the amount of memory(in bytes) occupied by a set object.

Related

How to count the number of unique absolute values in an array with Ruby?

There is a given array that is not empty and it consists of N values. The array is sorted in ascending order.
The task is to write a function that counts the number of unique absolute values ​​in the array.
For example, the array A is:
   A [0] = -5
   A [1] = -3
   A [2] = -1
   A [3] = 0
   A [4] = 3
   A [5] = 6
The absolute number is a unique 5 because there are 5 unique absolute values ​​that are 0,1,3,5,6.
Suppose that:
N is the integer range [1 .. 100,000]; array a is sorted in ascending order.
So far I've done this:
class AbsoluteUnique
def initialize int_array
puts "Enter content for the array:"
5.times do
int_array << gets
if A.size == 5
find_count
end
end
end
def find_count
end
end
Obviously that isn't enough, but I find it difficult to figure out the rest of the function. Could you please help me find those 5 absolute, unique values?
Thank you.
Uniq takes an optional block - the result of the block is compared for uniqueness instead of the value itself - so you can do
array.uniq(&:abs).length
If you have an integer array a like so:
a = [-5, -3, -1, 0, 3, 6]
This should do it:
a.map {|n| n.abs}.uniq.length
If it needs to be take an array for standard input, this would be the way to do that:
a = []
puts "Enter 5 integers separated by newlines: "
until(a.length == 5)
a << gets.chomp.to_i
end
puts "There are #{a.map {|n| n.abs}.uniq.length} unique absolute values in your list."
This should work:
array.map(&:abs).uniq.count
[-5, -3, -1, 0, 3, 6].group_by(&:abs).length
# => 5

How to generate a power set of a given set?

I am studying for an interview and I stumbled upon this question online under the "Math" category.
Generate power set of given set:
int A[] = {1,2,3,4,5};
int N = 5;
int Total = 1 << N;
for ( int i = 0; i < Total; i++ ) {
for ( int j = 0; j < N; j++) {
if ( (i >> j) & 1 )
cout << A[j];
}
cout <<endl;
}
Please I do not want an explicit answer. I just want clarifications and hints on how to approach this problem.
I checked power set algorithm on google and I still do not understand how to address this problem.
Also, could someone reiterate what the question is asking for.
Thank you.
Power set of a set A is the set of all of the subsets of A.
Not the most friendly definition in the world, but an example will help :
Eg. for {1, 2}, the subsets are : {}, {1}, {2}, {1, 2}
Thus, the power set is {{}, {1}, {2}, {1, 2}}
To generate the power set, observe how you create a subset : you go to each element one by one, and then either retain it or ignore it.
Let this decision be indicated by a bit (1/0).
Thus, to generate {1}, you will pick 1 and drop 2 (10).
On similar lines, you can write a bit vector for all the subsets :
{} -> 00
{1} -> 10
{2} -> 01
{1,2} -> 11
To reiterate : A subset if formed by including some or all of the elements of the original set. Thus, to create a subset, you go to each element, and then decide whether to keep it or drop it. This means that for each element, you have 2 decisions. Thus, for a set, you can end up with 2^N different decisions, corresponding to 2^N different subsets.
See if you can pick it up from here.
Create a power-set of: {"A", "B", "C"}.
Pseudo-code:
val set = {"A", "B", "C"}
val sets = {}
for item in set:
for set in sets:
sets.add(set + item)
sets.add({item})
sets.add({})
Algorithm explanation:
1) Initialise sets to an empty set: {}.
2) Iterate over each item in {"A", "B", "C"}
3) Iterate over each set in your sets.
3.1) Create a new set which is a copy of set.
3.2) Append the item to the new set.
3.3) Append the new set to sets.
4) Add the item to your sets.
4) Iteration is complete. Add the empty set to your resultSets.
Walkthrough:
Let's look at the contents of sets after each iteration:
Iteration 1, item = "A":
sets = {{"A"}}
Iteration 2, item = "B":
sets = {{"A"}, {"A", "B"}, {"B"}}
Iteration 3, item = "C":
sets = {{"A"}, {"A", "B"}, {"B"}, {"A", "C"}, {"A", "B", "C"}, {"B", "C"}, {"C"}}
Iteration complete, add empty set:
sets = {{"A"}, {"A", "B"}, {"B"}, {"A", "C"}, {"A", "B", "C"}, {"B", "C"}, {"C"}, {}}
The size of the sets is 2^|set| = 2^3 = 8 which is correct.
Example implementation in Java:
public static <T> List<List<T>> powerSet(List<T> input) {
List<List<T>> sets = new ArrayList<>();
for (T element : input) {
for (ListIterator<List<T>> setsIterator = sets.listIterator(); setsIterator.hasNext(); ) {
List<T> newSet = new ArrayList<>(setsIterator.next());
newSet.add(element);
setsIterator.add(newSet);
}
sets.add(new ArrayList<>(Arrays.asList(element)));
}
sets.add(new ArrayList<>());
return sets;
}
Input: [A, B, C]
Output: [[A], [A, C], [A, B], [A, B, C], [B], [B, C], [C], []]
Power set is just set of all subsets for given set. It includes all subsets (with empty set). It's well-known that there are 2N elements in this set, where N is count of elements in original set.
To build power set, following thing can be used:
Create a loop, which iterates all integers from 0 till 2N-1
Proceed to binary representation for each integer
Each binary representation is a set of N bits (for lesser numbers, add leading zeros). Each bit corresponds, if the certain set member is included in current subset.
Example, 3 numbers: a, b, c
number binary subset
0 000 {}
1 001 {c}
2 010 {b}
3 011 {b,c}
4 100 {a}
5 101 {a,c}
6 110 {a,b}
7 111 {a,b,c}
Well, you need to generate all subsets. For a set of size n, there are
2n subsets.
One way would be to iterate over the numbers from 0 to 2n - 1
and convert each to a list of binary digits, where 0 means exclude
that element and 1 means include it.
Another way would be with recursion, divide and conquer.
Generating all combination of a set (By including or not an item).
explain by example:
3 items in a set (or list). The possible subset will be:
000
100
010
001
110
101
011
111
The result is 2^(number of elements in the set).
As such we can generate all combinations of N items (with python) as follows:
def powerSet(items):
N = len(items)
for i in range(2**N):
comb=[]
for j in range(N):
if (i >> j) % 2 == 1:
comb.append(items[j])
yield comb
for x in powerSet([1,2,3]):
print (x)
You Get Something Like This by Implementing the top rated Answer.
def printPowerSet(set,set_size):
# set_size of power set of a set
# with set_size n is (2**n -1)
pow_set_size = (int) (math.pow(2, set_size));
counter = 0;
j = 0;
# Run from counter 000..0 to 111..1
for counter in range(0, pow_set_size):
for j in range(0, set_size):
# Check if jth bit in the
# counter is set If set then
# pront jth element from set
if((counter & (1 << j)) > 0):
print(set[j], end = "");
print("");
C# Solution
Time Complexity and Space Complexity: O(n*2^n)
public class Powerset
{
/*
P[1,2,3] = [[],[1],[2],[3],[1,2],[1,3],[2,3],[1,2,3]]
*/
public List<List<int>> PowersetSoln(List<int> array)
{
/*
We will start with an empty subset
loop through the number in the array
loop through subset generated till and add the number to each subsets
*/
var subsets = new List<List<int>>();
subsets.Add(new List<int>());
for (int i = 0; i < array.Count; i++)
{
int subsetLen = subsets.Count;
for (int innerSubset = 0; innerSubset < subsetLen; innerSubset++)
{
var newSubset = new List<int>(subsets[innerSubset]);
newSubset.Add(array[i]);
subsets.Add(newSubset);
}
}
return subsets;
}
}
Sample Java Code:
void printPowerSetHelper(String s, String r) {
if (s.length() > 0) {
printPowerSetHelper(s.substring(1), r + s.charAt(0));
printPowerSetHelper(s.substring(1), r);
}
if (r.length() > 0) System.out.println(r);
}
void printPowerSet(String s) {
printPowerSetHelper(s,"");
}

Algorithm to find the most common substrings in a string

Is there any algorithm that can be used to find the most common phrases (or substrings) in a string? For example, the following string would have "hello world" as its most common two-word phrase:
"hello world this is hello world. hello world repeats three times in this string!"
In the string above, the most common string (after the empty string character, which repeats an infinite number of times) would be the space character .
Is there any way to generate a list of common substrings in this string, from most common to least common?
This is as task similar to Nussinov algorithm and actually even simpler as we do not allow any gaps, insertions or mismatches in the alignment.
For the string A having the length N, define a F[-1 .. N, -1 .. N] table and fill in using the following rules:
for i = 0 to N
for j = 0 to N
if i != j
{
if A[i] == A[j]
F[i,j] = F [i-1,j-1] + 1;
else
F[i,j] = 0;
}
For instance, for B A O B A B:
This runs in O(n^2) time. The largest values in the table now point to the end positions of the longest self-matching subquences (i - the end of one occurence, j - another). In the beginning, the array is assumed to be zero-initialized. I have added condition to exclude the diagonal that is the longest but probably not interesting self-match.
Thinking more, this table is symmetric over diagonal so it is enough to compute only half of it. Also, the array is zero initialized so assigning zero is redundant. That remains
for i = 0 to N
for j = i + 1 to N
if A[i] == A[j]
F[i,j] = F [i-1,j-1] + 1;
Shorter but potentially more difficult to understand. The computed table contains all matches, short and long. You can add further filtering as you need.
On the next step, you need to recover strings, following from the non zero cells up and left by diagonal. During this step is also trivial to use some hashmap to count the number of self-similarity matches for the same string. With normal string and normal minimal length only small number of table cells will be processed through this map.
I think that using hashmap directly actually requires O(n^3) as the key strings at the end of access must be compared somehow for equality. This comparison is probably O(n).
Python. This is somewhat quick and dirty, with the data structures doing most of the lifting.
from collections import Counter
accumulator = Counter()
text = 'hello world this is hello world.'
for length in range(1,len(text)+1):
for start in range(len(text) - length):
accumulator[text[start:start+length]] += 1
The Counter structure is a hash-backed dictionary designed for counting how many times you've seen something. Adding to a nonexistent key will create it, while retrieving a nonexistent key will give you zero instead of an error. So all you have to do is iterate over all the substrings.
just pseudo code, and maybe this isn't the most beautiful solution, but I would solve like this:
function separateWords(String incomingString) returns StringArray{
//Code
}
function findMax(Map map) returns String{
//Code
}
function mainAlgorithm(String incomingString) returns String{
StringArray sArr = separateWords(incomingString);
Map<String, Integer> map; //init with no content
for(word: sArr){
Integer count = map.get(word);
if(count == null){
map.put(word,1);
} else {
//remove if neccessary
map.put(word,count++);
}
}
return findMax(map);
}
Where map can contain a key, value pairs like in Java HashMap.
Since for every substring of a String of length >= 2 the text contains at least one substring of length 2 at least as many times, we only need to investigate substrings of length 2.
val s = "hello world this is hello world. hello world repeats three times in this string!"
val li = s.sliding (2, 1).toList
// li: List[String] = List(he, el, ll, lo, "o ", " w", wo, or, rl, ld, "d ", " t", th, hi, is, "s ", " i", is, "s ", " h", he, el, ll, lo, "o ", " w", wo, or, rl, ld, d., ". ", " h", he, el, ll, lo, "o ", " w", wo, or, rl, ld, "d ", " r", re, ep, pe, ea, at, ts, "s ", " t", th, hr, re, ee, "e ", " t", ti, im, me, es, "s ", " i", in, "n ", " t", th, hi, is, "s ", " s", st, tr, ri, in, ng, g!)
val uniques = li.toSet
uniques.toList.map (u => li.count (_ == u))
// res18: List[Int] = List(1, 2, 1, 1, 3, 1, 5, 1, 1, 3, 1, 1, 3, 2, 1, 3, 1, 3, 2, 3, 1, 1, 1, 1, 1, 3, 1, 3, 3, 1, 3, 1, 1, 1, 3, 3, 2, 4, 1, 2, 2, 1)
uniques.toList(6)
res19: String = "s "
Perl, O(n²) solution
my $str = "hello world this is hello world. hello world repeats three times in this string!";
my #words = split(/[^a-z]+/i, $str);
my ($display,$ix,$i,%ocur) = 10;
# calculate
for ($ix=0 ; $ix<=$#words ; $ix++) {
for ($i=$ix ; $i<=$#words ; $i++) {
$ocur{ join(':', #words[$ix .. $i]) }++;
}
}
# display
foreach (sort { my $c = $ocur{$b} <=> $ocur{$a} ; return $c ? $c : split(/:/,$b)-split(/:/,$a); } keys %ocur) {
print "$_: $ocur{$_}\n";
last if !--$display;
}
displays the 10 best scores of the most common sub strings (in case of tie, show the longest chain of words first). Change $display to 1 to have only the result.There are n(n+1)/2 iterations.

How to generate cross product of sets in specific order

Given some sets (or lists) of numbers, I would like to iterate through the cross product of these sets in the order determined by the sum of the returned numbers. For example, if the given sets are { 1,2,3 }, { 2,4 }, { 5 }, then I would like to retrieve the cross-products in the order
<3,4,5>,
<2,4,5>,
<3,2,5> or <1,4,5>,
<2,2,5>,
<1,2,5>
I can't compute all the cross-products first and then sort them, because there are way too many. Is there any clever way to achieve this with an iterator?
(I'm using Perl for this, in case there are modules that would help.)
For two sets A and B, we can use a min heap as follows.
Sort A.
Sort B.
Push (0, 0) into a min heap H with priority function (i, j) |-> A[i] + B[j]. Break ties preferring small i and j.
While H is not empty, pop (i, j), output (A[i], B[j]), insert (i + 1, j) and (i, j + 1) if they exist and don't already belong to H.
For more than two sets, use the naive algorithm and sort to get down to two sets. In the best case (which happens when each set is relatively small), this requires storage for O(√#tuples) tuples instead of Ω(#tuples).
Here's some Python to do this. It should transliterate reasonably straightforwardly to Perl. You'll need a heap library from CPAN and to convert my tuples to strings so that they can be keys in a Perl hash. The set can be stored as a hash as well.
from heapq import heappop, heappush
def largest_to_smallest(lists):
"""
>>> print list(largest_to_smallest([[1, 2, 3], [2, 4], [5]]))
[(3, 4, 5), (2, 4, 5), (3, 2, 5), (1, 4, 5), (2, 2, 5), (1, 2, 5)]
"""
for lst in lists:
lst.sort(reverse=True)
num_lists = len(lists)
index_tuples_in_heap = set()
min_heap = []
def insert(index_tuple):
if index_tuple in index_tuples_in_heap:
return
index_tuples_in_heap.add(index_tuple)
minus_sum = 0 # compute -sum because it's a min heap, not a max heap
for i in xrange(num_lists): # 0, ..., num_lists - 1
if index_tuple[i] >= len(lists[i]):
return
minus_sum -= lists[i][index_tuple[i]]
heappush(min_heap, (minus_sum, index_tuple))
insert((0,) * num_lists)
while min_heap:
minus_sum, index_tuple = heappop(min_heap)
elements = []
for i in xrange(num_lists):
elements.append(lists[i][index_tuple[i]])
yield tuple(elements) # this is where the tuple is returned
for i in xrange(num_lists):
neighbor = []
for j in xrange(num_lists):
if i == j:
neighbor.append(index_tuple[j] + 1)
else:
neighbor.append(index_tuple[j])
insert(tuple(neighbor))

Algorithm to find words spelled out by a number

I'm trying to find a way to determine all possible words that can be spelled out by a given number, given a mapping of alphabets to values.
I eventually want to find a solution that works for any 1- or 2- digit value mapping for a letter, but for illustration, assume A=1, B=2, ... Z=26.
Example: 12322 can be equal to abcbb (1,2,3,2,2), lcbb (12,3,2,2), awbb (1,23,2,2), abcv (1,2,3,22), awv (1,23,22), or lcv (12,3,22).
Here's what I have thought of so far:
I will build a tree of all possible words using the number.
To do this, I will start out with a tree with one root node with dummy data.
I will parse then the number digit-by-digit starting from the least significant digit.
At each step, I will take the last digit of the remaining part of the number and insert it into the left subtree of the current node, and remove that digit from the number for that node's left subtree. For the same node, I will then check if the previous TWO digits together form a valid alphabet, and if so, I will put them into the right subtree (and remove the 2 digits from the number for that node's right subtree).
I will then repeat the above steps recursively for each node, using the part of the number that's left, until there are no more digits left.
To illustrate, for 12322 my tree will look something like this:
*
/ \
/ \
2 22
/ / \
2 3 23
/ \ / \ /
3 23 2 12 1
/ \ / /
2 12 1 1
/
1
To get the words then, I will traverse all possible paths from the leaves to the nodes.
This seems to be an overly complex solution for what I thought would be a fairly simple problem, and I'm trying to find if there's a simpler way to solve this.
You need not actually construct a tree - just recurse:
Take a single digit. See if we can form a word considering it as a letter in itself, and recurse.
When we return from the recursion, try adding another digit (if we were 1 or 2 previously), and re-recursing.
Suppose you aleady have all the possible combination of [2, 3, 2, 2] ,what would be the combination of [1, 2, 3, 2, 2] (add [1] to the head)? It is not difficult the deduce it should be:
A1: put [1] to the head of all_the_combinations_of[1,2,3,2,2] and
A2: put [1*10 + 2] to the head of all_the_combinations_of[2,3,2,2] if [1*10 + 2 <=26]
Once we got this , the following should be easy. I implemented an Ruby version with the recusion trace for your reference.
def comb a
c = []
puts a.inspect
return [a] if a.length <= 1
c = comb(a[1..-1]).map {|e| [a[0]] + e}
if a[0] * 10 + a[1] <= 26
c += comb(a[2..-1]).map { |f| [a[0] * 10 + a[1]] + f }
end
c
end
h = Hash[*(1..26).to_a.zip(('A'..'Z').to_a).flatten]
#h.keys.sort.each {|k| puts "#{k}=>#{h[k]}"}
comb([1,2,3,2,2]).each do |comb|
puts comb.map {|k| h[k]}.join
end
[1, 2, 3, 2, 2]
A1 [2, 3, 2, 2]
[3, 2, 2]
[2, 2]
[2]
[]
[2, 2]
[2]
[]
A2 [3, 2, 2]
[2, 2]
[2]
[]
ABCBB
ABCV
AWBB
AWV
LCBB
LCV
A brute-force solution would be to dynamically fill the array from 1 to N, where a[i] element contains a set of strings that form a[1]a[2]a[3]...a[i] after expansion. You can fill a[1] from stratch, then fill a[2], based on a[1] set and second character in the string. Then you fill a[3], etc. At each sted you only have to look back to a[i-1] and a[i-2] (and to s[i-1] and s[i], where s is your number sequence).
Finally, after you fill a[n], it will contain the answer.
For the example '12322', the sequence becomes:
a[1] = { "a" }
a[2] = { a + 'b' | a in a[1] } union { "l" } = { "ab", "l" }
a[3] = { a + 'c' | a in a[2] } union { a + 'w' | a in a[1] } = { "abc", "lc", "aw" }
a[4] = { a + 'b' | a in a[3] } union { } = { "abcb", "lcb", "awb" }
a[5] = { a + 'b' | a in a[4] } union { a + 'v' | a in a[3] } = { "abcbb", "lcbb", "awbb", "abcv", "lcv", "awv" }
This is essentially the dynamic programming version of the recursive solution above.
An alternative way to do this would be to reverse the problem:
Given a dictionary of words, calculate the numeric strings that would be generated, and store this data into a map/dictionary structure, i.e. table['85'] = 'hi'
For each of the first x digits of the number you are looking up, see if it's in the table, i.e. table.ContainsKey('1'), table.ContainsKey('12'), ...
If you're trying to find the word sequences, generate the words that start at each location in the numeric string, and then do a recursive lookup to find all phrases from that.

Resources