BoundsError when using DistributedArrays - multiprocessing

Apparently I must have a fundamental misunderstanding about DistributedArrays.jl.
I have set up a MWE of something similar to what I have to do:
using Distributed
using DistributedArrays
addprocs()
#everywhere using Distributed, DistributedArrays
a = distribute(zeros(5))
#sync #distributed for i in 1:5
a_l = localpart(a)
a_l[i] = 100 * i
end
And then I run into the following Error:
ERROR: TaskFailedException:
On worker 2:
BoundsError: attempt to access 1-element Array{Float64,1} at index [2]
setindex! at ./array.jl:847
macro expansion at /home/user/test.jl:36 [inlined]
[inlined]
#17 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/macros.jl:301
#160 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/macros.jl:87
#103 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:290
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:88
#96 at ./task.jl:356
...and 3 more exception(s).
Stacktrace:
[1] sync_end(::Channel{Any}) at ./task.jl:314
[2] (::Distributed.var"#159#161"{var"#17#18",UnitRange{Int64}})() at ./task.jl:333
Stacktrace:
sync_end(::Channel{Any}) at ./task.jl
top-level scope at task.jl
Using a = dzeros((5,1), workers()) also gives the same Error. Any help is appreciated!

There are two problems:
localpart is indexed starting from 1
the number of your workers is greater than loop size. Since workers are randomly assigned your loop elements get empty localparts.
Let us consider this code:
a = distribute(zeros(5));
#sync #distributed for i in 1:5
for j in keys(a[:L])
a[:L][j] = 100 * i+myid()
end
end
While it solves the first issue the second is still there:
julia> a
5-element DArray{Float64, 1, Vector{Float64}}:
402.0
503.0
0.0
0.0
0.0
Why it does not work as expected? because addprocs is adding all processes so I have now 8 workers and the size of loop is 5.
Perhaps the simplest solution is to replace the range from 1:5 to 1:max(5,nworkers()). This makes sure that each localpart is going to get processed.
julia> #sync #distributed for i in 1:max(5,nworkers())
#show i, myid(), length(a[:L])
for j in keys(a[:L])
a[:L][j] = 100 * i+myid()
end
end
From worker 9: (i, myid(), length(a[:L])) = (6, 9, 0)
From worker 7: (i, myid(), length(a[:L])) = (4, 7, 0)
From worker 2: (i, myid(), length(a[:L])) = (7, 2, 1)
From worker 8: (i, myid(), length(a[:L])) = (5, 8, 0)
From worker 3: (i, myid(), length(a[:L])) = (8, 3, 1)
From worker 4: (i, myid(), length(a[:L])) = (1, 4, 1)
From worker 5: (i, myid(), length(a[:L])) = (2, 5, 1)
From worker 6: (i, myid(), length(a[:L])) = (3, 6, 1)
Task (done) #0x0000000073e09f50
This code run shows clearly what is happening when you loop over 5 elements and use 8 workers.
The result is now as expected (with regard that tasks are randomly allocated around workers):
julia> a
5-element DArray{Float64, 1, Vector{Float64}}:
702.0
803.0
104.0
205.0
306.0

Related

Data Structure to convert a stream of integers into a list of range

A function is given with a method to get the next integer from a stream of integers. The numbers are fetched sequentially from the stream. How will we go about producing a summary of integers encountered till now?
Given a list of numbers, the summary will consist of the ranges of numbers. Example: The list till now = [1,5,4,2,7] then summary = [[1-2],[4-5],7]
Put the number in ranges if they are continuous.
My Thoughts:
Approach 1:
Maintain the sorted numbers. So when we fetch a new number from a stream, we can use binary search to find the location of the number in the list and insert the element so that the resulting list is sorted. But since this is a list, I think inserting the element will be an O(N) operation.
Approach 2:
Use Balanced binary search trees like Red, Black, or AVL. Each insertion will be O(log N)
and in order will yield the sorted array from which one can compute the range in O(N)
Approach 2 looks like a better approach if I am not making any mistakes. I am unsure if there is a better way to solve this issue.
I'd not keep the original numbers, but aggregate them to ranges on the fly. This has the potential to reduce the number of elements by quite some factor (depending on the ordering and distribution of the incoming values). The task itself seems to imply that you expect contiguous ranges of integers to appear quite frequently in the input.
Then a newly incoming number can fall into one of a few cases:
It is already contained in some range: then simply ignore the number (this is only relevant if duplicate inputs can happen).
It is adjacent to none of the ranges so far: create a new single-element range.
It is adjacent to exactly one range: extend that range by 1, downward or upward.
It is adjacent to two ranges (i.e. fills the gap): merge the two ranges.
For the data structure holding the ranges, you want a good performance for the following operations:
Find the place (position) for a given number.
Insert a new element (range) at a given place.
Merge two (neighbor) elements. This can be broken down into:
Remove an element at a given place.
Modify an element at a given place.
Depending on the expected number und sparsity of ranges, a sorted list of ranges might do. Otherwise, some kind of search tree might turn out helpful.
Anyway, start with the most readable approach, measure performance for typical cases, and decide whether some optimization is necessary.
I suggest maintaining a hashmap that maps each integer seen so far to the interval it belongs to.
Make sure that two numbers that are part of the same interval will point to the same interval object, not to copies; so that if you update an interval to extend it, all numbers can see it.
All operations are O(1), except the operation "merge two intervals" that happens if the stream produces integer x when we have two intervals [a, x - 1] and [x + 1, b]. The merge operation is proportional to the length of the shortest of these two intervals.
As a result, for a stream of n integers, the algorithm's complexity is O(n) in the best-case (where at most a few big merges happen) and O(n log n) in the worst-case (when we keep merging lots of intervals).
In python:
def add_element(intervals, x):
if x in intervals: # do not do anything
pass
elif x + 1 in intervals and x - 1 in intervals: # merge two intervals
i = intervals[x - 1]
j = intervals[x + 1]
if i[1]-i[0] > j[1]-j[0]: # j is shorter: update i, and make everything in j point to i
i[1] = j[1]
for y in range(j[0] - 1, j[1]+1):
intervals[y] = i
else: # i is shorter: update j, and make everything in i point to j
j[0] = i[0]
for y in range(i[0], i[1] + 2):
intervals[y] = j
elif x + 1 in intervals: # extend one interval to the left
i = intervals[x + 1]
i[0] = x
intervals[x] = i
elif x - 1 in intervals: # extend one interval to the right
i = intervals[x - 1]
i[1] = x
intervals[x] = i
else: # add a singleton
intervals[x] = [x,x]
return intervals
from random import shuffle
def main():
stream = list(range(10)) * 2
shuffle(stream)
print(stream)
intervals = {}
for x in stream:
intervals = add_element(intervals, x)
print(x)
print(set(map(tuple, intervals.values()))) # this line terribly inefficient because I'm lazy
if __name__=='__main__':
main()
Output:
[1, 5, 8, 3, 9, 6, 7, 9, 3, 0, 6, 5, 8, 1, 4, 7, 2, 2, 0, 4]
1
{(1, 1)}
5
{(1, 1), (5, 5)}
8
{(8, 8), (1, 1), (5, 5)}
3
{(8, 8), (1, 1), (5, 5), (3, 3)}
9
{(8, 9), (1, 1), (5, 5), (3, 3)}
6
{(3, 3), (1, 1), (8, 9), (5, 6)}
7
{(5, 9), (1, 1), (3, 3)}
9
{(5, 9), (1, 1), (3, 3)}
3
{(5, 9), (1, 1), (3, 3)}
0
{(0, 1), (5, 9), (3, 3)}
6
{(0, 1), (5, 9), (3, 3)}
5
{(0, 1), (5, 9), (3, 3)}
8
{(0, 1), (5, 9), (3, 3)}
1
{(0, 1), (5, 9), (3, 3)}
4
{(0, 1), (3, 9)}
7
{(0, 1), (3, 9)}
2
{(0, 9)}
2
{(0, 9)}
0
{(0, 9)}
4
{(0, 9)}
You could use a Disjoint Set Forest implementation for this. If well-implemented, it gives a near linear time complexity for inserting 𝑛 elements into it. The amortized running time of each insert operation is Θ(α(𝑛)) where α(𝑛) is the inverse Ackermann function. For all practical purposes we can not distinguish this from O(1).
The extraction of the ranges can have a time complexity of O(𝑘), where 𝑘 is the number of ranges, provided that the disjoint set maintains the set of root nodes. If the ranges need to be sorted, then this extraction will have a time complexity of O(𝑘log𝑘), as it will then just perform the sort-operation on it.
Here is an implementation in Python:
class Node:
def __init__(self, value):
self.low = value
self.parent = self
self.size = 1
def find(self): # Union-Find: Path splitting
node = self
while node.parent is not node:
node, node.parent = node.parent, node.parent.parent
return node
class Ranges:
def __init__(self):
self.nums = dict()
self.roots = set()
def union(self, a, b): # Union-Find: Size-based merge
a = a.find()
b = b.find()
if a is not b:
if a.size > b.size:
a, b = b, a
self.roots.remove(a) # Keep track of roots
a.parent = b
b.low = min(a.low, b.low)
b.size = a.size + b.size
def add(self, n):
if n not in self.nums:
self.nums[n] = node = Node(n)
self.roots.add(node)
if (n+1) in self.nums:
self.union(node, self.nums[n+1])
if (n-1) in self.nums:
self.union(node, self.nums[n-1])
def get(self):
return sorted((node.low, node.low + node.size - 1) for node in self.roots)
# example run
ranges = Ranges()
for n in 4, 7, 1, 6, 2, 9, 5:
ranges.add(n)
print(ranges.get()) # [(1, 2), (4, 7), (9, 9)]

Detect outlier in repeating sequence

I have a repeating sequence of say 0~9 (but may start and stop at any of these numbers). e.g.:
3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2
And it has outliers at random location, including 1st and last one, e.g.:
9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6
I need to find & correct the outliers, in the above example, I need correct the first "9" into "3", and "8" into "5", etc..
What I came up with is to construct a sequence with no outlier of desired length, but since I don't know which number the sequence starts with, I'd have to construct 10 sequences each starting from "0", "1", "2" ... "9". And then I can compare these 10 sequences with the given sequence and find the one sequence that match the given sequence the most. However this is very inefficient when the repeating pattern gets large (say if the repeating pattern is 0~99, I'd need to create 100 sequences to compare).
Assuming there won't be consecutive outliers, is there a way to find & correct these outliers efficiently?
edit: added some explanation and added the algorithm tag. Hopefully it is more appropriate now.
I'm going to propose a variation of #trincot's fine answer. Like that one, it doesn't care how many outliers there may be in a row, but unlike that one doesn't care either about how many in a row aren't outliers.
The base idea is just to let each sequence element "vote" on what the first sequence element "should be". Whichever gets the most votes wins. By construction, this maximizes the number of elements left unchanged: after the 1-liner loop ends, votes[i] is the number of elements left unchanged if i is picked as the starting point.
def correct(numbers, mod=None):
# this part copied from #trincot's program
if mod is None: # if argument is not provided:
# Make a guess what the range is of the values
mod = max(numbers) + 1
votes = [0] * mod
for i, x in enumerate(numbers):
# which initial number would make x correct?
votes[(x - i) % mod] += 1
winning_count = max(votes)
winning_numbers = [i for i, v in enumerate(votes)
if v == winning_count]
if len(winning_numbers) > 1:
raise ValueError("ambiguous!", winning_numbers)
winning_number = winning_numbers[0]
for i in range(len(numbers)):
numbers[i] = (winning_number + i) % mod
return numbers
Then, e.g.,
>>> correct([9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6])
[3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
but
>>> correct([1, 5, 3, 7, 5, 9])
...
ValueError: ('ambiguous!', [1, 4])
That is, it's impossible to guess whether you want [1, 2, 3, 4, 5, 6] or [4, 5, 6, 7, 8, 9]. They both have 3 numbers "right", and despite that there are never two adjacent outliers in either case.
I would do a first scan of the list to find the longest sublist in the input that maintains the right order. We will then assume that those values are all correct, and calculate backwards what the first value would have to be to produce those values in that sublist.
Here is how that would look in Python:
def correct(numbers, mod=None):
if mod is None: # if argument is not provided:
# Make a guess what the range is of the values
mod = max(numbers) + 1
# Find the longest slice in the list that maintains order
start = 0
longeststart = 0
longest = 1
expected = -1
for last in range(len(numbers)):
if numbers[last] != expected:
start = last
elif last - start >= longest:
longest = last - start + 1
longeststart = start
expected = (numbers[last] + 1) % mod
# Get from that longest slice what the starting value should be
val = (numbers[longeststart] - longeststart) % mod
# Repopulate the list starting from that value
for i in range(len(numbers)):
numbers[i] = val
val = (val + 1) % mod
# demo use
numbers = [9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6]
correct(numbers, 10) # for 0..9 provide 10 as argument, ...etc
print(numbers)
The advantage of this method is that it would even give a good result if there were errors with two consecutive values, provided that there are enough correct values in the list of course.
Still this runs in linear time.
Here is another way using groupby and count from Python's itertools module:
from itertools import count, groupby
def correct(lst):
groupped = [list(v) for _, v in groupby(lst, lambda a, b=count(): a - next(b))]
# Check if all groups are singletons
if all(len(k) == 1 for k in groupped):
raise ValueError('All groups are singletons!')
for k, v in zip(groupped, groupped[1:]):
if len(k) < 2:
out = v[0] - 1
if out >= 0:
yield out
else:
yield from k
else:
yield from k
# check last element of the groupped list
if len(v) < 2:
yield k[-1] + 1
else:
yield from v
lst = "9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6"
lst = [int(k) for k in lst.split(',')]
out = list(correct(lst))
print(out)
Output:
[3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
Edit:
For the case of [1, 5, 3, 7, 5, 9] this solution will return something not accurate, because i can't see which value you want to modify. This is why the best solution is to check & raise a ValueError if all groups are singletons.
Like this?
numbers = [9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6]
i = 0
for n in numbers[:-1]:
i += 1
if n > numbers[i] and n > 0:
numbers[i-1] = numbers[i]-1
elif n > numbers[i] and n == 0:
numbers[i - 1] = 9
n = numbers[-1]
if n > numbers[0] and n > 0:
numbers[-1] = numbers[0] - 1
elif n > numbers[0] and n == 0:
numbers[-1] = 9
print(numbers)

How do I find the smallest combination of 2 numbers to get closest to another number

I have two numbers: 6 & 10
I want to use a combination of these 2 numbers to get as close as possible to another number.
For example, to get to 9 I need 1 six with 3 remaining.
Other examples:
6: [6]
10: [10]
12: [6, 6]
18: [6, 6, 6]
20: [10, 10]
24: [6, 6, 6, 6]
26: [10, 10, 6]
28: [10, 6, 6, 6]
30: [10, 10, 10]
32: [10, 10, 6, 6]
I need an algorithm that can find the smallest number of combinations for any given number, taking preference for a combination with the smallest remainder. ie
38: [10, 10, 10, 6] - 2 remaining
38: [10, 10, 6, 6, 6] - no remainder, so preferred result
I hope I've explained this clearly, let me know if I need to clarify.
UPDATE:
To clarify, this is a real-world problem dealing with physical goods. The numbers 6 & 10 correspond to package cartons that contain multiples of a product in either 6 or 10 quantities. We accept orders for these products in any amount, and want to calculate the smallest number of cartons that can make up the order, and then add the remainder as individual qtys.
A customer may want to order a qty of 39, so I need to know the smallest number of 6/10 qty cartons to make up the order, with the smallest number of remainders being the priority.
A customer may also order qtys of 1,2,3,4,5,6..... up to a max of about 300.
Considering you want the factorization of n in factors of a and b, then you just need to factorize the two possible ways and check which one gives you the minimal remainder.
So, you can something like this:
def factorize(a, b, n):
return n/a, (n%a)/b, (n%a)%b
def factorize_min(a, b, n):
na1, nb1, r1 = factorize(a, b, n)
nb2, na2, r2 = factorize(b, a, n)
return (na1, nb1) if r1 < r2 else (na2, nb2)
def factorize_min_list(a, b, n):
na, nb = factorize_min(a, b, n)
return [a]*na + [b]*nb
And use it like this:
for n in (6,10,12,18,20,24,26,28,30,32):
print factorize_min_list(6, 10, n)
This would give you:
[6]
[10]
[6, 6]
[6, 6, 6]
[10, 10]
[6, 6, 6, 6]
[6, 10, 10]
[6, 10, 10]
[10, 10, 10]
[10, 10, 10]
This is kind of change-making problem that might be effectively solved using dynamic programming. Note that if needed sum cannot be produced exactly (like 9 in your example) - check lower neighbor cells of the DP table.
This is more or less the coin problem. Above the Frobenius number all values can be built. Here we have 6 and 10 which aren't coprime. So we can divide by the greatest common divisor (2) to get 3 and 5 which are coprime. Then we get the Frobenius number 3*5 - 5 - 3 = 7. This means all even values > 14 can be built using 6 and 10 coins. The values are so few, that you could just make a list:
3 (% 3 = 0), coins (1, 0)
5 (% 3 = 2), coins (0, 1)
6 (% 3 = 0), coins (2, 0)
8 (% 3 = 2), coins (1, 1)
9 (% 3 = 0), coins (3, 0)
10 (% 3 = 1), coins (0, 2)
So the algorithm would be as follows:
divide input by 2
if it is smaller than 11 return the closest of the 6 values in the list
otherwise do input % 3, take corresponding configuration (8, 9 or 10), subtract it from input, divide by 3 and add the result to the 3 (i.e. 6) coins
Example for 32 (or 33):
32 / 2 = 16
16 >= 11
16 % 3 = 1, we take 10 (0, 2), (16 - 10) / 3 = 2, add it to 3 coins => (2, 2)
check: 2 * 6 + 2 * 10 = 32
The following is a C version of solving your problem with dynamic programming.
After you compile it you can run it with the total number of items as parameter
It will output the package sizes 10 and 6 separated by tabs
followed by their sum.
I took the algorithm from the german wikipedia page on the knapsack problem
https://de.m.wikipedia.org/wiki/Rucksackproblem
It is quoted in the comment at the beginning:
/*
U = { 10, 6 } n = 2
w = { 10, 6 }
v = { 2, 1 }
Eingabe: U, B, w, v wie oben beschrieben
R := [1…(n+1), 0…B]-Matrix, mit Einträgen 0
FOR i = n … 1
FOR j = 1 … B
IF w(i) <= j
R[i,j] := max( v(i) + R[i+1, j-w(i)], R[i+1,j]\
)
ELSE
R[i,j] := R[i+1,j]
Ausgabe: R[1,B]
*/
#include <stdlib.h>
#include <stdio.h>
int n = 2;
int main(int argc, char** argv) {
int w[3] = { -1, 10, 6 };
int v[3] = { -1, 10, 6 };
int B = atoi(argv[1]);
size_t sz = ((B+1)*3*sizeof(int));
int* R = malloc(sz);
memset(R, 0, sz);
for(int i = n; i > 0; i--) {
for(int j = 1; j <= B; j++) {
int b = R[i+1,j];
if(w[i] <= j) {
// max( v(i) + R[i+1, j-w(i)], R[i+1,j] )
int a = v[i] + R[i+1, j-w[i]];
R[i,j] = a>b?a:b;
} else {
R[i,j] = b;
}
}
}
int k = R[1,B];
while(R[1,k]>0){
int j = R[1,k];;
for(int i = n; i > 0; i--) {
int t = R[1,k-w[i]];
if(t == k - w[i]) {
j = w[i];
}
}
printf("%i\t",j);
k = k-j;
}
printf("\n%i\n", R[1,B]);
return 0;
}

How many times a number appears as a leaf node?

Suppose you have an array of n elements
A = {1,2,3,4,5}
total of 5! binary search trees are possible(not necessarily distinct) now my question is in how many of trees 1 appeared as leaf node and in how many 2 appeared as leaf node and so on ?
What I have tried:
I've seen for A = {1,2,3}
2 appears 6/3 = 2 times
1 appears 2+1 = 3 times
3 appears 2+1 = 3 times
can i generalise that and say that,
if A= {1,2,3,4}
2 = 24/4 = 6 times
3 = 24/4 = 6 times
1 = 6+1 = 7 times
4 = 6+1 = 7 times
We can generalize, but not in that way.
You can try to permute the array and produce all possible BST's. A brute-force approach, that returns answer in a map/dictionary data structure shouldn't be that hard. First write a function that given one of permuted arrays, finds all leaves. It takes first element as root, sends all elements less than root to left, all greater ones to right, and calls this function recursively for both of them. It then just returns after combining those values.
In the end, combine values for all possible permutations.
A possible approach in python:
from itertools import permutations
def func(arr):
if not arr: return {}
if len(arr)==1: return {arr[0]}
ans = set()
left = func([v for v in arr[1:] if v<arr[0]])
right = func([v for v in arr[1:] if v>=arr[0]])
ans.update(left)
ans.update(right)
return ans
arr = [1,2,3,4]
ans = {i:0 for i in arr}
for a in permutations(arr):
dic = func(a)
print(a,":",dic)
for k in dic:
ans[k]+=1
print(ans)
for [1,2,3] it outputs:
(1, 2, 3) : {3}
(1, 3, 2) : {2}
(2, 1, 3) : {1, 3}
(2, 3, 1) : {1, 3}
(3, 1, 2) : {2}
(3, 2, 1) : {1}
{1: 3, 2: 2, 3: 3}
for [1,2,3,4], only the last line i.e answer is:
{1: 12, 2: 8, 3: 8, 4: 12}
for [1,2,3,4,5], it is :
{1: 60, 2: 40, 3: 40, 4: 40, 5: 60}
Can you see the pattern? well, one last example. For up to 6 it is:
{1: 360, 2: 240, 3: 240, 4: 240, 5: 240, 6: 360}

Pseudocode to find the longest run within an array

I know that A run is a sequence of adjacent repeated values , How would you write pseudo code for computing the length of the longest run in an array e.g.
5 would be the longest run in this array of integers.
1 2 4 4 3 1 2 4 3 5 5 5 5 3 6 5 5 6 3 1
Any idea would be helpful.
def longest_run(array):
result = None
prev = None
size = 0
max_size = 0
for element in array:
if (element == prev):
size += 1
if size > max_size:
result = element
max_size = size
else:
size = 0
prev = element
return result
EDIT
Wow. Just wow! This pseudocode is actually working:
>>> longest_run([1,2,4,4,3,1,2,4,3,5,5,5,5,3,6,5,5,6,3,1])
5
max_run_length = 0;
current_run_length = 0;
loop through the array storing the current index value, and the previous index's value
if the value is the same as the previous one, current_run_length++;
otherwise {
if current_run_length > max_run_length : max_run_length = current_run_length
current_run_length = 1;
}
Here a different functional approach in Python (Python looks like Pseudocode). This code works only with Python 3.3+. Otherwise you must replace "return" with "raise StopIteration".
I'm using a generator to yield a tuple with quantity of the element and the element itself. It's more universal. You can use this also for infinite sequences. If you want to get the longest repeated element from the sequence, it must be a finite sequence.
def group_same(iterable):
iterator = iter(iterable)
last = next(iterator)
counter = 1
while True:
try:
element = next(iterator)
if element is last:
counter += 1
continue
else:
yield (counter, last)
counter = 1
last = element
except StopIteration:
yield (counter, last)
return
If you have a list like this:
li = [0, 0, 2, 1, 1, 1, 1, 1, 5, 5, 6, 7, 7, 7, 12, 'Text', 'Text', 'Text2']
Then you can make a new list of it:
list(group_same(li))
Then you'll get a new list:
[(2, 0),
(1, 2),
(5, 1),
(2, 5),
(1, 6),
(3, 7),
(1, 12),
(2, 'Text'),
(1, 'Text2')]
To get longest repeated element, you can use the max function.
gen = group_same(li) # Generator, does nothing until iterating over it
grouped_elements = list(gen) # iterate over the generator until it's exhausted
longest = max(grouped_elements, key=lambda x: x[0])
Or as a one liner:
max(list(group_same(li)), key=lambda x: x[0])
The function max gives us the biggest element in a list. In this case, the list has more than one element. The argument key is just used to get the first element of the tuple as max value, but you'll still get back the tuple.
In : max(list(group_same(li)), key=lambda x: x[0])
Out: (5, 1)
The element 1 occurred 5 times repeatedly.
int main()
{
int a[20] = {1, 2, 4, 4, 3, 1, 2, 4, 3, 5, 5, 5, 5, 3, 6, 5, 5, 6, 3, 1};
int c=0;
for (int i=0;i<19;i++)
{
if (a[i] == a[i+1])
{
if (i != (i+1))
{
c++;
}
}
}
cout << c-1;
return 0;
}

Resources