python appending issues, function keeps changing values of list - sorting

I was trying to visualize bubblesort by making an animated plot on some unsorted list, say np.random.permutation(10)
so naturally I would append the list every time it's altered within the bubblesort function until it's completely sorted. Here's the code
def bubblesort(A):
instant = []
for i in range(len(A)-1):
lindex=0
while lindex+1<len(A):
if A[lindex]> A[lindex+1]:
swap(A,lindex,lindex+1)
lindex+=1
else:
lindex+=1
instant.append(A)
return instant
The problem is though, instant only returns
[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])]
which is obviously not right. What has gone wrong? Thanks!

A is being operated on in-place, and bubblesort is returning a list of references to this array. Notice that if you check A now, it is also sorted.
Changing
if A[lindex]> A[lindex+1]:
swap(A,lindex,lindex+1)
to
if A[lindex]> A[lindex+1]:
A = A.copy()
swap(A,lindex,lindex+1)
making a copy before changing anything, should show the progress of the sort.

Related

Creating a nested dictionary comprehension for year and month in python

I would like to create a nested dictionary with dict comprehension but I am getting syntax error.
years = [2016, 2017, 2018, 2019]
months = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
my_Dict = {i:{j: for j in months}, for i in years}
I am not sure how to declare this nested dict comprehension without getting a syntax error.
In this case, the correct way would be to use a dictionary with a nested list comprehension because if you use a nested dictionary, it will replace old values. The correct syntax, in this case, would be this one:
years = [2016, 2017, 2018, 2019]
months = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
my_Dict = {
year: [month for month in months] for year in years
}
print(my_Dict)
>>> {2016: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
2017: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
2018: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
2019: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
Made some slight changes to the code above and this works
key_years = {2016, 2017, 2018, 2019}
key_months = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
myDict = {i:{j for j in key_months} for i in key_years}
print (myDict)
Output: {2016: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}, 2017: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}, 2018: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}, 2019: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}}

Linux script using a Hardware (True) Random number generator

I'd like to use the built in hardware random number generator in my RPI3 for a project. Currently I'm only able to use /dev/hwrng to save binary dumps with
dd if=/dev/hwrng of=data.bin bs=25 count=1
What I need for my project is to read 200 bit long data chunks from the random source (/dev/hwrng) with a frequency of 1 reading/second and count the 1's in it and write the result as decimal into a text file with a timestamp, like this:
datetime, value
11/20/2018 12:48:09, 105
11/20/2018 12:48:10, 103
11/20/2018 12:48:11, 97
The decimal number should be always close to 100, since it is a random data source and the expected number of 1's and 0's should be the same.
Any help is appreciated....
I did come up wit a perl script that is close to what I wan't, so let me share it. I'm sure it could be done in a much cleaner way though...
#!/usr/bin/perl
use strict;
use warnings;
use DateTime;
my #bitcounts = (
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3,
3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4,
3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2,
2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5,
3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5,
5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3,
2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4,
4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4,
4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6,
5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5,
5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8
);
for (my $i=0; $i <= 10; $i++) {
system("dd if=/dev/hwrng of=temprnd.bin bs=25 count=1 status=none");
my $filename = 'temprnd.bin';
open(my $fh, '<', $filename) or die "$!";
binmode $fh;
my $count = 0;
my $byte = 0;
while ( read $fh, $byte, 1 ) {
$count += $bitcounts[ord($byte)];
}
my $dt = DateTime->now;
print join ',', $dt->ymd, $dt->hms,"$count\n";
system("rm temprnd.bin");
sleep 1;
}
__END__
Try running the following code
for ((n=0; n<200; ++n)); do echo $(date '+%m/%d/%Y %H:%M:%S'), $(od -vAn -N1 -tu1 < /dev/hwrng); sleep 1; done
If you want to save it to a file, add simple redirect in the end
> somefile
Updating on the new request, try running the following code
for ((n=0; n<10; ++n)); do
count=0
for ((s=0; s<200; ++s)); do
if (( $(od -vAn -N1 -tu1 < /dev/hwrng) > 127 )); then ((++count)); fi
done
echo $(date '+%m/%d/%Y %H:%M:%S'), $count
sleep 1
done

Performance analysis of openmp code (parallel) vs serial code

How do I compare the performance of parallel code(using OpenMP) vs serial code? I am using the following method
int arr[1000] = {1, 6, 1, 3, 1, 9, 7, 3, 2, 0, 5, 0, 8, 9, 8, 4, 4, 4, 0, 9, 6, 5, 9, 5, 9, 2, 5, 8, 6, 1, 0, 7, 7, 3, 2, 8, 3, 2, 3, 7, 2, 0, 7, 2, 9, 5, 8, 6, 2, 8, 5, 8, 5, 6, 3, 5, 8, 1, 3, 7, 2, 6, 6, 2, 1, 9, 0, 6, 1, 6, 3, 5, 6, 3, 0, 8, 0, 8, 4, 2, 7, 1, 0, 2, 7, 6, 9, 7, 7, 5, 4, 9, 3, 1, 1, 4, 2, 4, 1, 5, 2, 6, 0, 8, 9, 2, 6, 0, 1, 0, 2, 0, 3, 3, 4, 0, 1, 4, 8, 8, 1, 4, 9, 4, 7, 3, 8, 9, 9, 1, 4, 1, 8, 7, 9, 9, 9, 8, 9, 0, 0, 4, 2, 4, 9, 7, 6, 0, 3, 4, 8, 6, 1, 9, 0, 8, 2, 0, 8, 1, 2, 4, 2, 2, 1, 4, 1, 1, 4, 3, 3, 4, 9, 8, 0, 8, 7, 7, 8, 0, 3, 8, 8, 4, 7, 8, 5, 2, 0, 3, 3, 4, 9, 8, 6, 1, 4, 0, 4, 8, 5, 9, 4, 4, 7, 5, 2, 4, 2, 2, 6, 5, 2, 4, 2, 1, 4, 7, 3, 5, 2, 7, 9, 1, 7, 8, 4, 3, 0, 8, 1, 5, 8, 7, 1, 7, 2, 5, 2, 6, 9, 8, 2, 1, 5, 4, 2, 9, 1, 6, 6, 5, 5, 8, 6, 4, 6, 1, 7, 8, 1, 0, 3, 9, 7, 6, 7, 2, 1, 1, 8, 2, 9, 2, 3, 6, 8, 7, 8, 9, 5, 4, 4, 2, 2, 3, 6, 8, 4, 5, 6, 5, 7, 1, 7, 7, 9, 6, 9, 2, 7, 9, 4, 8, 2, 7, 5, 0, 7, 3, 2, 2, 9, 8, 7, 2, 3, 5, 2, 9, 1, 1, 5, 8, 4, 4, 5, 4, 0, 6, 6, 9, 8, 1, 7, 0, 0, 4, 2, 7, 9, 6, 2, 9, 7, 9, 1, 0, 4, 3, 0, 7, 6, 7, 8, 1, 1, 5, 5, 3, 4, 3, 2, 2, 4, 1, 2, 7, 6, 6, 4, 5, 3, 8, 4, 2, 9, 7, 2, 6, 3, 4, 3, 9, 1, 1, 0, 4, 9, 5, 7, 3, 9, 1, 5, 5, 5, 9, 2, 3, 5, 9, 8, 0, 9, 5, 2, 9, 4, 7, 5, 7, 1, 0, 7, 5, 4, 7, 9, 3, 5, 9, 8, 6, 2, 3, 1, 7, 2, 6, 0, 9, 7, 1, 2, 6, 8, 4, 5, 2, 3, 2, 2, 7, 3, 9, 2, 9, 6, 3, 2, 3, 2, 2, 9, 7, 5, 3, 4, 9, 9, 7, 8, 6, 0, 0, 4, 0, 7, 2, 4, 0, 4, 6, 9, 9, 5, 1, 0, 4, 5, 4, 7, 9, 6, 9, 6, 1, 2, 3, 0, 3, 2, 1, 1, 4, 1, 5, 4, 0, 7, 8, 3, 4, 5, 2, 5, 2, 6, 6, 6, 1, 0, 6, 2, 9, 5, 1, 0, 9, 6, 3, 4, 8, 4, 5, 2, 7, 2, 8, 8, 2, 6, 1, 6, 3, 5, 3, 6, 1, 1, 4, 4, 2, 0, 7, 1, 7, 0, 3, 8, 6, 6, 2, 6, 2, 7, 0, 0, 2, 8, 0, 4, 6, 3, 2, 0, 8, 5, 8, 2, 7, 2, 6, 1, 5, 5, 4, 4, 5, 9, 3, 3, 8, 7, 9, 0, 7, 1, 2, 9, 1, 2, 3, 8, 7, 5, 0, 8, 0, 8, 0, 9, 2, 6, 0, 7, 2, 6, 4, 9, 6, 7, 3, 4, 6, 4, 6, 3, 6, 9, 2, 7, 3, 5, 7, 1, 2, 7, 9, 5, 7, 1, 4, 0, 7, 7, 9, 1, 3, 3, 1, 1, 2, 4, 5, 9, 0, 4, 4, 6, 3, 7, 6, 8, 4, 3, 1, 7, 1, 2, 2, 8, 3, 6, 0, 1, 5, 0, 2, 1, 5, 5, 2, 0, 9, 0, 1, 0, 4, 5, 8, 7, 2, 4, 7, 7, 0, 9, 6, 1, 1, 8, 1, 5, 6, 4, 8, 2, 4, 0, 3, 1, 6, 5, 1, 7, 7, 4, 9, 1, 0, 0, 0, 4, 6, 8, 3, 6, 7, 9, 9, 0, 9, 3, 5, 6, 7, 3, 8, 3, 6, 3, 4, 4, 0, 8, 1, 8, 2, 3, 1, 4, 3, 2, 9, 1, 0, 4, 8, 9, 4, 9, 9, 3, 2, 7, 1, 9, 0, 1, 4, 8, 4, 9, 2, 7, 9, 6, 5, 1, 1, 6, 8, 4, 0, 9, 7, 2, 3, 5, 1, 9, 7, 3, 5, 9, 0, 6, 1, 2, 8, 5, 1, 4, 6, 5, 1, 5, 3, 8, 9, 4, 7, 7, 0, 9, 6, 8, 2, 9, 3, 5, 9, 2, 8, 4, 2, 0, 2, 5, 3, 2, 2, 6, 7, 9, 3, 0, 6, 7, 1, 5, 1, 0, 2, 2, 9, 0, 2, 1, 2, 7, 7, 3, 0, 7, 9, 4, 8, 1, 9, 3, 4, 1, 1, 3, 2, 6, 3, 9, 3, 6, 6, 7, 6, 1, 1, 6, 1, 3, 9, 3, 2, 6, 8, 2, 6, 7, 6, 4, 1, 5, 9, 5, 9, 2, 0, 3, 8, 5, 2, 4, 2, 9, 3, 8, 0, 6, 6, 3, 1, 6, 9, 3, 2, 7, 6, 0, 7, 2, 6, 8, 0, 5, 5, 9, 9, 5, 4, 8, 0, 7, 4, 2, 8, 9, 3, 0, 5, 9, 3, 6, 5, 4, 9, 0, 2, 7, 2, 9, 0, 9, 9, 2, 6, 4, 3, 6, 9, 7, 6, 1, 6, 0, 6, 4, 9, 9, 6, 6, 0, 2, 2, 6, 6, 3, 8, 8, 1, 0, 9, 3, 9, 8, 5, 6, 4, 8, 4, 3, 5, 0, 7, 2, 2, 3, 8, 3, 2, 5, 9, 2, 7, 1, 0, 5, 6, 0, 4};
clock_t begin, end;
double time_spent;
begin = clock();
/* here, do your time-consuming job */
#pragma omp parallel for private(temp)
for(j=0;j<1000;j++){
temp = arr[j];
for(i=0;i<temp;temp--)
result[j]=result[j]*temp;
}
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("\n\n%f",time_spent);
But every time I run the code I get a different output. I want to see how the performance of the code differs for openmp and serial code. What method I should use to achieve the same?
The time the code takes to run will change a little bit due to computer/server usage; however, if you run both the parallel and serial versions you should see a difference in the amount of run time between the two. Also, the size of your parallel operation is pretty small. But you should see and improvement.
int arr[1000] = {1, 6, 1, 3, 1, 9, 7, 3, 2, 0, 5, 0, 8, 9, 8, 4, 4, 4, 0, 9, 6, 5, 9, 5, 9, 2, 5, 8, 6, 1, 0, 7, 7, 3, 2, 8, 3, 2, 3, 7, 2, 0, 7, 2, 9, 5, 8, 6, 2, 8, 5, 8, 5, 6, 3, 5, 8, 1, 3, 7, 2, 6, 6, 2, 1, 9, 0, 6, 1, 6, 3, 5, 6, 3, 0, 8, 0, 8, 4, 2, 7, 1, 0, 2, 7, 6, 9, 7, 7, 5, 4, 9, 3, 1, 1, 4, 2, 4, 1, 5, 2, 6, 0, 8, 9, 2, 6, 0, 1, 0, 2, 0, 3, 3, 4, 0, 1, 4, 8, 8, 1, 4, 9, 4, 7, 3, 8, 9, 9, 1, 4, 1, 8, 7, 9, 9, 9, 8, 9, 0, 0, 4, 2, 4, 9, 7, 6, 0, 3, 4, 8, 6, 1, 9, 0, 8, 2, 0, 8, 1, 2, 4, 2, 2, 1, 4, 1, 1, 4, 3, 3, 4, 9, 8, 0, 8, 7, 7, 8, 0, 3, 8, 8, 4, 7, 8, 5, 2, 0, 3, 3, 4, 9, 8, 6, 1, 4, 0, 4, 8, 5, 9, 4, 4, 7, 5, 2, 4, 2, 2, 6, 5, 2, 4, 2, 1, 4, 7, 3, 5, 2, 7, 9, 1, 7, 8, 4, 3, 0, 8, 1, 5, 8, 7, 1, 7, 2, 5, 2, 6, 9, 8, 2, 1, 5, 4, 2, 9, 1, 6, 6, 5, 5, 8, 6, 4, 6, 1, 7, 8, 1, 0, 3, 9, 7, 6, 7, 2, 1, 1, 8, 2, 9, 2, 3, 6, 8, 7, 8, 9, 5, 4, 4, 2, 2, 3, 6, 8, 4, 5, 6, 5, 7, 1, 7, 7, 9, 6, 9, 2, 7, 9, 4, 8, 2, 7, 5, 0, 7, 3, 2, 2, 9, 8, 7, 2, 3, 5, 2, 9, 1, 1, 5, 8, 4, 4, 5, 4, 0, 6, 6, 9, 8, 1, 7, 0, 0, 4, 2, 7, 9, 6, 2, 9, 7, 9, 1, 0, 4, 3, 0, 7, 6, 7, 8, 1, 1, 5, 5, 3, 4, 3, 2, 2, 4, 1, 2, 7, 6, 6, 4, 5, 3, 8, 4, 2, 9, 7, 2, 6, 3, 4, 3, 9, 1, 1, 0, 4, 9, 5, 7, 3, 9, 1, 5, 5, 5, 9, 2, 3, 5, 9, 8, 0, 9, 5, 2, 9, 4, 7, 5, 7, 1, 0, 7, 5, 4, 7, 9, 3, 5, 9, 8, 6, 2, 3, 1, 7, 2, 6, 0, 9, 7, 1, 2, 6, 8, 4, 5, 2, 3, 2, 2, 7, 3, 9, 2, 9, 6, 3, 2, 3, 2, 2, 9, 7, 5, 3, 4, 9, 9, 7, 8, 6, 0, 0, 4, 0, 7, 2, 4, 0, 4, 6, 9, 9, 5, 1, 0, 4, 5, 4, 7, 9, 6, 9, 6, 1, 2, 3, 0, 3, 2, 1, 1, 4, 1, 5, 4, 0, 7, 8, 3, 4, 5, 2, 5, 2, 6, 6, 6, 1, 0, 6, 2, 9, 5, 1, 0, 9, 6, 3, 4, 8, 4, 5, 2, 7, 2, 8, 8, 2, 6, 1, 6, 3, 5, 3, 6, 1, 1, 4, 4, 2, 0, 7, 1, 7, 0, 3, 8, 6, 6, 2, 6, 2, 7, 0, 0, 2, 8, 0, 4, 6, 3, 2, 0, 8, 5, 8, 2, 7, 2, 6, 1, 5, 5, 4, 4, 5, 9, 3, 3, 8, 7, 9, 0, 7, 1, 2, 9, 1, 2, 3, 8, 7, 5, 0, 8, 0, 8, 0, 9, 2, 6, 0, 7, 2, 6, 4, 9, 6, 7, 3, 4, 6, 4, 6, 3, 6, 9, 2, 7, 3, 5, 7, 1, 2, 7, 9, 5, 7, 1, 4, 0, 7, 7, 9, 1, 3, 3, 1, 1, 2, 4, 5, 9, 0, 4, 4, 6, 3, 7, 6, 8, 4, 3, 1, 7, 1, 2, 2, 8, 3, 6, 0, 1, 5, 0, 2, 1, 5, 5, 2, 0, 9, 0, 1, 0, 4, 5, 8, 7, 2, 4, 7, 7, 0, 9, 6, 1, 1, 8, 1, 5, 6, 4, 8, 2, 4, 0, 3, 1, 6, 5, 1, 7, 7, 4, 9, 1, 0, 0, 0, 4, 6, 8, 3, 6, 7, 9, 9, 0, 9, 3, 5, 6, 7, 3, 8, 3, 6, 3, 4, 4, 0, 8, 1, 8, 2, 3, 1, 4, 3, 2, 9, 1, 0, 4, 8, 9, 4, 9, 9, 3, 2, 7, 1, 9, 0, 1, 4, 8, 4, 9, 2, 7, 9, 6, 5, 1, 1, 6, 8, 4, 0, 9, 7, 2, 3, 5, 1, 9, 7, 3, 5, 9, 0, 6, 1, 2, 8, 5, 1, 4, 6, 5, 1, 5, 3, 8, 9, 4, 7, 7, 0, 9, 6, 8, 2, 9, 3, 5, 9, 2, 8, 4, 2, 0, 2, 5, 3, 2, 2, 6, 7, 9, 3, 0, 6, 7, 1, 5, 1, 0, 2, 2, 9, 0, 2, 1, 2, 7, 7, 3, 0, 7, 9, 4, 8, 1, 9, 3, 4, 1, 1, 3, 2, 6, 3, 9, 3, 6, 6, 7, 6, 1, 1, 6, 1, 3, 9, 3, 2, 6, 8, 2, 6, 7, 6, 4, 1, 5, 9, 5, 9, 2, 0, 3, 8, 5, 2, 4, 2, 9, 3, 8, 0, 6, 6, 3, 1, 6, 9, 3, 2, 7, 6, 0, 7, 2, 6, 8, 0, 5, 5, 9, 9, 5, 4, 8, 0, 7, 4, 2, 8, 9, 3, 0, 5, 9, 3, 6, 5, 4, 9, 0, 2, 7, 2, 9, 0, 9, 9, 2, 6, 4, 3, 6, 9, 7, 6, 1, 6, 0, 6, 4, 9, 9, 6, 6, 0, 2, 2, 6, 6, 3, 8, 8, 1, 0, 9, 3, 9, 8, 5, 6, 4, 8, 4, 3, 5, 0, 7, 2, 2, 3, 8, 3, 2, 5, 9, 2, 7, 1, 0, 5, 6, 0, 4};
clock_t begin, end;
double time_spent_omp;
double time_spent;
begin = omp_get_wtime();
/* here, do your time-consuming job */
#pragma omp parallel for private(temp)
for(j=0;j<1000;j++){
temp = arr[j];
for(i=0;i<temp;temp--)
result[j]=result[j]*temp;
}
end = omp_get_wtime();
time_spent_omp = (double)(end - begin) / CLOCKS_PER_SEC;
begin = omp_get_wtime();
/* here, do your time-consuming job */
for(j=0;j<1000;j++){
temp = arr[j];
for(i=0;i<temp;temp--)
result[j]=result[j]*temp;
}
end = omp_get_wtime();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("\n\n Time to process: %f --- Time to process with OPENMP %f",time_spent, time_spent_omp);
This should give you a better idea about how it is working.

ElasticSearch: minimum_should_match and length of terms list

Using ElasticSearch I'm trying to use the minimum_should_match option on a Terms Query to find documents that have a list of longs that is X% similar to the list of longs I'm querying with.
e.g:
{
"filter": {
"fquery": {
"query": {
"terms": {
"mynum": [1, 2, 3, 4, 5, 6, 7, 8, 9, 13],
"minimum_should_match": "90%",
"disable_coord": False
}
}
}
}
}
will match two documents with a mynum list of:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
and:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12]
This works and is correct since the first document has a 10 at the end while the query contained a 13 and the second document contained an 11 where again the query contained a 13.
Which means that 1 ou of 10 numbers in my query's list is different in the returned document and amounts to the allowed 90% similarity (minimum_should_match) value in the query.
Now the issue that I have is that I would like the behaviour to be different in the sense that since the second document is longer and has 11 numbers in place of 10, the difference level should ideally have been higher since it has actually two values 11 and 12 that are not in the query's list. e.g:
Instead of computing the intersection of:
(list1) [1, 2, 3, 4, 5, 6, 7, 8, 9, 13]
with:
(list2) [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12]
which is a 10% difference
it should say that since list2 is longer than list1, the intersection should be:
(list2) [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12]
with:
(list1) [1, 2, 3, 4, 5, 6, 7, 8, 9, 13]
which is a 12% difference
Is this possible ?
If not, how could I weight in the length of the list besides using a dense vector rather than a sparse one ? e.g:
using
[1, 2, 3, 4, 5, 6, 7, 8, 9, , , , 13]
rather than:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 13]

recursive removal of elements in array

Given an array of n elements, remove any adjacent pair of elements which are equal. Repeat this operation until there are no more adjacent pairs to remove; that will be the final array.
For e.g 1 2 2 3 4 should return the array 1 3 4.
please note array need not to be sorted.
check this test case also: 1,2,2,3,4,4,3,5 o/p should be 1,5.
(2,2) and (4,4) gets removed, then (3,3) which became adjacent after the removal of (4,4)
Any time you remove a pair of elements, you also need to see if you generated another pair that you want to remove.
The algorithm should follow naturally from that observation.
In Python:
>>> l=[1,2,2,3,4,4,3,5]
>>> [x for x in l if not l.count(x) > 1]
[1, 5]
This removes all integers that occur more than once in the list. This is a correct result for your example but I think that you are really trying to state something different. I think you are saying:
list:=(an unsorted list of integers)
while adjacent_pairs(list) is True:
remove_adjacent_pairs(list)
Once again, in Python:
#!/usr/bin/env python
def dedupe_adjacent(l):
for i in xrange(len(l) - 1, 0, -1):
if l[i] == l[i-1]:
del l[i-1:i+1]
return True
return False
def process_list(l):
print "input list: ",l
i=1
while(dedupe_adjacent(l)):
print " loop ",i,":",l
i+=1
print "processed list=",l
print
process_list([1,2,2,3,4,4,3,5])
process_list([1,2,2,3,4,4,6,3,5])
Output:
input list: [1, 2, 2, 3, 4, 4, 3, 5]
loop 1 : [1, 2, 2, 3, 3, 5]
loop 2 : [1, 2, 2, 5]
loop 3 : [1, 5]
processed list= [1, 5]
input list: [1, 2, 2, 3, 4, 4, 6, 3, 5]
loop 1 : [1, 2, 2, 3, 6, 3, 5]
loop 2 : [1, 3, 6, 3, 5]
processed list= [1, 3, 6, 3, 5]
The following:
function compress(arr) {
var prev, res = [];
for (var i in arr) {
if (i == 0 || (arr[i] != arr[i - 1]))
res.push(arr[i]);
}
return res;
}
compress([1, 2, 2, 3, 3, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]);
Returns:
[1, 2, 3, 4, 3, 5, 6, 7, 8]
Also (JavaScript 1.6 solution):
[1, 2, 2, 3, 3, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8].filter(function(el, i, arr) {
return i == 0 || (el != arr[i - 1]);
})
Edit: Removing any item that appears in the array more than once requires a different solution:
function dedup(arr) {
var res = [], seen = {};
for (var i in arr)
seen[arr[i]] = seen[arr[i]] ? ++seen[arr[i]] : 1;
for (var j in arr) {
if (seen[arr[j]] == 1)
res.push(arr[j]);
}
return res;
}
The following:
dedup([1, 2, 2, 3, 4, 4, 3, 5]);
Produces:
[1, 5]
I have a solution to this in Java. You need to use replaceAll method in String class in Java. You can use regular expession to remove such adjacent redundant characters:
public class MyString {
public static void main(String[] args) {
String str = "12234435";
while(!str.replaceAll("(\\w)\\1+", "").equalsIgnoreCase(str))
str = str.replaceAll("(\\w)\\1+", "");
System.out.println(str);
}
}
You can find how to give a regular expression here
I would:
Sort the array.
From the start of the array, until you are at the last element of the array do:
`count` = count the number of array[i] elements.
remove the first `count` elements of the array if `count` > 1.
The following Python 3 code will remove duplicates from a list (array). It does this by scanning the array from start towards end and compares the target element with the element one larger. If they are the same they are removed. If the element pointer is not pointing at 0, then it is reduced by 1 in order to catch nested pairs. If the two compared elements are different then the pointer is incremented.
I'm sure there's a more pythonic way to remove two adjacent elements from a list, but I'm new to Python and haven't figured that out yet. Also, you'll want to get rid of the print(indx, SampleArray) statement--I left it in there to let you follow the progress in the output listing below.
# Algorithm to remove duplicates in a semi-sorted list
def CompressArray(SampleArray):
indx=0
while(indx < len(SampleArray)-1):
print(indx, SampleArray)
if(SampleArray[indx]==SampleArray[indx+1]):
del(SampleArray[indx])
del(SampleArray[indx])
if(indx>0):
indx-=1
else:
indx+=1
return SampleArray
Here are sample runs for:
[1, 2, 2, 3, 4]
[1, 2, 2, 3, 4, 4, 3, 5]
[1, 2, 2, 3, 3, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]
[1, 2, 2, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
[1, 1, 2, 3, 3, 2, 4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
================================
0 [1, 2, 2, 3, 4]
1 [1, 2, 2, 3, 4]
0 [1, 3, 4]
1 [1, 3, 4]
[1, 3, 4]
================================
0 [1, 2, 2, 3, 4, 4, 3, 5]
1 [1, 2, 2, 3, 4, 4, 3, 5]
0 [1, 3, 4, 4, 3, 5]
1 [1, 3, 4, 4, 3, 5]
2 [1, 3, 4, 4, 3, 5]
1 [1, 3, 3, 5]
0 [1, 5]
[1, 5]
================================
0 [1, 2, 2, 3, 3, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]
1 [1, 2, 2, 3, 3, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]
0 [1, 3, 3, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]
1 [1, 3, 3, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]
0 [1, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]
1 [1, 3, 3, 4, 3, 3, 5, 6, 7, 8, 8]
0 [1, 4, 3, 3, 5, 6, 7, 8, 8]
1 [1, 4, 3, 3, 5, 6, 7, 8, 8]
2 [1, 4, 3, 3, 5, 6, 7, 8, 8]
1 [1, 4, 5, 6, 7, 8, 8]
2 [1, 4, 5, 6, 7, 8, 8]
3 [1, 4, 5, 6, 7, 8, 8]
4 [1, 4, 5, 6, 7, 8, 8]
5 [1, 4, 5, 6, 7, 8, 8]
[1, 4, 5, 6, 7]
================================
0 [1, 2, 2, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
1 [1, 2, 2, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
0 [1, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
1 [1, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
2 [1, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
3 [1, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
4 [1, 3, 4, 6, 7, 7, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
3 [1, 3, 4, 6, 6, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
2 [1, 3, 4, 4, 3, 8, 8, 5, 9, 10, 10, 9, 11]
1 [1, 3, 3, 8, 8, 5, 9, 10, 10, 9, 11]
0 [1, 8, 8, 5, 9, 10, 10, 9, 11]
1 [1, 8, 8, 5, 9, 10, 10, 9, 11]
0 [1, 5, 9, 10, 10, 9, 11]
1 [1, 5, 9, 10, 10, 9, 11]
2 [1, 5, 9, 10, 10, 9, 11]
3 [1, 5, 9, 10, 10, 9, 11]
2 [1, 5, 9, 9, 11]
1 [1, 5, 11]
[1, 5, 11]
================================
0 [1, 1, 2, 3, 3, 2, 4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
0 [2, 3, 3, 2, 4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
1 [2, 3, 3, 2, 4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
0 [2, 2, 4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
0 [4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
1 [4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
2 [4, 5, 6, 6, 5, 7, 8, 8, 7, 4, 9]
1 [4, 5, 5, 7, 8, 8, 7, 4, 9]
0 [4, 7, 8, 8, 7, 4, 9]
1 [4, 7, 8, 8, 7, 4, 9]
2 [4, 7, 8, 8, 7, 4, 9]
1 [4, 7, 7, 4, 9]
0 [4, 4, 9]
[9]
================================
I love Java, but functional solutions should get more time on this site.
In Haskell, doing things the way the question asks:
compress lst = if (length lst == length b) then lst else (compress b) where
b = helper lst
helper [] = []
helper [x] = [x]
helper (x:y:xs) = if (x == y) then (helper xs) else (x:helper (y:xs))
You can solve this problem in O(n) time, although it is a bit more complicated
compress' lst = reverse (helper [] lst) where
helper xs [] = xs
helper [] (x:xs) = helper [x] xs
helper (a:as) (x:xs)
| a == x = helper as xs
| otherwise = helper (x:a:as) xs
I think we could use a stack to check adjacent duplicated elements.
Scan the array. For each new element, if it is equal to the top element in the stack, drop it and pop the top element from the stack. Otherwise, push it into the stack.
Here is the stack based algorithm based upon the edited question.
// pseudo code, not tested
void RemoveDupp(vector<int> & vin, vector<int> & vout)
{
int i = 0, int j = -1;
vout.resize(vin.size());
while (i < vin.size())
{
if (j == -1 || vout[j] != vin[i])
vout[++j] = vin[i++]; //push
else
j--, i++; //pop
}
vout.resize(j + 1);
}

Resources