Replace multiple If statements with For statement - for-loop

My script to find where a value in Array $c[] is in Array $a[] works, but nested If statements are too long. Can this be shortened using a For statement using a variable for array $c[] with the following rules?
#include <Array.au3>
#include <MsgBoxConstants.au3>
Local $a[]=[33,5,3,4,4,'a4',2,22,66,234,'a4',234,31,34,55,'a4',22,44,55,66]
Local $c[]=['a4',22,44,55,66]
For $b=0 To 19
If $c[0] == $a[$b] Then
If $c[1] == $a[$b+1] Then
If $c[2] == $a[$b+2] Then
$k=$b
EndIf
EndIf
EndIf
Next
#include <Array.au3>
#include <MsgBoxConstants.au3>
Local $a[]=[33,5,3,4,4,'a4',2,22,66,234,'a4',234,31,34,55,'a4',22,44,55,66]
Local $c[]=['a4',22,44,55,66]
Local $k[]
$e=0
For $b=0 To 19
If $c[$e] == $a[$b] Then
$k[$e]=$b
$e+=1
EndIf
Next
_ArrayDisplay($k,"dispay")
MsgBox($MB_SYSTEMMODAL, "", $k &"th value" )
I tried one nested If statement using the For statement, but it doesn't work.

Here is another approach:
#include <Array.au3>
Local $aA[]=[33,5,3,4,4,'a4',2,22,66,234,'a4',234,31,34,55,'a4',22,44,55,66]
Local $aC[]=['a4',22,44,55,66]
Local $aIndex=_ArrayFindAll($aA,$aC[0],0,ubound($aA)-ubound($aC)) ; positions where the first element of $aC is found. Reduces number of comparisons
for $i in $aIndex ; for each of those positions
$aSubA=_ArrayExtract($aA,$i,$i+UBound($aC)-1) ; extract a subarray of $aA with the same lenght as $aC
if _ArrayToString($aSubA) = _ArrayToString($aC) then ConsoleWrite("found C in A on position "&$i & #CRLF) ; if can't compare arrays, but strings only
Next

"Can this be shortened using a For statement using a variable for array $c[] …"
Yes :
Global Const $g_bGetAll = False
Global Const $g_aArrayA = [33, 5, 3, 4, 4, 'a4', 2, 22, 66, 234, 'a4', 234, 31, 34, 55, 'a4', 22, 44, 55, 66], _
$g_aArrayC = ['a4', 22, 44, 55, 66]
For $i1 = 0 To (UBound($g_aArrayA) - 1) - (UBound($g_aArrayC) - 1)
For $i2 = 0 To UBound($g_aArrayC) - 1
If Not ($g_aArrayC[$i2] == $g_aArrayA[$i1 + $i2]) Then _
ContinueLoop 2
Next
ConsoleWrite(StringFormat('$g_aArrayC matches $g_aArrayA[%i] to $g_aArrayA[%i]\n', $i1, $i1 + UBound($g_aArrayC) - 1))
If Not $g_bGetAll Then _
ExitLoop
Next
Returns:
$g_aArrayC matches $g_aArrayA[15] to $g_aArrayA[19]
Returns all (multiple) matches if $g_bGetAll = True.

Related

Comparing csv files using field values

I have some csv files with the following format in the same folder:
Name - Value - Number - Key
I want to compare these files pairwise and give them a score in this way. If all the names in the Name column of the first file aren't in the corresponding column of the second file the score will be 0.
Instead, if they are, the score will be given as shown in this example:
File1.csv
NameA, ValueA, NumberA, KeyA
Jack, 10, 9, 12
Alex, 30, 2, 16
Mark, 15, 3, 18
File2.csv
NameB, ValueB, NumberB, KeyB
Jack, 13, 4, 11
Alex, 22, 5, 18
Bill, 67, 6, 20
Mark 18, 8, 26
Score = abs(11 - 10)/9 + abs(18 - 30)/2 + abs(26 - 15)/3
So it will be given by the summation of the abs(KeyB - ValueA)/NumberA scores, where abs is the absolute value of the subtraction.
How can I do this?
First of all, based on your formula, abs(KeyB - ValueA)/NumberA, you should have
Score = abs(11 - 10)/9 + abs(18 - 30)/2 + abs(26 - 15)/3 = 9.7778
instead of
Score = abs(11 - 10)/9 + abs(18 - 30)/2 + abs(26 - 15)/18
the awk command can be
awk -F, 'function abs(x){return ((x < 0.0) ? -x : x)}
BEGIN {while (getline < "file1.csv" ) { f[$1] = $2 ; g[$1] = $3; h[$1] = $4 } }
{ if (g[$1] != 0 ) score+= abs($4 - f[$1])/g[$1] } END { print score } ' file2.csv

Hand tracing a pseudo code

I have this pseudo code that I need to hand trace:
begin
count <- 1
while count < 11
t <- (count ^ 2) - 1
output t
count <- count + 1
endwhile
end
I am unsure what <- means and I don't really understand what to do with the t. I also keep getting 1,1,1, etc. every time I go through. Any help would be appreciated!
First off the operator <- means "gets", as in an assignment. So:
count <- count + 1
Means to set the variable count to the value count + 1.
Second the program will output the first 10 values of x2-1, so:
t <- count^2 - 1
will evaluate to:
0, 3, 8, 15, 24, 35, 48, 63, 80, 99
for the values of count
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
respectively.
here is the code for it in C++, hope it helps:
int count = 1; // count <- 1
int t;
while ( count < 11 ){ // while count < 11
t = count * count - 1; // t <- (count ^ 2) - 1
std::cout<<t<<std::endl; // output t
count ++; // count <- count + 1
} // endwhile
and as said in the previous answer:
count takes the values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
and t will take the values: 0, 3, 8, 15, 24, 35, 48, 63, 80, 99

detect sequence of repeating elements

I have an array of random numbers lets say:
[10,11,12,14,15,16,17,18,19,20, 11, 12, 14,25,25,26,27,28,29]
I have to detect repeated sequences (witch are in fact errors)
with length bigger then a specific number (2).
Is there any good algorithm for this ?
what I have for now:
int minLenght = 3;
int[] data = {1,2,3};
for(int i = 0; i < data.length; i++){
for(int j = 0; j < data.length; j++){
if ( data[i] == data[j]){
int l = 0;
int ii = i;
int jj = j;
while(data[ii] == data[jj]){
ii++;
jj++;
l++;
}
if(l >= minLenght){
print('['+i+'-'+ii+'] same as ['+j+'-'+jj+']');
}
}
}
}
One way is to store sequences of length L (one bigger than your specific length) in a hash table.
If you ever find a sequence is already in the hash table, then you have found a repeat of length >= L.
e.g. Python code
A=[10,11,12,14,15,16,17,18,19,20,11,12,14,25,25,26,27,28,29]
S=set()
L=2+1
for i in xrange(len(A)-L+1):
key=tuple(A[i:i+L])
if key in S:
print i
else:
S.add(key)
This prints out the locations of repeated sequences with length greater than 2.
You could use regexps, except the list format shown is irregular. I use Python below and 'regularise' the list format turning it into a string before applying a regular expression looking for duplicated sequences of numbers/non-numbers:
>>> import re
>>> numbers = [10,11,12,14,15,16,17,18,19,20, 11, 12, 14,25,25,26,27,28,29]
>>> sep = ', '
>>> txt = sep + sep.join(str(x) for x in numbers) + sep
>>> txt
', 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 11, 12, 14, 25, 25, 26, 27, 28, 29, '
>>> re.search(r'\D((?:\d+\D+){2,}).*\1', txt).groups()
('11, 12, 14, ',)
I normally try and minimize my use of regexps but this does detect the duplication.
I don't know if there is a special algorithm for this, but my proposal would be this:
loop1 over array[i]:
loop2 over array[j] starting with i+1:
dist=array[j]-array[i];
if dist==specific_number:
array_result.append(array[i] +""+""+array[j])
This would be my simple logic.

Partition a collection into "k" close-to-equal pieces (Scala, but language agnostic)

Defined before this block of code:
dataset can be a Vector or List
numberOfSlices is an Int denoting how many "times" to slice dataset
I want to split the dataset into numberOfSlices slices, distributed as evenly as possible. By "split" I guess I mean "partition" (intersection of all should be empty, union of all should be the original) to use the set theory term, though this is not necessarily a set, just an arbitrary collection.
e.g.
dataset = List(1, 2, 3, 4, 5, 6, 7)
numberOfSlices = 3
slices == ListBuffer(Vector(1, 2), Vector(3, 4), Vector(5, 6, 7))
Is there a better way to do it than what I have below? (which I'm not even sure is optimal...)
Or perhaps this is not an algorithmically feasible endeavor, in which case any known good heuristics?
val slices = new ListBuffer[Vector[Int]]
val stepSize = dataset.length / numberOfSlices
var currentStep = 0
var looper = 0
while (looper != numberOfSlices) {
if (looper != numberOfSlices - 1) {
slices += dataset.slice(currentStep, currentStep + stepSize)
currentStep += stepSize
} else {
slices += dataset.slice(currentStep, dataset.length)
}
looper += 1
}
If the behavior of xs.grouped(xs.size / n) doesn't work for you, it's pretty easy to define exactly what you want. The quotient is the size of the smaller pieces, and the remainder is the number of the bigger pieces:
def cut[A](xs: Seq[A], n: Int) = {
val (quot, rem) = (xs.size / n, xs.size % n)
val (smaller, bigger) = xs.splitAt(xs.size - rem * (quot + 1))
smaller.grouped(quot) ++ bigger.grouped(quot + 1)
}
The typical "optimal" partition calculates an exact fractional length after cutting and then rounds to find the actual number to take:
def cut[A](xs: Seq[A], n: Int):Vector[Seq[A]] = {
val m = xs.length
val targets = (0 to n).map{x => math.round((x.toDouble*m)/n).toInt}
def snip(xs: Seq[A], ns: Seq[Int], got: Vector[Seq[A]]): Vector[Seq[A]] = {
if (ns.length<2) got
else {
val (i,j) = (ns.head, ns.tail.head)
snip(xs.drop(j-i), ns.tail, got :+ xs.take(j-i))
}
}
snip(xs, targets, Vector.empty)
}
This way your longer and shorter blocks will be interspersed, which is often more desirable for evenness:
scala> cut(List(1,2,3,4,5,6,7,8,9,10),4)
res5: Vector[Seq[Int]] =
Vector(List(1, 2, 3), List(4, 5), List(6, 7, 8), List(9, 10))
You can even cut more times than you have elements:
scala> cut(List(1,2,3),5)
res6: Vector[Seq[Int]] =
Vector(List(1), List(), List(2), List(), List(3))
Here's a one-liner that does the job for me, using the familiar Scala trick of a recursive function that returns a Stream. Notice the use of (x+k/2)/k to round the chunk sizes, intercalating the smaller and larger chunks in the final list, all with sizes with at most one element of difference. If you round up instead, with (x+k-1)/k, you move the smaller blocks to the end, and x/k moves them to the beginning.
def k_folds(k: Int, vv: Seq[Int]): Stream[Seq[Int]] =
if (k > 1)
vv.take((vv.size+k/2)/k) +: k_folds(k-1, vv.drop((vv.size+k/2)/k))
else
Stream(vv)
Demo:
scala> val indices = scala.util.Random.shuffle(1 to 39)
scala> for (ff <- k_folds(7, indices)) println(ff)
Vector(29, 8, 24, 14, 22, 2)
Vector(28, 36, 27, 7, 25, 4)
Vector(6, 26, 17, 13, 23)
Vector(3, 35, 34, 9, 37, 32)
Vector(33, 20, 31, 11, 16)
Vector(19, 30, 21, 39, 5, 15)
Vector(1, 38, 18, 10, 12)
scala> for (ff <- k_folds(7, indices)) println(ff.size)
6
6
5
6
5
6
5
scala> for (ff <- indices.grouped((indices.size+7-1)/7)) println(ff)
Vector(29, 8, 24, 14, 22, 2)
Vector(28, 36, 27, 7, 25, 4)
Vector(6, 26, 17, 13, 23, 3)
Vector(35, 34, 9, 37, 32, 33)
Vector(20, 31, 11, 16, 19, 30)
Vector(21, 39, 5, 15, 1, 38)
Vector(18, 10, 12)
scala> for (ff <- indices.grouped((indices.size+7-1)/7)) println(ff.size)
6
6
6
6
6
6
3
Notice how grouped does not try to even out the size of all the sub-lists.
Here is my take on the problem:
def partition[T](items: Seq[T], partitionsCount: Int): List[Seq[T]] = {
val minPartitionSize = items.size / partitionsCount
val extraItemsCount = items.size % partitionsCount
def loop(unpartitioned: Seq[T], acc: List[Seq[T]], extra: Int): List[Seq[T]] =
if (unpartitioned.nonEmpty) {
val (splitIndex, newExtra) = if (extra > 0) (minPartitionSize + 1, extra - 1) else (minPartitionSize, extra)
val (newPartition, remaining) = unpartitioned.splitAt(splitIndex)
loop(remaining, newPartition :: acc, newExtra)
} else acc
loop(items, List.empty, extraItemsCount).reverse
}
It's more verbose than some of the other solutions but hopefully more clear as well. reverse is only necessary if you want the order to be preserved.
As Kaito mentions grouped is exactly what you are looking for. But if you just want to know how to implement such a method, there are many ways ;-). You could for example do it like this:
def grouped[A](xs: List[A], size: Int) = {
def grouped[A](xs: List[A], size: Int, result: List[List[A]]): List[List[A]] = {
if(xs.isEmpty) {
result
} else {
val (slice, rest) = xs.splitAt(size)
grouped(rest, size, result :+ slice)
}
}
grouped(xs, size, Nil)
}
I'd approach it this way: Given n elements and m partitions (n>m), either n mod m == 0 in which case, each partition will have n/m elements, or n mod m = y, in which case you'll have each partition with n/m elements and you have to distribute y over some m.
You'll have y slots with n/m+1 elements and (m-y) slots with n/m. How you distribute them is your choice.

Algorithm to split an array into N groups based on item index (should be something simple)

I feel that it should be something very simple and obvious but just stuck on this for the last half an hour and can't move on.
All I need is to split an array of elements into N groups based on element index.
For example we have an array of 30 elements [e1,e2,...e30], that has to be divided into N=3 groups like this:
group1: [e1, ..., e10]
group2: [e11, ..., e20]
group3: [e21, ..., e30]
I came up with nasty mess like this for N=3 (pseudo language, I left multiplication on 0 and 1 just for clarification):
for(i=0;i<array_size;i++) {
if(i>=0*(array_size/3) && i<1*(array_size/3) {
print "group1";
} else if(i>=1*(array_size/3) && i<2*(array_size/3) {
print "group2";
} else if(i>=2*(array_size/3) && i<3*(array_size/3)
print "group3";
}
}
But what would be the proper general solution?
Thanks.
What about something like this?
for(i=0;i<array_size;i++) {
print "group" + (Math.floor(i/(array_size/N)) + 1)
}
Here's a little function which will do what you want - it presumes you know the number of groups you want to make:
function arrayToGroups(source, groups) {
//This is the array of groups to return:
var grouped = [];
//work out the size of the group
var groupSize = Math.ceil(source.length/groups);
//clone the source array so we can safely splice it (splicing modifies the array)
var queue = source.slice(0);
for (var r=0;r<groups;r++) {
//Grab the next groupful from the queue, and append it to the array of groups
grouped.push(queue.splice(0, groupSize));
}
return grouped;
}
And you use it like:
var herbs = ['basil', 'marjoram', 'aniseed', 'parsely', 'chives', 'sage', 'fennel', 'oregano', 'thyme', 'tarragon', 'rosemary'];
var herbGroups = arrayToGroups(herbs, 3);
which returns:
herbGroups[0] = ['basil', 'marjoram', 'aniseed', 'parsely']
herbGroups[1] = ['chives', 'sage', 'fennel', 'oregano']
herbGroups[2] = ['thyme', 'tarragon', 'rosemary']
It doesn't do any sanity checking to make sure you pass in an array and a number, but you could add that easily enough. You could probably prototype it into the Javascript's object type, too, which would give you a handy 'toGroups' method on Arrays.
Using a vector language makes this task simple, right tool and all that. Just thought I'd throw this out there to let folks check out an alternative methodology.
The explained version in K (an APL descendent):
split:{[values;n] / define function split with two parameters
enum:!n / ! does enumerate from 0 through n exclusive, : is assign
floor:_(#values)%n / 33 for this sample, % is divide, _ floor, # count
cut:floor*enum / 0 33 66 for this sample data, * multiplies atom * vector
:cut _ values / cut the values at the given cutpoints, yielding #cut lists
}
values:1+!30 / generate values 1 through 30
n:3 / how many groups to split into
groups:split[values;n] / set the groups
yields the expected output:
(1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30)
The short version in K :
split:{((_(#x)%y)*!y)_ x}
groups:split[1+!30;3]
yields the same output:
(1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30)
I modified Beejamin's function above and just wanted to share it.
function arrayToGroups($source, $pergroup) {
$grouped = array();
$groupCount = ceil(count($source)/$pergroup);
$queue = $source;
for ($r=0; $r<$groupCount; $r++) {
array_push($grouped, array_splice($queue, 0, $pergroup));
}
return $grouped;
}
This asks how many items to have per group instead of how many groups total. PHP.
const int g = 3; // number of groups
const int n = (array_size + g - 1)/g; // elements per group
for (i=0,j=1; i<array_size; ++i) {
if (i > j*n)
++j;
printf("Group %d\n", j);
}
int group[3][10];
int groupIndex = 0;
int itemIndex = 0;
for(i = 0; i < array_size; i++)
{
group[groupIndex][itemIndex] = big_array[i];
itemIndex++;
if (itemIndex == 10)
{
itemIndex = 0;
groupIndex++;
}
}
There's probably an infinite number of ways of do this.
I'd suggest: for each group, create a base pointer and count.
struct group {foo * ptr; size_t count };
group * pgroups = new group [ngroups];
size_t objects_per_group = array_size / ngroups;
for (unsigned u = 0; u < ngroups; ++u ) {
group & g = pgroups[u];
size_t index = u * objects_per_group;
g.ptr = & array [index];
g.count = min (objects_per_group, array_size - index); // last group may have less!
}
...`
for (unsigned u = 0; u < ngroups; ++u) {
// group "g" is an array at pgroups[g].ptr, dimension pgroups[g].count
group & g = pgroups[u];
// enumerate the group:
for (unsigned v = 0; v < g.count; ++v) {
fprintf (stdout, "group %u, item %u, %s\n",
(unsigned) u, (unsigned) v, (const char *) g.ptr[v]->somestring);
} }
delete[] pgroups;
I think the problem is a little more complicated; and considering that your only look at group as a 1 dimensional problem your going to get a very odd view of what groups actually are.
Firstly the problem is dimensional according to the number of group primes, and group combinations you are dealing with. In Mathematics; this is represented as n to the power of n or n^n which can be translated to !n (factor of n).
If I have 5 groups arrayed as (1, 2, 3, 4, 5) then I wanted to represent it as certain groups or combonations of groups according to a factorial expression then the combonations get bigger
Group 1x1 = 1,2,3,4,5
Group 2x1 = 12, 23, 45, 13, 14, 15, 21, 24, 25, 31, 32, 34, 35, 41, 42, 43, 45, 51, 52, 53, 54
so the strategy creates a branch systematic branch (easy enough)
12, 13, 14, 15
21, 22, 23, 24
31, 32, 34, 35
41, 42, 43, 45
51, 52, 53, 55
Group 1 + 2x2x1 = (1, 23, 45), (2, 13, 45), (3, 12, 45), (4, 12, 35), (1, 24, 35), (1, 25, 35), (1, 32, 45), (1, 34, 25), (1, 35, 24), ... etc
As you can see when you begin to add factorial sets the comboniations become not so easy to create a mathematic reference to express the terms. It gets worst when you get up into a base set > 3 or 4 length.
If I am understanding your question: you want to expressing in a generic terms an algorythm which allows you to create grouping strategies programmatically?
This is a complicated set; and is represented best in calculus; as set theory. Otherwise all your doing is a two dimensional array handling.
the first Array expresses the grouping strategy;
the second Array expresses the grouping elements.
I don't think this is what your being asked to do, because the term "GROUP" in mathematics has a very specific allocation for the term. You should not use the term group; rather express it as a set; set1, set2 if that is what you are doing.
Set1 contains elements of set2; and therefor this is handled with the same mathematics as Sets and unions are expressed. Lookup "Vin Diagrams" and "Union"; avoid using the term group unless you are representing the factor of a set.
http://en.wikipedia.org/wiki/Group_(mathematics)
I think what you are trying to express is the groups within a known set or table; This is on the wikipedia.org example D2.
In which case that means you have to look at the problem like a rubik's cube; and it gets complicated.
I'm working the same problem in javascript; when I am done I might publish it ;). It's very complicated.

Resources