For loop being ignored - image

I've written this script, but everytime i click run, All I get is a blank image in the output. And when I try to print lists or variable within the for loops, nothing prints. if i remove the last line, I get nothing.
It seems like my For loops aren't running. Can anyone see what I have done wrong?
people = 100;
c = 3; (* number of connections to add to graph per person added*)
d = 1; (* Number of people assigned randomly*)
ct =3; (* number of traits person must have in common *)
region = 11; (* Position of region percentage break down list *)
m={{0.48,0.52},{0.19,0.25,0.40,0.16},{0.36,30,34},
{0.26,0.39,0.35},{0.18,32,32,18},{0.05,0.95},
{0.13,0.87},{0.34,0.49,0.17},{0.27,0.23,0.01,0.24,0.03,0.07,0.15},
{0.7,0.12,0.04,0.03,0.11},
{0.19,0.23,0.37,0.21}}; (* List of list of % breakdowns of traits*)
randomchoice[matr_]:=Table[RandomChoice[matr[[i]]
-> Range[Length[matr[[i]]]]],{i,Length[matr]}];
PickTraits := RandomChoice[Range[Length[m]-1],ct];
listoflistoftraits = Table[randomchoice[m],people];
adjmat = {};
For[j,j<Range[Length[people]],1,
commontraits = Table[listoflistoftraits[[j]],
{i,PickTraits}];
ccount = 0;
possibleconnections = {};
possibleregionconnections = {};
emptyc = {};
For[k,k<people,1,
If[Count[listoflistoftraits[[k]],commontraits] = ct,
Append[possibleconnections,k]
,Unevaluated[Sequence[]]
];
If[listoflistoftraits[[k,Length[listoflistoftraits,region]]] =
listoflistoftraits[[j,region]],
Append[possibleregionconnections,k]
,Unevaluated[Sequence[]]
]
];
selected = RandomChoice[possibleconnections,c-d];
randomselect = RandomChoice[possibleregionconnections,d];
connectionindices = Join[selected,randomselect];
emptyc = Table[If[MemberQ[connectionindices,i],1,0],
{i,people}];
Append[adjmat,emptyc]
]
AdjacencyGraph[adjmat]

#murksiuke is right. people is an Integer so Range[Length[people]] == {}. j is not initialized to any value and it is not incremented anywhere, so the For loop will never terminate. Same for the loop over k. You probably want something like
For[j = 1, j <= people, j++, ...]
The second argument to Count is a pattern, commontraits is a list of Integer, so it will never match a list. Maybe you intended
Count[{listoflistoftraits[[k]]}, commontraits]
Length only takes one argument, two are being passed
Length[listoflistoftraits, region]
Not sure what you intended.

Related

get pairs / triple / quadruple... of elements from vector by function

I have a vector with a couple of elements and I want to write a function that returns me all combinations of x items from this vector.
The following code produces the right output for the case x=2 or x=3 or x=4.
However, I can not implement a solution for every possible x following this idea.
values = {'A','B','C','D','E'};
n = length(values);
data2 = {}; % case x=2
for i = 1:n
for j = i+1:n
data2{end+1} = {values{i}, values{j}};
fprintf('%s %s\n',values{i}, values{j})
end
end
data3 = {}; % case x=3
for i = 1:n
for j = i+1:n
for k = j+1:n
data3{end+1} = {values{i}, values{j}, values{k}};
fprintf('%s %s %s\n',values{i}, values{j}, values{k})
end
end
end
data4 = {}; % case x=4
for i = 1:n
for j = i+1:n
for k = j+1:n
for l = k+1:n
data4{end+1} = {values{i}, values{j}, values{k}, values{l}};
fprintf('%s %s %s %s\n',values{i}, values{j}, values{k}, values{l})
end
end
end
end
How would a function look like which would be able to return my data variable?
data = getCombinations(values, x) %values is vector with elements, x is integer value
EDIT
The following code comes pretty close:
data = perms(values)
data = data(:,1:x)
data = unique(data,'rows')
but it still produces output like A,B and B,A
EDIT2
This fixed it somehow but it is not very nice to look at and it does not work for text entries in cells but only for numbers
data = perms(values)
data = data(:,1:x)
data = sort(data,2)
data = unique(data,'rows')
EDIT3
This did it but it is not very nice to look at... Maybe there is a better solution?
function [data] = getCombinations(values,x)
i = 1:length(values);
d = perms(i);
d = d(:,1:x);
d = sort(d,2);
d = unique(d,'rows');
data = v(d);
end
If you don't want repetitions (and your example suggests you don't) then try nchoosek as nchoosek(1:n, x) to give indices:
values = {'A','B','C','D','E'};
n = length(values);
x = 3;
C = nchoosek(1:n, x);
data = values(C)
In the above, each row is a unique combination of 3 of the 5 elements of values.
Alternatively pass in the values directly:
data = nchoosek(values, x);

Deleting duplicate characters from array

Got asked this question in an interview and couldn't find a solution.
Given an array of characters delete all the characters that got repeated k or more times consecutively and add '#' in the end of the array for every deleted character.
Example:
"xavvvarrrt"->"xaat######"
O(1) memory and O(n) time without writing to the same cell twice.
The tricky part for me was that I am not allowed to overwrite a cell more than once, which means I need to know exactly where each character will move after deleting the duplicates.
The best I could come up with is iterating once on the array and saving in a map the occurrences of each character, and when iterating again and checking if the current character is not deleted then move it to the new position according to the offset, if it is deleted then update an offset variable.
The problem with this approach is that it won't work in this scenario:
"aabbaa" because 'a' appears at two different places.
So when I thought about saving an array of occurrences in the map but now it won't use O(1) memory.
Thanks
This seems to work with your examples, although it seems a little complicated to me :) I wonder if we could simplify it. The basic idea is to traverse from left to right, keeping a record of how many places in the current block of duplicates are still available to replace, while the right pointer looks for more blocks to shift over.
JavaScript code:
function f(str){
str = str.split('')
let r = 1
let l = 0
let to_fill = 0
let count = 1
let fill = function(){
while (count > 0 && (to_fill > 0 || l < r)){
str[l] = str[r - count]
l++
count--
to_fill--
}
}
for (; r<str.length; r++){
if (str[r] == str[r-1]){
count++
} else if (count < 3){
if (to_fill)
fill()
count = 1
if (!to_fill)
l = r
} else if (!to_fill){
to_fill = count
count = 1
} else {
count = 1
}
}
if (count < 3)
fill()
while (l < str.length)
str[l++] = '#'
return str.join('')
}
var str = "aayyyycbbbee"
console.log(str)
console.log(f(str)) // "aacee#######"
str = "xavvvarrrt"
console.log(str)
console.log(f(str)) // "xaat######"
str = "xxaavvvaarrrbbsssgggtt"
console.log(str)
console.log(f(str))
Here is a version similar to the other JS answer, but a bit simpler:
function repl(str) {
str = str.split("");
var count = 1, write = 0;
for (var read = 0; read < str.length; read++) {
if (str[read] == str[read+1])
count++;
else {
if (count < 3) {
for (var i = 0; i < count; i++)
str[write++] = str[read];
}
count = 1;
}
}
while (write < str.length)
str[write++] = '#';
return str.join("");
}
function demo(str) {
console.log(str + " ==> " + repl(str));
}
demo("a");
demo("aa");
demo("aaa");
demo("aaaaaaa");
demo("aayyyycbbbee");
demo("xavvvarrrt");
demo("xxxaaaaxxxaaa");
demo("xxaavvvaarrrbbsssgggtt");
/*
Output:
a ==> a
aa ==> aa
aaa ==> ###
aaaaaaa ==> #######
aayyyycbbbee ==> aacee#######
xavvvarrrt ==> xaat######
xxxaaaaxxxaaa ==> #############
xxaavvvaarrrbbsssgggtt ==> xxaaaabbtt############
*/
The idea is to keep the current index for reading the next character and one for writing, as well as the number of consecutive repeated characters. If the following character is equal to the current, we just increase the counter. Otherwise we copy all characters below a count of 3, increasing the write index appropriately.
At the end of reading, anything from the current write index up to the end of the array is the number of repeated characters we have skipped. We just fill that with hashes now.
As we only store 3 values, memory consumption is O(1); we read each array cell twice, so O(n) time (the extra reads on writing could be eliminated by another variable); and each write index is accessed exactly once.

R - Using a While() loop inside a FOR() loop

I am rebuilding a VBA code inside R, that counts transitions from a rating to another based on different conditions:
It is as follows:
## attach the relevant data table
attach(cohort)
# define the matrices that will contain all the counting information
ni = matrix(0,nrow = 1, ncol = classes - 1)
nij = matrix(0, nrow = classes-1, ncol = classes+1)
for (k in 1:obs)
{
# define the year of the kth observation
t = apply(data.frame(date[k],ystart),1,max, na.rm = F)
#t = year(as.Date(t))
while (t < yend)
{
# if this observation and the second one belong to the same id and year, break and move to the next one
if (id[k] == id[k+1] & date[k] == date[k+1]) {break}
# if the rating of this observation is 0 (not rated) or in default, then leave it
if (rating[k] == classes | rating[k] == 0) {break}
# add to the group of customers with rating = rating_k, 1 observation
rating_k = rating[k]
ni[rating_k] = ni[rating_k]+1
# determine the rating from end of next year
if (id[k] != id[k+1] | date[k+1] > (t+1))
{newrat = rating_k}
else
{
kn = k +1
while (date[kn]==date[kn+1] & id[kn]==id[kn+1])
{
if (rating[kn]==classes) {break}
Kn = kn+1
}
newrat = rating[kn]
}
nij[rating_k, newrat] = (nij[rating_k, newrat] + 1)
if(newrat!=rating[k]) {break}
else
{t = (t+1)}
}
print (k)
}
At the end of my code, if the condition " if(newrat!=rating[k]) " is met, i want my code to break and move to the next K. Else, if the condition is not met, i have t = t + 1, where the code will go back to the condition inside the while(t
I added in the end "print(k)" to understand at which "for k ..." step the code stops, and it always stops at k = 9 while k = 1 to 8 are printed. In total, i have 4000 observations but only 8 are considered, though the loop never stops and R keeps running.

Best Practices with Initialization or Pre-allocation - MATLAB

My question doesn't depend expressly on one snippet of code, but is more conceptual.
Unlike some programming languages, MATLAB doesn't require variables to be initialized expressly before they're used. For example, this is perfectly valid to have halfway through a script file to define 'myVector':
myVector = vectorA .* vectorB
My question is: Is it faster to initialize variables (such as 'myVector' above) to zero and then assign values to them, or to keep initializing things throughout the program?
Here's a direct comparison of what I'm talking about:
Initializing throughout:
varA = 8;
varB = 2;
varC = varA - varB;
varD = varC * varB;
Initializing at start:
varA = 8;
varB = 2;
varC = 0;
varD = 0;
varC = varA - varB;
varD = varC * varB;
On one hand, it seems a bit of a waste to have these extra lines of code for no reason. On the other hand, though, it makes a little bit of sense that it would be faster to allocate all the memory for a program at once instead of spread out over the runtime.
Does anyone have a little insight?
Copy and paste your Initializing at start: code into MATLAB Editor Window and you would get this warning that looks like this -
And if you go into the Details, you would read this -
Explanation
The code does not appear to use the assignment to the indicated variable. This situation occurs when any of the following are true:
Another assignment overwrites the value of the variable before an operation uses it.
The specified argument value contains a typographical error, causing it to appear unused.
The code does not use all values returned by a function call...
In our case, the reason for this warning is The code does not use all values. So, this clarifies that initialization/pre-allocation won't help for that case.
When should we pre-allocate?
From my experience, pre-allocation helps when you need to later on index into part of it.
Thus, if you need to index into a portion of varC to store the results, pre-allocation would help. Hence, this would make more sense -
varC = zeros(...)
varD = zeros(...)
varC(k,:) = varA - varB;
varD(k,:) = varC * varB;
Again, while indexing if you are going beyond the size of varC, MATLAB would spend time trying to allocate more memory space for it, so that would slow things a bit. So, pre-allocate output variables to the maximum size which you think would be used for storing results. But, if you don't know the size of results, you are in a catch there and have to append results into the output variable(s) and that would slow down things for sure.
Alright! I've done some tests, and here are the results.
This is the code I used for the "throughout" variable assignments:
tic;
a = 1;
b = 2;
c = 3;
d = 4;
e = a - b;
f = e + c;
g = f - a;
h = g * c;
i = h - g;
j = 9 * i;
k = [j i h];
l = any(k);
b2(numel(b2) + 1) = toc
Here's the code for the "At Start" variable assignments:
tic;
a = 1;
b = 2;
c = 3;
d = 4;
e = 0;
f = 0;
g = 0;
h = 0;
i = 0;
j = 0;
k = 0;
l = 0;
e = a - b;
f = e + c;
g = f - a;
h = g * c;
i = h - g;
j = 9 * i;
k = [j i h];
l = any(k);
b1(numel(b1) + 1) = toc
I saved the time in the vectors 'b1' and 'b2'. Each was run with only MATLAB and Chrome open, and was the only script file open inside MATLAB. Each was run 201 times. Because the first time a program is run it compiles, I disregarded the first time value for both (I'm not interested in compile time).
To find the average, I used
mean(b1(2:201))
and
mean(b2(2:201))
The results:
"Throughout": 1.634311562062418e-05 seconds (0.000016343)
"At Start": 2.832598989758290e-05 seconds (0.000028326)
Interestingly (or perhaps not, who knows) defining variables only when needed, spread throughout the program was almost twice as fast.
I don't know whether this is because of the way MATLAB allocates memory (maybe it just grabs a huge chunk and doesn't need to keep allocating more every time you define a variable?) or if the allocation speed is just so fast that it's eclipsed by the extra lines of code.
NOTE: As Divakar points out, mileage may vary when using arrays. My testing should hold true for when the size of variables doesn't change, however.
tl;dr Setting variables to zero only to change it later is slow

Number distribution

Problem: We have x checkboxes and we want to check y of them evenly.
Example 1: select 50 checkboxes of 100 total.
[-]
[x]
[-]
[x]
...
Example 2: select 33 checkboxes of 100 total.
[-]
[-]
[x]
[-]
[-]
[x]
...
Example 3: select 66 checkboxes of 100 total:
[-]
[x]
[x]
[-]
[x]
[x]
...
But we're having trouble to come up with a formula to check them in code, especially once you go 11/111 or something similar. Anyone has an idea?
Let's first assume y is divisible by x. Then we denote p = y/x and the solution is simple. Go through the list, every p elements, mark 1 of them.
Now, let's say r = y%x is non zero. Still p = y/x where / is integer devision. So, you need to:
In the first p-r elements, mark 1 elements
In the last r elements, mark 2 elements
Note: This depends on how you define evenly distributed. You might want to spread the r sections withx+1 elements in between p-r sections with x elements, which indeed is again the same problem and could be solved recursively.
Alright so it wasn't actually correct. I think this would do though:
Regardless of divisibility:
if y > 2*x, then mark 1 element every p = y/x elements, x times.
if y < 2*x, then mark all, and do the previous step unmarking y-x out of y checkboxes (so like in the previous case, but x is replaced by y-x)
Note: This depends on how you define evenly distributed. You might want to change between p and p+1 elements for example to distribute them better.
Here's a straightforward solution using integer arithmetic:
void check(char boxes[], int total_count, int check_count)
{
int i;
for (i = 0; i < total_count; i++)
boxes[i] = '-';
for (i = 0; i < check_count; i++)
boxes[i * total_count / check_count] = 'x';
}
total_count is the total number of boxes, and check_count is the number of boxes to check.
First, it sets every box to unchecked. Then, it checks check_count boxes, scaling the counter to the number of boxes.
Caveat: this is left-biased rather than right-biased like in your examples. That is, it prints x--x-- rather than --x--x. You can turn it around by replacing
boxes[i * total_count / check_count] = 'x';
with:
boxes[total_count - (i * total_count / check_count) - 1] = 'x';
Correctness
Assuming 0 <= check_count <= total_count, and that boxes has space for at least total_count items, we can prove that:
No check marks will overlap. i * total_count / check_count increments by at least one on every iteration, because total_count >= check_count.
This will not overflow the buffer. The subscript i * total_count / check_count
Will be >= 0. i, total_count, and check_count will all be >= 0.
Will be < total_count. When n > 0 and d > 0:
(n * d - 1) / d < n
In other words, if we take n * d / d, and nudge the numerator down, the quotient will go down, too.
Therefore, (check_count - 1) * total_count / check_count will be less than total_count, with the assumptions made above. A division by zero won't happen because if check_count is 0, the loop in question will have zero iterations.
Say number of checkboxes is C and the number of Xes is N.
You example states that having C=111 and N=11 is your most troublesome case.
Try this: divide C/N. Call it D. Have index in the array as double number I. Have another variable as counter, M.
double D = (double)C / (double)N;
double I = 0.0;
int M = N;
while (M > 0) {
if (checkboxes[Round(I)].Checked) { // if we selected it, skip to next
I += 1.0;
continue;
}
checkboxes[Round(I)].Checked = true;
M --;
I += D;
if (Round(I) >= C) { // wrap around the end
I -= C;
}
}
Please note that Round(x) should return nearest integer value for x.
This one could work for you.
I think the key is to keep count of how many boxes you expect to have per check.
Say you want 33 checks in 100 boxes. 100 / 33 = 3.030303..., so you expect to have one check every 3.030303... boxes. That means every 3.030303... boxes, you need to add a check. 66 checks in 100 boxes would mean one check every 1.51515... boxes, 11 checks in 111 boxes would mean one check every 10.090909... boxes, and so on.
double count = 0;
for (int i = 0; i < boxes; i++) {
count += 1;
if (count >= boxes/checks) {
checkboxes[i] = true;
count -= count.truncate(); // so 1.6 becomes 0.6 - resetting the count but keeping the decimal part to keep track of "partial boxes" so far
}
}
You might rather use decimal as opposed to double for count, or there's a slight chance the last box will get skipped due to rounding errors.
Bresenham-like algorithm is suitable to distribute checkboxes evenly. Output of 'x' corresponds to Y-coordinate change. It is possible to choose initial err as random value in range [0..places) to avoid biasing.
def Distribute(places, stars):
err = places // 2
res = ''
for i in range(0, places):
err = err - stars
if err < 0 :
res = res + 'x'
err = err + places
else:
res = res + '-'
print(res)
Distribute(24,17)
Distribute(24,12)
Distribute(24,5)
output:
x-xxx-xx-xx-xxx-xx-xxx-x
-x-x-x-x-x-x-x-x-x-x-x-x
--x----x----x---x----x--
Quick html/javascript solution:
<html>
<body>
<div id='container'></div>
<script>
var cbCount = 111;
var cbCheckCount = 11;
var cbRatio = cbCount / cbCheckCount;
var buildCheckCount = 0;
var c = document.getElementById('container');
for (var i=1; i <= cbCount; i++) {
// make a checkbox
var cb = document.createElement('input');
cb.type = 'checkbox';
test = i / cbRatio - buildCheckCount;
if (test >= 1) {
// check the checkbox we just made
cb.checked = 'checked';
buildCheckCount++;
}
c.appendChild(cb);
c.appendChild(document.createElement('br'));
}
</script>
</body></html>
Adapt code from one question's answer or another answer from earlier this month. Set N = x = number of checkboxes and M = y = number to be checked and apply formula (N*i+N)/M - (N*i)/M for section sizes. (Also see Joey Adams' answer.)
In python, the adapted code is:
N=100; M=33; p=0;
for i in range(M):
k = (N+N*i)/M
for j in range(p,k-1): print "-",
print "x",
p=k
which produces
- - x - - x - - x - - x - - [...] x - - x - - - x where [...] represents 25 --x repetitions.
With M=66 the code gives
x - x x - x x - x x - x x - [...] x x - x x - x - x where [...] represents mostly xx- repetitions, with one x- in the middle.
Note, in C or java: Substitute for (i=0; i<M; ++i) in place of for i in range(M):. Substitute for (j=p; j<k-1; ++j) in place of for j in range(p,k-1):.
Correctness: Note that M = x boxes get checked because print "x", is executed M times.
What about using Fisher–Yates shuffle ?
Make array, shuffle and pick first n elements. You do not need to shuffle all of them, just first n of array. Shuffling can be find in most language libraries.

Resources