I have probem / strange question, i got algorithm with few "for" loops and now i need to do block scheme of this algorithm.
I know how to picture "while" loop, but is this acceptable to represent "for" loop using "while" and at this point make difference between souce code and algorithm?. Ofcourse assuming that all "for" loops are right in place and using loop of any other kind would produce unnecessary code which i avoided using "for" loops.
I'm guessing that this is rather impossible (at least I can't imagine this) to simply picture "for" loop, but maybe there is a way (if exists).
Thanks in advance
Here's a flow chart that illustrates a for loop:
The equivalent C code would be
for(i = 2; i <= 6; i = i + 2) {
printf("%d\t", i + 1);
}
I found this and several other examples on one of Tenouk's C Laboratory practice worksheets.
What's a "block scheme"?
If I were drawing it, I might draw a box with "for each x in y" written in it.
If you're drawing a flowchart, there's always a loop with a decision box.
Nassi-Schneiderman diagrams have a loop construct you could use.
The Algorithm for given flow chart :
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Step :01
Start
Step :02
[Variable initialization]
Set counter: i<----K [Where K:Positive Number]
Step :03[Condition Check]
If condition True then Do your task, set i=i+N and go to Step :03 [Where N:Positive Number]
If condition False then go to Step :04
Step:04
Stop
Related
I'm trying to calculate the Kolmogorov-Smirnov statistic in R. I have the following sample, which clearly comes from a random variable that follows a long-tailed distribution.
Download link
https://drive.google.com/file/d/1hIgqikX7p343zdyc-Goq34THUpsZA63n/view?usp=sharing
As you may know, the Kolmogorov-Smirnov statistic requires the calculation of the empirical cumulative distribution function and the presumed cumulative distribution function. For both calculations I take the following approach: first, I create a vector with the same length as the length of the sample, and then I modify each of the components of the vector so as for it to contain the empirical cdf (or presumed cdf) of the corresponding observation of the sample.
For the sake of illustration, I'll show you the code I wrote in order to calculate the empirical cdf.
I'm assuming that the data has been read and stored in a dataframe called data.
ecdf = vector("numeric", length(data$logueos))for (i in 1:length(data$logueos)) {ecdf[i] = sum (data$logueos <= data$logueos[i])/length(data$logueos)}
The code I wrote for the calculation of the presumed cdf is analogous to the preceding one; the only difference is that I set each component of the pcdf vector equal to the formula $P(X<=t)$ —where t is the corresponding observation of the sample— according to the distribution that I'm assuming.
The problem is that this 'for' loop never ends. If I force it to end by clicking RStudio's stop button it works: it makes the vector store what I want it to store. But, if I press Ctrl+Shift+k in order to render my notebook and preview it, the load gets stuck when trying to execute the first chunk encountered that contains one of those loops.
First of all, your loop is not endless. It will finish, eventually.
You start initializing a vector with as much elements as the number of observations (1.245.888, which is a lot of iterations). This vector is FULL OF ZEROS.
What your loop does is iterate while changing each zero with the calculus sum (data$logueos <= data$logueos[i])/length(data$logueos). Check that when you stop the execution, the first values of your vector will be values between 0 and 1 while the last values is going to be 0s (because the loop hasn't arrived there yet).
So, you will have to wait more time.
In order to make the execution faster, you could consider loop parallelization (because standard loops go sequentially, one by one, and if it's too much wait, parallelization makes it faster. For example, executing 4 by 4, depending of your computer capacities). Here you'll find some information about it: https://nceas.github.io/oss-lessons/parallel-computing-in-r/parallel-computing-in-r.html
Then, my proposal to you:
if(!require(foreach)){install.packages("foreach")}; require(foreach)
registerDoParallel(detectCores() - 1)
ecdf = vector("numeric", length(data$logueos))
foreach (i=1:length(data$logueos)) %do% {
print(i)
ecdf[i] = sum (data$logueos <= data$logueos[i])/length(data$logueos)
}
The first line will download and load foreach library, that you
need for parallelization.
detectCores() - 1 is going to use all the
processors that your computer has except one (to avoid freezing your
machine) for computing this loop. You'll see that is going to be
faster!
registerDoParallel function is what tells to foreach how many cores use.
I need to run many many tests of the form a<0 where a is a vector (a relatively short one). I am currently doing it with
all(v<0)
Is there a faster way?
Not sure which one will be faster (that may depend on the machine and Matlab version), but here are some alternatives to all(v<0):
~any(v>0)
nnz(v>=0)==0 %// Or ~nnz(v>=0)
sum(v>=0)==0 %// Or ~sum(v>=0)
isempty(find(v>0, 1)) %// Or isempty(find(v>0))
I think the issue is that the conditional is executed on all elements of the array first, then the condition is tested... That is, for the test "any(v<0)", matlab does the following I believe:
Step 1: compute v<0 for every element of v
Step 2: search through the results of step 1 for a true value
So even if the first element of v is less than zero, the conditional was first computed for all elements, hence wasting a lot of time. I think this is also true for any of the alternative solutions offered above.
I don't know of a faster way to do it easily, but wish I did. In some cases, breaking the array v up into smaller chunks and testing incrementally could speed things up, particularly if the condition is common. For example:
function result = anyLessThanZero(v);
w = v(:);
result = true;
for i=1:numel(w)
if ( w(i) < 0 )
return;
end
end
result = false;
end
but that can be very inefficient if the condition is rare. (If you were to really do this, there is probably a better way than I illustrate above to handle any condition, not just <0, but I show it this way to make it clear).
I am having trouble constructing my own nested selection statements (ifs) and repetition statements (for loops, whiles and do-whiles). I can understand what most simple repetition and selection statements are doing and although it takes me a bit longer to process what the nested statements are doing I can still get the general gist of the code (keeping count of the control variables and such). However, the real problem comes down to the construction of these statements, I just can't for the life of me construct my own statements that properly aligns with the pseudo-code.
I'm quite new to programming in general so I don't know if this is an experience thing or I just genuinely lack a very logical mind. It is VERY demoralising when it takes me a about an hour to complete 1 question in a book when I feel like it should just take a fraction of the time.
Can you people give me some pointers on how I can develop proper nested selection and repetition statements?
First of all, you need to understand this:
An if statement defines behavior for when **a decision is made**.
A loop (for, while, do-while) signifies **repetitive (iterative) work being done** (such as counting things).
Once you understand these, the next step, when confronted with a problem, is to try to break that problem up in small, manageable components:
i.e. decisions, that provide you with the possible code paths down the way,
and various work you need to do, much of which will end up being repetitive,
hence the need for one or more loops.
For instance, say we have this simple problem:
Given a positive number N, if N is even, count (display numbers) from
0(zero) to N in steps of 2, if N is odd, count from 0 to N in steps of
1.
First, let's break up this problem.
Step 1: Get the value of N. For this we don't need any decision, simply get it using the preferred method (from file, read console, etc.)
Step 2: Make a decision: is N odd or even?
Step 3: According to the decision made in Step 2, do work (count) - we will iterate from 0 to N, in steps of 1 or 2, depending on N's parity, and display the number at each step.
Now, we code:
//read N
int N;
cin<<N;
//make decision, get the 'step' value
int step=0;
if (N % 2 == 0) step = 2;
else step = 1;
//do the work
for (int i=0; i<=N; i=i+step)
{
cout >> i >> endl;
}
These principles, in my opinion, apply to all programming problems, although of course, usually it is not so simple to discern between concepts.
Actually, the complicated phase is usually the problem break-up. That is where you actually think.
Coding is just translating your thinking so the computer can understand you.
I'm doing some simple logic with for loop and if statement, and I was wondering which of the following two positioning is better, or whether there is a significant performance difference between the two.
Case 1:
if condition-is-true:
for loop of length n:
common code
do this
else:
another for loop of length n
common code
do that
Case 2:
for loop of length n:
common code
if condition-is-true:
do this
else:
do that
Basically, I have a for loop that needs to be executed slightly differently based on a condition, but there is certain stuff that needs to happen in the for loop no matter what. I would prefer the second one because I don't have to repeat the commond code twice, but I'm wondering if case 1 would perform significantly better?
I know in terms of big-O notation it doesn't really matter because the if-else statement is a constant anyway, but I'm wondering realistically on a dataset that is not way too big (maybe n = a few thousands), if the two cases make a difference.
Thank you!
First one is good one because there is no need to check the condition every time but in second case you have to check the condition on very iteration. But your length of code will be long . If code size matters then put the common code into the method and just call the method instead of block of common code.
i'm kinda new to vectorization. Have tried myself but couldn't. Can somebody help me vectorize this code as well as give a short explaination on how u do it, so that i can adapt the thinking process too. Thanks.
function [result] = newHitTest (point,Polygon,r,tol,stepSize)
%This function calculates whether a point is allowed.
%First is a quick test is done by calculating the distance from point to
%each point of the polygon. If that distance is smaller than range "r",
%the point is not allowed. This will slow down the algorithm at some
%points, but will greatly speed it up in others because less calls to the
%circleTest routine are needed.
polySize=size(Polygon,1);
testCounter=0;
for i=1:polySize
d = sqrt(sum((Polygon(i,:)-point).^2));
if d < tol*r
testCounter=1;
break
end
end
if testCounter == 0
circleTestResult = circleTest (point,Polygon,r,tol,stepSize);
testCounter = circleTestResult;
end
result = testCounter;
Given the information that Polygon is 2 dimensional, point is a row vector and the other variables are scalars, here is the first version of your new function (scroll down to see that there are lots of ways to skin this cat):
function [result] = newHitTest (point,Polygon,r,tol,stepSize)
result = 0;
linDiff = Polygon-repmat(point,size(Polygon,1),1);
testLogicals = sqrt( sum( ( linDiff ).^2 ,2 )) < tol*r;
if any(testLogicals); result = circleTest (point,Polygon,r,tol,stepSize); end
The thought process for vectorization in Matlab involves trying to operate on as much data as possible using a single command. Most of the basic builtin Matlab functions operate very efficiently on multi-dimensional data. Using for loop is the reverse of this, as you are breaking your data down into smaller segments for processing, each of which must be interpreted individually. By resorting to data decomposition using for loops, you potentially loose some of the massive performance benefits associated with the highly optimised code behind the Matlab builtin functions.
The first thing to think about in your example is the conditional break in your main loop. You cannot break from a vectorized process. Instead, calculate all possibilities, make an array of the outcome for each row of your data, then use the any keyword to see if any of your rows have signalled that the circleTest function should be called.
NOTE: It is not easy to efficiently conditionally break out of a calculation in Matlab. However, as you are just computing a form of Euclidean distance in the loop, you'll probably see a performance boost by using the vectorized version and calculating all possibilities. If the computation in your loop were more expensive, the input data were large, and you wanted to break out as soon as you hit a certain condition, then a matlab extension made with a compiled language could potentially be much faster than a vectorized version where you might be performing needless calculation. However this is assuming that you know how to program code that matches the performance of the Matlab builtins in a language that compiles to native code.
Back on topic ...
The first thing to do is to take the linear difference (linDiff in the code example) between Polygon and your row vector point. To do this in a vectorized manner, the dimensions of the 2 variables must be identical. One way to achieve this is to use repmat to copy each row of point to make it the same size as Polygon. However, bsxfun is usually a superior alternative to repmat (as described in this recent SO question), making the code ...
function [result] = newHitTest (point,Polygon,r,tol,stepSize)
result = 0;
linDiff = bsxfun(#minus, Polygon, point);
testLogicals = sqrt( sum( ( linDiff ).^2 ,2 )) < tol*r;
if any(testLogicals); result = circleTest (point,Polygon,r,tol,stepSize); end
I rolled your d value into a column of d by summing across the 2nd axis (note the removal of the array index from Polygon and the addition of ,2 in the sum command). I then went further and evaluated the logical array testLogicals inline with the calculation of the distance measure. You will quickly see that a downside of heavy vectorisation is that it can make the code less readable to those not familiar with Matlab, but the performance gains are worth it. Comments are pretty necessary.
Now, if you want to go completely crazy, you could argue that the test function is so simple now that it warrants use of an 'anonymous function' or 'lambda' rather than a complete function definition. The test for whether or not it is worth doing the circleTest does not require the stepSize argument either, which is another reason for perhaps using an anonymous function. You can roll your test into an anonymous function and then jut use circleTest in your calling script, making the code self documenting to some extent . . .
doCircleTest = #(point,Polygon,r,tol) any(sqrt( sum( bsxfun(#minus, Polygon, point).^2, 2 )) < tol*r);
if doCircleTest(point,Polygon,r,tol)
result = circleTest (point,Polygon,r,tol,stepSize);
else
result = 0;
end
Now everything is vectorised, the use of function handles gives me another idea . . .
If you plan on performing this at multiple points in the code, the repetition of the if statements would get a bit ugly. To stay dry, it seems sensible to put the test with the conditional function into a single function, just as you did in your original post. However, the utility of that function would be very narrow - it would only test if the circleTest function should be executed, and then execute it if needs be.
Now imagine that after a while, you have some other conditional functions, just like circleTest, with their own equivalent of doCircleTest. It would be nice to reuse the conditional switching code maybe. For this, make a function like your original that takes a default value, the boolean result of the computationally cheap test function, and the function handle of the expensive conditional function with its associated arguments ...
function result = conditionalFun( default, cheapFunResult, expensiveFun, varargin )
if cheapFunResult
result = expensiveFun(varargin{:});
else
result = default;
end
end %//of function
You could call this function from your main script with the following . . .
result = conditionalFun(0, doCircleTest(point,Polygon,r,tol), #circleTest, point,Polygon,r,tol,stepSize);
...and the beauty of it is you can use any test, default value, and expensive function. Perhaps a little overkill for this simple example, but it is where my mind wandered when I brought up the idea of using function handles.