I am running a model which is written in Fortan (an executable), in some runs it started to deliver constant errors and apparently incoherent results, however when I closely checked the results file (a text with n columns of data) and I realized that when the concentration of certain mineral is very very low, lets say 2.9984199E-306, the code omits the 'E' and the number presented is 2.9984199-306 which of course causes problems. Since I have no access to the source code of the executable file, is there a way to avoid this problem in Windows? I have seen that in other computers these numbers are directly replaced by zero, however I was not able to find the specific configuration to achieve it.
You will need access to code to change the output formatting or you will need to post-process your output. You are seeing standard conforming Fortran behavior. Consider the simple program
program foo
implicit none
real(8) x
integer i
x = 1
do i = 1, 10
x = x / 5.4321e11
write(*,'(ES15.7)') x
end do
end program foo
It's output is
1.8409086E-12
3.3889446E-24
6.2387373E-36
1.1484945E-47
2.1142735E-59
3.8921843E-71
7.1651557E-83
1.3190397E-94
2.4282316-106
4.4701525-118
See Fortran 2018 Standard, 13.7.2.3.3 E and D editing, in particular, Table 13.1.
Related
I am new to using Halide and I am playing around with implementing algorithms first. I am trying to write a function which, depending on the value of the 8 pixels around it, either skips to the next pixel or does some processing and then moves on to the next pixel. When trying to write this I get the following compiler error:
84:5: error: value of type 'Halide::Expr' is not contextually convertible to 'bool'
if(input(x,y) > 0)
I have done all the tutorials and have seen that the select function is an option, but is there a way to either compare the values of a function or store them somewhere?
I also may be thinking about this problem wrong or might not be implementing it with the right "Halide mindset", so any suggestions would be great. Thank you in advance for everything!
The underlying issue here is that, although they are syntactically interleaved, and Halide code is constructed by running C++ code, Halide code is not C++ code and vice versa. Halide code is entirely defined by the Halide::* data structures you build up inside Funcs. if is a C control flow construct; you can use it to conditionally build different Halide programs, but you can't use it inside the logic of the Halide program (inside an Expr/Func). select is to Halide (an Expr which conditionally evaluates to one of two values) as if/else is to C (a statement which conditionally executes one of two sub-statements).
Rest assured, you're hardly alone in having this confusion early on. I want to write a tutorial specifically addressing how to think about staged programming inside Halide.
Until then, the short, "how do I do what I want" answer is as you suspected and as Khouri pointed out: use a select.
Since you've provided no code other than the one line, I'm assuming input is a Func and both x and y are Vars. If so, the result of input(x,y) is an Expr that you cannot evaluate with an if, as the error message indicates.
For the scenario that you describe, you might have something like this:
Var x, y;
Func input; input(x,y) = ...;
Func output; output(x,y) = select
// examine surrounding values
( input(x-1,y-1) > 0
&& input(x+0,y-1) > 0
&& ...
&& input(x+1,y+1) > 0
// true case
, ( input(x-1,y-1)
+ input(x+0,y-1)
+ ...
+ input(x+1,y+1)
) / 8
// false case
, input(x,y)
);
Working in Halide definitely requires a different mindset. You have to think in a more mathematical form. That is, a statement of a(x,y) = b(x,y) will be enforced for all cases of x and y.
Algorithm and scheduling should be separate, although the algorithm may need to be tweaked to allow for better scheduling.
I'm learning the Julia language and followed some tutorials to test OLS (ordinary least squares) estimation in Julia. First, I need to simulate a dataset of dependent variable ("Y"), independent variables ("X") ,error terms (epsilon) and parameters. The script is like:
# ols_simulate :generate necessary data
using Distributions
N=100000
K=3
genX = MvNormal(eye(K))
X = rand(genX,N)
X = X'
X_noconstant = X
constant = ones(N)
X = [constant X]
genEpsilon = Normal(0, 1)
epsilon = rand(genEpsilon,N)
trueParams = [0.1,0.5,-0.3,0.]
Y = X*trueParams + epsilon
and then I defined an OLS function
function OLSestimator(y,x)
estimate = inv(x'*x)*(x'*y)
return estimate
end
What I planed to do is first to simulate data from terminal with command:
ols_simulate
and hope this step generates and stores data properly, and then I could call olsestimator . But after trying this, when I typed mean(Y) in Julia REPL, it gives me an error message like
Error: UnderdefvarError: Y not defined
it seems the data are not stored properly. More generally, if I have multiple scripts (scripts and function), how can I use the data generated by one from others in the terminal?
Thank you.
Each time you run the Julia REPL (the Julia "command-line"), it begins with a fresh memory workspace. Thus, to define variables and then use them, you should run the interpreter once.
If I understand correctly, you have multiple scripts which do parts of the calculations. To run a script in the REPL and stay in it with all the global variables still defined, you can use
include("scriptname.jl")
(with scriptname changed to appropriate .jl filename).
In this case, the workflow could look like:
include("ols_simulate.jl")
estimate = OLSestimator(Y,X)
mean(Y)
In general, it's best to stay in the REPL, unless you want to clear everything and start fresh and then quitting and restarting is the way to go.
You need to save the script in a separate file and then load it into Julia. Say you already saved it with name "ols_simulate.jl" in directory "dir1", then navigate to that directory in the Terminal, startup Julia (you might want to see this). Once in Julia, you have to load "ols_simulate.jl", after which you can calculate the mean of Y and do whatever you want:
include("ols_simulate.jl")
mean(Y)
OLSestimator(Y, X)
For the kind of stuff that I think you are doing, I think you could find useful using a notebook interface like Jupyter.
When I compile the following Fortran code with gfortran and run it, it gives me 'signal SIGBUS: Access to undefined portion of a memory object', whenever n>=180. I'm running this on a Mac OSX Mavericks.
PROGRAM almatrix
IMPLICIT NONE
INTEGER :: i,j,n
REAL,ALLOCATABLE :: a(:,:)
READ(*,*)n
ALLOCATE(a(n+1,n+1))
DO i=0,n
DO j=0,n
a(i,j)=0.0
END DO
END DO
DEALLOCATE(a)
END PROGRAM almatrix
I understood that instead of
ALLOCATE(a(n+1,n+1))
this
ALLOCATE(a(n+1,n+1),STAT=err)
IF(err /= 0) STOP
would prevent crashing. It didn't, however. Why?
I tried to look at similar problems, but so far they haven't helped.
I tried to compile with -Wall, -g, -fcheck=all, as suggested in another answer, but those didn't give me warnings.
I've also noticed before, that unlike with C, Fortran usually does not give bus errors when using small dynamic arrays and not deallocating them.
The problem isn't directly with the allocate statement, but with accessing the resulting array. [Note also that that an array 181x181 is not "large".] As there is nothing wrong with the allocation, err will indeed be zero.
From that allocate one is left with an array a which has elements a(1,1), a(2,1), ..., a(n+1,1), ..., a(n+1,n+1). So, a(0,0) (the first access in the loop) is not valid.
There are two options: request that the array elements be a(0,0) to a(n,n) as the loop wants, or change the loop:
allocate(a(0:n,0:n))
or
do i=1,n+1
do j=1,n+1
a(j,i) = 0 ! Note I've changed the order to Fortran-friendly
end od
end do
Finally, those loops aren't even necessary:
allocate(a(0:n,0:n))
a = 0.
or even
allocate(a(0:n,0:n), source=0.)
if you have a compiler later than Fortran 95.
I have been playing with Julia because it seems syntactically similar to python (which I like) but claims to be faster. However, I tried making a similar script to something I have in python for tesing where numerical values are within a text file which uses this function:
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
For some reason, this takes a great deal of time for a text file with a reasonable amount of rows of text (~500000).
Why would this be? Is there a better way to do this? What general feature of the language can I understand from this to apply to other languages?
Here are the two exact scripts i ran with the times for reference:
python: ~0.5 seconds
def is_number(s):
try:
np.float64(s)
return True
except ValueError:
return False
start = time.time()
file_data = open('SMW100.asc').readlines()
file_data = map(lambda line: line.rstrip('\n').replace(',',' ').split(), file_data)
bools = [(all(map(is_number, x)), x) for x in file_data]
print time.time() - start
julia: ~73.5 seconds
start = time()
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
x = map(x-> split(replace(x, ",", " ")), open(readlines, "SMW100.asc"))
u = [(all(map(isFloat, i)), i) for i in x]
print(start - time())
Note also that you can use the float64_isvalid function in the standard library to (a) check whether a string is a valid floating-point value and (b) return the value.
Note also that the colons (:) after try and catch in your isFloat code are wrong in Julia (this is a Pythonism).
A much faster version of your code should be:
const isFloat2_out = [1.0]
isFloat2(s::String) = float64_isvalid(s, isFloat2_out)
function foo(L)
x = split(L, ",")
(all(isFloat2, x), x)
end
u = map(foo, open(readlines, "SMW100.asc"))
On my machine, for a sample file with 100,000 rows and 10 columns of data, 50% of which are valid numbers, your Python code takes 4.21 seconds and my Julia code takes 2.45 seconds.
This is an interesting performance problem that might be worth submitting to julia-users to get more focused feedback than SO will probably provide. At a first glance, I think you're hitting problems because (1) try/catch is just slightly slow to begin with and then (2) you're using try/catch in a context where there's a very considerable amount of type uncertainty because of lots of function calls that don't return stable types. As a result, the Julia interpreter spend its time trying to figure out the types of objects rather than doing your computation. It's a bit hard to tell exactly where the big bottlenecks are because you're doing a lot of things that are not very idiomatic in Julia. Also you seem to be doing your computations in the global scope, where Julia's compiler can't perform many meaningful optimizations due to additional type uncertainty.
Python is oddly ambiguous on the subject of whether using exceptions for control flow is good or bad. See Python using exceptions for control flow considered bad?. But even in Python, the consensus is that user code shouldn't use exceptions for control flow (although for some reason generators are allowed to do this). So basically, the simple answer is that you should not be doing that – exceptions are for exceptional situations, not for control flow. That is why almost zero effort has been put into making Julia's try/catch construct faster – you shouldn't be using it like that in the first place. Of course, we will probably get around to making it faster at some point.
That said, the onus is on us as the designers of Julia's standard library to make sure that we provide APIs that never force you to use exceptions for control flow. In this case, you need a function that allows you to try to parse something as a floating-point value and indicate whether that was possible or not – not by throwing an exception, but rather by returning normal values. We don't provide such an API, so this ultimately a shortcoming of Julia's standard library – as it exists right now. I've opened an issue to discuss this API design question: https://github.com/JuliaLang/julia/issues/5704. We'll see how it pans out.
As the title says I'm curious about the difference between "call-by-reference" and "call-by-value-return". I've read about it in some literature, and tried to find additional information on the internet, but I've only found comparison of "call-by-value" and "call-by-reference".
I do understand the difference at memory level, but not at the "conceptual" level, between the two.
The called subroutine will have it's own copy of the actual parameter value to work with, but will, when it ends executing, copy the new local value (bound to the formal parameter) back to the actual parameter of the caller.
When is call-by-value-return actually to prefer above "call-by-reference"? Any example scenario? All I can see is that it takes extra memory and execution time due to the copying of values in the memory-cells.
As a side question, is "call-by-value-return" implemented in 'modern' languages?
Call-by-value-return, from Wikipedia:
This variant has gained attention in multiprocessing contexts and Remote procedure call: if a parameter to a function call is a reference that might be accessible by another thread of execution, its contents may be copied to a new reference that is not; when the function call returns, the updated contents of this new reference are copied back to the original reference ("restored").
So, in more practical terms, it's entirely possible that a variable is in some undesired state in the middle of the execution of a function. With parallel processing this is a problem, since you can attempt to access the variable while it has this value. Copying it to a temporary value avoids this problem.
As an example:
policeCount = 0
everyTimeSomeoneApproachesOrLeaves()
calculatePoliceCount(policeCount)
calculatePoliceCount(count)
count = 0
for each police official
count++
goAboutMyDay()
if policeCount == 0
doSomethingIllegal()
else
doSomethingElse()
Assume everyTimeSomeoneApproachesOrLeaves and goAboutMyDay are executed in parallel.
So if you pass by reference, you could end up getting policeCount right after it was set to 0 in calculatePoliceCount, even if there are police officials around, then you'd end up doing something illegal and probably going to jail, or at least coughing up some money for a bribe. If you pass by value return, this won't happen.
Supported languages?
In my search, I found that Ada and Fortran support this. I don't know of others.
Suppose you have a call by reference function (in C++):
void foobar(int &x, int &y) {
while (y-->0) {
x++;
}
}
and you call it thusly:
int z = 5;
foobar(z, z);
It will never terminate, because x and y are the same reference, each time you decrement y, that is subsequently undone by the increment of x (since they are both really z under the hood).
By contrast using call-by-value-return (in rusty Fortran):
subroutine foobar(x,y):
integer, intent(inout) :: x,y
do while y > 0:
y = y - 1
x = x + 1
end do
end subroutine foobar
If you call this routine with the same variable:
integer, z = 5
call foobar(z,z)
it will still terminate, and at the end z will be changed have a value of either 10 or 0, depending on which result is applied first (I don't remember if a particular order is required and I can't find any quick answers to the question online).
Kindly go to the following link , the program in there can give u an practical idea regarding these two .
Difference between call-by-reference and call-by-value