Related
I have been playing with Julia because it seems syntactically similar to python (which I like) but claims to be faster. However, I tried making a similar script to something I have in python for tesing where numerical values are within a text file which uses this function:
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
For some reason, this takes a great deal of time for a text file with a reasonable amount of rows of text (~500000).
Why would this be? Is there a better way to do this? What general feature of the language can I understand from this to apply to other languages?
Here are the two exact scripts i ran with the times for reference:
python: ~0.5 seconds
def is_number(s):
try:
np.float64(s)
return True
except ValueError:
return False
start = time.time()
file_data = open('SMW100.asc').readlines()
file_data = map(lambda line: line.rstrip('\n').replace(',',' ').split(), file_data)
bools = [(all(map(is_number, x)), x) for x in file_data]
print time.time() - start
julia: ~73.5 seconds
start = time()
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
x = map(x-> split(replace(x, ",", " ")), open(readlines, "SMW100.asc"))
u = [(all(map(isFloat, i)), i) for i in x]
print(start - time())
Note also that you can use the float64_isvalid function in the standard library to (a) check whether a string is a valid floating-point value and (b) return the value.
Note also that the colons (:) after try and catch in your isFloat code are wrong in Julia (this is a Pythonism).
A much faster version of your code should be:
const isFloat2_out = [1.0]
isFloat2(s::String) = float64_isvalid(s, isFloat2_out)
function foo(L)
x = split(L, ",")
(all(isFloat2, x), x)
end
u = map(foo, open(readlines, "SMW100.asc"))
On my machine, for a sample file with 100,000 rows and 10 columns of data, 50% of which are valid numbers, your Python code takes 4.21 seconds and my Julia code takes 2.45 seconds.
This is an interesting performance problem that might be worth submitting to julia-users to get more focused feedback than SO will probably provide. At a first glance, I think you're hitting problems because (1) try/catch is just slightly slow to begin with and then (2) you're using try/catch in a context where there's a very considerable amount of type uncertainty because of lots of function calls that don't return stable types. As a result, the Julia interpreter spend its time trying to figure out the types of objects rather than doing your computation. It's a bit hard to tell exactly where the big bottlenecks are because you're doing a lot of things that are not very idiomatic in Julia. Also you seem to be doing your computations in the global scope, where Julia's compiler can't perform many meaningful optimizations due to additional type uncertainty.
Python is oddly ambiguous on the subject of whether using exceptions for control flow is good or bad. See Python using exceptions for control flow considered bad?. But even in Python, the consensus is that user code shouldn't use exceptions for control flow (although for some reason generators are allowed to do this). So basically, the simple answer is that you should not be doing that – exceptions are for exceptional situations, not for control flow. That is why almost zero effort has been put into making Julia's try/catch construct faster – you shouldn't be using it like that in the first place. Of course, we will probably get around to making it faster at some point.
That said, the onus is on us as the designers of Julia's standard library to make sure that we provide APIs that never force you to use exceptions for control flow. In this case, you need a function that allows you to try to parse something as a floating-point value and indicate whether that was possible or not – not by throwing an exception, but rather by returning normal values. We don't provide such an API, so this ultimately a shortcoming of Julia's standard library – as it exists right now. I've opened an issue to discuss this API design question: https://github.com/JuliaLang/julia/issues/5704. We'll see how it pans out.
I am developing a (large) package which does not load properly anymore.
This happened after I changed a single line of code.
When I attempt to load the package (with Needs), the package starts loading and then one of the setdelayed definitions “comes alive” (ie. Is somehow evaluated), gets trapped in an error trapping routine loaded a few lines before and the package loading aborts.
The error trapping routine with abort is doing its job, except that it should not have been called in the first place, during the package loading phase.
The error message reveals that the wrong argument is in fact a pattern expression which I use on the lhs of a setdelayed definition a few lines later.
Something like this:
……Some code lines
Changed line of code
g[x_?NotGoodQ]:=(Message[g::nogood, x];Abort[])
……..some other code lines
g/: cccQ[g[x0_]]:=True
When I attempt to load the package, I get:
g::nogood: Argument x0_ is not good
As you see the passed argument is a pattern and it can only come from the code line above.
I tried to find the reason for this behavior, but I have been unsuccessful so far.
So I decided to use the powerful Workbench debugging tools .
I would like to see step by step (or with breakpoints) what happens when I load the package.
I am not yet too familiar with WB, but it seems that ,using Debug as…, the package is first loaded and then eventually debugged with breakpoints, ect.
My problem is that the package does not even load completely! And any breakpoint set before loading the package does not seem to be effective.
So…2 questions:
can anybody please explain why these code lines "come alive" during package loading? (there are no obvious syntax errors or code fragments left in the package as far as I can see)
can anybody please explain how (if) is possible to examine/debug
package code while being loaded in WB?
Thank you for any help.
Edit
In light of Leonid's answer and using his EvenQ example:
We can avoid using Holdpattern simply by definying upvalues for g BEFORE downvalues for g
notGoodQ[x_] := EvenQ[x];
Clear[g];
g /: cccQ[g[x0_]] := True
g[x_?notGoodQ] := (Message[g::nogood, x]; Abort[])
Now
?g
Global`g
cccQ[g[x0_]]^:=True
g[x_?notGoodQ]:=(Message[g::nogood,x];Abort[])
In[6]:= cccQ[g[1]]
Out[6]= True
while
In[7]:= cccQ[g[2]]
During evaluation of In[7]:= g::nogood: -- Message text not found -- (2)
Out[7]= $Aborted
So...general rule:
When writing a function g, first define upvalues for g, then define downvalues for g, otherwise use Holdpattern
Can you subscribe to this rule?
Leonid says that using Holdpattern might indicate improvable design. Besides the solution indicated above, how could one improve the design of the little code above or, better, in general when dealing with upvalues?
Thank you for your help
Leaving aside the WB (which is not really needed to answer your question) - the problem seems to have a straightforward answer based only on how expressions are evaluated during assignments. Here is an example:
In[1505]:=
notGoodQ[x_]:=True;
Clear[g];
g[x_?notGoodQ]:=(Message[g::nogood,x];Abort[])
In[1509]:= g/:cccQ[g[x0_]]:=True
During evaluation of In[1509]:= g::nogood: -- Message text not found -- (x0_)
Out[1509]= $Aborted
To make it work, I deliberately made a definition for notGoodQ to always return True. Now, why was g[x0_] evaluated during the assignment through TagSetDelayed? The answer is that, while TagSetDelayed (as well as SetDelayed) in an assignment h/:f[h[elem1,...,elemn]]:=... does not apply any rules that f may have, it will evaluate h[elem1,...,elem2], as well as f. Here is an example:
In[1513]:=
ClearAll[h,f];
h[___]:=Print["Evaluated"];
In[1515]:= h/:f[h[1,2]]:=3
During evaluation of In[1515]:= Evaluated
During evaluation of In[1515]:= TagSetDelayed::tagnf: Tag h not found in f[Null]. >>
Out[1515]= $Failed
The fact that TagSetDelayed is HoldAll does not mean that it does not evaluate its arguments - it only means that the arguments arrive to it unevaluated, and whether or not they will be evaluated depends on the semantics of TagSetDelayed (which I briefly described above). The same holds for SetDelayed, so the commonly used statement that it "does not evaluate its arguments" is not literally correct. A more correct statement is that it receives the arguments unevaluated and does evaluate them in a special way - not evaluate the r.h.s, while for l.h.s., evaluate head and elements but not apply rules for the head. To avoid that, you may wrap things in HoldPattern, like this:
Clear[g,notGoodQ];
notGoodQ[x_]:=EvenQ[x];
g[x_?notGoodQ]:=(Message[g::nogood,x];Abort[])
g/:cccQ[HoldPattern[g[x0_]]]:=True;
This goes through. Here is some usage:
In[1527]:= cccQ[g[1]]
Out[1527]= True
In[1528]:= cccQ[g[2]]
During evaluation of In[1528]:= g::nogood: -- Message text not found -- (2)
Out[1528]= $Aborted
Note however that the need for HoldPattern inside your left-hand side when making a definition is often a sign that the expression inside your head may also evaluate during the function call, which may break your code. Here is an example of what I mean:
In[1532]:=
ClearAll[f,h];
f[x_]:=x^2;
f/:h[HoldPattern[f[y_]]]:=y^4;
This code attempts to catch cases like h[f[something]], but it will obviously fail since f[something] will evaluate before the evaluation comes to h:
In[1535]:= h[f[5]]
Out[1535]= h[25]
For me, the need for HoldPattern on the l.h.s. is a sign that I need to reconsider my design.
EDIT
Regarding debugging during loading in WB, one thing you can do (IIRC, can not check right now) is to use good old print statements, the output of which will appear in the WB's console. Personally, I rarely feel a need for debugger for this purpose (debugging package when loading)
EDIT 2
In response to the edit in the question:
Regarding the order of definitions: yes, you can do this, and it solves this particular problem. But, generally, this isn't robust, and I would not consider it a good general method. It is hard to give a definite advice for a case at hand, since it is a bit out of its context, but it seems to me that the use of UpValues here is unjustified. If this is done for error - handling, there are other ways to do it without using UpValues.
Generally, UpValues are used most commonly to overload some function in a safe way, without adding any rule to the function being overloaded. One advice is to avoid associating UpValues with heads which also have DownValues and may evaluate -by doing this you start playing a game with evaluator, and will eventually lose. The safest is to attach UpValues to inert symbols (heads, containers), which often represent a "type" of objects on which you want to overload a given function.
Regarding my comment on the presence of HoldPattern indicating a bad design. There certainly are legitimate uses for HoldPattern, such as this (somewhat artificial) one:
In[25]:=
Clear[ff,a,b,c];
ff[HoldPattern[Plus[x__]]]:={x};
ff[a+b+c]
Out[27]= {a,b,c}
Here it is justified because in many cases Plus remains unevaluated, and is useful in its unevaluated form - since one can deduce that it represents a sum. We need HoldPattern here because of the way Plus is defined on a single argument, and because a pattern happens to be a single argument (even though it describes generally multiple arguments) during the definition. So, we use HoldPattern here to prevent treating the pattern as normal argument, but this is mostly different from the intended use cases for Plus. Whenever this is the case (we are sure that the definition will work all right for intended use cases), HoldPattern is fine. Note b.t.w., that this example is also fragile:
In[28]:= ff[Plus[a]]
Out[28]= ff[a]
The reason why it is still mostly OK is that normally we don't use Plus on a single argument.
But, there is a second group of cases, where the structure of usually supplied arguments is the same as the structure of patterns used for the definition. In this case, pattern evaluation during the assignment indicates that the same evaluation will happen with actual arguments during the function calls. Your usage falls into this category. My comment for a design flaw was for such cases - you can prevent the pattern from evaluating, but you will have to prevent the arguments from evaluating as well, to make this work. And pattern-matching against not completely evaluated expression is fragile. Also, the function should never assume some extra conditions (beyond what it can type-check) for the arguments.
Before jumping into python, I had started with some Objective-C / Cocoa books. As I recall, most functions required keyword arguments to be explicitly stated. Until recently I forgot all about this, and just used positional arguments in Python. But lately, I've ran into a few bugs which resulted from improper positions - sneaky little things they were.
Got me thinking - generally speaking, unless there is a circumstance that specifically requires non-keyword arguments - is there any good reason NOT to use keyword arguments? Is it considered bad style to always use them, even for simple functions?
I feel like as most of my 50-line programs have been scaling to 500 or more lines regularly, if I just get accustomed to always using keyword arguments, the code will be more easily readable and maintainable as it grows. Any reason this might not be so?
UPDATE:
The general impression I am getting is that its a style preference, with many good arguments that they should generally not be used for very simple arguments, but are otherwise consistent with good style. Before accepting I just want to clarify though - is there any specific non-style problems that arise from this method - for instance, significant performance hits?
There isn't any reason not to use keyword arguments apart from the clarity and readability of the code. The choice of whether to use keywords should be based on whether the keyword adds additional useful information when reading the code or not.
I follow the following general rule:
If it is hard to infer the function (name) of the argument from the function name – pass it by keyword (e.g. I wouldn't want to have text.splitlines(True) in my code).
If it is hard to infer the order of the arguments, for example if you have too many arguments, or when you have independent optional arguments – pass it by keyword (e.g. funkyplot(x, y, None, None, None, None, None, None, 'red') doesn't look particularly nice).
Never pass the first few arguments by keyword if the purpose of the argument is obvious. You see, sin(2*pi) is better than sin(value=2*pi), the same is true for plot(x, y, z).
In most cases, stable mandatory arguments would be positional, and optional arguments would be keyword.
There's also a possible difference in performance, because in every implementation the keyword arguments would be slightly slower, but considering this would be generally a premature optimisation and the results from it wouldn't be significant, I don't think it's crucial for the decision.
UPDATE: Non-stylistical concerns
Keyword arguments can do everything that positional arguments can, and if you're defining a new API there are no technical disadvantages apart from possible performance issues. However, you might have little issues if you're combining your code with existing elements.
Consider the following:
If you make your function take keyword arguments, that becomes part of your interface.
You can't replace your function with another that has a similar signature but a different keyword for the same argument.
You might want to use a decorator or another utility on your function that assumes that your function takes a positional argument. Unbound methods are an example of such utility because they always pass the first argument as positional after reading it as positional, so cls.method(self=cls_instance) doesn't work even if there is an argument self in the definition.
None of these would be a real issue if you design your API well and document the use of keyword arguments, especially if you're not designing something that should be interchangeable with something that already exists.
If your consideration is to improve readability of function calls, why not simply declare functions as normal, e.g.
def test(x, y):
print "x:", x
print "y:", y
And simply call functions by declaring the names explicitly, like so:
test(y=4, x=1)
Which obviously gives you the output:
x: 1
y: 4
or this exercise would be pointless.
This avoids having arguments be optional and needing default values (unless you want them to be, in which case just go ahead with the keyword arguments! :) and gives you all the versatility and improved readability of named arguments that are not limited by order.
Well, there are a few reasons why I would not do that.
If all your arguments are keyword arguments, it increases noise in the code and it might remove clarity about which arguments are required and which ones are optionnal.
Also, if I have to use your code, I might want to kill you !! (Just kidding), but having to type the name of all the parameters everytime... not so fun.
Just to offer a different argument, I think there are some cases in which named parameters might improve readability. For example, imagine a function that creates a user in your system:
create_user("George", "Martin", "g.m#example.com", "payments#example.com", "1", "Radius Circle")
From that definition, it is not at all clear what these values might mean, even though they are all required, however with named parameters it is always obvious:
create_user(
first_name="George",
last_name="Martin",
contact_email="g.m#example.com",
billing_email="payments#example.com",
street_number="1",
street_name="Radius Circle")
I remember reading a very good explanation of "options" in UNIX programs: "Options are meant to be optional, a program should be able to run without any options at all".
The same principle could be applied to keyword arguments in Python.
These kind of arguments should allow a user to "customize" the function call, but a function should be able to be called without any implicit keyword-value argument pairs at all.
Sometimes, things should be simple because they are simple.
If you always enforce you to use keyword arguments on every function call, soon your code will be unreadable.
When Python's built-in compile() and __import__() functions gain keyword argument support, the same argument was made in favor of clarity. There appears to be no significant performance hit, if any.
Now, if you make your functions only accept keyword arguments (as opposed to passing the positional parameters using keywords when calling them, which is allowed), then yes, it'd be annoying.
I don't see the purpose of using keyword arguments when the meaning of the arguments is obvious
Keyword args are good when you have long parameter lists with no well defined order (that you can't easily come up with a clear scheme to remember); however there are many situations where using them is overkill or makes the program less clear.
First, sometimes is much easier to remember the order of keywords than the names of keyword arguments, and specifying the names of arguments could make it less clear. Take randint from scipy.random with the following docstring:
randint(low, high=None, size=None)
Return random integers x such that low <= x < high.
If high is None, then 0 <= x < low.
When wanting to generate a random int from [0,10) its clearer to write randint(10) than randint(low=10) in my view. If you need to generate an array with 100 numbers in [0,10) you can probably remember the argument order and write randint(0, 10, 100). However, you may not remember the variable names (e.g., is the first parameter low, lower, start, min, minimum) and once you have to look up the parameter names, you might as well not use them (as you just looked up the proper order).
Also consider variadic functions (ones with variable number of parameters that are anonymous themselves). E.g., you may want to write something like:
def square_sum(*params):
sq_sum = 0
for p in params:
sq_sum += p*p
return sq_sum
that can be applied a bunch of bare parameters (square_sum(1,2,3,4,5) # gives 55 ). Sure you could have written the function to take an named keyword iterable def square_sum(params): and called it like square_sum([1,2,3,4,5]) but that may be less intuitive, especially when there's no potential confusion about the argument name or its contents.
A mistake I often do is that I forget that positional arguments have to be specified before any keyword arguments, when calling a function. If testing is a function, then:
testing(arg = 20, 56)
gives a SyntaxError message; something like:
SyntaxError: non-keyword arg after keyword arg
It is easy to fix of course, it's just annoying. So in the case of few - lines programs as the ones you mention, I would probably just go with positional arguments after giving nice, descriptive names to the parameters of the function. I don't know if what I mention is that big of a problem though.
One downside I could see is that you'd have to think of a sensible default value for everything, and in many cases there might not be any sensible default value (including None). Then you would feel obliged to write a whole lot of error handling code for the cases where a kwarg that logically should be a positional arg was left unspecified.
Imagine writing stuff like this every time..
def logarithm(x=None):
if x is None:
raise TypeError("You can't do log(None), sorry!")
My code relies on version of Element which works like MemberQ, but when I load Combinatorica, Element gets redefined to work like Part. What is the easiest way to fix this conflict? Specifically, what is the syntax to remove Combinatorica's definition from DownValues? Here's what I get for DownValues[Element]
{HoldPattern[
Combinatorica`Private`a_List \[Element] \
{Combinatorica`Private`index___}] :>
Combinatorica`Private`a[[Combinatorica`Private`index]],
HoldPattern[Private`x_ \[Element] Private`list_List] :>
MemberQ[Private`list, Private`x]}
If your goal is to prevent Combinatorica from installing the definition in the first place, you can achieve this result by loading the package for the first time thus:
Block[{Element}, Needs["Combinatorica`"]]
However, this will almost certainly make any Combinatorica features that depend upon the definition fail (which may or may not be of concern in your particular application).
You can do several things. Let us introduce a convenience function
ClearAll[redef];
SetAttributes[redef, HoldRest];
redef[f_, code_] := (Unprotect[f]; code; Protect[f])
If you are sure about the order of definitions, you can do something like
redef[Element, DownValues[Element] = Rest[DownValues[Element]]]
If you want to delete definitions based on the context, you can do something like this:
redef[Element, DownValues[Element] =
DeleteCases[DownValues[Element],
rule_ /; Cases[rule, x_Symbol /; (StringSplit[Context[x], "`"][[1]] ===
"Combinatorica"), Infinity, Heads -> True] =!= {}]]
You can also use a softer way - reorder definitions rather than delete:
redef[Element, DownValues[Element] = RotateRight[DownValues[Element]]]
There are many other ways of dealing with this problem. Another one (which I already recommended) is to use UpValues, if this is suitable. The last one I want to mention here is to make a kind of custom dynamic scoping construct based on Block, and wrap it around your code. I personally find it the safest variant, in case if you want strictly your definition to apply (because it does not care about the order in which various definitions could have been created - it removes all of them and adds just yours). It is also safer in that outside those places where you want your definitions to apply (by "places" I mean parts of the evaluation stack), other definitions will still apply, so this seems to be the least intrusive way. Here is how it may look:
elementDef[] := Element[x_, list_List] := MemberQ[list, x];
ClearAll[elemExec];
SetAttributes[elemExec, HoldAll];
elemExec[code_] := Block[{Element}, elementDef[]; code];
Example of use:
In[10]:= elemExec[Element[1,{1,2,3}]]
Out[10]= True
Edit:
If you need to automate the use of Block, here is an example package to show one way how this can be done:
BeginPackage["Test`"]
var;
f1;
f2;
Begin["`Private`"];
(* Implementations of your functions *)
var = 1;
f1[x_, y_List] := If[Element[x, y], x^2];
f2[x_, y_List] := If[Element[x, y], x^3];
elementDef[] := Element[x_, list_List] := MemberQ[list, x];
(* The following part of the package is defined at the start and you don't
touch it any more, when adding new functions to the package *)
mainContext = StringReplace[Context[], x__ ~~ "Private`" :> x];
SetAttributes[elemExec, HoldAll];
elemExec[code_] := Block[{Element}, elementDef[]; code];
postprocessDefs[context_String] :=
Map[
ToExpression[#, StandardForm,
Function[sym,DownValues[sym] =
DownValues[sym] /.
Verbatim[RuleDelayed][lhs_,rhs_] :> (lhs :> elemExec[rhs])]] &,
Select[Names[context <> "*"], ToExpression[#, StandardForm, DownValues] =!= {} &]];
postprocessDefs[mainContext];
End[]
EndPackage[]
You can load the package and look at the DownValues for f1 and f2, for example:
In[17]:= DownValues[f1]
Out[17]= {HoldPattern[f1[Test`Private`x_,Test`Private`y_List]]:>
Test`Private`elemExec[If[Test`Private`x\[Element]Test`Private`y,Test`Private`x^2]]}
The same scheme will also work for functions not in the same package. In fact, you could separate
the bottom part (code-processing package) to be a package on its own, import it into any other
package where you want to inject Block into your functions' definitions, and then just call something like postprocessDefs[mainContext], as above. You could make the function which makes definitions inside Block (elementDef here) to be an extra parameter to a generalized version of elemExec, which would make this approach more modular and reusable.
If you want to be more selective about the functions where you want to inject Block, this can also be done in various ways. In fact, the whole Block-injection scheme can be made cleaner then, but it will require slightly more care when implementing each function, while the above approach is completely automatic. I can post the code which will illustrate this, if needed.
One more thing: for the less intrusive nature of this method you pay a price - dynamic scope (Block) is usually harder to control than lexically-scoped constructs. So, you must know exactly the parts of evaluation stack where you want that to apply. For example, I would hesitate to inject Block into a definition of a higher order function, which takes some functions as parameters, since those functions may come from code that assumes other definitions (like for example Combinatorica` functions relying on overloaded Element). This is not a big problem, just requires care.
The bottom line of this seems to be: try to avoid overloading built-ins if at all possible. In this case you faced this definitions clash yourself, but it would be even worse if the one who faces this problem is a user of your package (may be yourself a few months later), who wants to combine your package with another one (which happens to overload same system functions as yours). Of course, it also depends on who will be the users of your package - only yourself or potentially others as well. But in terms of design, and in the long term, you may be better off assuming the latter scenario from the start.
To remove Combinatorica's definition, use Unset or the equivalent form =.. The pattern to unset you can grab from the Information output you show in the question:
Unprotect[Element];
Element[a_List, {index___}] =.
Protect[Element];
The worry would be, of course, that Combinatorica depends internally on this ill-conceived redefinition, but you have reason to believe this to not be the case as the Information output from the redefined Element says:
The use of the function
Element in Combinatorica is now
obsolete, though the function call
Element[a, p] still gives the pth
element of nested list a, where p is a
list of indices.
HTH
I propose an entirely different approach than removing Element from DownValues. Simply use the full name of the Element function.
So, if the original is
System`Element[]
the default is now
Combinatorica`Element[]
because of loading the Combinatorica Package.
Just explicitly use
System`Element[]
wherever you need it. Of course check that System is the correct Context using the Context function:
Context[Element]
This approach ensures several things:
The Combinatorica Package will still work in your notebook, even if the Combinatorica Package is updated in the future
You wont have to redefine the Element function, as some have suggested
You can use the Combinatorica`Element function when needed
The only downside is having to explicitly write it every time.
When debugging a function I usually use
library(debug)
mtrace(FunctionName)
FunctionName(...)
And that works quite well for me.
However, sometimes I am trying to debug a complex function that I don't know. In which case, I can find that inside that function there is another function that I would like to "go into" ("debug") - so to better understand how the entire process works.
So one way of doing it would be to do:
library(debug)
mtrace(FunctionName)
FunctionName(...)
# when finding a function I want to debug inside the function, run again:
mtrace(FunctionName.SubFunction)
The question is - is there a better/smarter way to do interactive debugging (as I have described) that I might be missing?
p.s: I am aware that there where various questions asked on the subject on SO (see here). Yet I wasn't able to come across a similar question/solution to what I asked here.
Not entirely sure about the use case, but when you encounter a problem, you can call the function traceback(). That will show the path of your function call through the stack until it hit its problem. You could, if you were inclined to work your way down from the top, call debug on each of the functions given in the list before making your function call. Then you would be walking through the entire process from the beginning.
Here's an example of how you could do this in a more systematic way, by creating a function to step through it:
walk.through <- function() {
tb <- unlist(.Traceback)
if(is.null(tb)) stop("no traceback to use for debugging")
assign("debug.fun.list", matrix(unlist(strsplit(tb, "\\(")), nrow=2)[1,], envir=.GlobalEnv)
lapply(debug.fun.list, function(x) debug(get(x)))
print(paste("Now debugging functions:", paste(debug.fun.list, collapse=",")))
}
unwalk.through <- function() {
lapply(debug.fun.list, function(x) undebug(get(as.character(x))))
print(paste("Now undebugging functions:", paste(debug.fun.list, collapse=",")))
rm(list="debug.fun.list", envir=.GlobalEnv)
}
Here's a dummy example of using it:
foo <- function(x) { print(1); bar(2) }
bar <- function(x) { x + a.variable.which.does.not.exist }
foo(2)
# now step through the functions
walk.through()
foo(2)
# undebug those functions again...
unwalk.through()
foo(2)
IMO, that doesn't seem like the most sensible thing to do. It makes more sense to simply go into the function where the problem occurs (i.e. at the lowest level) and work your way backwards.
I've already outlined the logic behind this basic routine in "favorite debugging trick".
I like options(error=recover) as detailed previously on SO. Things then stop at the point of error and one can inspect.
(I'm the author of the 'debug' package where 'mtrace' lives)
If the definition of 'SubFunction' lives outside 'MyFunction', then you can just mtrace 'SubFunction' and don't need to mtrace 'MyFunction'. And functions run faster if they're not 'mtrace'd, so it's good to mtrace only as little as you need to. (But you probably know those things already!)
If 'MyFunction' is only defined inside 'SubFunction', one trick that might help is to use a conditional breakpoint in 'MyFunction'. You'll need to 'mtrace( MyFunction)', then run it, and when the debugging window appears, find out what line 'MyFunction' is defined in. Say it's line 17. Then the following should work:
D(n)> bp( 1, F) # don't bother showing the window for MyFunction again
D(n)> bp( 18, { mtrace( SubFunction); FALSE})
D(n)> go()
It should be clear what this does (or it will be if you try it).
The only downsides are: the need to do it again whenever you change the code of 'MyFunction', and; the slowing-down that might occur through 'MyFunction' itself being mtraced.
You could also experiment with adding a 'debug.sub' argument to 'MyFunction', that defaults to FALSE. In the code of 'MyFunction', then add this line immediately after the definition of 'SubFunction':
if( debug.sub) mtrace( SubFunction)
That avoids any need to mtrace 'MyFunction' itself, but does require you to be able to change its code.