PyQGIS - wrapped C/C++ object of type QgsVectorLayer has been deleted when editing the layer - runtime

I'm currently developing a QGIS plug-in.
When i start editing a layer either with with edit(QgsVectorLayer) or with QgsVectorLayer.startediting() this RuneTimeError happens the majority of runs: RuntimeError: wrapped C/C++ object of type QgsVectorLayer has been deleted. I can run 10 times the script and have no error and then run it another 10 times and get 10 times in a row the error. It feels completely random.
As i understood by reading post such as Understanding the "underlying C/C++ object has been deleted" error it might be a garbage collector problem C++ side. But none of the post i saw was about QgsVectorLayer so i'm not really sure it applies.
It really annoys me to the point where i start creating empty layers to store modified features instead of editing.
I tried to move start editing before the loop as i was thinking to continually start editing and commit changes for each feature might cause the issue but the error still appears.
Then i thought it might be the use of break at the end but removing it doesn't resolve the error.
As it is the first time i really use PyQGIS i spent sometimes reading the developer cookbook or searching online (Anita Graser - creating and editing a new vector layer) but i could not find any solutions.
I tried with different version, LTR or not. With another computer by despair but the issue is still here.
I also read somewhere that the progress bar was the issue, so i removed the feedback in my script also without success.
Here are some code example :
nodesLayer = self.parameterAsVectorLayer(parameters, self.INPUT_NODE, context)
arcsLayer = self.parameterAsVectorLayer(parameters, self.INPUT_LINE, context)
# Fill node Id_line_x
# Create spatial index
index = QgsSpatialIndex(nodesLayer.getFeatures())
for line in arcsLayer.getFeatures():
# Construct a geometry engine to speed up spatial relationship
engine = QgsGeometry.createGeometryEngine(line.geometry().constGet())
engine.prepareGeometry()
# Get potential neighbour
candidateIds = index.intersects(line.geometry().boundingBox())
request = QgsFeatureRequest().setFilterFids(candidateIds)
for node in nodesLayer.getFeatures(request):
# Get real neighbour
if engine.intersects(node.geometry().constGet()):
# Fill the Id_line fields for the number of neighbour
for fld in range(1, node["Nb_seg"] + 1):
if node["fk_Id_line_%d" %fld] == NULL:
with edit(nodesLayer):
node["fk_Id_line_%d" %fld] = line["Id_line"]
nodesLayer.updateFeature(node)
break
And the exact error :
Traceback (most recent call last):
File "/some/path/to/a/file.py", line 331, in processAlgorithm
nodesLayer.updateFeature(node)
RuntimeError: wrapped C/C++ object of type QgsVectorLayer has been deleted
Hope the example is enough. The goal of the code is for the nodes to be aware of their surroundings without going through the lines. it's just for treatment and those fields would be removed in the final output.

Related

gensim/models/ldaseqmodel.py:217: RuntimeWarning: divide by zero encountered in double_scalars

/Users/Barry/anaconda/lib/python2.7/site-packages/gensim/models/ldaseqmodel.py:217: RuntimeWarning: divide by zero encountered in double_scalars
convergence = np.fabs((bound - old_bound) / old_bound)
#dynamic topic model
def run_dtm(num_topics=18):
docs, years, titles = preprocessing(datasetType=2)
#resort document by years
Z = zip(years, docs)
Z = sorted(Z, reverse=False)
years_new, docs_new = zip(*Z)
#generate time slice
time_slice = Counter(years_new).values()
for year in Counter(years_new):
print year,' --- ',Counter(years_new)[year]
print '********* data set loaded ********'
dictionary = corpora.Dictionary(docs_new)
corpus = [dictionary.doc2bow(text) for text in docs_new]
print '********* train lda seq model ********'
ldaseq = ldaseqmodel.LdaSeqModel(corpus=corpus, id2word=dictionary, time_slice=time_slice, num_topics=num_topics)
print '********* lda seq model done ********'
ldaseq.print_topics(time=1)
Hey guys, I'm using the dynamic topic models in gensim package for topic analysis, following this tutorial, https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/ldaseqmodel.ipynb, however I always got the same unexpected error. Can anyone give me some guidance? I'm really puzzled even thought I have tried some different dataset for generating corpus and dictionary.
The error is like this:
/Users/Barry/anaconda/lib/python2.7/site-packages/gensim/models/ldaseqmodel.py:217: RuntimeWarning: divide by zero encountered in double_scalars
convergence = np.fabs((bound - old_bound) / old_bound)
The np.fabs error means it is encountering an error with NumPy. What NumPy and gensim versions are you using?
NumPy no longer supports Python 2.7, and Ldaseq was added to Gensim in 2016, so you might just not have a compatible version available. If you are recoding a Python 3+ tutorial to a 2.7 variant, you obviously understand a little bit about the version differences - try running it in a, say, 3.6.8 environment (you will have to upgrade sometime anyway, 2020 is the end of 2.7 support from Python itself). That might already help, I've gone through the tutorial and did not encounter this with my own data.
That being said, I have encountered the same error before when running LdaMulticore, and it was caused by an empty corpus.
Instead of running your code fully in a function, can you try to go through it line by line (or look at you DEBUG level log) and check whether your output has the expected properties: that, for example your corpus is not empty (or contains empty documents)?
If that happens, fix the preprocessing steps and try again - that at least helped me and helped with the same ldamodel error in the mailing list.
PS: not commenting because I lack the reputation, feel free to edit this.
This is the issue with the source code of ldaseqmodel.py itself.
For the latest gensim package(version 3.8.3) I am getting the same error at line 293:
ldaseqmodel.py:293: RuntimeWarning: divide by zero encountered in double_scalars
convergence = np.fabs((bound - old_bound) / old_bound)
Now, if you go through the code you will see this:
enter image description here
You can see that here they divide the difference between bound and old_bound by the old_bound(which is also visible from the warning)
Now if you analyze further you will see that at line 263, the old_bound is initialized with zero and this is the main reason that you are getting this warning of divide by zero encountered.
enter image description here
For further information, I put a print statement at line 294:
print('bound = {}, old_bound = {}'.format(bound, old_bound))
The output I received is: enter image description here
So, in a single line you are getting this warning because of the source code of the package ldaseqmodel.py not because of any empty document. Although if you do not remove the empty documents from your corpus you will receive another warning. So I suggest if there are any empty documents in your corpus remove them and just ignore the above warning of division by zero.

OpenMDAO v1.x: output of sub Group in ParallelGroup does not exist

When running in parallel I am unable to connect unknowns of subgroups in a ParallelGroup() even though I can connect to the subgroups' params. The code causing the problem (with names changed for clarity) is below. This code is within a group of a larger structure, but is the only place where MPI is being used:
for i in range(0, nTasks):
self.connect('comp_a.output%i' % i, 'parallel_group.sub_group%i.param_a' % i)
self.connect('input_param%i' % i, 'parallel_group.sub_group%i.param_b' % i)
self.connect('parallel_group.sub_group%i.output' % i, 'comp_b.input%i' % i)
The first two connections seem to work fine, but the last one throws an error:
NameError: Source 'parallel_group.sub_group0.output' cannot be connected to target 'comb_b.input0': 'parallel_group.sub_group0.output' does not exist.
Also, if I comment out the offending line, then first line in the loop fails for the second process with the same error message:
NameError: Source 'comp_a.output1' cannot be connected to target 'parallel_group.sub_group1.param_a': 'parallel_group.sub_group1.param_a' does not exist.
All the connections work fine with our serial version of the code. The serial version is the same except that the sub_groups are added directly to the group this code is in rather than being wrapped in parallel_group.
I have tried to look over the tutorials and examples but have not been able to figure what might be wrong. I would really appreciate any suggestions of what to check or what may be wrong. Sorry to not post a complete code sample.
its a little unclear, but it sounds like you've added a new group in the parallel version of the code, named "parallel_group". When you did this, did you promote anything (or everything) from that group? If so, then you shouldn't add the parallel group into the variable name path for the connection.
That seems like the only thing likely to trip you up. I could try to debug a bit more if you can come up with a sample code you can post up here that would show the problem.

VS2008 to VS2010 - Problematic Configuration and Upgrade - Newbie

I have updated this question with an executive summary at the start below. Then, extensive details follow, if needed. Thanks for the suggestions.
Exec Summary:
I am a novice with VS. I have a problem with some inherited code. Code builds and executes fine on VS2008 (XP64). Same code will either build and not run, or fail to build on XP64 or W7 with VS2008 and/or VS2010. After changing some compiler options, I managed to get it to run without an issue on VS2010 on XP64; however, on W7, no luck.
I eventually discovered that the heap is getting corrupted.
Unhandled exception at 0x76e540f2 (ntdll.dll) in ae312i3.3.exe: 0xC0000374: A heap has been corrupted.
I am not familiar with how to consider fixing a heap problem; perhaps there is an issue with the pointers in the existing code that points to memory in use by another thread or program, corrupted ntdll.dll file, other?
Rebooting PC to check if ntdll.dll was corrupted didn't help. Changed debug settings, and received the following feedback:
HEAP[ae312i3.3.exe]: Invalid address specified to RtlSizeHeap( 0000000000220000, 000000002BC8BE58 )
Windows has triggered a breakpoint in ae312i3.3.exe.
This may be due to a corruption of the heap, which indicates a bug in ae312i3.3.exe or any of the DLLs it has loaded. This may also be due to the user pressing F12 while ae312i3.3.exe has focus.
It appears that when it crashes, C++ is returning a boolean variable to an expression of the form
While (myQueryFcn(inputvars))
QUESTIONS:
So, is it not returning a C++ boolean to a VB boolean? I do believe that the two are different representations (one uses True/False, the other an integer?) Could this be an issue? If so, why was it NOT an issue in VB2008?**
Or, perhaps it is that the C++ code has written to allocated memory, and upon returning to VB, it crashes???
** I have recently learned of 'Insure++', and will be trying to use it to track down the issue. Any suggestions on its use, other possible insight? **
I would appreciate any suggestions. Thanks again.
.
.
.
.
.
DETAILS THAT LED TO THE ABOVE SUMMARY (below):
I am a novice with VS2010; familiar with programming at an engineering application level (Python, Fortran, but been decades since I used C++ extensively), but not a professional programmer.
I have a solution that consists of multiple projects, all in VS2008. Projects are:
Reader (C++ project; utilizes 3rd party DLLs)
Query (C++ project; depends upon Reader)
Main (VB; depends upon Reader and Query).
The following applies to XP64 OS.
The solution and projects were written, built, and released by someone other than myself.
I have taken the existing files, and made a copy, placed in a directory of my choice, and simply opened in VS2010 (VS2008 is not installed on my PC). I was able to successfully build (with many warnings though - more on that later) ; but when I ran the executable, it would reach a point and crash. After much trial and error, I discovered that modification of compiler settings resolved the issue for me as follows:
It would build and execute in DEBUG configuration, but no the Release. I found that the in the Query project Property Page / Configuration Properties / C++ / Optimization / Optimization --> the Release (x64) configuration utilized 'Maximize Speed (/O2) while the Debug used 'Disabled (/Od)' --> so I switched to 'Disabled (/Od).
Also, Query's project Property Page / Configuration Properties / General / Whole Program Optimization --> needed to be set to 'Use Link Time Code Generation'.
The above build and ran successfully on XP64 in VS2010.
Next, I simply copied the files and placed a copy on a W7 machine with VS2010. Opened the solution via 2010, and it 'upgraded' the files automatically. When I launch VS2010, it automatically indicates the 4 following warnings. They are:
Operands of type Object used for operator '&'; runtime errors could occur. In file 'CobraIFile.vb', Line 1845, Column 37.
identical error completely
Accesss of shared member, constant member, enum member or nested type through an instance; qualifying expression will not be evaluated. In file 'FileWriter.vb', Lines 341, Columns 51
Operands of type Object used for operator '='; use the 'Is' operator to test object identity. In file 'FormMain.vb'; Line 4173, Column 32.
Code for warnings in 1 & 2 are as follows
ValueStr = String.Empty
For iCols = 0 To DGrid.Columns.Count - 1
ValueStr &= DGrid.Item(iCols, iRows).Value & ";" // THIS IS WARNING LINE!!!
Next
Code for warning 3:
With FormMain
WriteComment("")
WriteComment("Generated by :")
WriteComment("")
WriteComment(" Program : " & .PROGRAM.ToUpper) // THIS IS WARNING LINE!!!
Code for warning 4:
' Compare material against the material table
For iRowMat As Integer = 0 To matCount - 1
' Ignore new row
If Not .Rows(iRowMat).IsNewRow Then
' Check material description
// LINE BELOW IS WARNING LINE!!!
If .Item("ColMatDesc", iRowMat).Value = matDesc Then
DataGridMatProp.Item("ColMatIdx", iRow).Value = .Item("ColMatFile", iRowMat).Value
Exit For
End If ' Check description
End If ' Check new row
Next iRowMat
When I build the solution, it will successfully build without errors (but many warnings), and when I run the executable, it successfully loads the GUI, but at some point crashes while executing either the Query or Reader projects (after taking actions with gui buttons) with the following information:
C:\Users\mcgrete\AppData\Local\Temp\WER5D31.tmp.WERInternalMetadata.xml
C:\Users\mcgrete\AppData\Local\Temp\WER68E6.tmp.appcompat.txt
C:\Users\mcgrete\AppData\Local\Temp\WER722A.tmp.mdmp
I was unable to utilize the information in the three files above (ignorant of how to consider to do so).
The warnings I receive in W7 are very similar / if not identical to that in XP64; they are along the lines of the following types, and there are over 1,600 of them. Add to the warning types below the original 4 warnings listed ealier above. With my success in running on XP64, and not in W7, I was assuming/hoping that these would not require to individually be addressed, but are only warnings.
Warning C4267: 'argument' : conversion from 'size_t' to 'int', possible loss of data. C:\Users\mcgrete\Documents\iCOBRA\pts\p312\exec\win64\6111\include\atr_StringBase.h 351 1 Reader
Warning C4018: '<' : signed/unsigned mismatch C:\Users\mcgrete\Documents\iCOBRA\pts\p312\exec\win64\6111\include\omi_BlkBitVectTrav.h 69 1 Reader
Warning C4244: 'initializing' : conversion from 'double' to 'float', possible loss of data. C:\Users\mcgrete\Documents\iCOBRA\pts\p312\exec\win64\6111\include\g3d_Vector.h 76 1 Reader
Warning C4244: 'initializing' : conversion from 'double' to 'float', possible loss of data. C:\Users\mcgrete\Documents\iCOBRA\pts\p312\exec\win64\6111\include\g3d_Vector.h 76 1 Reader
Warning C4800: 'int' : forcing value to bool 'true' or 'false' (performance warning). C:\Users\mcgrete\Documents\iCOBRA\pts\p312\exec\win64\6111\include\rgnC_Region.h 219 1 Reader
Warning LNK4006: "public: class ddr_ShortcutImpl const & __cdecl cow_COW,struct cow_Virtual > >::ConstGet(void)const " (?ConstGet#?$cow_COW#V?$ddr_ShortcutImpl#VkmaC_Material####U?$cow_Virtual#V?$ddr_ShortcutImpl#VkmaC_Material########QEBAAEBV?$ddr_ShortcutImpl#VkmaC_Material####XZ) already defined in ABQDDB_Odb_import.lib(ABQDDB_Odb.dll); second definition ignored C:\Users\mcgrete\Documents\iCOBRA\pts\p312\source\312i3.3\Reader\ABQSMAOdbCore_import.lib(ABQSMAOdbCore.dll) Reader
Warning LNK4221: This object file does not define any previously undefined public symbols, so it will not be used by any link operation that consumes this library. C:\Users\mcgrete\Documents\iCOBRA\pts\p312\source\312i3.3\Reader\ABQSMAOdbCore_import.lib(ABQSMAOdbCore.dll) Reader
Warning C4996: 'sprintf': This function or variable may be unsafe. Consider using sprintf_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. C:\Users\mcgrete\Documents\iCOBRA\pts\p312\source\312i3.3\Query\Query.cpp 271 1 Query
Warning MSB8004: Output Directory does not end with a trailing slash. This build instance will add the slash as it is required to allow proper evaluation of the Output Directory. C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\Microsoft.CppBuild.targets 299 6 Query
Now to my request for help:
I must clarify, I am willing to dig into the warnings above in detail; however, I have not done so as before investing that effort and not having written code to begin with, I am simply trying to understand what might be the true root cause, then focus efforts in that direction.
I was disappointed with the XP64 issues I experienced, and was unsure if the changes required to the configuration were required, or if the changes that I made were only actually a 'work-around' to an unidentified problem?
I expected that once the XP64 VS2010 version of the solution was operable, that it would transfer to W7 without an issue, as the software build and ran fine with VS2008 and XP64. Is that a poor assumption? What might I be missing?
Should I consider attempting to modify the configurations again, or is the root cause likely associatd with the warnings indicated above? If the warnings, why were they apparently non-issues in VS2008 - did changes in VS2010 effectively lead to generation of actual runtime errors where in VS2008 I was luckily 'spared' the pain?
I appreciate any guidance and insight on how to proceed, as from my limited experience, it appears from searches on the web that there were numerous compiler bugs or related in VS2010. Not sure if any are related to my issues, if the numerous warnings are actually a problem and the code needs quite a bit of cleaning up, or if there are simply some configuration issues that I may have to deal with.
FYI - The latest update/SP to VS2010 that I have installed is VS10SP1-KB2736182.exe. I have also trid to use the debugger, but was unable to get it to stop at breakpoints in my Query or Reader project codes, even while running VS2010 as administrator. W7 does have .NET Framework 4.0 Multi-Targeting Pack installed, and my solution is configured to use .NET Framework 4.0 Client Profile.
Thanks in advance!
UPDATE March 18, 2013
I didn't know how to reply to my own question, so here is an update.
I still could not manage to get the debugger working; so, I did it the old fashioned way - added various MessageBoxs to find where it was crashing.
A. The Main.vb program calls a function in the 'Query' project
OdbQueryGetIncrement(str_out, vec_ptr)
B. Then, the function executes through 100%, attempting to return a boolean...here is code with some old fashioned debugging code added...
//Gets the next item in a list.
// Returns false if there is the vector is empty.
// NOTE: Once an element is returned it is removed from the list.
bool __stdcall OdbQueryGetItem(
char* &str_out, // RETURN Next item in list.
void * vec_ptr, // Pointer to the vector of pointers.
int index) // Index of pointers vector to return next item of.
{
// Cast the point into an array of pointers
std::vector<std::string>* *vec_temp = (std::vector<std::string>* *) vec_ptr;
bool bool_out = false;
char vectempsize[1000];
int TEM1;
char temp[1000];
TEM1 = vec_temp[index]->size();
// Check vector is valid
if (vec_temp) {
if(vec_temp[index]->size() >= index)
{
sprintf(temp,"value: %d\n",(int)bool_out);
::MessageBoxA(0, (LPCSTR) temp, (LPCSTR) "OdbQuery.dll - bool_out", MB_ICONINFORMATION);
sprintf(temp,"value: %d\n",(int)index);
::MessageBoxA(0, (LPCSTR) temp, (LPCSTR) "OdbQuery.dll - index", MB_ICONINFORMATION);
sprintf(vectempsize,"value: %d\n",(int)TEM1);
::MessageBoxA(0, (LPCSTR) temp, (LPCSTR) "OdbQuery.dll - index", MB_ICONINFORMATION);
}
if (!vec_temp[index]->empty()) {
// Get the next item in the list
std::string item = vec_temp[index]->front();
// Initialise ouput string
str_out = (char*)malloc( item.size()*sizeof(char) );
sprintf(str_out, "%s", item.c_str());
::MessageBoxA(0,(LPCSTR) str_out, (LPCSTR) "hello", 0);
// Remove first item from the vector
vec_temp[index]->erase(vec_temp[index]->begin());
bool_out = true;
}
}
sprintf(temp,"value: %d\n",(int)bool_out);
::MessageBoxA(0, (LPCSTR) temp, (LPCSTR) "OdbQuery.dll - bool_out", MB_ICONINFORMATION);
return bool_out;
}
The code starts out with bool_out=false as expected (verified with MessageBox value=0 output)
The code reads and outputs index = 2 with the MessageBox...
The code reads and outputs TEM1=vec_temp[index]->size() as a value=2 with the MessageBox...
The code outputs bool_out as true (value=1) with the MessageBox...
Then, the code crashes. A MessageBox that was placed immediately after the line that calls the code above never is executed.
The output from VS2010 is "The program '[6892] ae312i3.3.exe: Managed (v4.0.30319)' has exited with code -2147483645 (0x80000003)."
I am lost as to why the execution would die while returning from this function.
Is there some possible issue with compiler settings or bugs?
Any help is appreciated!
MORE INFORMATION
Hello, I modified some settings on the Properties Page to attempt to get the debugger to give me more information. This has resulted in more information as follows:
Unhandled exception at 0x76e540f2 (ntdll.dll) in ae312i3.3.exe: 0xC0000374: A heap has been corrupted.
I am not familiar with how to consider fixing a heap problem; perhaps there is an issue with the pointers in the existing code that points to memory in use by another thread or program, corrupted ntdll.dll file, other?
I will try rebooting PC to see if that helps, though I have little hope for that...didn't help.
Found option in Debugger to 'Enable unmanaged code debugging', checked it; cleaned; rebuild; run with debug...
Output more descriptive --
HEAP[ae312i3.3.exe]: Invalid address specified to RtlSizeHeap( 0000000000220000, 000000002BC8BE58 )
Windows has triggered a breakpoint in ae312i3.3.exe.
This may be due to a corruption of the heap, which indicates a bug in ae312i3.3.exe or any of the DLLs it has loaded. This may also be due to the user pressing F12 while ae312i3.3.exe has focus.
It appears that when it crashes, C++ is returning a boolean variable to an expression of the form
While (myQueryFcn(inputvars))
So, is it not returning a C++ boolean to a VB boolean? I do believe that the two are different representations (one uses True/False, the other an integer?) Could this be an issue? If so, why was it NOT an issue in VB2008?
I solved my own problem; the root cause of the problem was as follows.
Root Cause:
VisualBasic (VB) called C++.
VB created a string and sent to C++. Previous developer/coder allocated memory in C++ for the same string.
When execution of C++ code ended, C++ appears to have terminated the memory allocation established by VB and C++.
Solution:
1. Removed memory allocation in C++ code (below).
str_out=(char*)malloc( (item.size()+1)*sizeof(char) );
Modified VB code to use a StringBuilder type, rather than string.
Dim str_out As StringBuilder = New StringBuilder(5120)
See: return string from c++ function to VB .Net

What is your favorite R debugging trick? [duplicate]

I get an error when using an R function that I wrote:
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: algorithm did not converge
What I have done:
Step through the function
Adding print to find out at what line the error occurs suggests two functions that should not use glm.fit. They are window() and save().
My general approaches include adding print and stop commands, and stepping through a function line by line until I can locate the exception.
However, it is not clear to me using those techniques where this error comes from in the code. I am not even certain which functions within the code depend on glm.fit. How do I go about diagnosing this problem?
I'd say that debugging is an art form, so there's no clear silver bullet. There are good strategies for debugging in any language, and they apply here too (e.g. read this nice article). For instance, the first thing is to reproduce the problem...if you can't do that, then you need to get more information (e.g. with logging). Once you can reproduce it, you need to reduce it down to the source.
Rather than a "trick", I would say that I have a favorite debugging routine:
When an error occurs, the first thing that I usually do is look at the stack trace by calling traceback(): that shows you where the error occurred, which is especially useful if you have several nested functions.
Next I will set options(error=recover); this immediately switches into browser mode where the error occurs, so you can browse the workspace from there.
If I still don't have enough information, I usually use the debug() function and step through the script line by line.
The best new trick in R 2.10 (when working with script files) is to use the findLineNum() and setBreakpoint() functions.
As a final comment: depending upon the error, it is also very helpful to set try() or tryCatch() statements around external function calls (especially when dealing with S4 classes). That will sometimes provide even more information, and it also gives you more control over how errors are handled at run time.
These related questions have a lot of suggestions:
Debugging tools for the R language
Debugging lapply/sapply calls
Getting the state of variables after an error occurs in R
R script line numbers at error?
The best walkthrough I've seen so far is:
http://www.biostat.jhsph.edu/%7Erpeng/docs/R-debug-tools.pdf
Anybody agree/disagree?
As was pointed out to me in another question, Rprof() and summaryRprof() are nice tools to find slow parts of your program that might benefit from speeding up or moving to a C/C++ implementation. This probably applies more if you're doing simulation work or other compute- or data-intensive activities. The profr package can help visualizing the results.
I'm on a bit of a learn-about-debugging kick, so another suggestion from another thread:
Set options(warn=2) to treat warnings like errors
You can also use options to drop you right into the heat of the action when an error or warning occurs, using your favorite debugging function of choice. For instance:
Set options(error=recover) to run recover() when an error occurs, as Shane noted (and as is documented in the R debugging guide. Or any other handy function you would find useful to have run.
And another two methods from one of #Shane's links:
Wrap an inner function call with try() to return more information on it.
For *apply functions, use .inform=TRUE (from the plyr package) as an option to the apply command
#JoshuaUlrich also pointed out a neat way of using the conditional abilities of the classic browser() command to turn on/off debugging:
Put inside the function you might want to debug browser(expr=isTRUE(getOption("myDebug")))
And set the global option by options(myDebug=TRUE)
You could even wrap the browser call: myBrowse <- browser(expr=isTRUE(getOption("myDebug"))) and then call with myBrowse() since it uses globals.
Then there are the new functions available in R 2.10:
findLineNum() takes a source file name and line number and returns the function and environment. This seems to be helpful when you source() a .R file and it returns an error at line #n, but you need to know what function is located at line #n.
setBreakpoint() takes a source file name and line number and sets a breakpoint there
The codetools package, and particularly its checkUsage function can be particularly helpful in quickly picking up syntax and stylistic errors that a compiler would typically report (unused locals, undefined global functions and variables, partial argument matching, and so forth).
setBreakpoint() is a more user-friendly front-end to trace(). Details on the internals of how this works are available in a recent R Journal article.
If you are trying to debug someone else's package, once you have located the problem you can over-write their functions with fixInNamespace and assignInNamespace, but do not use this in production code.
None of this should preclude the tried-and-true standard R debugging tools, some of which are above and others of which are not. In particular, the post-mortem debugging tools are handy when you have a time-consuming bunch of code that you'd rather not re-run.
Finally, for tricky problems which don't seem to throw an error message, you can use options(error=dump.frames) as detailed in this question:
Error without an error being thrown
At some point, glm.fit is being called. That means one of the functions you call or one of the functions called by those functions is using either glm, glm.fit.
Also, as I mention in my comment above, that is a warning not an error, which makes a big difference. You can't trigger any of R's debugging tools from a warning (with default options before someone tells me I am wrong ;-).
If we change the options to turn warnings into errors then we can start to use R's debugging tools. From ?options we have:
‘warn’: sets the handling of warning messages. If ‘warn’ is
negative all warnings are ignored. If ‘warn’ is zero (the
default) warnings are stored until the top-level function
returns. If fewer than 10 warnings were signalled they will
be printed otherwise a message saying how many (max 50) were
signalled. An object called ‘last.warning’ is created and
can be printed through the function ‘warnings’. If ‘warn’ is
one, warnings are printed as they occur. If ‘warn’ is two or
larger all warnings are turned into errors.
So if you run
options(warn = 2)
then run your code, R will throw an error. At which point, you could run
traceback()
to see the call stack. Here is an example.
> options(warn = 2)
> foo <- function(x) bar(x + 2)
> bar <- function(y) warning("don't want to use 'y'!")
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
> traceback()
7: doWithOneRestart(return(expr), restart)
6: withOneRestart(expr, restarts[[1L]])
5: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg,
call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x +
2)))
3: warning("don't want to use 'y'!")
2: bar(x + 2)
1: foo(1)
Here you can ignore the frames marked 4: and higher. We see that foo called bar and that bar generated the warning. That should show you which functions were calling glm.fit.
If you now want to debug this, we can turn to another option to tell R to enter the debugger when it encounters an error, and as we have made warnings errors we will get a debugger when the original warning is triggered. For that you should run:
options(error = recover)
Here is an example:
> options(error = recover)
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
Enter a frame number, or 0 to exit
1: foo(1)
2: bar(x + 2)
3: warning("don't want to use 'y'!")
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x + 2)))
5: withRestarts({
6: withOneRestart(expr, restarts[[1]])
7: doWithOneRestart(return(expr), restart)
Selection:
You can then step into any of those frames to see what was happening when the warning was thrown.
To reset the above options to their default, enter
options(error = NULL, warn = 0)
As for the specific warning you quote, it is highly likely that you need to allow more iterations in the code. Once you've found out what is calling glm.fit, work out how to pass it the control argument using glm.control - see ?glm.control.
So browser(), traceback() and debug() walk into a bar, but trace() waits outside and keeps the motor running.
By inserting browser somewhere in your function, the execution will halt and wait for your input. You can move forward using n (or Enter), run the entire chunk (iteration) with c, finish the current loop/function with f, or quit with Q; see ?browser.
With debug, you get the same effect as with browser, but this stops the execution of a function at its beginning. Same shortcuts apply. This function will be in a "debug" mode until you turn it off using undebug (that is, after debug(foo), running the function foo will enter "debug" mode every time until you run undebug(foo)).
A more transient alternative is debugonce, which will remove the "debug" mode from the function after the next time it's evaluated.
traceback will give you the flow of execution of functions all the way up to where something went wrong (an actual error).
You can insert code bits (i.e. custom functions) in functions using trace, for example browser. This is useful for functions from packages and you're too lazy to get the nicely folded source code.
My general strategy looks like:
Run traceback() to see look for obvious issues
Set options(warn=2) to treat warnings like errors
Set options(error=recover) to step into the call stack on error
After going through all the steps suggested here I just learned that setting .verbose = TRUE in foreach() also gives me tons of useful information. In particular foreach(.verbose=TRUE) shows exactly where an error occurs inside the foreach loop, while traceback() does not look inside the foreach loop.
Mark Bravington's debugger which is available as the package debug on CRAN is very good and pretty straight forward.
library(debug);
mtrace(myfunction);
myfunction(a,b);
#... debugging, can query objects, step, skip, run, breakpoints etc..
qqq(); # quit the debugger only
mtrace.off(); # turn off debugging
The code pops up in a highlighted Tk window so you can see what's going on and, of course you can call another mtrace() while in a different function.
HTH
I like Gavin's answer: I did not know about options(error = recover). I also like to use the 'debug' package that gives a visual way to step through your code.
require(debug)
mtrace(foo)
foo(1)
At this point it opens up a separate debug window showing your function, with a yellow line showing where you are in the code. In the main window the code enters debug mode, and you can keep hitting enter to step through the code (and there are other commands as well), and examine variable values, etc. The yellow line in the debug window keeps moving to show where you are in the code. When done with debugging, you can turn off tracing with:
mtrace.off()
Based on the answer I received here, you should definitely check out the options(error=recover) setting. When this is set, upon encountering an error, you'll see text on the console similar to the following (traceback output):
> source(<my filename>)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
Enter a frame number, or 0 to exit
1: source(<my filename>)
2: eval.with.vis(ei, envir)
3: eval.with.vis(expr, envir, enclos)
4: LinearParamSearch(data = dataset, y = data.frame(LGD = dataset$LGD10), data.names = data
5: LinearParamSearch.R#66: plot(x = x, y = y.data, xlab = names(y), ylab = data.names[i])
6: LinearParamSearch.R#66: plot.default(x = x, y = y.data, xlab = names(y), ylab = data.nam
7: LinearParamSearch.R#66: localWindow(xlim, ylim, log, asp, ...)
8: LinearParamSearch.R#66: plot.window(...)
Selection:
At which point you can choose which "frame" to enter. When you make a selection, you'll be placed into browser() mode:
Selection: 4
Called from: stop(gettextf("replacement has %d rows, data has %d", N, n),
domain = NA)
Browse[1]>
And you can examine the environment as it was at the time of the error. When you're done, type c to bring you back to the frame selection menu. When you're done, as it tells you, type 0 to exit.
I gave this answer to a more recent question, but am adding it here for completeness.
Personally I tend not to use functions to debug. I often find that this causes as much trouble as it solves. Also, coming from a Matlab background I like being able to do this in an integrated development environment (IDE) rather than doing this in the code. Using an IDE keeps your code clean and simple.
For R, I use an IDE called "RStudio" (http://www.rstudio.com), which is available for windows, mac, and linux and is pretty easy to use.
Versions of Rstudio since about October 2013 (0.98ish?) have the capability to add breakpoints in scripts and functions: to do this, just click on the left margin of the file to add a breakpoint. You can set a breakpoint and then step through from that point on. You also have access to all of the data in that environment, so you can try out commands.
See http://www.rstudio.com/ide/docs/debugging/overview for details. If you already have Rstudio installed, you may need to upgrade - this is a relatively new (late 2013) feature.
You may also find other IDEs that have similar functionality.
Admittedly, if it's a built-in function you may have to resort to some of the suggestions made by other people in this discussion. But, if it's your own code that needs fixing, an IDE-based solution might be just what you need.
To debug Reference Class methods without instance reference
ClassName$trace(methodName, browser)
I am beginning to think that not printing error line number - a most basic requirement - BY DEFAILT- is some kind of a joke in R/Rstudio. The only reliable method I have found to find where an error occurred is to make the additional effort of calloing traceback() and see the top line.

Debugging tools for the R language [duplicate]

I get an error when using an R function that I wrote:
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: algorithm did not converge
What I have done:
Step through the function
Adding print to find out at what line the error occurs suggests two functions that should not use glm.fit. They are window() and save().
My general approaches include adding print and stop commands, and stepping through a function line by line until I can locate the exception.
However, it is not clear to me using those techniques where this error comes from in the code. I am not even certain which functions within the code depend on glm.fit. How do I go about diagnosing this problem?
I'd say that debugging is an art form, so there's no clear silver bullet. There are good strategies for debugging in any language, and they apply here too (e.g. read this nice article). For instance, the first thing is to reproduce the problem...if you can't do that, then you need to get more information (e.g. with logging). Once you can reproduce it, you need to reduce it down to the source.
Rather than a "trick", I would say that I have a favorite debugging routine:
When an error occurs, the first thing that I usually do is look at the stack trace by calling traceback(): that shows you where the error occurred, which is especially useful if you have several nested functions.
Next I will set options(error=recover); this immediately switches into browser mode where the error occurs, so you can browse the workspace from there.
If I still don't have enough information, I usually use the debug() function and step through the script line by line.
The best new trick in R 2.10 (when working with script files) is to use the findLineNum() and setBreakpoint() functions.
As a final comment: depending upon the error, it is also very helpful to set try() or tryCatch() statements around external function calls (especially when dealing with S4 classes). That will sometimes provide even more information, and it also gives you more control over how errors are handled at run time.
These related questions have a lot of suggestions:
Debugging tools for the R language
Debugging lapply/sapply calls
Getting the state of variables after an error occurs in R
R script line numbers at error?
The best walkthrough I've seen so far is:
http://www.biostat.jhsph.edu/%7Erpeng/docs/R-debug-tools.pdf
Anybody agree/disagree?
As was pointed out to me in another question, Rprof() and summaryRprof() are nice tools to find slow parts of your program that might benefit from speeding up or moving to a C/C++ implementation. This probably applies more if you're doing simulation work or other compute- or data-intensive activities. The profr package can help visualizing the results.
I'm on a bit of a learn-about-debugging kick, so another suggestion from another thread:
Set options(warn=2) to treat warnings like errors
You can also use options to drop you right into the heat of the action when an error or warning occurs, using your favorite debugging function of choice. For instance:
Set options(error=recover) to run recover() when an error occurs, as Shane noted (and as is documented in the R debugging guide. Or any other handy function you would find useful to have run.
And another two methods from one of #Shane's links:
Wrap an inner function call with try() to return more information on it.
For *apply functions, use .inform=TRUE (from the plyr package) as an option to the apply command
#JoshuaUlrich also pointed out a neat way of using the conditional abilities of the classic browser() command to turn on/off debugging:
Put inside the function you might want to debug browser(expr=isTRUE(getOption("myDebug")))
And set the global option by options(myDebug=TRUE)
You could even wrap the browser call: myBrowse <- browser(expr=isTRUE(getOption("myDebug"))) and then call with myBrowse() since it uses globals.
Then there are the new functions available in R 2.10:
findLineNum() takes a source file name and line number and returns the function and environment. This seems to be helpful when you source() a .R file and it returns an error at line #n, but you need to know what function is located at line #n.
setBreakpoint() takes a source file name and line number and sets a breakpoint there
The codetools package, and particularly its checkUsage function can be particularly helpful in quickly picking up syntax and stylistic errors that a compiler would typically report (unused locals, undefined global functions and variables, partial argument matching, and so forth).
setBreakpoint() is a more user-friendly front-end to trace(). Details on the internals of how this works are available in a recent R Journal article.
If you are trying to debug someone else's package, once you have located the problem you can over-write their functions with fixInNamespace and assignInNamespace, but do not use this in production code.
None of this should preclude the tried-and-true standard R debugging tools, some of which are above and others of which are not. In particular, the post-mortem debugging tools are handy when you have a time-consuming bunch of code that you'd rather not re-run.
Finally, for tricky problems which don't seem to throw an error message, you can use options(error=dump.frames) as detailed in this question:
Error without an error being thrown
At some point, glm.fit is being called. That means one of the functions you call or one of the functions called by those functions is using either glm, glm.fit.
Also, as I mention in my comment above, that is a warning not an error, which makes a big difference. You can't trigger any of R's debugging tools from a warning (with default options before someone tells me I am wrong ;-).
If we change the options to turn warnings into errors then we can start to use R's debugging tools. From ?options we have:
‘warn’: sets the handling of warning messages. If ‘warn’ is
negative all warnings are ignored. If ‘warn’ is zero (the
default) warnings are stored until the top-level function
returns. If fewer than 10 warnings were signalled they will
be printed otherwise a message saying how many (max 50) were
signalled. An object called ‘last.warning’ is created and
can be printed through the function ‘warnings’. If ‘warn’ is
one, warnings are printed as they occur. If ‘warn’ is two or
larger all warnings are turned into errors.
So if you run
options(warn = 2)
then run your code, R will throw an error. At which point, you could run
traceback()
to see the call stack. Here is an example.
> options(warn = 2)
> foo <- function(x) bar(x + 2)
> bar <- function(y) warning("don't want to use 'y'!")
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
> traceback()
7: doWithOneRestart(return(expr), restart)
6: withOneRestart(expr, restarts[[1L]])
5: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg,
call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x +
2)))
3: warning("don't want to use 'y'!")
2: bar(x + 2)
1: foo(1)
Here you can ignore the frames marked 4: and higher. We see that foo called bar and that bar generated the warning. That should show you which functions were calling glm.fit.
If you now want to debug this, we can turn to another option to tell R to enter the debugger when it encounters an error, and as we have made warnings errors we will get a debugger when the original warning is triggered. For that you should run:
options(error = recover)
Here is an example:
> options(error = recover)
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
Enter a frame number, or 0 to exit
1: foo(1)
2: bar(x + 2)
3: warning("don't want to use 'y'!")
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x + 2)))
5: withRestarts({
6: withOneRestart(expr, restarts[[1]])
7: doWithOneRestart(return(expr), restart)
Selection:
You can then step into any of those frames to see what was happening when the warning was thrown.
To reset the above options to their default, enter
options(error = NULL, warn = 0)
As for the specific warning you quote, it is highly likely that you need to allow more iterations in the code. Once you've found out what is calling glm.fit, work out how to pass it the control argument using glm.control - see ?glm.control.
So browser(), traceback() and debug() walk into a bar, but trace() waits outside and keeps the motor running.
By inserting browser somewhere in your function, the execution will halt and wait for your input. You can move forward using n (or Enter), run the entire chunk (iteration) with c, finish the current loop/function with f, or quit with Q; see ?browser.
With debug, you get the same effect as with browser, but this stops the execution of a function at its beginning. Same shortcuts apply. This function will be in a "debug" mode until you turn it off using undebug (that is, after debug(foo), running the function foo will enter "debug" mode every time until you run undebug(foo)).
A more transient alternative is debugonce, which will remove the "debug" mode from the function after the next time it's evaluated.
traceback will give you the flow of execution of functions all the way up to where something went wrong (an actual error).
You can insert code bits (i.e. custom functions) in functions using trace, for example browser. This is useful for functions from packages and you're too lazy to get the nicely folded source code.
My general strategy looks like:
Run traceback() to see look for obvious issues
Set options(warn=2) to treat warnings like errors
Set options(error=recover) to step into the call stack on error
After going through all the steps suggested here I just learned that setting .verbose = TRUE in foreach() also gives me tons of useful information. In particular foreach(.verbose=TRUE) shows exactly where an error occurs inside the foreach loop, while traceback() does not look inside the foreach loop.
Mark Bravington's debugger which is available as the package debug on CRAN is very good and pretty straight forward.
library(debug);
mtrace(myfunction);
myfunction(a,b);
#... debugging, can query objects, step, skip, run, breakpoints etc..
qqq(); # quit the debugger only
mtrace.off(); # turn off debugging
The code pops up in a highlighted Tk window so you can see what's going on and, of course you can call another mtrace() while in a different function.
HTH
I like Gavin's answer: I did not know about options(error = recover). I also like to use the 'debug' package that gives a visual way to step through your code.
require(debug)
mtrace(foo)
foo(1)
At this point it opens up a separate debug window showing your function, with a yellow line showing where you are in the code. In the main window the code enters debug mode, and you can keep hitting enter to step through the code (and there are other commands as well), and examine variable values, etc. The yellow line in the debug window keeps moving to show where you are in the code. When done with debugging, you can turn off tracing with:
mtrace.off()
Based on the answer I received here, you should definitely check out the options(error=recover) setting. When this is set, upon encountering an error, you'll see text on the console similar to the following (traceback output):
> source(<my filename>)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
Enter a frame number, or 0 to exit
1: source(<my filename>)
2: eval.with.vis(ei, envir)
3: eval.with.vis(expr, envir, enclos)
4: LinearParamSearch(data = dataset, y = data.frame(LGD = dataset$LGD10), data.names = data
5: LinearParamSearch.R#66: plot(x = x, y = y.data, xlab = names(y), ylab = data.names[i])
6: LinearParamSearch.R#66: plot.default(x = x, y = y.data, xlab = names(y), ylab = data.nam
7: LinearParamSearch.R#66: localWindow(xlim, ylim, log, asp, ...)
8: LinearParamSearch.R#66: plot.window(...)
Selection:
At which point you can choose which "frame" to enter. When you make a selection, you'll be placed into browser() mode:
Selection: 4
Called from: stop(gettextf("replacement has %d rows, data has %d", N, n),
domain = NA)
Browse[1]>
And you can examine the environment as it was at the time of the error. When you're done, type c to bring you back to the frame selection menu. When you're done, as it tells you, type 0 to exit.
I gave this answer to a more recent question, but am adding it here for completeness.
Personally I tend not to use functions to debug. I often find that this causes as much trouble as it solves. Also, coming from a Matlab background I like being able to do this in an integrated development environment (IDE) rather than doing this in the code. Using an IDE keeps your code clean and simple.
For R, I use an IDE called "RStudio" (http://www.rstudio.com), which is available for windows, mac, and linux and is pretty easy to use.
Versions of Rstudio since about October 2013 (0.98ish?) have the capability to add breakpoints in scripts and functions: to do this, just click on the left margin of the file to add a breakpoint. You can set a breakpoint and then step through from that point on. You also have access to all of the data in that environment, so you can try out commands.
See http://www.rstudio.com/ide/docs/debugging/overview for details. If you already have Rstudio installed, you may need to upgrade - this is a relatively new (late 2013) feature.
You may also find other IDEs that have similar functionality.
Admittedly, if it's a built-in function you may have to resort to some of the suggestions made by other people in this discussion. But, if it's your own code that needs fixing, an IDE-based solution might be just what you need.
To debug Reference Class methods without instance reference
ClassName$trace(methodName, browser)
I am beginning to think that not printing error line number - a most basic requirement - BY DEFAILT- is some kind of a joke in R/Rstudio. The only reliable method I have found to find where an error occurred is to make the additional effort of calloing traceback() and see the top line.

Resources