I am trying to debug an intermittent parallel build issue in my cmake build system around some generated files. It is however difficult to reliably test or reproduce the issue.
Does anyone know any way to exacerbate or sensitise such issues? Or other strategies for debugging them?
It is likely a missing add_dependencies to force one target to build completely before another begins, or an add_custom_command output that is used in more than one library.
If both libraries start building at the same time, and they both trigger running the custom command at the same time, then you'll get two competing custom commands running, and they may overwrite each other's results, or intermingle results.
Is your code public? Can you post it for others to inspect?
One good strategy is simply exposing it to other developers for "more eyes"...
Related
I have been using Clojure, ClojureScript, lein, shadow-cljs, re-frame, reagent, Emacs, and CIDER to work on a Clojure/ClojureScript dynamic web app project.
Usually, I build the project by executing shadow-cljs watch app. It works fine. I can use the application and watch changes.
Currently, I am working on a Continuous Integration project via GitHub Actions. In this new environment, I wanna use a different command: shadow-cljs compile app.
But, I am having problems. Both on GitHub Actions env and in my local env (which is reproducing the steps on GitHub Actions).
The result returned by the command is quite weird. First, it says the compilation went well, Build completed, and the terminal displays some non-harmful warnings:
shadow-cljs compile app
shadow-cljs - config: /Users/pedro/projects/my_project/shadow-cljs.edn
WARNING: random-uuid already refers to: #'clojure.core/random-uuid in namespace: portal.runtime.browser, being replaced by: #'portal.runtime.browser/random-uuid
[:app] Compiling ...
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
WARNING: abs already refers to: #'clojure.core/abs in namespace: day8.re-frame-10x.inlined-deps.garden.v1v3v10.garden.color, being replaced by: #'day8.re-frame-10x.inlined-deps.garden.v1v3v10.garden.color/abs
WARNING: abs already refers to: #'clojure.core/abs in namespace: garden.color, being replaced by: #'garden.color/abs
[:app] Build completed. (1121 files, 27 compiled, 8 warnings, 45.85s)
------ WARNING #1 - :redef -----------------------------------------------------
(... omitted ...)
Unfortunately, the terminal stays like this forever. It behaves like the watch command... But, compile is not supposed to be like this.
The terminal does not finish the process. There is no exit. Even though the build is complete, as stated in the last line before WARNING.
Feels like a deadlock situation. The terminal is still responsive because I can cancel the process with Control-c. But, besides canceling it, it is frozen. This can be catastrophic on GitHub actions since you are going to be paying for an ongoing process.
On shadow-clj.edn, I tried commenting out compilation-options:
;:compiler-options {:optimizations :none}
But it did not make any difference. The same problem happens with shadow-cljs release app.
After finding a GitHub issue with a similar problem, I also tried:
:js-options {:resolve {"highlight.js" {:target :npm :require "highlight.js/lib/core"}}}
One of my hypotheses is that highlight.js is causing some trouble while parsing strings and/or files...
I am also afraid the problem could be some REPL being fired up by some dependency and blocking the process for the exit... But I am not sure where to look for it.
Why is this happening? What could be causing this? How can I solve it?
;;;;
UPDATE:
After #thomasheller kind answer, I tried some things. I believe all of the hints can be discarded for the present situation, except for the macros:
1 - The project does not use user.clj
2 - Build hooks is the default setting which is basically a comment:
:build-hooks [;; this will create a build report for every release build
;; which includes a detailed breakdown of the included sources
;; and how much they each contributed to the overall size:
#_(shadow.cljs.build-report/hook
{:output-to "build-reports/report.html"})]
3- There is no shadow-cljs start, server, or watch process going before trying to compile. The problem of an endless compile app happens just after re-starting my MacBook, on the CI, in others' people's PC... So, it is unrelated.
Ok. Now, let's talk about the macros... The project's main repository has 6 defmacros - according to a git grep (I have not checked the dependencies).
I am suspicious about one of the macros which is involved in reading files:
(defmacro slurp [file & [default]]
(if (.exists (io/file file))
(clojure.core/slurp file)
default))
Why does this exist? Well, it is being used inside another macro that reads config files related to environment variables:
(defmacro read-open-config [env-var]
(clojure.edn/read-string (slurp (str "config/" (System/getenv env-var) "-open.edn"))))
We have multiple config files for different purposes (I know it is not the standard practice...).
It feels like too much dynamicity...
First of all a few clarifications.
compile produces a development build, no optimizations are applied. Setting anything related to that does nothing.
CI systems should likely use release. This avoid including all the development related code (eg. highlight.js, re-frame-10x).
A dependency cannot "fire up a REPL". The shadow-cljs "server mode" provides it, which compile or release don't normally enter.
highlight.js errors would fail earlier with a visible error, you would not get to "Build completed."
This could be happening for a variety of reasons, very hard to debug without seeing the build config/setup. A few guesses:
You have a user.clj on the classpath, which starts additional "stuff" when the process is launched and this namespace is loaded? Clojure will load this unconditionally and shadow-cljs cannot prevent whatever that may do.
You have configured :build-hooks that launches additional stuff. It could launch an addition css watch process that doesn't exit?
You use a macro that launches additional stuff without shutting it down?
You previously started a shadow-cljs start, server or watch process? Coupled with the above that may then wait for the server to stop. On your local machine is the watch maybe still running? You can prevent the attempted server connect via npx shadow-cljs release build --force-spawn, but that is only useful if it is actually running.
Normally shadow-cljs will shut down after compile or release. I'm not aware of anyone ever having any issues with this after a successful build. The issue you linked is not related to yours, since that never got to a completed build.
After a long research process, it was discovered that the problem came from tests inside a private dependency. Apparently, some tests with asynchronous processes profile were holding the compilation process from finishing.
After commenting-out some olds tests, the compilation happened as expected.
When using CSScript.Evaluator.Reset(), will this reset anyone else that is currently using the same script or build?
Another way of asking my question.
script = CSScript.Evaluator.LoadCode(scriptString);
If another user came thru and called the same code, but an error occurred.
Would the original compile still be good and safe?
Also CSScript.Evaluator.AutoResetEvaluatorOnError = true, this does not seem to be working.
After i cause an error in the code, then fix it. The compile will not work til i reset my app, or use Reset().
Using Reset is working, But that's the reason for my initial question.
I am using CSScript.Evaluator.LoadCode and looking at CSScript.Evaluator.LoadMethod, but getting the same issues.
These are not huge scripts, but may run in large batches or loops.
Not against unique naming, cause i will have build for every run anyway.
But i'm not sure the cache is working either.
Where is the location of the cache folder, when CSScript.CacheEnabled is enabled?
Would the original compile still be good and safe?
Yes it would as when you hold the reference to the compiled object it is YOURS. I is good even if you destroy the compiler.
Also CSScript.Evaluator.AutoResetEvaluatorOnError = true, this does not seem to be working.
This setting triggers so cold SoftReset, which is different to the Reset only by re-referencing the assemblies and re-creating the CompilerSettings object. It the time of the initial implementation SoftReset was sufficient to fully clear Mono.Evaluator. I will need to check may be it is not the case any more. I will let you know the outcome.
Where is the location of the cache folder, when CSScript.CacheEnabled is enabled?
The caching doesn't cover Mono Evaluator as the all assemblies are in the memory and cannot be cached.
Everything that is accessed through CSScript.Evaluator.* applicable to the Mono compiler and everything that it accessed through CSScript.* is a CodeDOM compiler, which indeed implements caching mechanism.
In my attempt to compile GCC I noticed that while ./configure doesn't yield error messages and returns an error code of 0, there are still errors logged in config.log, which do later on cause make to fail. So, why doesn't configure fail already? Or does make modify config.log later on?
config.log contains the output of all configure probes. Some of them are expected to fail. For example, frequently Autoconf probes for several different possible alternative implementations of particular functionality, and some of them are expected to fail depending on the characteristics of your system.
It's therefore up to the author of the Autoconf configure.ac script to explicitly fail the configure step if the results are not viable. Some people do this when writing their configure.ac and some don't. Sometimes it can be quite hard to know at configure time whether a particular set of findings are viable. There's also a reasonable argument that it's easier to diagnose problems during the build, later on, than to issue an error message from configure and make people search through config.log for the details. That's particularly the case if the problems are relatively obscure.
The short answer is that configure didn't fail because the people who wrote the configure script you're running didn't program it to fail for the specific errors that you're seeing, for one reason or another.
I was wondering if there is a quick and effective way to remove all the unused variables (local, instance, even properties) in xcode... I am doing a code cleanup on my app and if I knew a quick way for code refactoring it would help me a lot...
Thanks...
It's being a long time since you made your question and maybe you found an answer already, but from an answer to a related question:
For static analysis, I strongly
recommend the Clang Static Analyzer
(which is happily built into Xcode 3.2
on Snow Leopard). Among all its other
virtues, this tool can trace code
paths an identify chunks of code that
cannot possibly be executed, and
should either be removed or the
surrounding code should be fixed so
that it can be called.
For dynamic analysis, I use gcov (with
unit testing) to identify which code
is actually executed. Coverage reports
(read with something like CoverStory)
reveal un-executed code, which —
coupled with manual examination and
testing — can help identify code that
may be dead. You do have to tweak some
setting and run gcov manually on your
binaries. I used this blog post to get
started.
Both methodologies are exactly for what you want, detecting unused code (both variables and methods) and removing them.
I have a workspace for running an H.263 Video Encoder in a loop for 31 times i.e. the main is executed 31 times to generate 31 different encoded bit streams. This MS Visual Studio 2005 Workspace has all C source files. When i create a "DEBUG" configuration for the workspace and build and execute it, it runs fine, i.e. it generates all the 31 output files as expected.
But when I set the configuration of the workspace to "RELEASE" mdoe, and repeat the process, the encoder crashes at some test case run.
Now to debug this is verified following:
Analyzed the code to see if there was any variable initialization being missed out in every run of the encoder
Checked the various Workspace(Solution) options in both the modes (DEBUG and RELEASE).
There are some obvious differences, but i turned the optimization related options explicitly same in both modes.
But still could not nail the problem and find a fix for that. Any pointers?
-Ajit.
It's hard to say what the problem might be without carefully inspecting the code. However...
One of the differences between debug and release builds is how the function call stack frame is set up. There are certain classes of bad things you can do (like calling a function with the wrong number of arguments) that are not fatal in a debug build but crash horribly in a release build. Perhaps you could try changing the stack frame related options (I forget what they're called, sorry) in the release build to the same as the debug build and see whether that helps.
Another thing might be to enable all the warnings you possibly can, and fix them all.
Could be a concurrency problem of two threads. The DEBUG configuration slows the execution down, so the problem does not occur. But, only a guess.
Interesting problem.. Are you sure you have no conditional compilation code lurking around that is not being compiled in release mode? i.e:
#if (DEBUG)
// Debug Code here
#else
// Release Code here
#endif
Thats the only thing I can really think of.. Never experienced anything like this myself..
Can you add the debug symbols to the release build and run it in the debugger to see where and why it crashed?
Yeah, those bastard crashes are the hardest to fix. Fortunatly, there are some steps you can do that will give you clues before you resort to manually looking at the code and hope to find the needle.
When does it crash? At every test? At a specific test? What does that test does that the others don't?
What's the error? If it's an access violation, is there a pattern to where it happens? If the addresses are low, it might mean there is an uninitialised pointer somewhere.
Is the program crashing with Debug configuration but without the debugger attached? If so, it's most likely a thread synchronisation problem as John Smithers pointed out.
Have you tried running the code through an analyser such as Purify? It's slow but it's usually worth the wait.
Try to debug the release configuration anyway. It will only dump assemblies but it can still give you an indication of what happens such as if the code pointer jumps in the middle of garbage or hits a breakpoint in an external library.
Are you on an Intel architecture? If not, watch for memory alignement errors, they hard crash without warning on some architectures and those codec algorithm tend to create those situations a lot since they are overly optimized.
Are you sure there are no precompile directives that, say, ignores some really important code in Release mode but allows them in Debug?
Also, have you implemented any logging that might point out to the precise assembly that's throwing the error?
I would look at the crash in more detail - if it's crashing in a test case, then it sounds pretty easily reproducible, which is usually most of the challenge.
Another thing to consider: in debug mode, the variables are initialized with 0xCCCCCCCC instead of zero. That might have some nasty side effects.