I have an application in which each test should run in a VM whose configuration has been conditioned by a test-specific System property (which can just be based on the test class name).
Something like this sort of works:
test {
forkEvery 1
maxParallelForks Runtime.runtime.availableProcessors()
beforeSuite { TestDescriptor descriptor ->
systemProperty( 'test.class.name', descriptor.getClassName() )
}
}
But it doesn't quite work. The names the forked JVMs see do change, but they aren't the ones I expect, and I suspect they aren't even deterministic. It seems as though there is one shared JavaForkOptions item, and that calls to beforeSuite aren't in a deterministic and exclusive way linked to the forking of a process for that suite, so the name a process gets might not match up with the one set in "its" beforeSuite call.
Is there a better way to do this, or some way to get more precise control over the forking process so that System properties could be set on a fork-specific data structure?
Thanks for any help!
Related
In order to discover Linux namespaces under certain conditions my open source Golang package lxkns needs to re-execute the application it is used in as a new child process in order to be able to switch mount namespaces before the Golang runtime spins up. The way Linux mount namespaces work makes it impossible to switch them from Golang applications after the runtime has spun up OS threads.
This means that the original process "P" re-runs a copy of itself as a child "C" (reexec package), passing a special indication via the child's environment which signals to the child to only run a specific "action" function belonging to the included "lxkns" package (see below for details), instead of running the whole application normally (avoiding endless recursively spawning children).
forkchild := exec.Command("/proc/self/exe")
forkchild.Start()
...
forkchild.Wait()
At the moment, I invoke the coverage tests from VisualStudio Code, which runs:
go test -timeout 30s -coverprofile=/tmp/vscode-goXXXXX/go-code-cover github.com/thediveo/lxkns
So, "P" re-executes a copy "C" of itself, and tells it to run some action "A", print some result to stdout, and then to immediately terminate. "P" waits for "C"'s output, parses it, and then continues in its program flow.
The module test uses Ginkgo/Gomega and a dedicated TestMain in order to catch when the test gets re-executed as a child in order to run only the requested "action" function.
package lxkns
import (
"os"
"testing"
. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"
"github.com/thediveo/gons/reexec"
)
func TestMain(m *testing.M) {
// Ensure that the registered handler is run in the re-executed child. This
// won't trigger the handler while we're in the parent, because the
// parent's Arg[0] won't match the name of our handler.
reexec.CheckAction()
os.Exit(m.Run())
}
func TestLinuxKernelNamespaces(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "lxkns package")
}
I would like to also create code coverage data from the re-executed child process.
Is it possible to enable code coverage from within the program under test itself, and how so?
Is it possible to then append the code coverage data written by the child to the coverage data of the parent process "P"?
Does the Golang runtime only write the coverage data at exit and does it overwrite the file specified, or does it append? (I would already be glad for a pointer to the corresponding runtime sources.)
Note: switching mount namespaces won't conflict with creating coverage files in the new mount namespaces in my test cases. The reason is that these test mount namespaces are copies of the initial mount namespaces, so creating a new file will show up also in the filesystem normally.
After #Volker's comment on my Q I knew I had to take the challenge and went straight for the source code of Go's testing package. While #marco.m's suggestion is helpful in many cases, it cannot handle my admittedly slightly bizare usecase. testing's mechanics relevant to my original question are as follows, heavily simplified:
cover.go: implements coverReport() which writes a coverage data file (in ASCII text format); if the file already exists (stale version from a previous run), then it will be truncated first. Please note that coverReport() has the annoying habit of printing some “statistics” information to os.Stdout.
testing.go:
gets the CLI arguments -test.coverprofile= and -test.outputdir= from os.Args (via the flags package). If also implements toOutputDir(path) which places cover profile files inside -test.outputdir if specified.
But when does coverReport() get called? Simply spoken, at the end of testing.M.Run().
Now with this knowledge under the belt, a crazy solutions starts to emerge, kind of "Go-ing Bad" ;)
Wrap testing.M in a special re-execution enabled version reexec.testing.M: it detects whether it is running with coverage enabled:
if it is the "parent" process P, then it runs the tests as normal, and then it collects coverage profile data files from re-executed child processes C and merges them into P's coverage profile data file.
while in P and when just about to re-execute a new child C, a new dedicated coverage profile data filename is allocated for the child C. C then gets the filename via its "personal" -test.coverprofile= CLI arg.
when in C, we run the desired action function. Next, we need to run an empty test set in order to trigger writing the coverage profile data for C. For this, the re-execution function in P adds a test.run= with a very special "Bielefeld test pattern" that will most likely result in an empty result. Remember, P will -- after it has run all its tests -- pick up the individual C coverage profile data files and merge them into P's.
when coverage profiling isn't enabled, then no special actions need to be taken.
The downside of this solution is that it depends on some un-guaranteed behavior of Go's testing with respect to how and when it writes code coverage reports. But since a Linux-kernel namespace discovery package already pushes Go probably even harder than Docker's libnetwork, that's just a quantum further over the edge.
To a test developer, the whole enchilada is hidden inside an "enhanced" rxtst.M wrapper.
import (
"testing"
rxtst "github.com/thediveo/gons/reexec/testing"
)
func TestMain(m *testing.M) {
// Ensure that the registered handler is run in the re-executed child.
// This won't trigger the handler while we're in the parent. We're using
// gons' very special coverage profiling support for re-execution.
mm := &rxtst.M{M: m}
os.Exit(mm.Run())
}
Running the whole lxkns test suite with coverage, preferably using go-acc (go accurate code coverage calculation), then shows in the screenshot below that the function discoverNsfsBindMounts() was run once (1). This function isn't directly called from anywhere in P. Instead, this function is registered and then run in a re-executed child C. Previously, no code coverage was reported for discoverNsfsBindMounts(), but now with the help of package github.com/thediveo/gons/reexec/testing code coverage for C is transparently merged in P's code coverage.
Recently our build times have been fluxuating wildly. Our baseline was between 1-2mins, but now it is sometimes taking up to 20 mins. I have been trying to track down the cause, but am having some trouble. It feels like the changes that trigger fluxuation in the build time are completely arbitrary, or at least I can not find the underlying connection between them.
Here is what the current build time looks like:
Now this obviously looks like the issue lies in Task Execution time, so lets look at that in more detail:
This indicates that the test task for the umlgen-transformer project is the culprit. The only difference between this project and the one that built in ~1min, is one tiny xml file. That xml file is read in by each of the unit tests as input. Here is what a typical test looks like:
#Test
def void testComponentNamesUpdated() {
loadModel("/testData/ProducerConsumer.uml")
val trafo = new ComponentTransformation
trafo.execute(model)
val componentNames = model.allOwnedElements.filter(Component).map[name]
assertThat(componentNames).contains(#{"ProducerComponent", "ConsumerComponent"})
}
(the .uml file is just a UML model stored as xml). There are several reasons why I don't think this is the underlying cause of our increases in build times: 1) the xml file is incredily small, ~200kb 2) the behavior in the unit tests is extremly simple and executes quite fast 3) when the unit tests are run through eclipse they execute in seconds 4) the build times of unrelated aspects, such as the time to compile kotlin code, have also seen a significant increase 5) this is not the only project we are having this issue with.
What I have tried:
Setting org.gradle.parallel = true
Increasing the memory of gradle deamon to 3gb
Updating to gradle 5
Clearing all gradle caches
Building without the gradle daemon
Simplifying the build.gradle scripts. I have removed several plugins and tasks to rule them out as possible causes
I have only been using Gradle for a couple months now, so I am hoping to get feedback on more conclusive ways to find out what gradle is doing (or to find some way to cross it off the list of possible causes). Please let me know if any additional information would be helpful.
My Gradle build has two task:
findRevision(type: SvnInfo)
buildWAR(type: MavenExec, dependsOn: findRevision)
Both tasks are configuration based, but the buildWAR task depends on a project property that is only defined in the execution phase of the findRevision task.
This breaks the process, as Gradle cannot find said property at the time it tries to configure the buildWAR task.
Is there any way to delay binding or configuration until another task has executed?
In this specific case I can make use of the mavenexec method instead of the MavenExec task type, but what should be done in similar scenarios where no alternative method exists?
Depending on what configuration option exactly you want to change, you might change it in the execution phase of the task with buildWAR.doFirst { }. But generally this is a really bad idea. If you e. g. change something that influences the result of the UP-TO-DATE checks like input files for example, the task might execute though it would not be necessary or even worse do not execute thoug it would be necessary. You can of course make the task always execute to overcome this with outputs.upToDateWhen { false }, but there might be other problems and also this way you disable one of Gradles biggest strenghts.
It is a much better idea to redesign your build so that this is not necessary. For example determining the revision at configuration time already. Depending on whether the task needs much time this might be a viable solution or not. Also depending on what you want to do with the revision, you might consider the suggestion of #LanceJava and make your findRevision task generate a file with the revision in it that is then packaged into the WAR and used at runtime.
After building my final output file with Gradle I want to do 2 things. Update a local version.properties file and copy the final output final to some specific directory for archiving. Let's assume I already have 2 methods implemented that do exactly what I just described, updateVersionProperties() and archiveOutputFile().
I'm know wondering what's the best way to do this...
Alternative A:
assembleRelease.doLast {
updateVersionProperties()
archiveOutputFile()
}
Alternative B:
task myBuildTask(dependsOn: assembleRelease) << {
updateVersionProperties()
archiveOutputFile()
}
And here I would call myBuildTask instead of assembleRelease as in alternative A.
Which one is the recommended way of doing this and why? Is there any advantage of one over the other? Would like some clarification please... :)
Whenever you can, model new activities as separate tasks. (In your case, you might add two more tasks.) This has many advantages:
Better feedback as to which activity is currently executing or failed
Ability to declare task inputs and outputs (reaping all benefits that come from this)
Ability to reuse existing task types
More possibilities for Gradle to execute tasks in parallel
Etc.
Sometimes it isn't easily possible to model an activity as a separate task. (One example is when it's necessary to post-process the outputs of an existing task in-place. Doing this in a separate task would result in the original task never being up-to-date on subsequent runs.) Only then the activity should be attached to an existing task with doLast.
Too often people write tests that don't clean up after themselves when they mess with state. Often this doesn't matter since objects tend to be torn down and recreated for most tests, but there are some unfortunate cases where there's global state on objects that persist for the entire test run, and when you run tests, that depend on and modify that global state, in a certain order, they fail.
These tests and possibly implementations obviously need to be fixed, but it's a pain to try to figure out what's causing the failure when the tests that affect each other may not be the only things in the full test suite. It's especially difficult when it's not initially clear that the failures are order dependent, and may fail intermittently or on one machine but not another. For example:
rspec test1_spec.rb test2_spec.rb # failures in test2
rspec test2_spec.rb test1_spec.rb # no failures
In RSpec 1 there were some options (--reverse, --loadby) for ordering test runs, but those have disappeared in RSpec 2 and were only minimally helpful in debugging these issues anyway.
I'm not sure of the ordering that either RSpec 1 or RSpec 2 use by default, but one custom designed test suite I used in the past randomly ordered the tests on every run so that these failures came to light more quickly. In the test output the seed that was used to determine ordering was printed with the results so that it was easy to reproduce the failures even if you had to do some work to narrow down the individual tests in the suite that were causing them. There were then options that allowed you to start and stop at any given test file in the order, which allowed you to easily do a binary search to find the problem tests.
I have not found any such utilities in RSpec, so I'm asking here: What are some good ways people have found to debug these types of order dependent test failures?
There is now a --bisect flag that will find the minimum set of tests to run to reproduce the failure. Try:
$ rspec --bisect=verbose
It might also be useful to use the --fail-fast flag with it.
I wouldn't say I have a good answer, and I'd love to here some better solutions than mine. That said...
The only real technique I have for debugging these issues is adding a global (via spec_helper) hook for printing some aspect of database state (my usual culprit) before and after each test (conditioned to check if I care or not). A recent example was adding something like this to my spec_helper.rb.
Spec::Runner.configure do |config|
config.before(:each) do
$label_count = Label.count
end
config.after(:each) do
label_diff = Label.count - $label_count
$label_count = Label.count
puts "#{self.class.description} #{description} altered label count by #{label_diff}" if label_diff != 0
end
end
We have a single test in our Continuous Integration setup that globs the spec/ directory of a Rails app and runs each of them against each other.
Takes a lot of time but we found 5 or 6 dependencies that way.
Here is some quick dirty script I wrote to debug order-dependent failure - https://gist.github.com/biomancer/ddf59bd841dbf0c448f7
It consists of 2 parts.
First one is intended to run rspec suit multiple times with different seed and dump results to rspec_[ok|fail]_[seed].txt files in current directory to gather stats.
The second part is iterating through all these files, extracts test group names and analyzes their position to the affected test to make assumptions about dependencies and forms some 'risk' groups - safe, unsafe, etc. The script output explains other details and group meanings.
This script will work correctly only for simple dependencies and only if the affected test is failing for some seeds and passes for another ones, but I think it's still better than nothing.
In my case it was complex dependency when effect could be cancelled by another test but this script helped me to get directions after running its analyze part multiple times on different sets of dumps, specifically only on the failed ones (I just moved 'ok' dumps out of current directory).
Found my own question 4 years later, and now rspec has a --order flag that lets you set random order, and if you get order dependent failures reproduce the order with --seed 123 where the seed is printed out on every spec run.
https://www.relishapp.com/rspec/rspec-core/v/2-13/docs/command-line/order-new-in-rspec-core-2-8
Its most likely some state persisting between tests so make sure your database and any other data stores (include class var's and globals) are reset after every test. The database_cleaner gem might help.
Rspec Search and Destroy
is meant to help with this problem.
https://github.com/shepmaster/rspec-search-and-destroy