How do I avoid liquibase running multiple times during JUNIT Test cases - spring

I have an application and I'm managing database with Liquibase. While I was looking at my package process during build jobs, I noticed that the Liquibase is being run multiple times. I'm trying to optimize my application code so that the build gets done faster. I've already implemented parallel test case running and I got my time reduced from 18 mins to 10 mins which is good. I would still like to optimize it to be faster. I noticed that my Liquibase is running multiple times to set up a H2 database. How can I optimize this process of test cases so that Liquibase is run only once and I have test cases running on it.

Related

Hibernate: DB not reliably rolled back at end of UnitTests in spite of #Transactional

We have a large application using Spring for application setup, initialisation and "wiring" and Hibernate as persistence framework. For that application we have a couple of unit tests which are causing us headaches because they again and again run "red" when executing them on our Jenkins build server.
These UnitTests execute and verify some rather complex and lengthy core-operations of our application and thus we considered it too complex and too much effort to mock the DB. Instead these UTs run against a real DB. Before the UTs are executed we create the objects required (the "pre-conditions"). Then we run a Test and then we verify the creation of certain objects, their status and values etc. All plain vanilla...
Since we run multiple tests in sequence which all need the same starting point these tests derive from a common parent class that has an #Transactional annotation. The purpose of that is that the DB is always rolled back after each unit-test so that the subsequent test can start from the same baseline.
That approach is working perfectly and reliably when executing the unit-tests "locally" (i.e. running a "mvn verify" on a developer's workstation). However, when we execute the very same tests on our Jenkins, then - not always but very often - these tests fail because there are too many objects being found after a test or due to constraint violations because certain objects already exist that shouldn't yet be there.
As we found out by adding lots of log-statements (because it's otherwise impossible to observe code running on Jenkins) the reason for these failures is, that the DB is occasionally not properly rolled back after a prior test. Thus there are left-overs from the previous test(s) in the DB and these then cause issue during subsequent tests.
What's puzzling us most is:
why are these tests failing ONLY when we execute them on Jenkins, but never when we run the very same tests locally? We are using absolute identical maven command line and code here, also same Java version, Maven version, etc.
We are by now sure that this has nothing to do with UTs being executed in parallel as we initially suspected. We disabled all options to run UTs in parallel, that the Maven Surefire plugin offers. Our log-statements also clearly show that the tests are perfectly serialized but again and again objects "pile up", i.e. after each test-method, the number of these objects that were supposed to have been removed/rolled back at the end of the test, are still there and their number increases which each test.
We also observed a certain "randomness" with this effect. Often, the Jenkins builds run fine for several commits and then suddenly (even without any code change, just by retriggering a new build of the same branch) start to run red. The DB, however, is getting re-initialized before each build & test-run, so that can not be the source of this effect.
Any idea anyone what could cause this? Why do the DB rollbacks that are supposed to be triggered by the #org.springframework.transaction.annotation.Transactional annotation work reliable on our laptops but not on our build server? Any similar experiences and findings on that anyone?

How to retry only failed tests in the CI job run on Gitlab?

Our automation tests run in gitlab CI environment. We have a regression suite of around 80 tests.
If a test fails due to some intermittent issue, the CI job fails and since the next stage is dependent on the Regression one, the pipeline gets blocked.
We retry the job to rerun regression suite expecting this time it will pass, but some other test fails this time.
So, my question is:
Is there any capability using which on retrying the failed CI job, only the failed tests run (Not the whole suite)?
You can use the retry keyword when you specify the parameters for a job, to define how many times the job can be automatically retried: https://docs.gitlab.com/ee/ci/yaml/#configuration-parameters
[Retry Only Failed Scenarios]
Yes, but it depends. let me explain. I'll mention the psuedo-steps which can be performed to retry only failed scenarios. The steps are specific to pytest, but can be modified depending on the test-runner.
Execute the test scenarios with --last-failed. At first, all 80 scenarios will be executed.
The test-runner creates a metadata file containing a list of failed tests. for example, pytest creates a folder .pytest_cache containing lastfailed file with the list of failed scenarios.
We now have to add the .pytest_cache folder in the GitLab cache with the key=<gitlab-pipeline-id>.
User checks that there are 5 failures and reruns the failed job.
When the job is retried it will see that now .pytest_cache folder exists in the GitLab cache and will copy the folder to your test-running directory. (shouldn't fail if the cache doesn't exist to handle the 1st execution)
you execute the same test cases with the same parameter --last-failed to execute the tests which were failed earlier.
In the rerun, 5 test cases will be executed.
Assumptions:
The test runner you are using creates a metadata file like pytest.
POC Required:
I have not done POC for this but in theory, it looks possible. The only doubt I have is how Gitlab parses the results. Ideally in the final result, all 80 scenarios should be pass. If it doesn't work out this way, then we have to have 2 jobs. execute tests -> [manual] execute failed tests to get 2 parsed results. I am sure with 2 stages, it will definitely work.
You can use Retry Analyser. This will help you definitely.

Integration test execution should wait until server is ready

I have written Selenium tests which should be executed during the build process of an web application. I am using the maven-failsafe-plugin to execute the integration tests and the tomcat7-maven-plugin to start up a tomcat server in the pre-integration-test phase and after the execution of the tests it gets stopped in the post-integration-test phase. This works fine.
The problem is that the tomcat server is caching some data when started up to improve the search speed. Some of my tests rely on that data, so the integration tests should wait for the server to finish caching the data.
How can I make that happen?
I added a process bar to show the loading progress. Once the loading is complete the process bar is not rendered anymore and the data table will be rendered. In this way I can add to the tests which depend on the data table to be loaded this line of code:
longWait.until(ExpectedConditions.presenceOfElementLocated(By.id("dataTablePanel")));
Additionally I am using org.junit.runners.Suite as a runner so that I can specify the order of how my test classes will be executed. Thereby I can execute the test which do not rely on the data first and then the ones which need it. To ensure that the data is present and I don't need to check that in every test case, I have created a test class which will only check the presence of the data and will be executed before all test cases which depend on the data.

Run Scenarios with time constrain on Jenkins using ruby cucumber framework

I have two scenarios, scenario 1 should run at 5:00 PM everyday. Gather the DB results from it and save it on yaml file. I have to use these results to run scenario 2 which needs to be run next day at 8AM. How can I implement this on Jenkins.
Write now I am trying to implement through rake task. My logic is if the yaml file is present run scenario 2 or else run scenario 1.
Is there a better way to do it?
I need these two scenarios on the same jenkins job.

Spring Batch: Horizontal scaling of Job Repository

I read a lot about how to enable parallel processing and chunking of an individual job, using Master/Slave paradigm. Consider an already implemented Spring Batch solution that was intended to run on a standalone server. With minimal refactoring I would like to enable this to horizontally scale and be more resilient in production operation. Speed and efficiency is not a goal.
http://www.mkyong.com/spring-batch/spring-batch-hello-world-example/
In the following example a Job Repository is used that connects to an initializes a database schema for the Job Repository. Job initiation requests are fed to a message queue, that a single server, with a single Java process is listening on via Spring JMS. When encountering this it executes a new Java process that is the Spring Batch job. If the job has not been started according to the Job Repository it will begin. If the job had failed it will pick up where the job left off. If the job is in process it will ignore.
The single point of failure is the single server and single listening process for job initiation. I would like to increase resiliency by horizontally scaling identical server instances all competing for who can first grab the job initiation message when it first appears in the queue. That server instance will now attempt to run the job.
I was conceiving that all instances of the JobRepository would share the same schema, so they can all query for when the status is currently in process and decide what they will do. I am unsure though if this schema or JobRepository implementation is meant to be utilized by multiple instances.
Is there a risk in pursuing this that this approach could result in deadlocking the database? There are other constraints to where the Partition features of Spring Batch will not work for my application.
I decided to build a prototype to test if the condition that the Spring Batch Job Repository schema and SimpleJobRepository can be used in a load balanced way with multiple Spring Batch Java processes running concurrently. I was afraid that deadlock scenarios might have occurred at the database to where all running job processes get stuck.
My Test
I started with the mkyong Spring Batch HelloWorld example and made some changes to it where it could be packaged into a Jar that can be executed from the command line. I also removed the initialize database step defined in the database.config file and manually established a local MySQL server with the proper schema elements. I added a Job parameter for time to be the current time in millis so that each job instance would be unique.
Next, I wrote a separate Java main class that used Apache Commons Exec framework to create 50 sub processes with no wait between them. Each of these processes have a Thread.sleep for 1 second within their Processor objects as well so that a number of processes will all kick off at the same time and all attempt to access the database at the same time.
Results
After running this test a number of times in a row I see that all 50 Spring batch processes consistently complete successfully and update the same database schema correctly. I don't see any indication that if there were multiple Spring Batch job processes running on multiple servers connecting to the same database that they would interfere with each other on the schema nor do I see any indication that a deadlock could happen at this time.
So it sounds as if load balancing of Spring Batch jobs without the use of advanced Master/Slave and Step Partitioning approaches is a valid use case.
If anybody would like to comment on my test or suggest ways to improve it I would appreciate it.
Here is excerpt from
Spring Batch docs on how Spring Batch handles database updates for its repository:
Spring Batch employs an optimistic locking strategy when dealing with updates to the database. This means that each time a record is 'touched' (updated) the value in the version column is incremented by one. When the repository goes back to save the value, if the version number has changed it throws an OptimisticLockingFailureException, indicating there has been an error with concurrent access. This check is necessary, since, even though different batch jobs may be running in different machines, they all use the same database tables.

Resources