How to loop a Maven execution? - maven

On a maven project, on process-test-resources phase I set up the database schemas with sql-maven-plugin. On this project that are N database shards which I set up with N repeated with exactly the same content bar the database name. Everything works as expected.
Problem here is that with a growing number of shards the number of similar blocks grows, which is cumbersome and makes maintenance annoying (since, per definition, all of those databases are literally the same). I would like to be able to define a "list" of database names and let sql-maven-plugin run once for each, without having to define the whole block many times.
I'm not looking for changes in the test setup as I positively want to setup as many shards as needed on the test environment. I need solely some "maven sugar" for conveniently define the over which values the executions should "loop".
I understand that maven itself does not support iteration by itself and am looking for alternatives or ideas of how to better achieve this. Things that come to my mind are:
Using/writing a "loop" plugin that manages the multiple parameterized executions
Extending sql-maven-plugin to support my use case
???
Does anyone has a better/cleaner solution?
Thanks in advance.

In this case i would recommend to use the maven-antrun-plugin to handle this situation, but of course it also possible to implement a particular maven plugin for this kind of purpose.

Related

The Clean Architecture, usecase dependencies

Recently, I found my way to The Clean Architecture post by Uncle Bob. But when I tried to apply it to a current project, I got stuck when a usecase needed to depend on another usecase.
For example, my Domain Model is Goal and Task. One Goal can have many Tasks. When I update a Task, it needs to update the information of its parent Goal. In other words, UpdateTask usecase will have UpdateGoal usecase as a dependecy. I am not sure if this is acceptable, or, if we should avoid usecase level dependencies.
A use case is related to a functionality of your application. Generally when we need to invoke from one use case to another there is something that does not work.
When you update a goal in isolation, it is not the same scenario as when you update it by a change in a task, in fact, it is sure that not all data is updated, but a part.
Surely you will have to use the goal repository and the goal entity but it is a completely different scenario. In your case you are not duplicating logic, only calls to the repository or the entity, saving code lines can be expensive in the future.
In short, it is not a good idea to have dependence between use cases.

Can you define the step order in the spring-batch dynamically?

I don't speak English that well. Please understand.
I use the Spring-boot and Spring-batch. It continues a project.
I know that beans related to a batch while all beans go up to the server early days same time go up together.
it was defined as A->B->C. If it is the case.
Is there the way in which I change into B->A->C this type and I can put into practice?
By using JobExecutionDecider, I was going to look for the way but as a step increases, the problem that the coding range is expanded is predicted.
When there are the StepA, StepB, and StepC for example.
A->B->C, A->C->B, B->A->C, B->C->A, C->A->B, C->B->A The number of 3!(=6) If I happen and get to do this for hanger.
.start(decider).on("A").to(StepA())
.next(decider).on("B").to(StepB()).next(StepC()).end()
.from(decider).on("C").to(StepC()).next(StepB()).end()
.from(decider).on("B").to(StepB())
.....
.....
Should you describe all case of 3!(=6) in this way?
The more if that happens, the number of Step increases, I increase by an exponential...
Doesn't something have the better way?

gradle has no task ordering so how to achieve this

So if we have a drop db task and a create db task and a start server task and a runqatest task and we want to
have independent tasks so I can just call gradle dropdb by itself(or the others as well)
have the runqatest depend on dropdb, createdb, populatedb, startserver
Number 2 above obviously needs to be ordered or will break and gradle does not abide by any order like ant does. How to achieve this? (I have read plenty about this on this post
http://markmail.org/thread/wn6ifkng6k7os4qn#query:+page:1+mid:hxibzgim5yjdxl7q+state:results
though the one user is wrong on it not being deterministic when you have
1. e depend on c and d
2. c depend on b,a
3. d depend on a,b
since e decides c will be first, the build would run b,a,c,d so it is completely deterministic. I do agree that parallelizing a build is much harder if you have order though like ant does as you can't just run c and d in parallel as order matters(and it's worse as from a user perspective, it does not matter most of the time).
If only they would add a dependsOnOrdered so we can do order when absolutely necessary.
OR does anyone know what is the way we are supposed to do this? The issue was filed against gradle in 2009!!!! I still see no documentation in gradle on how to do ordered stuff when needed.
Dean
Here is one solution:
if (gradle.startParameter.taskNames.contains("qatest") {
qatest.dependsOn startServer
startServer.dependsOn populatedb
populatedb.dependsOn createdb
createdb.dependson dropdb
}
The limitation of this approach is that it only works if qatest is part of the initial tasks provided on the command line. Sometimes this is good enough, and you can add a check to make sure that users don't go wrong.
If you need this more often, you can add a little helper method that makes it easier to declare such a workflow. Something like workflow(qatest, dropdb, createdb, populatedb, startserver).
Another approach is to create "clones" of the tasks, and add task dependencies (only) between the clones. Again, you could hide this behind a little abstraction. For example, createWorkflowTask("startServer") { ... } could create and configure both a startServer and a startServerWorkflow task.
In summary, the programmability of Gradle makes it possible to overcome the problem that "workflow" isn't yet a first-class concept in Gradle.
Gradle 1.6 added an alternative solution, but it's still incubating: mustRunAfter. See the release notes.

Unit Testing highly interdependent code

So I have some challenging code I would like to refactor. The challenge is that it depends on Database queries, EJB and Java serverFaces. Not simultaneously but close to it.
A good example would be a geocoder. Getting meaningful results depending on multiple queries to the DB depending on the data entered and stored. The code might also reference other helper classes and look them up via the JSF framework.
What are the best strategies for testing this sort of code? Should I try to separate out my code as much as possible? Should I use mocking instead? What has worked for other people?
Well, the short answer is "yes".
You're going to need, first of all, to factor the code sufficiently to construct unit tests at all. What you're describing is excessively complicated to apply the usual unit test methods, and what you would get in any case is more like a higher-level acceptance test.
Now, as far as that factoring goes, you have several possible approaches, and you will probably use them all.
Test the data base queries themselves, using an external script.
Construct an appropriate mock for the components directly accessing the DB, in order to see what happens against known results.
Build unit tests using a JUnit like framework for units of functionality.
Examine the state of the art to see if you can usefully test the output HTML against unit tests.

Is it bad practice to run tests on a database instead of on fake repositories?

I know what the advantages are and I use fake data when I am working with more complex systems.
What if I am developing something simple and I can easily set up my environment in a real database and the data being accessed is so small that the access time is not a factor, and I am only running a few tests.
Is it still important to create fake data or can I forget the extra coding and skip right to the real thing?
When I said real database I do not mean a production database, I mean a test database, but using a real live DBMS and the same schema as the real database.
The reasons to use fake data instead of a real DB are:
Speed. If your tests are slow you aren't going to run them. Mocking the DB can make your tests run much faster than they otherwise might.
Control. Your tests need to be the sole source of your test data. When you use fake data, your tests choose which fakes you will be using. So there is no chance that your tests are spoiled because someone left the DB in an unfamiliar state.
Order Independence. We want our tests to be runnable in any order at all. The input of one test should not depend on the output of another. When your tests control the test data, the tests can be independent of each other.
Environment Independence. Your tests should be runnable in any environment. You should be able to run them while on the train, or in a plane, or at home, or at work. They should not depend on external services. When you use fake data, you don't need an external DB.
Now, if you are building a small little application, and by using a real DB (like MySQL) you can achieve the above goals, then by all means use the DB. I do. But make no mistake, as your application grows you will eventually be faced with the need to mock out the DB. That's OK, do it when you need to. YAGNI. Just make sure you DO do it WHEN you need to. If you let it go, you'll pay.
It sort of depends what you want to test. Often you want to test the actual logic in your code not the data in the database, so setting up a complete database just to run your tests is a waste of time.
Also consider the amount of work that goes into maintaining your tests and testdatabase. Testing your code with a database often means your are testing your application as a whole instead of the different parts in isolation. This often result in a lot of work keeping both the database and tests in sync.
And the last problem is that the test should run in isolation so each test should either run on its own version of the database or leave it in exactly the same state as it was before the test ran. This includes the state after a failed test.
Having said that, if you really want to test on your database you can. There are tools that help setting up and tearing down a database, like dbunit.
I've seen people trying to create unit test like this, but almost always it turns out to be much more work then it is actually worth. Most abandoned it halfway during the project, most abandoning ttd completely during the project, thinking the experience transfer to unit testing in general.
So I would recommend keeping tests simple and isolated and encapsulate your code good enough it becomes possible to test your code in isolation.
As far as the Real DB does not get in your way, and you can go faster that way, I would be pragmatic and go for it.
In unit-test, the "test" is more important than the "unit".
I think it depends on whether your queries are fixed inside the repository (the better option, IMO), or whether the repository exposes composable queries; for example - if you have a repository method:
IQueryable<Customer> GetCustomers() {...}
Then your UI could request:
var foo = GetCustomers().Where(x=>SomeUnmappedFunction(x));
bool SomeUnmappedFunction(Customer customer) {
return customer.RegionId == 12345 && customer.Name.StartsWith("foo");
}
This will pass for an object-based fake repo, but will fail for actual db implementations. Of course, you can nullify this by having the repository handle all queries internally (no external composition); for example:
Customer[] GetCustomers(int? regionId, string nameStartsWith, ...) {...}
Because this can't be composed, you can check the DB and the UI independently. With composable queries, you are forced to use integration tests throughout if you want it to be useful.
It rather depends on whether the DB is automatically set up by the test, also whether the database is isolated from other developers.
At the moment it may not be a problem (e.g. only one developer). However (for manual database setup) setting up the database is an extra impediment for running tests, and this is a very bad thing.
If you're just writing a simple one-off application that you absolutely know will not grow, I think a lot of "best practices" just go right out the window.
You don't need to use DI/IOC or have unit tests or mock out your db access if all you're writing is a simple "Contact Us" form. However, where to draw the line between a "simple" app and a "complex" one is difficult.
In other words, use your best judgment as there is no hard-and-set answer to this.
It is ok to do that for the scenario, as long as you don't see them as "unit" tests. Those would be integration tests. You also want to consider if you will be manually testing through the UI again and again, as you might just automated your smoke tests instead. Given that, you might even consider not doing the integration tests at all, and just work at the functional/ui tests level (as they will already be covering the integration).
As others as pointed out, it is hard to draw the line on complex/non complex, and you would usually now when it is too late :(. If you are already used to doing them, I am sure you won't get much overhead. If that were not the case, you could learn from it :)
Assuming that you want to automate this, the most important thing is that you can programmatically generate your initial condition. It sounds like that's the case, and even better you're testing real world data.
However, there are a few drawbacks:
Your real database might not cover certain conditions in your code. If you have fake data, you cause that behavior to happen.
And as you point out, you have a simple application; when it becomes less simple, you'll want to have tests that you can categorize as unit tests and system tests. The unit tests should target a simple piece of functionality, which will be much easier to do with fake data.
One advantage of fake repositories is that your regression / unit testing is consistent since you can expect the same results for the same queries. This makes it easier to build certain unit tests.
There are several disadvantages if your code (if not read-query only) modifies data:
- If you have an error in your code (which is probably why you're testing), you could end up breaking the production database. Even if you didn't break it.
- if the production database changes over time and especially while your code is executing, you may lose track of the test materials that you added and have a hard time later cleaning it out of the database.
- Production queries from other systems accessing the database may treat your test data as real data and this can corrupt results of important business processes somewhere down the road. For example, even if you marked your data with a certain flag or prefix, can you assure that anyone accessing the database will adhere to this schema?
Also, some databases are regulated by privacy laws, so depending on your contract and who owns the main DB, you may or may not be legally allowed to access real data.
If you need to run on a production database, I would recommend running on a copy which you can easily create during of-peak hours.
It's a really simple application, and you can't see it growing, I see no problem running your tests on a real DB. If, however, you think this application will grow, it's important that you account for that in your tests.
Keep everything as simple as you can, and if you require more flexible testing later on, make it so. Plan ahead though, because you don't want to have a huge application in 3 years that relies on old and hacky (for a large application) tests.
The downsides to running tests against your database is lack of speed and the complexity for setting up your database state before running tests.
If you have control over this there is no problem in running the tests directly against the database; it's actually a good approach because it simulates your final product better than running against fake data. The key is to have a pragmatic approach and see best practice as guidelines and not rules.

Resources