Apache Storm 1.1.0 - apache-storm

I have a topology that has been created using Apache storm. I have created different Bolts and spout in place to perform various activities. However, I am planning on creating an automated test suite for testing the functionalities of the topology.
Can some help me if you have done a similar thing or suggest me a tool/language that helps me to achieve it? Currently, my topology is in Java.

The fastests tests are going to be unit tests. If you can write most of your business logic in a way that is decoupled from Storm's APIs, you can write your tests as regular JUnit tests, maybe with Mockito or a similar tool to stub collaborators. Basically your standard Java unit tests.
For integration testing where you need to check whether you're using Storm correctly, or need to do a full end to end test of your topology, you can look at the org.apache.storm.Testing class, which helps you start a LocalCluster. There are some examples at https://github.com/xumingming/storm-lib/blob/master/src/jvm/storm/TestingApiDemo.java. The basic idea is to boot up Storm in the same JVM as the test, and then deploy the topology into it.
As of 2.0.0 there's a LocalCluster builder class making it a little easier to instantiate LocalCluster from Java.
Just to give an idea of what LocalCluster offers:
Can run your topology in the same process as your tests
You can enable tuple tracking, which causes the cluster to track all emitted tuples from all components. This lets you e.g. assert that a certain tuple was emitted from a certain component.
Lets you replace your spouts with stubs. This can let you inject specific tuples into the topology easily, e.g. using a FixedTupleSpout or FeederSpout.
Lets you do assertions about which tuples were acked/failed
Some stubbed spouts are completable spouts, which means they have an API to indicate to Storm when they have emitted and acked/failed all tuples. This can let you e.g. start a topology in the test and ask for Storm to run it until all tuples are processed. This makes it easier to write tests that are not flaky, since you don't need to know how long it takes to finish processing your tuples.
For other examples of how to use LocalCluster, you can take a look at some of our own integration tests at https://github.com/apache/storm/blob/8f49e06998abb4dfc50f51d78b6784ebd04844fb/storm-core/test/jvm/org/apache/storm/integration/TopologyIntegrationTest.java. Please ignore the way the topologies are wired in these tests, you should just use TopologyBuilder in your own tests.

Related

Is it worth implementing service integration tests in Spring Boot application?

I have come accross multiple articles on integration testing on Spring Boot applications. Given that the application follows three layer pattern (Web Layer - Service Layer - Repository Layer) I have not seen a single article with integration testing the application up to just the service layer (ommiting the web layer) where all the business logic is contained. All of the integration tests seem like controller unit tests - mostly veryfing only request and response payloads, parameters etc.
What I would like however is to verify the business logic using service integration tests. Since the web layer is responsible only for taking the results from services and exchanging them with the client I think this makes much more sense. Such tests could also contain some database state verifications after running services to e.g. ensure that there are no detached leftovers.
Since I have never seen such a test, is it a good practice to implement one? If no, then why?
There is no one true proper way to test Spring applications. A general approach is as you described:
slices tests (#DataJpaTest, #WebMvcTest) etc for components that heavily rely on Spring
unit tests for domain classes and service layer
small amount of e2e tests (#SpringBootTest) to see if everything is working together properly
Spotify engineers on the other hand wrote how they don't do almost any unit testing and everything is covered with integration tests that covered with integration tests.
There is nothing stopping you from using #SpringBootTest and test your service layer with all underlying components. There are things you need to consider:
it is harder to prepare test data (or put system under certain state), as you need to put them into the database
you need to clean the database by yourself, as (#SpringBootTest) does not rollback transactions
it is harder to test edge cases
you need to mock external HTTP services with things like Wiremock - which is also harder than using regular Mockito
you need to take care of how many application contexts you create during tests - first that it's slow, second each application context will connect to the database, so you will create X connections per context and eventually you can reach limits of your database server.
This is borderline opinion-based, but still, I will share my take on this.
I usually follow Mike Cohn's original test pyramid such as depicted below.
The reason is that unit tests are not only easier to write but also faster and most likely cover much more than other more granular tests.
Then we come across the service or integration tests, the ones you mention in your question. They are usually harder to write simply because you are now testing the whole application and not only a single class and take longer to run. The benefit is that you are able to test a given scenario and most probably they do not require as much maintenance as the unit tests when you need to change something in your code.
However, and here comes the opinion part, I usually prefer to focus much more on writing good and extensive unit tests (but not too much on test coverage and more on what I expect from that class) than on fully-fledged integration tests. What I do like to do is take advantage of Spring Slice Tests which in the pyramid would be placed between the Unit Tests and the Service Tests. They allow you to focus on a specific class (a Controller for example) but they also allow you to test some integration with the underlying Spring Framework or infrastructure. This is for me the best of both worlds. You can still focus on a single class but also test some relevant components of your application. You can test your web layer with #WebMvcTest or #WebFluxTest (so that you can test JSON deserialization and serialization, bean validation, etc...), or you can focus on your persistence layer with #DataJpaTest, #JdbcTest or #DataMongoTest (so that you can test the actual persistence and retrieval of data).
Wrapping up, I usually write a bunch of Unit Tests and then web layer tests to check my Controllers and also some persistence layer tests against a real database.
You can read more in the following interesting online resources:
https://martinfowler.com/articles/practical-test-pyramid.html
https://www.baeldung.com/spring-tests

Is it advisable to have sequential integration tests?

I'm new to integration tests, and currently doing it with SpringBootTest.
Roughly what I'm gathering from examples is that each method would be one integration test (corresponds to one REST call).
But what if I want to test a scenario where it's a sequence of steps? Like Create User->Update User->Delete User.
Maybe that's not called an integration test? And if so, how do I chain these inside SpringBootTest?
Well, it is okay to have a testing order at that level of testing, what I mean with level is this:
Unit Testing -> Component Testing -> Integration Testing -> end to end testing.
As you move to the right, the tests are more complex to set up and execute.
In my opinion, the tests you describe are Integration Test, so, having order is fine, but, you should try to avoid adding complexity, for instance, using a mock in-memory database like H2, and populate it when you are testing, helps a lot.
As the database is in memory, you won't need to take care of cleaning or restoring the state of that database, the data just will be gone after the testing finishes.
Now, you need to take care of the order of the test methods. JUnit5 uses a new annotation #TestMethodOrder and JUnit4 uses #FixMethodOrder that is not pretty customizable, you can find more information here
And finally, I suggest using something more BDD related like Cucumber for that kind of tests

Activiti vs Spring batch

I have got a use case to implement. It's basically a workflow kind of use case. Below is the requirements
Extract and import data from an external db to an internal db
Make this imported data into different formats and supply it to multiple external systems and invoke some script there. The external interfaces are SFTP, SOAP, JDBC, Python over CORBA. There are around 14 external systems with one of these interfaces.
Interface transactions are executed in around 15 steps, with the ability to run some steps in parallel
These steps should be configurable. ie, a particular flow may execute 10 of these 15 steps and another flow executes 15 of 15 steps
Should have the ability to restart each step individually or restart from a particular step
There are some steps that are manual and completion of manual step should trigger next step
Volume of data is not that large. Total data size is around 400k records. But this process is executing for around 30k records at a time. Time for development is less and we are looking for some light weight easy to learn and implement solution.
We are looking for Spring based or Spring integratable solutions.
The solutions we considered are
For workflow:
Activiti, Spring Batch
For interfaces:
Spring Integration
My question is
Can Spring batch considered for managing a work flow kind of use case? I don't think it's a best fit use case for Spring Batch but as its simple and easy to implement looked for its scope. We considered doing the interfaces interaction as each step in a batch job and inside the tasklet do the Spring Integration for external interfaces, with few issues as far as I understand are
a) Dynamic step configuration can be done with Java configuration, but how flexible it is and is it recommended?
b) Manual step processing is not possible in Spring Batch
Is there any work around for this? Is there any other issues or performance impacts on doing this?
Activiti seems to a solution. Can you please provide some feedback on Activiti with Spring and Spring integration for this use case and ease of implementing it? And support for Activiti
Can Activiti workflows restarted from a particular task? Is a task can be rollbacked?
Welcoming any suggestions !!
1) For managing workflows, Activiti would be a great choice. They have created a really good process engine which should comply your needs for delegating your tasks as well as calling your custom logic. Moreover, it is based entirely on Spring Framework so Integration with your logic would be easy.
2) i've provided the same in first answer.
3) No, you will have to create a new workflow for that and Yes!, a task can be rolled back.

Transaction Management while performing Functional Testing Spring Rest Interface

I am trying to write a Functional Testing suite. The test utilizes a bunch of Rest calls to execute workflows (The testing is black-box testing, using the rest interface.). The application under rest is Spring 3 and uses Spring's transaction management(DataSourceTransactionManager). To avoid individual setup and tear-down methods, I was thinking of making the transaction rollback-able.This is accomplished by using #TransactionConfiguration(defaultRollback = true) when doing unit\integration testing, but I am not aware of a straight forward way of doing it, while performing integration testing(since they are individual rest calls).
The application under test is not single threaded and multiple concurrent testing suite might be running at the same time on the same database instance\application.
My preliminary analysis leads me to believe that I should force spring to use the same rollback-able transaction for all the methods in a test suite.(Like using a Factory method that returns a Transaction based on a unique identifier. Passing a unique request parameter and using AOP to somehow inject a transaction for this thread)
Have any of you done anything similar. I would really appreciate some ideas.
Thank you.
Good Question,
I am planning to use transaction in my Junit test too
Please use if the following works for you
#Test
#Transactional
#Rollback(true)
It will take some time for me to implement this in my project but hope, this will help you before I need this.
One more thing which i read is the program is multithreaded.
Do you not wish to use the Isolation level which are provided by spring ? But I think it will be developers who should take care of this.

In TDD, why OpenEJB and why Arquillian?

I'm a web developer ended up in some Java EE development (Richfaces, Seam 2, EJB 3.1, JPA). To test JPA I use hypersonic and Mockito. But I lack deeper EJB knowledge.
Some may argue that we should use OpenEJB and Arquillian, but for what?
When do I need to do container dependent tests? What are the possible test scenarios where I need OpenEJB and Arquillian?
Please enlighten me :)
There are two aspects in this case.
Unit tests. These are intended to be very fast (execute the whole test suite in seconds). They test very small chunks of your code - i.e. one method. To achieve this kind of granularity, you need to mock the whole environment using i.e. Mockito. You're not interested in:
invoking EntityManager and putting entities into the database,
testing transactions,
making asynchronous invocations,
hitting the JMS Endpoint, etc.
You mock this whole environment and just test each method separately. Unit tests are fine-grained and blazingly fast. It's because you can execute them each time you make some important changes in code. If they were more complex and time-consuming, the developer wouldn't hit the 'test' button so often as he should.
Integration tests. These are slower, as you want to test the integration between your modules. You want to test if they 'talk' to each other appropriately, i.e.:
are the transactions propagated in the way you expect it,
what happens if you invoke your business method with no transaction at all,
does the changes sent from your WebServices client, really hits your endpoint method and it adds the data to the database?
what if my JMS endpoint throw an ApplicationException - will it properly rollback all the changes?
As you see, integration tests are coarse-grained and as they're executed in the container (or basically: in production-like environment) they're much slower. These tests are normally not executed by the developer after each code change.
Of course, you can run the EJB Container in embedded mode, just as you can execute the JPA in Java SE. The point is that the artificial environment is giving you the basic services, but you'll end with tweaking it and still end with less flexibility than in the real container.
Arquillian gives you the ability to create the production environment on the container of your choice and just execute tests in this environment (using the datasources, JMS destinations, and a whole lot of other configurations you expect to see in production environment.)
Hope it helps.
I attended Devoxx this year and got a chance to answer the JBOSS dudes this question.
Some of the test scenarios (the stuff i managed to scribble down):
Configuration of the container
Container integration
Transaction boundaries
Entity callback methods
Integration tests
Selenium recordings

Resources