Spring batch meta-data tables in DynamoDB

Spring batch meta-data tables in DynamoDB - spring

I have a spring-boot batch application.
Where I'm pulling data from redshift to dump in elastic search.
And I also have a dynamoDB available for the application.
Now, my question is:
is it possible to store the spring-batch meta-data tables in DynamoDB or to avoid the meta-data tables at-all?
Some of the things, I have tried so far,
tried to avoid the meta-data tables creation, by using ResourcelessTransactionManager
public ResourcelessTransactionManager resourcelessTransactionManager() {
return new ResourcelessTransactionManager();
}
OR
By overriding setDataSource to do nothing
public void setDataSource(DataSource dataSource) {
// do nothing
}
But both the case, it's not reading data from Redshift.
Secondly, I didn't find any reference to use dynamoDB to store spring-batch meta-data tables.
Therefore, is there any approach to store meta schema in dynamoDb or the avoid these tables without affecting Redshift data reading from ItemReader?
My ItemReader Code:
public ItemReader<MyPojo> redshiftReader(#Qualifier("redShiftDataSource") DataSource dataSource) {
JdbcCursorItemReader<MyPojo> databaseReader = new JdbcCursorItemReader<>();
databaseReader.setDataSource(dataSource);
databaseReader.setSql(QUERY_TO_READ_DATA);
databaseReader.setRowMapper(new BeanPropertyRowMapper<MyPojo>(MyPojo.class));
databaseReader.setFetchSize(10000);
return databaseReader;
}
SPRING BOOT VERSION: 2.2.2.RELEASE

Related

Spring Data Mongo Db transactional annotation error handling of TransientTransactionError

I have multiple operations on Mongo which I want to be considered as one operation so I enabled transactions. I'm using Mongo version 5.
The configuration class contains the following bean definition:
#Bean(name = "transactionManager")
MongoTransactionManager transactionManager(MongoDatabaseFactory dbFactory) {
return new MongoTransactionManager(dbFactory);
}
My business logic is similar to below snippet. Assume inside this method there are changes done on different collections.
#Transactional
public MyResult create(String payload) {
//operations omitted for brevity
}
Based on the Mongo documentation (https://www.mongodb.com/docs/v5.0/core/transactions/) I quote:
The new callback API also incorporates retry logic for TransientTransactionError or UnknownTransactionCommitResult commit errors.
Also this as reference: https://github.com/mongodb/specifications/blob/master/source/transactions/transactions.rst#error-labels
Does Spring data Mongo db know how to handle the TransientTransactionError (including the retry described in the docs)? If not what are the alternatives?

#InboundChannelAdapter in Spring-integration is not running continously?

i am working in spring cloud data flow,there i am having a scenario like reading from the database and send the data to the kafka topic using the #InboundChannelAdapter
Below is the strategy i followed.
->Created common list to store the objects if the list was empty
->if the list have the data i won't poll
->i am sending the values to kafka one by one by using index and after that i will remove the index
if i keep the #Bean it is inserting only the first object in the list to kafka topic.
{"id":101443442,"name":"Mobile1","price":8000}
if i remove the #Bean then it will insert all empty data into kafka.
{}
public static List<Product> products;
#Bean
public void initList() {
products = new ArrayList<>();
}
#Bean
#InboundChannelAdapter(channel = TbeSource.PR1)
public MessageSource<Product> addProducts() {
if (products.size() == 0) {
products.add(new Product(101443442, "Mobile1", 8000));
products.add(new Product(102235434, "book111", 6000));
}
MessageBuilder<Product> message = MessageBuilder.withPayload(products.get(0));
products.remove(0);
return message::build;
}
what am i doing wrong?
i need to send the data frequently by reading from db ?

Really not clear what you are asking.
If you talk about JDBC then you may consider to use a JDBC Source from tout-of-the-box applications for Data Flow.
If you are doing logic yourself to take data from data base, you may consider to use a JdbcPollingChannelAdapter from Spring Integration for the same #InboundChannelAdapter reason.
The rest of your logic with that list is not clear. It is strange to see a #Bean on a void method. If you need to initialize that products and get access from the MessageSource implementation, you just need to do private List<Product> products = new ArrayList<>();. Having property as public is really a bad practice.

Pass data from one writer to another writer after reading from DB

I have to create a batch job where I need to fetch data from 1 DB and after processing dump that data to another DB where auto generated ID would be assigned to persisted data. I need to send that data along with generated ID to solace queue.
Reader(DB1) --data1--> Processor --data2--> Writer (DB2) --data3--> Writer (Solace Publisher)
I am using spring boot-2.2.5.RELEASE and spring-boot-starter-batch.
I have created a job having 1 step that read data from DB1 and write data to DB2 via RepositoryItemReader and RepositoryItemWriter respectively. This is working fine.
Now next task is to send persisted data having generated ID to solace stream (using spring-cloud-starter-stream-solace).
I have below questions. Please assist as I am totally new to spring batch
How can I get the complete record after it's saved to DB2 based on some parameter? Do I have to write my own RepositoryItemWriter having StepExecution Context or can I somehow use the existing RepositoryItemWriter.
Once I got the record I need to use solace stream and there I have publish method which expects argument(record) to be published. I think again I need to write my own Item Writer and either I could use the record passed from above repositoryItemWriter by StepExecutionContext or should I query into DB2 directly from here based on some parameter ?
Either of the above case I need to use stepexecution context but can I use available RepositoryItemWriter or do I have to write my own?
Is there any other concept which is handy in this handy instead of using above approaches?

Passing data to future steps is a common pattern in Spring Batch. According to the documentation https://docs.spring.io/spring-batch/docs/current/reference/html/common-patterns.html#passingDataToFutureSteps you can use stepExecution to store and retrieve your generated IDs. In your case the writers are also listeners which has before step methods annotated with #BeforeStep. For example:
public class DB2ItemWriter implements ItemWriter<Object> {
private StepExecution stepExecution;
public void write(List<? extends Object> items) throws Exception {
// ...
ExecutionContext stepContext = this.stepExecution.getExecutionContext();
stepContext.put("generatedIds", ids);
}
#BeforeStep
public void saveStepExecution(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
}
and then you retrieve the ids in the next writer
public class SolacePublisherItemWriter implements ItemWriter<Object> {
public void write(List<? extends Object> items) throws Exception {
// ...
}
#BeforeStep
public void retrieveGeneratedIds(StepExecution stepExecution) {
ExecutionContext stepExecutionContext = stepExecution.getExecutionContext();
this.generatedIds = stepExecutionContext.get("generatedIds");
}
}

I have created a job having 1 step that read data from DB1 and write data to DB2 via RepositoryItemReader and RepositoryItemWriter respectively. This is working fine.
I would add a second step that reads data from the table (in which records have been persisted by step 1 and have their IDs generated) and push it to solace using a custom writer.

Multi-tenancy: Managing multiple datasources with Spring Data JPA

I need to create a service that can manage multiple datasources.
These datasources do not necessarily exist when the app when first running the app, actually an endpoint will create new databases, and I would like to be able to switch to them and create data.
For example, let's say that I have 3 databases, A, B and C, then I start the app, I use the endpoint that creates D, then I want to use D.
Is that possible?
I know how to switch to other datasources if those exist, but I can't see any solutions for now that would make my request possible.
Have you got any ideas?
Thanks

To implement multi-tenancy with Spring Boot we can use AbstractRoutingDataSource as base DataSource class for all 'tenant databases'.
It has one abstract method determineCurrentLookupKey that we have to override. It tells the AbstractRoutingDataSource which of the tenant datasource it has to provide at the moment to work with. Because it works in the multi-threading environment, the information of the chosen tenant should be stored in ThreadLocal variable.
The AbstractRoutingDataSource stores the info of the tenant datasources in its private Map<Object, Object> targetDataSources. The key of this map is a tenant identifier (for example the String type) and the value - the tenant datasource. To put our tenant datasources to this map we have to use its setter setTargetDataSources.
The AbstractRoutingDataSource will not work without 'default' datasource which we have to set with method setDefaultTargetDataSource(Object defaultTargetDataSource).
After we set the tenant datasources and the default one, we have to invoke method afterPropertiesSet() to tell the AbstractRoutingDataSource to update its state.
So our 'MultiTenantManager' class can be like this:
#Configuration
public class MultiTenantManager {
private final ThreadLocal<String> currentTenant = new ThreadLocal<>();
private final Map<Object, Object> tenantDataSources = new ConcurrentHashMap<>();
private final DataSourceProperties properties;
private AbstractRoutingDataSource multiTenantDataSource;
public MultiTenantManager(DataSourceProperties properties) {
this.properties = properties;
}
#Bean
public DataSource dataSource() {
multiTenantDataSource = new AbstractRoutingDataSource() {
#Override
protected Object determineCurrentLookupKey() {
return currentTenant.get();
}
};
multiTenantDataSource.setTargetDataSources(tenantDataSources);
multiTenantDataSource.setDefaultTargetDataSource(defaultDataSource());
multiTenantDataSource.afterPropertiesSet();
return multiTenantDataSource;
}
public void addTenant(String tenantId, String url, String username, String password) throws SQLException {
DataSource dataSource = DataSourceBuilder.create()
.driverClassName(properties.getDriverClassName())
.url(url)
.username(username)
.password(password)
.build();
// Check that new connection is 'live'. If not - throw exception
try(Connection c = dataSource.getConnection()) {
tenantDataSources.put(tenantId, dataSource);
multiTenantDataSource.afterPropertiesSet();
}
}
public void setCurrentTenant(String tenantId) {
currentTenant.set(tenantId);
}
private DriverManagerDataSource defaultDataSource() {
DriverManagerDataSource defaultDataSource = new DriverManagerDataSource();
defaultDataSource.setDriverClassName("org.h2.Driver");
defaultDataSource.setUrl("jdbc:h2:mem:default");
defaultDataSource.setUsername("default");
defaultDataSource.setPassword("default");
return defaultDataSource;
}
}
Brief explanation:
map tenantDataSources it's our local tenant datasource storage which we put to the setTargetDataSources setter;
DataSourceProperties properties is used to get Database Driver Class name of tenant database from the spring.datasource.driverClassName of the 'application.properties' (for example, org.postgresql.Driver);
method addTenant is used to add a new tenant and its datasource to our local tenant datasource storage. We can do this on the fly - thanks to the method afterPropertiesSet();
method setCurrentTenant(String tenantId) is used to 'switch' onto datasource of the given tenant. We can use this method, for example, in the REST controller when handling a request to work with database. The request should contain the 'tenantId', for example in the X-TenantId header, that we can retrieve and put to this method;
defaultDataSource() is build with in-memory H2 Database to avoid using the default database on the working SQL server.
Note: you must set spring.jpa.hibernate.ddl-auto parameter to none to disable the Hibernate make changes in the database schema. You have to create a schema of tenant databases beforehand.
A full example of this class and more you can find in my repo.
UPDATED
This branch demonstrates an example of using the dedicated database to store tenant DB properties instead of property files (see the question of #MarcoGustavo below).

Implementation of DynamoDB for Spring Boot

I am trying to implement a backend DynamoDB for my Spring Boot application. But AWS recently updated their SDKs for DynamoDB. Therefore, almost all of the tutorials available on the internet, such as http://www.baeldung.com/spring-data-dynamodb, aren't directly relevant.
I've read through Amazon's SDK documentation regarding the DynamoDB class. Specifically, the way the object is instantiated and endpoints/regions set have been altered. In the past, constructing and setting endpoints would look like this:
#Bean
public AmazonDynamoDB amazonDynamoDB() {
AmazonDynamoDB amazonDynamoDB
= new AmazonDynamoDBClient(amazonAWSCredentials());
if (!StringUtils.isEmpty(amazonDynamoDBEndpoint)) {
amazonDynamoDB.setEndpoint(amazonDynamoDBEndpoint);
}
return amazonDynamoDB;
}
#Bean
public AWSCredentials amazonAWSCredentials() {
return new BasicAWSCredentials(
amazonAWSAccessKey, amazonAWSSecretKey);
}
However, the setEndpoint() method is now deprecated, and [AWS documentation][1] states that we should construct the DynamoDB object through a builder:
AmazonDynamoDBClient() Deprecated. use
AmazonDynamoDBClientBuilder.defaultClient()
This other StackOverflow post recommends using this strategy to instantiate the database connection object:
DynamoDB dynamoDB = new DynamoDB(AmazonDynamoDBClientBuilder.standard().withEndpointConfiguration(new EndpointConfiguration("http://localhost:8000", "us-east-1")).build());
Table table = dynamoDB.getTable("Movies");
But I get an error on IntelliJ that DynamoDB is abstract and cannot be instantiated. But I cannot find any documentation on the proper class to extend.
In other words, I've scoured through tutorials, SO, and the AWS documentation, and haven't found what I believe is the correct way to create my client. Can someone provide an implementation that works? I'm specifically trying to set up a client with a local DynamoDB (endpoint at localhost port 8000).

I think I can take a stab at answering my own question. Using the developer guide here for DynamoDB Mapper you can implement a DynamoDB Mapper object that takes in your client and performs data services for you, like loading, querying, deleting, saving (essentially CRUD?). Here's the documentation I found helpful.
I created my own class called DynamoDBMapperClient with this code:
private AmazonDynamoDB amazonDynamoDB = AmazonDynamoDBClientBuilder.standard().withEndpointConfiguration(
new EndpointConfiguration(amazonDynamoDBEndpoint, amazonAWSRegion)).build();
private AWSCredentials awsCredentials = new AWSCredentials() {
#Override
public String getAWSAccessKeyId() {
return null;
}
#Override
public String getAWSSecretKey() {
return null;
}
};
private DynamoDBMapper mapper = new DynamoDBMapper(amazonDynamoDB);
public DynamoDBMapper getMapper() {
return mapper;
}
Basically takes in endpoint and region configurations from a properties file, then instantiates a new mapper that is accessed with a getter.
I know this may not be the complete answer, so I'm leaving this unanswered, but at least it's a start and you guys can tell me what I'm doing wrong!

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio