I am reading up on AsyncCassandraOperations to perform async inserts to improve performance based on another post here. But I am unable to find a lot of help on google or spring data documentation.
Previously I was using Cassandra Repository for all data extraction and insert/updates which I found to be super slow. As per recommendation I am now using AsyncCassandraOperations for the insert operation alone, but it wont let me. I encounter required a bean of type 'org.springframework.data.cassandra.core.AsyncCassandraOperations' error.
What would be the correct way to use AsyncCassandraOperations please?
#Autowired private MyRepository repository_name;
#Autowired private AsyncCassandraOperations acops;
public void persist(List<POJO> l_POJO)
{
System.out.println("Enter Persist: "+new java.util.Date());
List<l_POJO> l_POJO_stale = repository_name.findBycol1AndStale("sample",false);
l_POJO_stale.forEach(s -> s.setStale(true));
l_POJO_stale.forEach(s -> acops.update(s));
try
{
acops.insert(l_POJO);
}
catch (Exception e)
{
System.out.println("Error in persisting new data");
}
}
Don't know whether spring boot is used, if so the AsyncCassandraOperations(AsyncCassandraTemplate is the implementation class) should be created automatically.
If the error shows you need an AsyncCassandraOperations bean, the straight way is to create one as shown below.
#Bean
AsyncCassandraTemplate asyncCassandraTemplate(Session session) {
return new AsyncCassandraTemplate(session);
}
Since you are using Spring data Repository interface, you can alse use the ReactiveCrudRepository interface to update or insert entity objects to Cassandra, which is shown in this spring data example project , as an alternative way to using the AsyncCassandraTemplate class.
In the case of using ReactiveCrudRepository and regarding what you want to do, your code needs the following changes.
change the return type of WRRepository.findByCol1AndCol2AndCol3(String, boolean, String) from List<WRpojo> to Flux<WRpojo> , in order to fully utilize the reactive functionality.
change the return type of persist(List<WRpojo>) from boolean to Mono<Void> , making the result non-blocking too.
change your persist(List<WRpojo>) to the following.
public Mono<Void> persist(List<WRpojo> l_wr) {
Flux<WRpojo> l_old_wr = objWRRepository.findByCol1AndCol2AndCol3("1", false, "2").doOnNext(s -> s.setStale(true));
return objWRRepository.saveAll(l_old_wr).thenMany(objWRRepository.saveAll(l_wr)).then();
}
In reactive programming, basically we don't block any code, this means that somewhere the returned Mono<Void> should be subscribed somewhere downstream, if you do want to block and wait for all operations complete, you can call block() on Mono<Void> , which is not recommended.
Related
I'm trying to migrate my project to Quarkus Reactive with Hibernate Reactive Panache and I'm not sure how to deal with caching.
My original method looked like this
#Transactional
#CacheResult(cacheName = "subject-cache")
public Subject getSubject(#CacheKey String subjectId) throws Exception {
return subjectRepository.findByIdentifier(subjectId);
}
The Subject is loaded from the cache, if available, by the cache key "subjectId".
Migrating to Mutiny would look like this
#CacheResult(cacheName = "subject-cache")
public Uni<Subject> getSubject(#CacheKey String subjectId) {
return subjectRepository.findByIdentifier(subjectId);
}
However, it can't be right to store the Uni object in the cache.
There is also the option to inject the cache as a bean, however, the fallback function does not support to return an Uni:
#Inject
#CacheName("subject-cache")
Cache cache;
//does not work, cache.get function requires return type Subject, not Uni<Subject>
public Uni<Subject> getSubject(String subjectId) {
return cache.get(subjectId, s -> subjectRepository.findByIdentifier(subjectId));
}
//This works, needs blocking call to repo, to return response wrapped in new Uni
public Uni<Subject> getSubject(String subjectId) {
return cache.get(subjectId, s -> subjectRepository.findByIdentifier(subjectId).await().indefinitely());
}
Can the #CacheResult annotations be used with Uni / Multi and everything is handled under the hood correctly?
Your example with a #CacheResult on a method that returns Uni should actually work. The implementation will automatically "strip" the Uni type and only store the Subject in the cache.
The problem with caching Unis is that depending on how this Uni is created, multiple subscriptions can trigger some code multiple times. To avoid this you have to memoize the Uni like this:
#CacheResult(cacheName = "subject-cache")
public Uni<Subject> getSubject(#CacheKey String subjectId) {
return subjectRepository.findByIdentifier(subjectId)
.memoize().indefinitely();
}
This will ensure that every subscription to the cached Uni will always return the same value (item or failure) without re-executing anything of the original Uni flow.
i am working in spring cloud data flow,there i am having a scenario like reading from the database and send the data to the kafka topic using the #InboundChannelAdapter
Below is the strategy i followed.
->Created common list to store the objects if the list was empty
->if the list have the data i won't poll
->i am sending the values to kafka one by one by using index and after that i will remove the index
if i keep the #Bean it is inserting only the first object in the list to kafka topic.
{"id":101443442,"name":"Mobile1","price":8000}
if i remove the #Bean then it will insert all empty data into kafka.
{}
public static List<Product> products;
#Bean
public void initList() {
products = new ArrayList<>();
}
#Bean
#InboundChannelAdapter(channel = TbeSource.PR1)
public MessageSource<Product> addProducts() {
if (products.size() == 0) {
products.add(new Product(101443442, "Mobile1", 8000));
products.add(new Product(102235434, "book111", 6000));
}
MessageBuilder<Product> message = MessageBuilder.withPayload(products.get(0));
products.remove(0);
return message::build;
}
what am i doing wrong?
i need to send the data frequently by reading from db ?
Really not clear what you are asking.
If you talk about JDBC then you may consider to use a JDBC Source from tout-of-the-box applications for Data Flow.
If you are doing logic yourself to take data from data base, you may consider to use a JdbcPollingChannelAdapter from Spring Integration for the same #InboundChannelAdapter reason.
The rest of your logic with that list is not clear. It is strange to see a #Bean on a void method. If you need to initialize that products and get access from the MessageSource implementation, you just need to do private List<Product> products = new ArrayList<>();. Having property as public is really a bad practice.
I'm using jpa , spring data and entity listeners to audit my entities precisely on postUpdate , postPersist , PostRemove
This is a pseudo code of my entity listener class
public class EntityListener extends AuditingEntityListener {
#PostUpdate
public void postPersist(Object auditedEntity) {
writer.saveEntity(auditedEntity,"UPDATE");
}
This the pseudo code of the Writer class
public class Writer {
#Async
public void saveEntity(Object auditedEntity, String action) {
try {
//some code to prepare the history entity
historyDAO.save(entity);
} catch (Exception e) {
}
}
when an exception is thrown in Writer class , the auditedEntity is updated or inserted however the historyEntity where i store the audit action doesnt
The problem is i need to invoke the saveEntity method in another thread for performance issue (#Async) but in that case a new transaction is open instead of the previously one which opened
how can i solve the rollack issue for both transactions
so when an exception is throwen both historyEntity and auditedEntity not persisted
I understand that you want to rollback both the child and the parent transaction when an exception is thrown from within Writer.saveEntity.
The problem is that the thread with the original transaction would still need to wait for all these complicated operations to finish before it could mark the transaction as committed. You can't easily span a transaction across multiple threads, either.
The only thing you could probably do to speed things up is you could run the logic of generating the history entities in parallel, and then save them all just before the transaction commits.
One way of doing that that I can think of is using a Hibernate interceptor:
public class AuditInterceptor extends EmptyInterceptor {
private List<Callable<BaseEntity>> historyEntries;
private ExecutorService executor;
...
public void beforeTransactionCompletion(Transaction tx) {
List<Future<BaseEntity>> futures = executor.invokeAll(historyEntries);
if (executor.awaitTermination(/* some timeout here */)) {
futures.stream().map(Future::get).forEach(entity -> session.save(object));
} else {
/* rollback */
}
}
}
Your listener code then becomes:
#PostUpdate
public void postPersist(Object auditedEntity) {
interceptor.getHistoryEntries().add(new Callable<BaseEntity> {
/* history entry generation logic goes here */
});
}
(note that the above code is greatly simplified, you could use any other asynchronous execution API, the basic idea is that you need to block in AuditInterceptor.beforeTransactionCompletion, waiting for all the history entries to be generated)
However, I would strongly advise against using the above technique, as it is rather complicated and error prone.
If you look here: https://docs.jboss.org/hibernate/orm/5.1/userguide/html_single/chapters/events/Events.html, you'll find that Hibernate interceptors have more interesting methods that could help you gather auditing info, and that perhaps your implementation could make use of them, possibly avoiding the need for complicated logic altogether (Hibernate already does track changes to fields of individual entities, so you get that information for free).
Why reinvent the wheel, though? If you dig even deeper, you'll find the Hibernate Envers module (http://hibernate.org/orm/envers/, works for both JPA and pure Hibernate) which gives you business auditing out of the box. Envers already digs into the above mechanism, so hopefully the performance issue would go away.
Final note: have you measured how long history entry generation takes? I would guess that executing for loops and if statements might be cheaper than database access operations. If I were you, I wouldn't do any of the above unless I was absolutely sure that's where the performance bottleneck was.
I am trying to implement a backend DynamoDB for my Spring Boot application. But AWS recently updated their SDKs for DynamoDB. Therefore, almost all of the tutorials available on the internet, such as http://www.baeldung.com/spring-data-dynamodb, aren't directly relevant.
I've read through Amazon's SDK documentation regarding the DynamoDB class. Specifically, the way the object is instantiated and endpoints/regions set have been altered. In the past, constructing and setting endpoints would look like this:
#Bean
public AmazonDynamoDB amazonDynamoDB() {
AmazonDynamoDB amazonDynamoDB
= new AmazonDynamoDBClient(amazonAWSCredentials());
if (!StringUtils.isEmpty(amazonDynamoDBEndpoint)) {
amazonDynamoDB.setEndpoint(amazonDynamoDBEndpoint);
}
return amazonDynamoDB;
}
#Bean
public AWSCredentials amazonAWSCredentials() {
return new BasicAWSCredentials(
amazonAWSAccessKey, amazonAWSSecretKey);
}
However, the setEndpoint() method is now deprecated, and [AWS documentation][1] states that we should construct the DynamoDB object through a builder:
AmazonDynamoDBClient() Deprecated. use
AmazonDynamoDBClientBuilder.defaultClient()
This other StackOverflow post recommends using this strategy to instantiate the database connection object:
DynamoDB dynamoDB = new DynamoDB(AmazonDynamoDBClientBuilder.standard().withEndpointConfiguration(new EndpointConfiguration("http://localhost:8000", "us-east-1")).build());
Table table = dynamoDB.getTable("Movies");
But I get an error on IntelliJ that DynamoDB is abstract and cannot be instantiated. But I cannot find any documentation on the proper class to extend.
In other words, I've scoured through tutorials, SO, and the AWS documentation, and haven't found what I believe is the correct way to create my client. Can someone provide an implementation that works? I'm specifically trying to set up a client with a local DynamoDB (endpoint at localhost port 8000).
I think I can take a stab at answering my own question. Using the developer guide here for DynamoDB Mapper you can implement a DynamoDB Mapper object that takes in your client and performs data services for you, like loading, querying, deleting, saving (essentially CRUD?). Here's the documentation I found helpful.
I created my own class called DynamoDBMapperClient with this code:
private AmazonDynamoDB amazonDynamoDB = AmazonDynamoDBClientBuilder.standard().withEndpointConfiguration(
new EndpointConfiguration(amazonDynamoDBEndpoint, amazonAWSRegion)).build();
private AWSCredentials awsCredentials = new AWSCredentials() {
#Override
public String getAWSAccessKeyId() {
return null;
}
#Override
public String getAWSSecretKey() {
return null;
}
};
private DynamoDBMapper mapper = new DynamoDBMapper(amazonDynamoDB);
public DynamoDBMapper getMapper() {
return mapper;
}
Basically takes in endpoint and region configurations from a properties file, then instantiates a new mapper that is accessed with a getter.
I know this may not be the complete answer, so I'm leaving this unanswered, but at least it's a start and you guys can tell me what I'm doing wrong!
I currently have a setup where data is inserted into a database, as well as indexed into Solr. These two steps are wrapped in a spring-managed transaction via the #Transaction annotation. What I've noticed is that spring-data-solr issues an update with the following parameters whenever the transaction is closed : params{commit=true&softCommit=false&waitSearcher=true}
#Transactional
public void save(Object toSave){
dbRepository.save(toSave);
solrRepository.save(toSave);
}
The rate of commits into solr is fairly high, so ideally I'd like send data to the solr index, and have solr auto commit at regular intervals. I have the autoCommit (and autoSoftCommit) set in my solrconfig.xml, but since spring-data-solr is sending those commit parameters, it does a hard commit every time.
I'm aware that I can drop down to the SolrTemplate API and issue commits manually, I would like to keep the solr repository.save call within a spring-managed transaction if possible. Is there a way to modify the parameters that are sent to solr on commit?
After putting in an IDE debug breakpoint in org.springframework.data.solr.repository.support.SimpleSolrRepository here:
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit(solrCollectionName);
}
}
I discovered that wrapping my code as #Transactional (and other details to actually enable the framework to begin/end code as a transaction) doesn't achieve what we expect with "Spring Data for Apache Solr". The stacktrace shows the Proxy and Transaction Interceptor classes for our code's Transactional scope but then it also shows the framework starting its own nested transaction with another Proxy and Transaction Interceptor of its own. When the framework exits its CrudRepository.save() method my code calls, the action to commit to Solr is done by the framework's nested transaction. It happens before our outer transaction is exited. So, the attempt to batch-process many saves with one commit at the end instead of one commit for every save is futile. It seems, for this area in my code, I'll have to make use of SolrJ to save (update) my entities to Solr and then have "my" transaction's exit be followed with a commit.
If using Spring Solr, I found using the SolrTemplate bean allows you to 'batch' updates when adding data to the Solr index. By using the bean for SolrTemplate, you can use "addBeans" method, which will add a collection to the index and not commit until the end of the transaction. In my case, I started out using solrClient.add() and taking up to 4 hours for my collection to get saved to the index by iterating over it, as it commits after every single save. By using solrTemplate.addBeans(Collect<?>), it finishes in just over 1 second, as the commit is on the entire collection. Here is a code snippet:
#Resource
SolrTemplate solrTemplate;
public void doReindexing(List<Image> images) {
if (images != null) {
/* CMSSolrImage is a class with #SolrDocument mappings.
* the List<Image> images is a collection pulled from my database
* I want indexed in Solr.
*/
List<CMSSolrImage> sImages = new ArrayList<CMSSolrImage>();
for (Image image : images) {
CMSSolrImage sImage = new CMSSolrImage(image);
sImages.add(sImage);
}
solrTemplate.saveBeans(sImages);
}
}
The way I've done something similar is to create a custom repository implementation of the save methods.
Interface for the repository:
public interface FooRepository extends SolrCrudRepository<Foo, String>, FooRepositoryCustom {
}
Interface for the custom overrides:
public interface FooRepositoryCustom {
public Foo save(Foo entity);
public Iterable<Foo> save(Iterable<Foo> entities);
}
Implementation of the custom overrides:
public class FooRepositoryImpl {
private SolrOperations solrOperations;
public SolrSampleRepositoryImpl(SolrOperations fooSolrOperations) {
this.solrOperations = fooSolrOperations;
}
#Override
public Foo save(Foo entity) {
Assert.notNull(entity, "Cannot save 'null' entity.");
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBean(entity, 1000);
commitIfTransactionSynchronisationIsInactive();
return entity;
}
#Override
public Iterable<Foo> save(Iterable<Foo> entities) {
Assert.notNull(entities, "Cannot insert 'null' as a List.");
if (!(entities instanceof Collection<?>)) {
throw new InvalidDataAccessApiUsageException("Entities have to be inside a collection");
}
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBeans((Collection<? extends T>) entities, 1000);
commitIfTransactionSynchronisationIsInactive();
return entities;
}
private void registerTransactionSynchronisationIfSynchronisationActive() {
if (TransactionSynchronizationManager.isSynchronizationActive()) {
registerTransactionSynchronisationAdapter();
}
}
private void registerTransactionSynchronisationAdapter() {
TransactionSynchronizationManager.registerSynchronization(SolrTransactionSynchronizationAdapterBuilder
.forOperations(this.solrOperations).withDefaultBehaviour());
}
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit();
}
}
}
and you also need to provide a SolrOperations bean for the right solr core:
#Configuration
public class FooSolrConfig {
#Bean
public SolrOperations getFooSolrOperations(SolrClient solrClient) {
return new SolrTemplate(solrClient, "foo");
}
}
Footnote: auto commit is (to my mind) conceptually incompatible with a transaction. An auto commit is a promise from solr that it will try to start to write it to disk within a certain time limit. Many things might stop that from actually happening however - a timely power or hardware failure, errors between the document and the schema, etc. But the client won't know that solr failed to keep its promise, and the transaction will see a success when it actually failed.