Should I use ALLOW FILTERING in Cassandra to delete associated entities in a multi-tenant app? - spring-boot

I have a spring-boot project and I am using Cassandra as database. My application is a tenant based application and all my tables include the tenantId. It is always part of the partition key of all tables but I have also other columns which are part of the partition keys.
So, the problem is; I want to remove a specific tenant from my database but I can't do it directly. Because I need the other part of the partition key.
I have two solutions for it in mind.
I will allow filtering and select all the tenant specific entities and then remove them one by one in the application.
I will use the findAll() method and fetch all the data and then filter in the application and delete all the tenant specific data.
Example:
public class DeleteTenant{
#Autowired MyRepository myRepo;
public void cleanTenantWithoutDbFiltering(String tenantId){
myRepo.findAll()
.stream()
.filter(entity -> entity.getTenantId().equals(tenantId)) // ??
.forEach(MyRepository::remove);
}
public void cleanTenantWithDbFiltering(String tenantId){
myRepo.getTenantSpecificData(tenantId)
.forEach(MyRepository::remove);
}
}
My getTenantSpecificData(String tenantId) query would look like that:
#AllowFiltering
#Query("Select * from myTable where tenantId = ?1 ALLOW FILTERING")
public List<MyEntity> getTenantSpecificData(String tenantId);
Do you have any other idea about it? If not which one do you think would be more efficient? Filtering in the application itself or in the cassandra.
Thanks in advance for your answers!

It isn't clear to me how you've modelled your data because you haven't provided examples of your schema but in any case, the use of ALLOW FILTERING is never going to be a good idea because it means that your query has to do a full table scan of all the relevant tables unless the tenant ID is the partition key.
You will need to come up with a different approach such as writing a Spark app that will efficiently go through the tables to identify partitions/rows to delete. Cheers!

Related

Spring Data- how to tell spring what entities to retrieve

If i have several entites, lets say :
#Entity
class Book{
String name;
Author author;
}
#Entity
class Author{
String name;
City hometown;
}
#Entity
class City{
String cityName;
}
If i want to retrieve all the books, if i use classic JPA Repository and Spring Data and just do a findAll(), it will get me all the books with all the Authors with all their home towns. I know i can use #JsonIgnore, but i think that only prevents whats being returned, not whats being looked up in the database. And also i have methods that DO want to return both books and authors, so #JsonIgnore -ing does not work for me. Is there anything like this ? To tell Spring Data what to look up and what to return ? Any links or guides or methods i don't know of, would be appreciated.
Spring Data has the concept of 'projections' which allow you to return different representations of the same Entity.
Official Documentation:
Spring Data query methods usually return one or multiple instances of
the aggregate root managed by the repository. However, it might
sometimes be desirable to create projections based on certain
attributes of those types. Spring Data allows modeling dedicated
return types, to more selectively retrieve partial views of the
managed aggregates.
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#projections
Where a Projection is a 'closed' projection (a projection interface whose accessor methods all match properties of the target aggregate) then the documentation notes that additionally:
Spring Data can optimize the query execution [to select only the relevant fields], because we know about
all the attributes that are needed to back the projection proxy
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#projections.interfaces.closed
Spring Data also allows for Projections to be specified dynamically at runtime. See further:
https://github.com/spring-projects/spring-data-commons/blob/master/src/main/asciidoc/repository-projections.adoc#dynamic-projections
First mark your relations as LAZY
Then specify what data needs to be fetched on a per-query basis.
See for example:
https://vladmihalcea.com/eager-fetching-is-a-code-smell/

Method in Entity need to load all data from aggregate, how to optimalize this?

I've problem with aggregate which one will increase over time.
One day there will be thousands of records and optimalization gonna be bad.
#Entity
public class Serviceman ... {
#ManyToMany(mappedBy = "servicemanList")
private List<ServiceJob> services = new ArrayList<>();
...
public Optional<ServiceJob> firstServiceJobAfterDate(LocalDateTime dateTime) {
return services.stream().filter(i -> i.getStartDate().isAfter(dateTime))
.min(Comparator.comparing(ServiceJob::getStartDate));
}
}
Method just loading all ServiceJob to get just one of them.
Maybe I should delegate this method into service with native sql.
You have to design small aggregates instead of large ones.
This essay explains in detail how to do it: http://dddcommunity.org/library/vernon_2011/. It explains how to decompose your aggregates to smaller ones so you can manage the complexity.
In your case instead of having an Aggregate consisting of two entities: Serviceman and Servicejob with Serviceman being the aggregate root you can decompose it in two smaller aggregates with single entity. ServiceJob will reference Serviceman by ID and you can use ServicejobRpository to make queries.
In your example you will have ServicejobRpository.firstServiceJobAfterDate(guid servicemanID, DateTime date).
This way if you have a lot of entities and you need to scale, you can store Servicejob entities to another DB Server.
If for some reason Serviceman or Servicejob need references to each other to do their work you can use a Service that will use ServicemanRepository and ServicejobRepository to get both aggregates and pass them to one another so they can do their work.

How to establish one to many relationship in Dynamo DB?

I have 2 different JSON Files. One with user details and other with order details. Order details table has a column user_id to match the user ordered for. I have to create a dynamo db table that has the order details nested inside the user details and insert the values from the Json files into this table using a spring-boot app. Can someone help me with this ? Do we have any example code ?
DynamoDB is NOT a relational DB so you can't have relations per se.
However, you have two ways (at least those come to my mind) to achieve what you want.
1) Have two tables: User and Order, the latter with userId field. When you load Order, get your userId and load also a User by the index id.
2) In your User.java you can have field List<Order> orders. Then you need to create Order.java and annotate this class with #DynamoDBDocument. This enables you to have custom objects in your #DynamoDBTable classes. Do not also forget about getters and setters since they are required.

Joining tables in two separate databases with .Net Core 2.1 / EF Core

I have a .Net Core 2.1 Web API which talks to two MySQL databases. Therefore I have two DbContexts, each with a connection string pointing to the relevant database.
In one of my controller actions, I need to return data which requires a join between two tables, one from each database. Is it possible to do this?
As an example, a simple controller action to retrieve data might look something like this:
[HttpGet]
public IEnumerable<Employee> GetEmployees()
{
return _context.Employees
.Include(e => e.Departments);
}
That example uses one controller only, because in that example both the employee and department tables are in the same database, and therefore both their DbSets would be in the same DbContext.
But what if the employee table was in one database and department table was in another? Then the DbSets for employee and department would be defined in different DbContexts. How could I handle the join in that case? (So that in the example above, the "Include" works properly?
I would imagine that I would have to inject both DbContexts into this controller. But I'm not sure where to go from there...
In my case, both datbases are MySQL databases, and both are on the same server, so that is the only scenario I'm interested in.
After more research, it looks like I could maybe use raw SQL to achieve this. However, what I ended up doing is creating a view on the server which does all the necessary joins, and then I simply call this view...

Database specific queries in a Spring Hibernate application

In a dao class implementation,I want to use different sql query depending upon the underlying database. Since my SQL query is complex which selects from a database view and uses "UNION" key word and uses database specific functions, I can not use JPQL (or HQL). I did some search on Stackoverflow and threads suggest the good way would be to find out the dialect used in the application. Can anyone provide some code example?
EDIT : My apologies, I did not explain my question well enough. In my dao class implementation , I want to determine the database ( say mysql or oracle) on which my application is running and then execute the appropriate query. I need the code like jdbcTemplate.findOutTheDialect().
JPA have the native queries for that.
An example you can find here.
You can use spring JdbcTemplate to connect and query from your database.
For Example..
String query = "SELECT COUNTRY_NAME FROM COUNTRY WHERE MCC_COUNTRY NOT IN ('Reserved') ORDER BY COUNTRY_NAME";
jdbcTemplate.query(query, new ObjectRowMapper());
Where "ObjectRowMapper" will be a class to map return resultset to list of Objects.

Resources