How to avoid exposing IDs of JPA entities to end users

How to avoid exposing IDs of JPA entities to end users - spring

I have an Spring MVC application, which as it stands, exposes the IDs of JPA entities to users (in hidden html inputs or browser urls).
This could allow a malicious user to perform operations on entities belonging to another user using their browser.
Can anyone please suggest a solution to this security problem?
Is encrypting/decrypting IDs a good solution?
If so, in which layer (web, service, repository) is it appropriate to do this?
Which encryption solution is recommended (symetric/asymetric)?
Is there a better solution?

There is a better solution. You can keep your user ID's as primary keys for some purposes, but for this particular purpose I would suggest create a column in all the tables you need, for e.g. called: IDENTIFIER and generate some strong random ID for it, I am using this to generate ID's:
public static String generateId() {
return UUID.randomUUID().toString().replaceAll("-", "").toUpperCase();
}
Then you can use these identifiers in your views. I also wrote a generic method for JPA to find entities which have these kind of columns:
public T findByGeneratedId(String generatedId) {
CriteriaBuilder cb = this.entityManager.getCriteriaBuilder();
CriteriaQuery cq = cb.createQuery();
Root<T> entity = cq.from(entityClass);
CriteriaQuery query = cq.select(entity).where(
cb.equal(entity.get("generatedId"), generatedId));
try {
return (T) this.entityManager.createQuery(query).getSingleResult();
} catch (RuntimeException e) {
return null;
}
}
Note that my column is called GENERATED_ID and all entities has a field:
#Column(name = "GENERATED_ID", nullable = false, unique = true)
private String generatedId = generateId();
This will guarantee the uniqueness and safeness across your entities and there is no need for some complex encoding/decoding stuff.

In my opinion, encrypting IDs is not a good idea, more like hiding the real problem. And it would probably be quite tricky to do cleanly. And a malicious user could still intercept another user's requests and use the encrypted Ids to perform attacks.
The real solution is to implement some kind of access control in your business logic, and refuse attempts to access unauthorized resources, such as en entity belonging to another user.
You could implement this logic yourself if it is simple (no shared entities belonging to several users, no groups, just entities belonging to one user, that should be quite straightforward).
You could implement it as a sort of interceptor (using aspect-oriented, add an aspect to your DAO or service methods for example) in order to do it automatically and avoid too much repetitive boilerplate code.
You could also use Spring Security which has some mechanisms for Access Control.
If the needs are more complex, Spring Security can be used to implement a full ACL (Access Control List) system on your domain objects. This is more complex because ACLs are stored separately, so it needs some exxtra infrastructure in the database, and it seems quite complex to configure right, but it is the more flexible and scalable solution in my opinion. I haven't implemented ACLs myself though, so I can't offer much concrete advice on this.
If you insist on hiding the ids from the users, I suggest you don't really encrypt the IDs but use a per-session correspondance table between the real IDs and some randomly generated temporary ones. This way you avoid frequent crypting/decrypting of IDs and make one visible id totally useless for another user.
Hope this helps.

Related

Entity/Domain purety dilemma in the clean architecutre/Domain driven design

Im working on a eCommerce system in which I try to implement the clean architecture.
But currently Im stuck a little bit.
So I have a use case called: CreateItemUseCase in which I create a Item (alias product) for the shop.
In this use case I call a method (createItemEntity()) of a Entity called ItemEntity.
This method creates just a data object with data like:
userId
itemTitle
itemDescription
...
So now I need another method in the ItemEntity which validates the userId.
To create a Item the user needs to have a userId so the method in the ItemEntity would be called:
validateUserId()
This method should check if the user has a userId in the database and if not the Item creation would be imposible.
Now my question:
How do I validate the userId?
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this:
validateUserId(toBeValidated: Int, allUserIds: Array[Int])
{
// loop through the allUserIds to see if toBeValidated is in there ...
}
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this:
validateUserId(toBeValidated: Int)
{
// get all user id´s through a query, and check if toBeValidated is in there ...
}
Or should I do it completly different?

In general, entities should only contain logic that is operating on information (data) that is within the entity's scope. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
I think your motivation to keep all the logic for validation together is reasonable but on the other hand you should not introduce infrastructure dependencies (like talking to the database or user repository) to the entity. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this
Exactly, that's why it's usually best trying to avoid that to keep entities free from such dependencies. Introducing such dependencies can easily get out of hand and also increase complexity for testing such entities. If you need to do that it should be a very thought decision that justifies that.
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this
This is not such a bad idea in general, because you would not make the entity dependent on infrastructure and provide the entity with all the data it needs for decision making. But on the other hand now you can run into another problem: bad performance.
Now you would retrieve all user ids everytime you create an item. If you would do the check for the user's existence somewhere else this can be optimized much better.
I suggest to ask the user repository beforehand if the user exists prior to performance the entity creation including all the other potentially required validations inside item entity that make sense there. The user repository could have a query that optimizes for just checking for the existence of this user by id.
In case these two operations (asking for the user's existence and creating the new item) only happen at one place of the application I'd be pragmatic and perform the user existence check directly in the use case. If this would occur from different places in your application you can extract that logic into a separate (domain) service (e.g. item service) which deals with the repetitive flow operations working with the user repository and item entity.
What you are dealing here with is a trade-off decision between domain model purity, domain model completeness and performance considerations. In this great blog this is named the Domain-Driven Design Trilemma. I suggest going through the reasoning in the article, I'm pretty sure it will help you coming to a final decision.

I think this is one of side case of what we call Business Gerunds
Details: https://www.forbes.com/sites/forbestechcouncil/2022/05/19/10-best-practices-for-event-streaming-success/
If Item has to validate the user, just see what common attributes are there between entities and who is responsible for change of those, and then a segregation can be done in DDD representation, and using a composite via transaltion, outside world entities can exist as the same

'Existing Entity' constraint

I'm reading some data from an excel file, and hydrating it into an object of class A. Now I have to make sure that one of the fields of the data corresponds to the Id of a specific Entity. i.e:
class A{
protected $entityId;
}
I have to make sure that $entityId is an existing id of a specific entity (let's call it Foo). Now this can be achieved using the choice constraint, by supplying the choices option as all of the existing ids of Foo. However this will obviously cause a performance overhead. Is there a standard/better way to do this?

I'm a bit confused about what you are doing, since you seem to talk about Excel parsing, but at the same time you mention choices, which in my opinion relate to Forms.
IMO you should handle directly the relationship to your entity, instead of only its id. Most of the time it is always better to have directly the related entity as attribute of your class A than only the id, and Symfony manipulates such behaviours pretty well.
Then just have your Excel parser do something like this:
$relatedEntity = $this->relatedEntityRepository->find($entityId);
if (!$relatedEntity) {
throw new \Exception();
}
$entity->setRelatedEntity($relatedEntity);
After doing this, since you were talking about Forms, you can then use an EntityType field which will automatically perform the request in database. Use query_builder if you need to filter the results.

DDD and Entity Framework, Filters

So I am struggling with the approach DDD has to follow when we talk about filtering and queries. From this SO question Is it okay to bypass the repository pattern for complex queries? I can see the filtering by User should be done after getting all the Products. The piece of code from the accepted answer is:
Products products = /* get Products repository implementation */;
IList<Product> res = products.BoughtByUser(User user);
But wait, and if the database has 1 million Products? Isn't the best approach to do this filter directly in the database like so:
productsRepository.Find(p => p.User.Id == userId);
But from my actual knowledge of DDD this would be wrong, because this logic should be inside the Product itself.
Therefore, how to handle this scenario?

I agree with Yorro's answer. According to the comment, products is indeed a repository.
The question around performance of the underlying datastructure vs keeping the domain knowledge in the application could be explored further though.
Databases are great at filtering and querying data, they are optimized to do so, and for us to ignore that simply to "keep our knowledge in the domain" is naive.
Your example shows Repository Specialization, which is fine albeit verbose.
The logic of that search is encapsulated by that call, and as long as the interface for calling that method is in the domain, and the implementation in the data-layer, everything is fine.
Indeed the call could be to a stored-procedure that performs a very complex operation. (In this case, yes some of your logic has escaped the domain, but you make it as a conscious decision, and should you introduce another data technology, you would have to implement that functionality again.)
There is another option...
We can encapsulate the logic of the search in a Specification (http://en.wikipedia.org/wiki/Specification_pattern) and pass the specification from our domain logic code to our Repository who would interpret the specification and do the query.
That makes our domain oblivious of how the underlying data structure works, but it puts it in control of what the search criteria is.
I usually find myself implementing a blend of Repository Specialization, and having a base repository that accepts an ISpecification for more lightweight queries.

Based on your link, the Products class is the repository, just named without the "repository" suffix.
You are correct that the filtering should be in the database, you just don't see it because you are in the domain.
The first and second approach are the same. The difference is that the first is more align with DDD because of the proper usage of the ubiquitous language
// First example
// Take note, the products IS the repository
IList<Product> productsByUser = products.BoughtByUser(User user);
// Second example
IList<Product> productsByUser = productsRepository.Find(p => p.User.Id == userId);
If you dive in the data access layer, you can see the filtering that you are talking about.
public IList<Product> BoughByUser(User user)
{
IList<Product> products = this.dbContext.Products.Find(p => p.User.Id == user.ID);
return products;
}

This is not the direct answer of your question (Yorro's answer is right) but maybe it helps you to better understand DDD. This is a "wrong way, turn back" answer.
Your views doesn't need domain rules; doesn't need aggregates with 1 million of childs or 1 million of entities. So, you don't need to "bypass" the product repository because you should have "View Services" with "View Repositories" wich allows you to query (and paging, etc) denormalize data from persistence for your views.
You should apply domain rules using aggregates/entities when update/insert/delete is needed.
Once the user select one or several products from the 1 million list and push, for example, delete button you should use product repository to retrieve the aggregate/entity of selected products, apply delete rules and invariants and save in persistence.

How do I implement fine-grained access control in Spring-Data-Rest?

TL;DR: How do I implement fine-grained access control in the flattened REST api approach that Spring-Data-Rest gives us?
So - I'm making an API using Spring-Data-Rest where there's three main access levels:
1) The admin - can see/update all groups
2) An owner of a group - can see/update the group and everything under it
3) An owner of a sub-group - can see/update only his group. No recursive nesting, just one sub-level allowed.
And 'group' is exposed as a resource (has a crud repository).
So far so good - and I've implemented some access control for modification using a Repository Event Handler - so on the create/write/delete side I think I'm fine.
Now I need to get to the point of limiting visibility of some of the items. This is ok for getting a single item since I can use Pre/Post Authorize annotations and reference the principal.
The problem lies in the findAll() methods - I don't have an easy hook to filter out the specific instances I don't want exposed based on the current principal. For example - a sub-group owner could see all groups by doing GET /groups. They should ideally have the items they don't have access to not even be visible at all.
To me this sounds like writing custom #Query() annotations on the repository interfaces, but that doesn't seem doable because:
I need to reference the principal in the query. SPeL is supposed to be supported, but doesn't seem to work at all with ?# expressions (despite this blog post suggesting otherwise: https://spring.io/blog/2014/07/15/spel-support-in-spring-data-jpa-query-definitions). I am using spring-boot with 1.1.8.RELEASE and the Evans-RELEASE train for spring-data generally.
The kind of query I need to write is going to be different depending on the access level, which can't realistically be encompassed in a single JPQL statement (if admin select all groups, else get all (sub)groups associated with the principal's user).
Therefore it sounds like I need to write some custom repository implementations for that and just reference the principal in code. Well that's ok - but it seems like a lot of work for each repository that I need to control the access to (I think this will be almost all of them). This applies to findAll and various custom search methods.
Am I approaching this wrong? Is there another approach to dynamically limiting item visibility based on the currently logged-in user that would work better? In a flat namespace like spring-data-rest exposes, I would imagine this would be a common problem.
In a prior design I just solved it by exposing everything under /api/groups/{groupId}/... and had a sub-resource locator act as a single pinch-point to control access to anything under it. No such luck in spring-data-rest.
Update: now stumbling with a custom method overriding findAll() (this works for other methods defined on my custom interface). Though this might be a separate question - I'm blocked right now. Spring-data is just not calling this when I do a GET /groups, but calling the original. Oddly enough it does use my query if I define one on the interface and mark it with #Query (perhaps custom overrides of built-in methods aren't supported anymore?).
public interface GroupRepository extends JpaRepository<Group, Long>, GroupCustomRepository {}
public interface GroupCustomRepository {
Page<Group> findAll(Pageable pageable);
}
public class GroupCustomRepositoryImpl extends SimpleJpaRepository<Group, Long> implements GroupCustomRepository {
#Inject
public GroupCustomRepositoryImpl(EntityManager em) {
super(Group.class, em);
}
#Override
public Page<Group> findAll(Pageable pageable) {
MyPrincipal principal = (MyPrincipal) SecurityContextHolder.getContext().getAuthentication().getPrincipal();
Page<Group> result;
if (principal.isAdmin()) {
result = findAll(pageable);
} else {
Specification<Group> spec = (root, query, cb) -> cb.or(
cb.equal(root, principal.getGroup()),
cb.and(cb.isNotNull(root.get(Group_.parentGroup)), cb.equal(root.get(Group_.parentGroup), principal.getGroup()))
);
result = findAll(spec, pageable);
}
return result;
}
}
Update 2: Since I can't access the principal in the #Query, and I can't override it with a custom method, I'm at a brick wall. #PostFilter doesn't work either because the return object is a Page rather than a collection.
I've decided to just wall-off the /groups to admins only, and have everyone else use different approaches (/groups/search/somethingSpecific) with #PostFilters/#PostAuthorizations.
This doesn't seem like it meshes very well with the HAL approach though. Interested in how other people are solving these kinds of issues with Spring-data-rest.

We ended up approaching this as follows:
We created a custom aspect which sits in front of the CRUD methods on a repository. It then looks up and calls an associated 'authorization handler' which is annotated on the repository that dynamically manages authorization details.
We had to be pretty heavy-handed when it came to limiting results in a findAll() query (eg: looking at /users) - essentially, only admins could list all of anything sensitive. Otherwise limited users had to use query methods for specific items.
We created some reusable authorization-related classes, and use those in certain scenarios - particularly custom queries, eg:
#PreAuthorize("#authorizations.systemAdminRead()")
#Query("select u FROM User r where ...")
List findAll();
#PostAuthorize("#otherAuthorizationHandler.readAllowed(returnObject)") ResponseObject someQuery();
All in all, it works - but it feels very clunky, and it's easy to miss things. I do wish this was baked-in to the framework more, even being able to dynamically adjust the default queries would be useful (when I was attempting this, I wasn't able to have the queries updated appropriately with #Query).
We happen to be using PostgreSQL, so the upcoming row level security (http://michael.otacoo.com/postgresql-2/postgres-9-5-feature-highlight-row-level-security/) would have fit the bill nicely, assuming we could feed it the proper authorization details via the DB connection.

LINQ To SQL entity objects as domain objects

Clearly separation of concerns is a desirable trait in our code and the first obvious step most people take is to separate data access from presentation. In my situation, LINQ To SQL is being used within data access objects for the data access.
My question is, where should the use of the entity object stop? To clarify, I could pass the entity objects up to the domain layer but I feel as though an entity object is more than just a data object - it's like passing a bit of the DAL up to the next layer too.
Let's say I have a UserDAL class, should it expose an entity User object to the domain when a method GetByID() is called, or should it spit out a plain data object purely for storing the data and nothing more? (seems like wasteful duplication in this case)
What have you guys done in this same situation? Is there an alternative method to this?
Hope that wasn't too vague.
Thanks a lot,
Martin.

I return IQueryable of POCOs from my DAL (which uses LINQ2SQL), so no Linq entity object ever leaves the DAL. These POCOs are returned to the service and UI layers, and are also used to pass data back into the DAL for processing. Linq handles this very well:
IQueryable<MyObjects.Product> products = from p in linqDataContext.Products
select new MyObjects.Product //POCO
{
ProductID = p.ProductID
};
return products;

For most projects, we use LINQ to SQL entities as our business objects.
The LINQ to SQL designer allows you to control the accessibility of the classes and properties that it generates, so you can restrict access to anything that would allow the consumer to violate the business rules and provide suitable public alternatives (that respect the business rules) in partial classes.
There's even an article on implementing your business logic this way on the MSDN.
This saves you from writing a lot of tedious boilerplate code and you can even make your entities serialisable if you want to return them from a web service.
Whether or not you create a separate layer for the business logic really depends on the size of your project (with larger projects typically having greater variation between the business logic and data access layers).
I believe LINQ to Entities attempts to provide a one-stop solution to this conundrum by maintaining two separate models (a conceptual schema for your business logic and a storage schema for your data access).

I personally don't like my entities to spread accross the layers. My DAL return POCO's (of course, it often means extra work, but I found this much cleaner - maybe that this will be simpler in the next .NET version ;-)).
The question is not so simple and there are lots of different thinking of the subject (I keep on asking myself the same question that you are).
Maybe you could take a look at the MVC Storefront sample app : I like the essence of the concept (the mapping that occurs in the data layer especially).
Hope this helps.

There is a similar post here, however, I see your question is more about what you should do, rather than how you should do it.
In small applications I find a second POCO implementation to be wasteful, in larger applications (particularly those that implement web services) the POCO object (usually a Data Transfer Object) is useful.
If your app falls into the later case, you may want to look at ADO.Net Data Services.
Hope that helps!

I have actually struggled with this, as well. Using plain vanilla LINQ to SQL, I quickly abandoned the DBML tooling, because it bound the entities to tightly to the DAL. I was striving for a higher level of persistence ignorance, although Microsoft didn't make it very easy.
What I ended up doing was hand-writing the persistence ignorance layer, by having the DAL inherit from my POCOs. The inherited objects exposed the same properties of the POCO it is inheriting from, so while inside the persistence ignorance layer, I could use attributes to map to the objects. The called then could cast the inherited object back to its base type, or have the DAL do that for them. I preferred the latter case, because it lessened the amount of casting that needed to be done. Granted, this was a primarily read-only implementation, so I would have to revisit it for more complex update scenarios.
The amount of manual coding for this is rather large, because I also have to manually maintain (after coding, to begin with) the context and provider for each data source, on top of the object inheritance and mappings. If this project was being deprecated, I would definitely move to a more robust solution.
Looking forward to the Entity Framework, persistence ignorance is a commonly requested feature according to the design blogs for the EF team. In the meantime, if you decide to go the EF route, you could always look at a pre-rolled persistence ignorance tool, like the EFPocoAdapter project on MSDN, to help.

I use a custom LinqToSQL generator, built upon one I found in the Internet, in place of the default MSLinqToSQLGenerator.
To make my upper layers independent of such Linq objects, I create interfaces to represent each one of them and then use such interfaces in these layers.
Example:
public interface IConcept {
long Code { get; set; }
string Name { get; set; }
bool IsDefault { get; set; }
}
public partial class Concept : IConcept { }
[Table(Name="dbo.Concepts")]
public partial class Concept
{
private long _Code;
private string _Name;
private bool _IsDefault;
partial void OnCreated();
public Concept() { OnCreated(); }
[Column(Storage="_Code", DbType="BigInt NOT NULL IDENTITY", IsPrimaryKey=true)]
public long Code
{
//***
}
[Column(Storage="_Name", DbType="VarChar(50) NOT NULL")]
public string Name
{
//***
}
[Column(Storage="_IsDefault", DbType="Bit NOT NULL")]
public bool IsDefault
{
//***
}
}
Of course there is much more than this, but that's the idea.

Please keep in mind that Linq to SQL is not a forward looking technology. It was released, it's fun to play with, but Microsoft is not taking it anywhere. I have a feeling it won't be supported forever either. Take a look at the Entity Framework (EF) by Microsoft which incorporates some of the Linq to SQL goodness.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio