Architecture/Technical Challenges in Handling Authentication/Permissions in Elixir Channels/Sockets - websocket

So I have decided to rewrite an application I have been writing in Node.js to Elixir because of all the extra complexity working with Node that Elixir comes with out of the box.
My issue was something I didn't have quite right in Node and is becoming just as complex in Elixir and I am not entirely sure how to go about approaching it.
I am trying to recreate a lot of how Discord does permissions. I am essentially building a CRM system, with different roles like "Sales Manager", "Sales", "Customer Service Rep" etc... But they all are able to do different things based on their "role".
Some things I need to do is be able to update a permission on the fly for a person or role. Maybe the "Sales Manager" role can't look at company financial data like an "Accountant" but we need to give that specific person access for a few days. Or I have a "Customer Service Rep" and we give that entire role the ability to add things to a calendar. I also would like to have the ability to kill sessions.
So there are a few ways I've seen said around Elixir forums, like:
Using Guardian, I really want to like tokens and think not having to hit the database every time sounds wonderful, but I don't think it's practical for this. Unless there is a good solution to updating tokens on the fly which I haven't found.
Giving each person their own process and just kill and start the process on changes with new changes. This seems pretty neat, but I'd rather not kill processes unless there is an actual error, I think this solution will come with big problems, like tracing problems. Although I am not familiar enough to know if this might actually cause problems, or if this is a bad solution for other reasons.
Use Guardian with Guardian_DB, which then defeats the purpose of using tokens, but at least I'd have a trackable session. My only problem with this is I do plan on using a load-balancer so that if a socket connection dies I can reconnect it to the same server and I am not sure there is a way to do that with tokens or if the socket itself has a session attached to it. This is not really that big of an issue though and is pretty close to what I had with Node.js.
Use Redis which I'd like to stay away from, and then update session data in Redis based on user_id when updates occur and hit Redis on every request to see if the user has permissions. I plan to put this across multiple servers eventually which means ETS is not viable unless I can load-balance socket connections like I could in Node.js.
So I guess my questions are,
Can I attach sessions to sockets? Is this a bad idea?
Should I still use a token, and just use Redis to check the token on every request?
Is a token still a better choice than a session?
Is there a much better/easier solution that I have not even mentioned?
I'm sorry this was pretty drawn out and long, I've never had to do something as permission bound as this project professionally and am pretty new to Elixir.

Phoenix channels are stateful. You can put data in the assigns field and it stays there for the duration of the connection. That is where you normally put your user_id after authenticating the user on join.
I also use the channels assigns to store client state that I need on the server.
WRT to the role to permissions question, I'm doing exactly this. What I do is load the load the role permissions from the database on startup and build an ETS store with them. You can do the same with a Task or a GenServer. If the permissions change for a given role, i update the database and the ETS table.
My user model supports a list of roles for each user.
When in need to validate the permissions for a given user, I call the Permission model api like Permission.has_permission?("create-room", user, scope). I have two level of permissions, global and per room. That is what the scope is used for.

Related

Microservices and isolated persistence - how should the data be stored/fetched?

At my company, we're about to move to the micro services architecture. I read a lot about it, and there are tons of obscure areas where it's specific to the project built, but one area seems to get everyone to agree, microservices need to have isolated persistence or another way to say it, they need to have they own database.
Now I love the idea, that means every microservice has its own database schema, its own domain objects and is 100% independent of any other microservice data structure.
There are things I don't quite understand though.
The "Customer Service" is obviously central to the application, and we can see that basically any other microservice will need some data about the user at some point. Whether it'd be the user's credit amount, its ID, or its name.
But since other microservices can't directly read into the Customer Service database, they'll need to query this service over and over again. This is fine (I guess) for simple stuff like getting the name of current logged user, but when we need to display 60 users on a page and we can't do any SQL join, it feels like we're missing something. This is even worse when microservices depend upon tons of microservices.
So I found out that some people actually queried microservices X times a day to get data into their own microservices.
So if microservice "Search" needs data from "Product", "Customer", it'll actually query these microservices and will persist the data with its own data structure.
The question I have is should it be "Search" that queries "Product" and "Customer", or should "Product" and "Customer" send data to "Search" ?
The first option looks a bit easier to do, we only need to have this logic on one side, and that's where the data is needed. But we'll only get static freshness of data which is not very smart, but could definitely work.
The second option looks a bit more difficult but more scalable too, because we could have very fresh data when we need it, since the data changed where it's sent, it could also be more granular.
I think you correctly identified downsides to the microservices approach! And there are no elegant solutions to these specific problems. You will have to eat the additional work and architecture deterioration that this brings.
Concretely addressing your question now:
The question I have is should it be "Search" that queries "Product" and "Customer", or should "Product" and "Customer" send data to "Search" ?
You seem to be looking for a data synchronization service. You want to decide between push and pull. You are concerned about data freshness and logic duplication.
The key point here is that the source service cannot know about its consumers. This is to prevent an unwanted reverse dependency. This would break architectural isolation. Any data sync process that maintains this is fine. You can do what is most convenient.
For example, you could make the data source expose two APIs:
An API to get the whole data set. This would be called periodically by the destination (e.g. nightly). It can also be used to seed the destination at will and to fix data errors there.
A feed of changes in the source database keyed by the date and time the change occurred. The destination can now poll that change feed very frequently (e.g. every few seconds or minutes) and apply the small delta that occurred.
You can even build a realtime change feed through a publish-subscribe middleware. Many message queue softwares can do that. The source would just send out changes to the middleware.
Building all of this is conceptually simple but takes a lot of work. It also creates lots of ongoing work and increases the potential for bugs. Debugging becomes much harder. I have worked on systems like that.
I'm going to add a subjective note: Microservices are not well understood by many teams. The downsides are often ignored. You identified a few of the downsides correctly and they are nasty! Given what I read on the web I believe many teams do not realize the mess they are getting themselves into. Managing disparate data stores can be a nightmare. This is not a one-time "mess" but an ongoing one.
As an alternative I'd recommend using a common data store and building services simply as classes or projects that live in the same process. This gives you the microservices code structuring with the convenience of normal development. It also leaves a few of the upsides of microservices on the table.
your identification of the problem is correct.
But the solution to your problem will depend on use case to use case.
In your example of search service , product service and customer service should publish their events on kafka or similar messaging and search service listen to them and updates it.
In case of lets say in order service while creating an order for a customer , you want to check customer exists , then you might do it by calling the sync api of customer service , but for that also there are variour other approaches , i have answered here linking Microservices and allowing for one to be unavailable
From my perspective sync communication between services should be avoided , and there are way around for this , above link would help
You can use domain driven design philosophy to correctly break your services and their contract

How to avoid abusing roles and permissions for user interface?

Are there any approaches or architecture design patterns to implement secure, clean role/permission based access control and UI conveniences without coupling them together?
The long story.
I have seen ambiguous use of roles and permissions in many web applications and I have often experienced how these ambiguities have caused misunderstandings and implementation difficulties.
Here is a simplified example.
Business requirements say that permission set for some specific role should deny access to some part of the system that displays a full list of addresses. But at the same time, users of this role will need to read the addresses for an autocomplete list on some other web page.
I have seen how reckless developers create a permission entry to disable access to addresses, and later they discover that users actually need to read the addresses from other parts of the system. Then they invent another specific permission for special cases where addresses can be read.
But for me it seems ambiguous and potentially risky situation. If user has no access to some specific data, then he/she shouldn't be able to access it at all. Period. Adding a special permission just for dropdown lists seems like a deliberate security hole. If user loads the list through an async request and the server is using the same controller action to return the list (and it should - to avoid code duplication), then how the server will know when it should not return the addresses, if they are sometimes forbidden?
This situation raises the question: "why shouldn't users of some specific role see the full list of addresses in the first place, if they have access to the list through some other means?" And the answer I often get from business analysts is something like "Well, the address list is not forbidden for data security reasons, but just because users of this particular role are not expected to do anything with the address list and it would be redundant item in their workspace".
So, now the problem seems clear to me: some permissions exist just for controlling the UI and not strictly for controlling access to some data. Such (ab)use of permissions feels wrong to me. Therefore the question which was given at the very beginning.
Good writing! It pretty much feels that you already have your answer.
IMO user profiling and user access are not the same thing. Access rights should be handled as low level as possible (eg. if or not a user has read-access to a specific SQL table) and profiling in this case should only apply at the UI level ("what the user actually wants or needs to see").
When we talk about an application that has some kind of access right control, there's almost always some kind of an "engine" behind the UI that actually holds all the data. The WORST thing ever you can do is implementing the security anywhere else than the engine itself. The data must never be accessible in any other way than through that engine's own access control or otherwise it's not access control - it's UI restriction.
But that's the perfect world :/ In reality, like in all areas of work, software development also has been driven torwards being more and more cost-effective, agile and responsive to the client. Not surprisingly, this guides people to do fast and cheap decisions... like "hell, let's just make another SQL procedure that pulls the data out as an admin" instead of "we need to re-evaluate user access rights, and/or possible redesign our tables to keep consistency with the access privileges". It's always a short-term (bad) solution WHEN it's done, but some solutions are definitely more NO-NOs than others.
As a guideline I'd say that if you're not 110% sure what you're doing, it's a biggest NO-NO there is.
TL;DR: If some data should be accessible even in one place, it's not restricted by access control. If it's unneccessary to show accessible data somewhere, use user/application profiling for filtering it.

Session: Why use mode="SQLServer"?

I'm really looking for feedback here. Why would you want to use
<sessionState mode="SQLServer" ... blaw blaw blaw....
Here, session is loaded from a database...it can allow a user to, say, recover from a power outage, where the user comes back to a web application and their current state is retrieved, if not past the session expiry time...
Why not just make up a class and load it on ResolveRequestCache and save it on UpdateRequestCache?
Why go to the trouble of perhaps even setting up a separate SQL server to use Session attached to a database?
R
Saving session information to the database adds support for using multiple web servers connected in a farm, since you then have connected and shared storage between them. So it's really just dependent on what you're looking to do. If you're going to be happy forever and ever, stick with something local that'll likely take less work to implement. If you're worried about scalability, go with something that will let you scale when you need to.

Should I make my CouchDB database server public-facing?

I'm new to CouchDb and am trying to comprehend how to properly make use of it. I'm coming from MongoDB where I would always write a web layer and put it in front of mongo so that I could allow users to access the data inside of it, etc. In fact, this is how I've used all databases for every web site that I've ever written. So, looking at Couch, I see that it's native API is HTTP and that it has built in things like OAuth support, and other features that hint to me that perhaps I should no longer have my code layer sitting in front of Couch, but instead write Views and things and just give out accounts to Couch to my users? I'm thinking in terms of like an HTTP-based API for a site of mine, or something that users would consume my data through. Opening up Couch like this seems odd to me, though. Is OAuth, in Couch's sense, meant more for remote access for software that I'd write and run internal to my own network "officially", or is it literally meant for the end users?
I know there might be things that could only be done through a code layer on top of CouchDB, like if you wanted additional non-database related things to occur during API requests, also. So thinking along those lines I think I will still need a code layer, anyway.
Dealer's choice.
Nodejitsu has a great writeup on this sort of topic here.
Not knowing your application specifics I'll take a broad approach...
Back-end
If you want to prevent users from ever seeing your database then make it back-end. You can pipe everything through something like node.js and present only what the user needs to see and they'll never know anything about the database.
See Resource View Presenter
Front-end
If you are not concerned about data security, you can host an entire app on CouchDB; see CouchApp. This approach has the benefit of using the replication mechanism to control publishing your site/data. The drawback here is that you will almost certainly run into some technical limitations that will require moving CouchDB closer to the backend.
Bl-end
Have the app server present the interface and the client pull the data from the database separately. This gives the most flexibility but can be a bag of hurt because even with good design this could lead to supportability and scalability issues.
My recommendation
Use CouchDB on the backend. If you need mobile clients to synchronize then use a secondary DB publicly exposed for this purpose and selectively sync this data to wherever it needs to go.
Simply put, no.
There's no way to secure Couch properly on a public facing site. There's no way to discriminate access at a fine enough granular level. If someone has access to any of the data, they have access to all of the data.
Not all data on a site is meant for public consumption, save for the most trivial of sites.

What are the benefits of a stateless web application?

It seems some web architects aim to have a stateless web application. Does that mean basically not storing user sessions? Or is there more to it?
If it is just the user session storing, what is the benefit of not doing that?
Reduces memory usage. Imagine if google stored session information about every one of their users
Easier to support server farms. If you need session data and you have more than 1 server, you need a way to sync that session data across servers. Normally this is done using a database.
Reduce session expiration problems. Sometimes expiring sessions cause issues that are hard to find and test for. Sessionless applications don't suffer from these.
Url linkability. Some sites store the ID of what the user is looking at in the sessions. This makes it impossible for users to simply copy and paste the URL or send it to friends.
NOTE: session data is really cached data. This is what it should be used for. If you have an expensive query which is going to be reused, then save it into session. Just remember that you cannot assume it will be there when you try and get it later. Always check if it exists before retrieving.
From a developer's perspective, statelessness can help make an application more maintainable and easier to work with. If I know a website I'm working on is stateless, I need not worry about things being correctly initialized in the session before loading a particular page.
From a user's perspective, statelessness allows resources to be linkable. If a page is stateless, then when I link a friend to that page, I know that they'll see what I'm seeing.
From the scaling and performance perspective, see tsters answer.

Resources