I have a WebApp where users log in using the database credentials and the backend runs prefab reports on a production database using the users credentials. Company policies does not allow a technical user in this special case.
Since a Datasource is tied to one user I use a plain JDBC connection
java.sql.Connection c = DriverManager.getConnection(aUrl, aUsername, aPassword);
This works but is this the preferred way to to this in an application server? Somehow this does not seem right.
This way will make your database run out of available open connections and result sets (open cursors) as soon as the user concurrency reaches a certain threshold.
The usual way to do this would be to define a database connection pool with a certain user with the appropiate grants. This connection pool should have some config settings that feel comfortable to your DBA, and should keep its open connections in thresholds that are acceptable to your data base, so you will never get into problems in case of excessive concurrency (you should not reach a database problem with, let's say, 250 concurrent users, which is likely to happen with the method you describe in your post).
The way to achieve this would be to provide sound arguments to your database folks in order to properly review the company policies on database users, in terms of
robustness: the initial implementation will surely take your database down with its first hundreds of concurrent users - this is not possible but certain
performance: a connection pool will always outperform the initial idea because opening a new connection is a very expensive operation
ease of monitoring and administration: a technical user will allow your DBAs to instantly know and better take decisions (for example, tablespace sizing and the like) over the running queries coming to their database from the java application servers
security: actually the matter of who can order which report is a business logic problem, that should be delegated to an upper tier - once this is solved, just let the java application order its reports
Good luck with this!
Related
I have One Database with one domain. But my Database have 3 Websites available. I want my 2nd Website for publish in that Database. Is that possible ???
You might want to make sure that you're not violating the terms of service with the company who is hosting your database. Having many outside domains hitting an inside database may cause some undue stress on that server that the company is not counting on or eating up more bandwidth that is allotted for that machine.
In the same breath though, if you setup some type of data layered web service which you can connect to, then your many other domains are not directly hitting the database and do essentially the same thing, but in a more ordered fashion of predictable database calls. This may not be what you're looking for, but if setup correctly it could make developing against your database much easier.
Correct me if I'm wrong, but from my understanding, "database caches" are usually implemented with an in-memory database that is local to the web server (same machine as the web server). Also, these "database caches" store the actual results of queries. I have also read up on the multiple caching strategies like - Cache Aside, Read Through, Write Through, Write Behind, Write Around.
For some context, the Write Through strategy looks like this:
and the Cache Aside strategy looks like this:
I believe that the "Application" refers to a backend server with a REST API.
My first question is, in the Write Through strategy (application writes to cache, cache then writes to database), how does this work? From my understanding, the most commonly used database caches are Redis or Memcached - which are just key-value stores. Suppose you have a relational database as the main database, how are these key-value stores going to write back to the relational database? Do these strategies only apply if your main database is also a key-value store?
In a Write Through (or Read Through) strategy, the cache sits in between the application and the database. How does that even work? How do you get the cache to talk to the database server? From my understanding, the web server (the application) is always the one facilitating the communication between the cache and the main database - which is basically a Cache Aside strategy. Unless Redis has some kind of functionality that allows it to talk to another database, I don't quite understand how this works.
Isn't it possible to mix and match caching strategies? From how I see it, Cache Aside and Read Through are caching strategies for application reads (user wants to read data), while Write Through and Write Behind are caching strategies for application writes (user wants to write data). Couldn't you have a strategy that uses both Cache Aside and Write Through? Why do most articles always seem to portray them as independent strategies?
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
Could you implement a cache using a normal (not in-memory) database? I suppose this would still be somewhat useful since you do not need to make an additional network hop to the database server (since the cache lives on the same machine as the web server)?
Introduction & clarification
I guess you have one misunderstood point, that the cache is NOT expclicitely stored on the same server as the werbserver. Sometimes, not even the database is sperated on it's own server from the webserver. If you think of APIs, like HTTP REST APIs, you can use caching to not spend too many resources on database connections & queries. Generally, you want to use as few database connections & queries as possible. Now imagine the following setting:
You have a werbserver who serves your application and a REST API, which is used by the webserver to work with some resources. Those resources come from a database (lets say a relational database) which is also stored on the same server. Now there is one endpoint which serves e.g. a list of posts (like blog-posts). Every user can fetch all posts (to make it simple in this example). Now we have a case where one can say that this API request could be cached, to not let all users always trigger the database, just to query the same resources (via the REST API) over and over again. Here comes caching. Redis is one of many tools which can be used for caching. Since redis is a simple in-memory key-value storage, you can just put all of your posts (remember the REST API) after the first DB-query, into the cache. All future requests for the posts-list would first check whether the posts are alreay cached or not. If they are, the API will return the cache-content for this specific request.
This is one simple example to show off, what caching can be used for.
Answers on your question
My first question is, why would you ever write to a cache?
To reduce the amount of database connections and queries.
how is writing to these key-value stores going to help with updating the relational database?
It does not help you with updating, but instead it helps you with spending less resources. It also helps you in terms of "temporary backing up" some data - but that only as a very little side effect. For this, out there are more attractive solutions (Since redis is also not persistent by default. But it supports persistence.)
Do these cache writing strategies only apply if your main database is also a key-value store?
No, it is not important which database you use. Whether it's a NoSQL or SQL DB. It strongly depends on what you want to cache and how the database and it's tables are set up. Do you have frequent changes in your recources? Do resources get updated manually or only on user-initiated actions? Those are questions, leading you to the right caching implementation.
Isn't it possible to mix and match caching strategies?
I am not an expert at caching strategies, but let me try:
I guess it is possible but it also, highly depends on what you are doing in your DB and what kind of application you have. I guess if you find out what kind of application you are building up, then you will know, what strategy you have to use - i guess it is also not recommended to mix those strategies up, because those strategies are coupled to your application type - in other words: It will not work out pretty well.
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
I guess that both is possible. Usually you have one database, maybe clustered or synchronized with copies, to which your webservers (e.g. REST APIs) make their requests. Then whether each of you API servers would have it's own cache, to not query the database at all (in cloud-based applications your database is also maybe on another separated server - so another "hop" in terms of networking). OR (what i also can imagine) you have another middleware between your APIs (clusterd up) and your DB (maybe also clustered up) - but i guess that no one would do that because of the network traffic. It would result in a higher response-time, what you usually want to prevent.
Could you implement a cache using a normal (not in-memory) database?
Yes you could, but it would be way slower. A machine can access in-memory data faster then building up another (local) connection to a database and query your cached entries. Also, because your database has to write the entries into files on your machine, to persist the data.
Conclusion
All in all, it is all about being fast in terms of response times and to prevent much network traffic. I hope that i could help you out a little bit.
I would like to store user profile information. After researching a bit online, I am confused between the following options:
Use a LDAP server (example: Open DJ) - I can write Java clients which can interact with the LDAP server using LDAP APIs.
Store user profile in a database as a JSON document (like in Elastic DB) - The No SQL databases can then index the documents to improve lookup time.
What are the factors that I should keep in mind before selecting one of the approaches?
For a start, if you are storing passwords, then using LDAP is a no brainer IMO. See http://smart421.com/smart-identity-and-fraud/why-bother-with-an-ldap-anyway/ .
Otherwise I would recommend you do a PoC with each solutions (do not forget to add indexes for OpenDJ and you may also use Rest2LDAP) see how they fill your needs. Both products are open source so its easy to get started.
If your user population is a known group that may already have accounts in an existing LDAP repository, or where user account information needs to be shared between systems, then it makes sense to use and add on to the existing LDAP repository.
If you are starting out from scratch and have mainly external, unknown users who have no other interaction with your infrastructure but this one application, then LDAP is not a good choice imo because of the overhead that you are getting for creating and managing the server. Then a lightweight JSON approach seems better suited (even thought the L in LDAP stands for "lightweight").
The number of expected users is less of a consideration - you need to thread carefully with very large populations in either scenario.
See this questions as well for additional insights Reasons to store users' data in LDAP instead of RDBMS
I've a full application coded. Now, the only part missing is to make it multi-tenancy.
I want to allow clients to register into my application website and get an instance of the application with a completely empty database only for that account.
I've thought to play with environments, but I'm not sure if this is a good approach:
config
- user1
- database.php
- user2
- database.php
- ...
I've also thought about a unique config file containing the database information about every account and set the database connection based on the subdomain name. Something like I've seen in this post:
Multi-tenant in Laravel4
Any other idea or better approach to do this part?
Your solutions require 1000 folders for 1000 users.
1000 databases, thousand migrations if anything changes during application live cycle.
You don't want this, trust me.
Instead, create one database and use flags/foreign keys to assing data to users, simply said.
As Andreyco points out having 1000 users with 1000 databases will quickly become a joke, but if your user accounts (clients) will be a much smaller number then this is not such an issue.
The best approach is to have one "master database" which contains all of your generic client information, and this is controlled via a "Super Admin" panel which you have access too. This then lists the database configuration details for the other accounts, so store the database information for the other db's in a table in that one.
It's a little less secure, but essentially means that somebody has to hack the main database to get into the other databases, which is unlikely. You should also limit the firewalls of these databases so even if an attacker is in that main db they can't do shit without hacking into one of your web servers and SSHing from there onto the secondary DB's.
What should we take care of before moving an application from a single Websphere Application Server to a Websphere cluster
This is my list from experience. It is not complete but should cover the most common problem areas:
Plan head the distributed session management configuration (ie. will you use memory-to-memory or database based replicaton). Make a notice that if you are still on 32-bit platform the resource requirement overhead from clustering might cause you instability issues if your application uses already lots of memory.
Make sure that everything you put into user sessions can be serialized with the default serializer (implements Serializable). You might otherwise run into problems with distributed sessions.
The same goes for everything you put into DynaCache. Make sure everything serializes properly.
Specify and make sure all the resource definitions (JDBC providers etc) will be made to a proper scope. I would usually recommend using the actual Cluster scope for everything that your applications installed to cluster use. That ensures the testing features work properly from proper points, and that you don't make conflicting definitions.
Make sure your application uses relative paths for resources in web interfaces. Once you start load balancing and stuff you can run into some serious problems if you have bolted down a lot of stuff.
If you had any sort of timers make sure they work well with clusters. With Quartz that means probably that you should use the JDBC store for timer tasks. With EJB Timers make sure you register the timers only once (it is possible to corrupt the timer database of WAS if you have several nodes attempting the registering at the exactly same time) and make sure you install them to Cluster scope.
Make sure you use the WAS provided SSO mechanisms. If you have a custom implementation please make sure it handles moving the user between servers in cluster well.
Keep it simple, depending on your requirements, try configuring your load balancer to use sticky sessions and not hold state in your HTTP Session. That way you don't need to use resource hungry in memory session replication.
Single Sign On isn't an issue for a single cluster as your HTTP clients will not be moving off the same http://server.acme.com/... host domain name.
Most of your testing should focus on database contention. If you have a highly transactional application (i.e. many writes to the same table) make sure you look at your database Isolation levels so that locks are not held unecessarily. Same goes for your transaction demarkaction. Keep transactions as brief as possible. If you dont have database skills yourself make sure you get a Database Analyst to help you monitor the database while you test.
Also a good advice to raise a PMR to IBM Support up front of any major changes, such as this one or upgrading to new versions etc. Raise it as a "Software Usage Question" and they can provide you with feedback from their knowledge database based on other customers input. Same would apply for any type of product which you have a support agreement for - ask support before problems occur.