I would like to find out how people out there manage the dbml file in a more scalable manner?
Do you have just one DataClasses1.dbml and drag every table into it?
Do you have separate files for separate logical groupings, eg Accounts, HR? If so, how do you visually see the foreign key relationships when one table has links to a table in another dbml file?
Thanks.
Better will be to use one single DBML file for all your tables, so that you can see all your relations i.e Foreign Key etc all together..But its depends upon your requirement totally..
Using Entity Framework (same for linq-to-sql) I like to use separate context classes for distinct parts of the database.
But what is "distinct"?
In most cases everything that is related to the core business of an application is too much interrelated for a separate context to be meaningful. But almost every application has lateral tasks like authorization, translation, auditing and so on. These are good candidates for separate contexts.
There will still be connections to the business logic though. As you probably know, you cannot join classes from separate contexts in a way that the join is translated to SQL. Only in memory. So it is useful to duplicate some entities in several contexts. So, for instance, both the business context and the authorization context will contain User entities. One context should be responsible for maintenance of the entity and the other one(s) should use it read-only.
Edit
By duplication of entities I mean that two (or more) contexts can have an entity that maps to the same table in the database. Like User. If you like, the business context could be for creating and updating users, the authorization context is (for instance) for adding roles to a specific user, without modifying the user itself.
Related
For a multi-tenancy architecture for a web application using a document-oriented database I can see two conceivable options:
Having one database per tenant, and the collections logically separate different kinds of object.
Having one collection per tenant, and all user data is stored in one database, with some kind of flag or object type identifier on each record.
Have there been any studies or has any documentation been produced regarding these two options and the differences between them?
Is there a particular standard or good reason why someone designing a web application which allows multiple users to store vastly different kinds of data would choose one over the other?
Aside from speed/efficiency issues, are there any other things to be said about this that would influence the decision?
EDIT I'm aware some of the terminology might be database specific, so for all wondering I am specifically referring to MongoDB.
I wouldn't want tenant specific collections. In my application, I usually hard code collection names, in the same way as I'd hardcode table names if I were using SQL tables. There'd be one comments collection that stores all comments for a blog. I would not want to deal with collection names like comments_tenant_1 and comments_tenant_2, because 1) that feels error prone, and 2) would make the application code more complicated (collection names would have to be replaced with functions that computed the collection name). And 3) the number of collections in a single database could grow huge, which would make a list of all collections look daunting, and also MongoDB isn't built for having very many collections (see the link in the comment below your question, which David B posted, https://docs.mongohq.com/use-cases/multi-tenant.html).
However, database names aren't coupled to application data structures, and you can grant permissions on databases (but not on single collections). So one database per tenant could be reasonable. As could be a per document tenant_id field in a single database for all tenants (see the above-mentioned link).
In our project we're trying to apply the Bounded Context ideology and we've faced kind of obvious problem of performance. E.g., we have different classes (in different contexts) for representing a user in the system: Person in our core domain's context and User in security context. So, we have two different repositories for each of the aggregate, but they are using the same table in DB and sometimes accessing the same data.
Is there common solution to minimize db roundtrips in this case? Are there ORM's which deals with it, or should we code some caching system by ourselves?
upd: the db is from legacy app, and we'll have to use it "as is"
So, we have two different repositories for each of the aggregate, but
they are using the same table in DB and sometimes accessing the same
data.
The fact that you have two aggregates stored in the same table is an indication of a problem with the design. In this case, it seems you have two bounded contexts - a BC for the core domain (Person is here) and an identity/access BC (User is here). The BCs are related and the latter can be seen as upstream from the former. A Person in the core domain has a corresponding User in the identity BC, but they are not exactly the same thing.
Beyond this relationship between the BCs there are questions regarding ownership of behavior. For example, both a Person and a User may have a name and what is to be determined is who own's the behavior of changing a name. This can be implemented in several ways. Person may have its own name and changes should be propagated to the identity BC. Similarly, User may own changes to name, in which case they must be propagated to Person via a synchronization mechanism.
Overall, your problem could be addressed in two ways. First, you can store Person and User aggregates in different tables. Any given query should only use one of these tables and they can be synchronized in an eventually consistent matter. Another approach is to decouple the behavioral domain model from a model designed for queries (read-model). This way, you can create a read-model designed to serve a specific screen(s) and have a customized query, perhaps even outside of an ORM.
If all the Users are Person too (sometimes external services are modeled as special users too), the only data that User and Person should share on the database are their identifiers.
Indeed each entity in a domain model should hold references only to the data that they need to ensure their invariants.
Moreover I guess that Users are identified by Username and Persons are identified by something else (VAT code or so..).
Thus, the simplest optimization technique is to avoid to encapsulate in an entity those informations that are not required to ensure its invariants.
Furthermore you simply need an effective context mapping technique to easily pass from User to Person when needed. I use shared identifiers for this.
As an example you can expose the Person's identifier in the User class, so that a simple query to the Person's repository can provide you the data you need.
Finally I suggest you the Vaughn Vernon series on Aggregate Root Design.
I have two database. MasterData and ProductData.
I store the Users and Employees in the MasterData and I store the Tasks in the ProdcutData.
A Task entity has a User property. It shows which user created this Task.
If I used just one Database and one DataContext I could define a one and more relationship between two entities. But I must use two Databases and datacontexts.
Are any solution that I define relationship between two entities that are in different databases, datacontexts?
thanks advance: l.
This is not a full blown answer, but it might get you to think of another solution.
Depending on the DBMS you are using, you might be able to create synonyms or updateable views (or something similar) from one database to the other. That you DataContext can contain the synonyms/views and the tables.
In sql-server:
http://msdn.microsoft.com/en-us/library/ms177544.aspx
Well, unless I missed something there is no way to join two entities from different contexts/databases regardless if its L2S or EF. Alternative is pooling all possibly relevant data from two contexts and doing in-memory linq to do relational operations, but that certainly poses performance problems of loading too much data.
Here's a "novel" idea, why not use DataSet? Different table adapters can use different connection strings. It is rather archaic next to L2S/EF but it will offer you most bells & whistles of relationships.
I do have one question, if you keep users and their tasks in separate dbs how do you handle ref integrity?
Synonym is a good solution, but the EF does not support it yet....
http://data.uservoice.com/forums/72025-ado-net-entity-framework-ef-feature-suggestions/suggestions/1052345-support-for-multiple-databases?ref=title
Thanks again!
I have predefined tables in the database based on which I have to develop a web application.
Should I base my model classes on the structure of data in the tables.
But a problem is that the tables are very poorly defined and there is much redundant data in them (which I can not change!).
Eg. in 2 tables three columns are same.
Table: Student_details
Student_id , Name, AGe, Class ,School
Table :Student_address
Student_id,Name,Age, Street1,Street2,City
I think you should make your models in a way that would be best suited for how they will be used. Don't worry about how the data is stored or where it is stored... otherwise why go through the trouble of layering your code. Why not just do the direct DB query right in your view? So if you are going to create an abstraction of your data... "model" ... make one that is designed around how it will be used... not how it will be or is persisted.
This seems like a risky project - presumably, there's another application somewhere which populates these tables. As the data model is not very sound from a relational point of view, I'm guessing there's a bunch of business/data logic glued into that app - for instance, putting the student age into the StudentAddress table.
I'd support jsobo in recommending you build your business logic independently of the underlying persistance mechanism, and that you try to keep your models as domain focused as possible, without too much emphasis on how the database happens to be structured.
You should, however, plan on spending a certain amount of time translating your domain models into their respective data representations and dealing with whatever quirks the data model imposes. I'd strongly recommend containing all this stuff in a separate translation layer - don't litter it throughout the rest of the application.
I have a Users table, Events table, and a mapping of UserEvents. In some parts of my code, I just need user-based stuff. In other parts, I need all of this information. (Especially: given a user, what are the details of each event they are subscribed to?)
If I have one repository just for users and another for users + events + userevents, then the auto-created users object is duplicated and the code won't compile until I rename one of them. This is possible but inconvenient. On the other hand, if I only have one repository with all 3 tables, when I just want user info, will it be expensive due to linq getting all the associated data with that user id?
In Linq2Sql, is it more expensive if you have more tables in a single dbml/repository?
Linq2Sql uses lazy loading to get additional information. I believe it can be configured to fetch all at once, but that is not the default behavior. If you ask for a user, you will not get events unless you specifically ask for them.
I have a project with 100+ tables in the dbml, as far as I can tell this does not effect the the time to instanciate the datacontext class.