Best practice architecture for professional Microsoft.Net websites
(Update: this article deals with back-end architecture. If you’re looking for HTMl/JS architecture, check out this article.)
Architecture technology is constantly evolving and there is a lot to keep up with. Over my career, I’ve reviewed a large number of approaches and today I’d like to draw a line in the sand and present to you all my current ‘best’ architecture.
This solution has been used more-or-less in my last three major applications (most notably www.rate-it.co) and is the result of many hundreds of hours of investigating, comparing, and performance testing, as well as tens of thousands of hours backing real-world websites.
There are a few reasons why I’m sharing it:
- there’s no loss to me if I can help other people build their sites
- I’m only as good as the knowledge I can attain. By sharing mine, I hope that others will share theirs and thereby help me evolve my craft
- this architecture will probably be obsolete within six months or so. It will be an interesting record for me to see how my approach changes over time
The project architecture
I have created a working application (Visual Studio 2012, .Net 4.5) which summarizes the concepts I’m about to explain. To install:
- download the code from here
- run the included setup.sql file to generate the database structure expected by your code
- update the BlackBallArchitecture.Web\Connectionstrings.config file as appropriate
First of all, let’s note the dependencies between the various projects:
There was a time when I built web applications with two layers – a front-end ‘web’ project with my ASPX and ASCX files etc, and a data-access layer which basically parsed stored procedures into database. This architecture was flawed for a number of reasons, but it’s biggest fault was that there was no separation between UI and business logic, nor even between UI and the data source (you need to know column names if you are accessing a data table).
These days, I use the structure above. To quickly summarize:
- Common.dll is used to store generic functions (such as Extension methods) which are of use to the entire application.
- Data.dll provides access to a Sql Server database using the Entity Framework Code-First Model (see below). Data storage is often synonymous with business logic (e.g. saving a person record into the database), however their separation is essential if we are to unit test later.
- Contracts.dll is where I record the ‘shape’ of the application. It holds an interface to every business logic class (in Logic.dll), as well as all my entity definitions, such as Person or SystemLog.
- Logic.dll is where I store all my business logic, such as email validation and accessing the data store (although note that the Logic layer does not actually have a reference to the data store
- Dependency.dll is used by Unity (see below) to map the Contracts to the actual business logic
- Web.dll is the front-end web application, containing html files etc.
- In addition, the application contains Test.dll and CodeGen.dll assemblies, which I have excluded from this diagram to avoid clutter. I will explain them later.
So, quite a few assemblies for an application that essentially just saves a person record to a database – in fact, I’ve gone from 2 to 6. Why?
Dependency Injection using Microsoft Unity
The first thing you may notice above is that the Web assembly has no knowledge of either the data store nor the Logic assembly. In fact, all it has is knowledge of is the logic structure (via the Contracts assembly) and a mechanism for accessing these contracts using the Dependency assembly.
I have done this using Microsoft Unity, which allows me to use a little trick called Dependency Injection.
If you refer to the Home/Welcome method, this means that instead of getting my list of people via…
var people = new Logic.DataManagers.PersonManager().GetPerson(null);
…I instead use:
var people = Dependency.Resolve<IPersonManager>().GetPerson(null);
The unity.config file (in the demo project) tells Unity that IPersonManager should be mapped to my PersonManager class. By doing this, I have de-coupled the site’s dependency on our actual Logic implementation, instead using just an interface to gain access to my data. This has a number of gnarly benefits:
- if I find a bug, I can deploy another assembly with a class that implements IPersonManager, with the bug fixed in it
- I can unit-test the code in total isolation from the actual business logic – an essential tenet of unit testing
I can create different implementations of IPersonManager, based on varying project requirements. For example, I once built a site which had two implementations – one in New Zealand and one in England. The New Zealand site showed full access to everybody’s name whereas England had stronger privacy controls and would only show the first name. I simply created two classes inheriting from IPersonManager and implementing FormatName(). New Zealand’s implementation was:
public string FormatName(Person p){ return (p.FirstName + " " + p.LastName).Trim(); }
and England’s was:
public string FormatName(Person p){ return p.FirstName.Trim();
I included both the implementations in the same Logic.dll assembly, and simply modified the unity.config file of each website to suit.
Another great use of Unity is when you want to temporarily replace some business logic without a full redeploy. For example, I will sometimes litter my code with logging functionality, for example, in the PersonManager.SavePerson() function, you can see this:
Dependency.Resolve<ILogger>().Log(firstName + " record updated");
But my unity.config file has TWO concrete implementations of ILogger:
<typeAlias alias="ILogger" type="BlackBallArchitecture.Contracts.ILogger, BlackBallArchitecture.Contracts" />
<typeAlias alias="DatabaseLogger" type="BlackBallArchitecture.Logic.DatabaseLogger, BlackBallArchitecture.Logic" />
<typeAlias alias="NoLogger" type="BlackBallArchitecture.Logic.NoLogger, BlackBallArchitecture.Logic" />
To save database stress, for the most part I turn logging off during production:
<type type="ILogger" mapTo="NoLogger" />
However, if there is a problem I can easily switch it on by modifying the file to:
<type type="ILogger" mapTo="DatabaseLogger" />
No recompile or redeploy required!
A misuse of Unity?
The other thing that Unity allows me to do is create circular references between my assemblies. For example, from my Logic layer, I am able to access the current user’s PersonID even though it is stored in a website cookie:
private int Log(SystemLog logSummary)
{
// We can use Unity to call back to whichever storage mechanism we are using for the 'current user' info, without actually knowing what it is
var personID = Dependency.Resolve<ICurrentUser>().PersonID;
// Log
logSummary.PersonID = personID;
logSummary.WhenOccurred = DateTime.Now;
var svc = Dependency.Resolve<BlackBallArchitecture.Contracts.DataManagers.ISystemLogManager>();
svc.SaveLog(logSummary);
return logSummary.SystemLogID;
}
ICurrentUser meanwhile, is implemented in my Web.dll assembly as follows:
public class AuthenticationManager : BlackBallArchitecture.Contracts.Security.ICurrentUser
{
public int? PersonID
{
get
{
if (!IsAuthenticated) { return null; }
return int.Parse(HttpContext.Current.User.Identity.Name);
}
}
}
I know what you’re thinking – who is this handsome renegade?! Well, there is a distinct difference between the above, and another method which I’ve seen in some Logic layers:
var personID = int.Parse(HttpContext.Current.User.Identity.Name);
When developers use this method, they are making an assumption that their business logic will always depend on or be run from a web application. If they were to later reference this from another application (such as Silverlight or Windows Forms), it will fall apart.
However, using Unity, if we were to reference this from a new application, we could just implement a new version of ICurrentUser and pull the PersonID variable from a different location such as in-memory or from file etc. Very cool.
Data access using LINQ2SQL and the Entity Framework Code First Model
LINQ2SQL and I got off to a bad start because I moved into so many brown-field applications where it had been naively implemented and was killing the database. For example, with ignorant use of foreign key relationships and a Repeater control, I once saw a web page making 50,000 database requests.
If one is working on a project by themselves, they can learn and remember the limitations of LINQ and it works well. However, I usually need to plan for a team of developers – experienced and inexperienced – as well as anticipate the day when I leave and others come in and have to pick up the project. For these reasons, I didn’t adopt LINQ until mid 2010, instead opting for stored procedures, which at least gave me 100% control over my SQL.
That was until I discovered the Entity Framework Code First Model, probably the biggest game-changer for my architectures in the last few years.
Non-data-bound Entities
EF CF allows me to design my own entities (POCO) and bind my LINQ queries back to them. Previously, LINQ returned its own entity types which meant that if you were using them in your web layer (for example, to render a list of people), your project ended up with a SQL dependency right from the web layer. It also meant that you had to keep a (static) database connection open throughout the duration of the page call, so that changes to your entity could be ‘remembered’ by LINQ and saved back to the data store (this is most often done by storing a LINQ ‘context’ in your Global.asax class and opening in Application.BeginRequest). Urgh.
With EF, you declare your entities completely separately from the data-access. In my example project, I have declared them in the Contracts.Entities.Data namespace. Note that there are not even any attributes involved here – there is nothing to suggest that they are going to be used in LINQ queries (actually, EF does support various attributes, I just avoided them).
I then reference these entities back in the Data.dll assembly, using them as return types for my data context.
The reason I put my entities in a separate assembly was so that I could use them from the web layer (they are return types from my logic layer) without requiring a reference to the Data.dll assembly, making it impossible for the developer to (inadvertently or not) call the data store without going through our Logic layer.
Caching
Data entities which are completely independent of the data source are also essential for caching – after all, you can’t store an open database connection on disk on another server. See the section on Caching below for more details.
Force your developers to explicitly call for more data
This harks back to the example before about the 50,000 database hits in one page. By removing sub-classes (which are what foreign keys resolve to in LINQ), you change this code…
foreach(var person in people) {
Response.Write(person.FirstName + " has " person.Orders.Count() + " orders");
}
…to this:
foreach(var person in people) {
var totalOrders = new OrderManager().GetOrdersForPerson(person.PersonID);
Response.Write(person.FirstName + " has " totalOrders + " orders");
}
Both of these examples result in the database being called to get the total orders for each person, so if there are 50,000 people in your list, you will call the database 50,000 times. The difference is that this behavior is not immediately apparent when looking at the first example. With the second method, a good developer will realize what is happening and take actions to resolve it (perhaps by getting all order counts in one call before the loop starts).
When I first started implementing this design for my clients, their developers were fairly unimpressed, and rightly so – it adds work. But I consider it absolutely essential if you want to use LINQ2SQL in a multi-developer environment.
T4 Templates for Code Generation
One thing that soon becomes apparent when using the Code First model and Unity is that for each database entity you are dealing with, you have quite a number of classes:
- an entity (e.g. Person)
- a data-access class
- an interface for the data-access class
- ObjectContexts and Configurations for Code First
- any number of other ‘helpers’ like my GetOrCreatePerson() method
Typing these by hand would be very time consuming and error prone, so I use T4 Templates to do the heavy lifting for me. T4 templates have been around for a long time, but were little known until things like the Entity Framework came out. They are awesome in their simplicity. T4 templates will only get you so far however. In order to truly unleash the power of code generation, I also use a fantastic third-party tool called the T4 Toolbox which allows you to:
- re-use and parametize your templates, in much the same way you would with a regular C# class
- generate multiple files from a single generator – no more files with 100,000 lines
- deploy your generated files across multiple projects – an essential requirement for my design, although I suppose you could do it using multiple templates (one for each project) if you enjoyed wasting time
The result is the CodeGen.dll assembly you have in the example project. Every time the database structure changes, just open and save the Controller.tt file. It queries the database and generates the files based off the structure.
Of course, you may not wish your code to match the database structure – no problem, just write the various classes by hand.
Caching
The first thing to remember about caching is that SQL Server is a very good cache. Its sole purpose is to remember things and return them to your application.
However, once your site traffic picks up, you will come to realize that those fancy data queries are awfully time consuming for what results in just a couple of objects returned, so at this point, you need to cache.
The attached project structure just uses the HttpApplication cache, built in to Microsoft.Net. This cache is actually fantastically fast (see my previous post on cache performance comparison), but it has limitations like not being able to access it between web applications, nor can you distribute to other servers etc. I’m not going to compare caches in this blog, but just remember:
- you can’t cache a LINQ2SQL object that is still connected to its data store (hence my Code First POCO model is excellent for this)
- caching should be done in the Logic layer – the web layer should have no idea where the objects come from
- Unit Testing
Now, I actually find unit testing extremely boring and often it is a waste of time. However, when working in a multi-developer environment, tools such as unit testing are an invaluable insurance policy against sloppy development and miscommunication.
A core tenet of unit testing is to isolate (and test) the minimum amount of functionality at once, and this means reducing dependencies between classes as much as possible. For example, consider this function which returns a requested person…
public void GetPerson(int personID){
var person = new LinqContext().Persons.FirstOrDefault(x => x.PersonID == personID);
return person;
}
- A very small and simple function and easy to test. Except that it depends on the database (LinqContext). This means that in order to test this function, you must first setup a database, including inserting appropriate test data. Although possible, it kind of defeats the aforementioned tenet.
- So this brings me back to Dependency Injection and my beloved Code First model. In the attached example project, I simply replace the database-bound IRepository, with a memory-bound IRepository. Then, when I call IPersonManager().GetPerson(), instead of using LINQ2SQL on my database, it uses LINQ2Entities on my in-memory collection. My logic is tested without any database required.
- This is actually amazing – unit testing aside, I’ve actually switched out my data store from a database to in-memory
without chaning a single line of my data-access code
- . In theory, I could switch in an XML-based data store or shove things directly in the cloud (if somebody would built a LINQ2Cloud provider).
- This is surely the whole purpose of LINQ, but it wasn’t practical until Microsoft gave us the Code First Model and POCO binding.
- Almost makes me want to actually write unit tests.
Other Notes
Broken Project References
Above, I mentioned that our web layer has no reference to the Data or Logic layers, which is great. Unfortunately, this means that when you compile the project, Visual Studio does not copy the DLLs into the bin folder, and so when you run it you get errors where the site can’t reference a requested class. To circumvent this, I simply added some post-build commands to copy the files manually:
copy "..\..\BlackBallArchitecture.Logic\bin\$(ConfigurationName)\BlackBallArchitecture.Logic.dll" "$(TargetDir)"
copy "..\..\BlackBallArchitecture.Data\bin\$(ConfigurationName)\BlackBallArchitecture.Data.dll" "$(TargetDir)"
Resharper
Ah Resharper, what would we do without you? With all these files and interfaces flying around, I find Resharper an absolute must for quickly navigating and refactoring my code. Just spend the money guys.
MVC
This article dealt mainly with the solution layers behind the UI – business logic, testing and data access. Unfortunately that’s only part of the story and the web layer still needs work. In particular, it would be nice to see an MVC architecture separate the UI from the ‘UI logic’ (or UX, I suppose) – the code-behind model restricts the re-usability of our UX by tying it to ASPX markup (update: I’ve got a front-end architecture now).
Summary – see it in action
I’ve tried to be as clear as possible, but reading back, it’s pretty hard to explain everything all at once. If you’re like me, the best way is to download the code and play around. Even then, the true benefits of the model may not be apparent until you build a large system, or invite other developers to work alongside you. Remember, you can also see it in action at www.rate-it.co.
Anyway, I hope it’s been thought-provoking and hopefully of some use to some of you. Again, if you have any improvements or even (especially) if you hate the architecture, please leave a comment below.