Information is king: December 2008

Wednesday, December 31, 2008

Essential Software Concepts: Fail fast

Software development is hard. Not only is development hard, the environment and hardware is not exactly fault proof. Let's not forget the users either, you can be sure they will do something you didn't think about. Or even the human factor in terms of deployment, database-updates and the like. So one of the few truths about software is that things will fail, the question is what you do about the failures.

You generally have two (three) options in dealing with errors: you can try to make the system recover, or you can fail at once (Or you can try to swallow the error, and cross your fingers that it will work out in the end).

Unless it is essential that the system shouldn't fail, I like to fail things fast. By that I mean, if something doesn't work properly then fail the operation as soon as possible. Failing doesn't mean crashing the application, you should provide an informative message to the user. Failing like this has, as usual, both advantages and disadvantages. The main advantage is related to tracking and fixing bugs. One of the major efforts in fixing bugs is understanding where and why it happened. If you fail early rather than late, you'll most likely have an easier task tracking the bug down. You don't have to worry about the system having worked in an inconsistent state and there's a larger chance that the stack trace actually shows where the problem is located.

The main disadvantage, or rather question to ask yourself is "is failing early fine in production?". If you can't say yes to this, then know that you'll have a much more complex bug-tracking time ahead of you. It's not an easy question though. Failing early can sometimes mean that users are unable to use important functionality because of a strict policy of failing early. Furthermore, a system that keeps failing is not good for building customer trust. Perhaps it is better to fix it in the background in some cases. Whatever you choose, it is certainly not a black or white question.

Tuesday, December 30, 2008

Essential Software Concepts: Broken windows

The concept of broken windows is one of the concepts I remember best from the Pragmatic Programmer book. It is one of those simple concepts that you need to be aware of as a developer. But even though it is known to most developers (at least in principle), it is still violated all over the place - just to try to crank out a little more functionality. The technical debt you build up there you will pay for a long time.

I'll explain the concept, but if you don't know it, you really should go ahead and read the Pragmatic Programmer book. A quick and good read.

The concept can be explained like this: Say you have a house. As long as the house stands firm, there's a good chance it will keep that way. But as soon as a window is broken, and nothing is done about it, the downward spiral begins. Another window is broken, vandals run free, and eventually you get to the point where restoring the house it's former self is more costly than just replacing it.

This certainly holds true for software projects as well. If you keep the architecture sound and continuously work to hold the implementation maintainable; by separating concerns, avoiding duplication, and so forth and so on, and if you fix problems once you find them, you'll have a much larger chance of keeping things maintainable than if you don't.

Taking the time to implement something properly will benefit you in the long run (and I'm talking weeks and months here, not years). The additional time it will take to implement something properly cannot be compared to the cost of being bogged down in a poorly architected code. Doing changes (again, we're talking weeks and months here) will be more costly, and you don't need to pay that price.

Monday, December 29, 2008

Essential Database Practices: Parameterized queries

When you access a relational database the old-fashioned way (not ORM's and LINQ), you generally have three options; you can call stored procedures, you can send queries manually as a concatenated text block or you can send queries manually as a parameterized query.

Option one and three are both valid ways of doing it, but if you ever create queries as concatenated text block WITHOUT parameters you need to stop. I'll explain why by explaining why parameterized queries are good for you.

You should use parameterized queries for three reasons:

SQL injection
Query plan caching
Query plan cache size

SQL injection

This is the most obvious. It is hopefully known to all, as much focus has been put on it in recent years. If not, have a look in wikipedia.

Query plan caching

Every time a query is executed in SQL Server, it goes through two main steps: parsing/compilation and execution. During the parsing/compilation step, SQL Server creates a plan for how it will run the query. Once this is completed, the query is put in the query plan cache and executed.

In the parsing/compilation step, before doing anything, SQL Server will look in the cache to see if the query plan has already been created. If it has, there is no need to create the plan again. Depending on the query and how many times the query is run, you can save a significant amount of time if the query is cached.

The cache check is done by a look up in a hash table where the form of entry is the entire SQL query, formatting included. The reason I listed query plan caching for parameterized queries, is that the contents of the parameters does not matter, thus running the same query with a number of different parameters will result in using the same plan each time. If you concatenate your queries manually, a new plan will be made for each call. Note that this is important to remember also for the queries you create in your stored procedures. Use sp_executesql instead of just using exec to execute your queries, then you can also add parameters to your dynamic SQL.

Query plan cache size

For each query plan created the cache increases in size. The queries go out of the cache when they are not reused, but this can take some time on a busy server. I don't think this usually has a big impact, but creating a large amount of query plans does fill up the cache size. That means they use up an unneccessary amount of your server's ram. Shame on you.

If you want to have a look at how much time the parsing/compilation step takes for a given query, run SET STATISTICS TIME ON before running the query. The actual cache can be accessed via the sys.dm_exec_cached_plans view.

New blog series: Essential this, Interesting that

I recently got the idea of a new form of blog posts I think will be interesting. Basically it involves covering topics that I find to be either essential or interesting within a particular area, and labeling them visibly as such.

I hope you'll find value in it.

Happy holidays :)

Monday, December 15, 2008

Good stuff on Botnets, DDos and Scripting

Rob Conery has a nice blogpost titled "The Perfect Storm Botnet".

A good, fairly short read - a bit like kids horror stories for computer geeks, except for the fact that these are true of course.

So just watch out for all those *-injection attacks kids, it's seldom pretty

Information is king