SQL Coding Guidelines

The goal of these guidelines are to:

  • Help us develop quality code that will scale well.
  • Make life easier on your fellow developers and those who will pick up this code in three years to tweak something.

To accomplish these objectives, we all need to follow these guidelines. Most are not specific to any specific database technology.

Database Tables:

  • Normalize data based on the 3rd Normalization Form (http://en.wikipedia.org/wiki/Third_normal_form). Do not attempt to over-normalize data or de-normalize data until you can demonstrate a specific problem you think it will solve. Then you will need to prove with testing that the changes will help.
  • Table names should be Pascal case.
  • Table names should be real words. Preferably a noun. Car, Project, Person, Employee, etc. Unless an abbreviation is well-known and new developers will understand it immediately, do not use abbreviations.
  • Tables that create a many-to-many relationship between two tables and store no other values should be named “TableName1″ + “TableName2″ + “Map”. ProjectCategoryMap for example.
  • Do not use “_” or spaces in table names.
  • Do not preface table names with anything.
  • Table names cannot be a reserved word in SQL. If you are not sure, go type the word in a Query window of SQL Management Studio and if it changes colors, you cannot use it.
  • Table names should be singular. If the name you want to use is a reserved word in singular form but not in plural form, find another name. User is reserved, Users is not but plural so find something else. Like AppUser which violates abbreviations, but it’s common enough so we’ll deal with it.
  • All tables should be owned by dbo. Explicitly establish the owner when creating tables by prefacing the name with dbo, Ex. create table dbo.Test
  • Unless you are a DBA, do not use triggers. Ever. For any reason. Ever.
  • Use a single Lookup table for all single-value structures. For example, a Person may have a Hair Color and an Eye Color. You may want to put the options in a table for populating lists in the application and create a relationship to the data so the value can change over time.

    Many would request you create a HairColor table with HairColorId and HairColorName as columns, and an EyeColor table with EyeColorId and EyeColorName as columns.

    We need you to use a single lookup table capable of storing simple lookups for many types of data. The Lookup table is named AppLookup and has AppLookupId, AppLookupTypeName, and AppLookupName columns. For the example above, you would insert a row into Lookup for each Hair Color or Eye Color as the AppLookupName with LookupTypeName set to either HairColorType or EyeColorType.

Columns:

  • Column names should be Pascal case.
  • Do not use “_” or spaces in column names.
  • The first column in every table should be an identity column named “TableName” + Id. If you have a Project table, its first column should be ProjectId.
  • Id columns should be Id, not ID
  • If the data represented by the table has a primary owner – a Project is owned by a Client for example – the next set of columns should be the things this table refers to that own it. ClientId for example.
  • If the table represents something that has a name or a label, the column should be named “TableName” + Name. ProjectName for example.
  • Unless the database is going to have a lot of data – and we mean a lot of data – limit the number of data types you use. Make all decimal values Decimal(12,3) instead of using Money, Float and all of the other various data types. This will keep other developers in three years from having to figure out what data type you used.
  • Do not use “n” data types (nvarchar, nchar, etc.) unless you have to.
  • Do not use text or image data types. Use varchar(max) for large text blobs. Store images in the file system and store the filename in the database. Do not store the full path to the file in the database. Store the current root-directory’s path in the application’s configuration, and the relative path/filename in the database.

Keys/Indexes/Constraints:

  • One or more columns should be specified as the PK. By default, it should be either the identity column for the table, or the non-identity columns that make up a Map table. In pre-production load testing, we may come back in and adjust the PK.
  • Do not create any indexes during development. We may add indexes in pre-production load testing.
  • Create Foreign Keys for any column that refers to a column in another table.
  • If the table has a FK to another table, the column name should match the Id column of the referred-to table. ClientId in the Project table is the same name as ClientId in the Client table. Another developer can look at the column ClientId and assume that it is a reference to the Client table. If it was named OwnerId in the Project table, another developer would have to figure out where it goes by digging into the keys.
  • If the table has a FK to the Lookup table, the column name should match the type of the Lookup, HairColorTypeId, EyeColorTypeId, to carry on the example from above. If another developer sees a column in a table ending in TypeId, she will know that it refers to the Lookup table and the subset of rows are those with the LookupTypeName of HairColorType.
  • Constraints should be used where appropriate

Stored Procedures:

  • All interaction between the application and the database should be done with stored procedures. No exceptions. Never embed SQL statements in your application. Ever.
  • Stored procedures should use the following naming convention: “app” + “Table” + “_” + “Verb” + “Conditions”
  • “app” is a lower case, two-character application-specific prefix. This becomes useful as a database ages and more than one application or team is developing against it. It will make it easier for the team to isolate the stored procedures it has developed if it uses a unique preface.
  • “Table” is the name of the core table whose data is being requested.
  • “Verb” helps the developer understand what the stored procedure is doing and should be something like “Get”, “GetAll”, “Save”, “Delete”
  • “Conditions” can represent required parameters or other information helpful for another developer to understand what the stored procedure does without opening it. ncProject_GetForUser would return Projects for a user for example. You will learn that _Get returns a single row that matches the Id column of a table. ncProject_Get would expect ProjectId as input and return the one row it matches. Can you guess what ncClient_Get does?
  • All input parameter names should match a column name. If you are writing a stored procedure to return Projects for a Client, I would expect it to be named ncProject_GetForClient, and have a @ClientId int parameter since that’s what the underlying column name is in the table. If you write the procedure with the parameter @Client int, you’ll be adding a step for another developer to figure out what the parameters are named.
  • Do not use “sp” or “sp_” as a prefix for stored procedures.
  • The first statement in all stored procedures should be SET NOCOUNT ON
  • In general, do not use output parameters or return values. To return a single value stored in a variable, just do select @var as Name
  • Do not trap errors in a stored procedure unless you are providing an alternate path and will not raise the error again. Errors are handled by the framework at the connection level and will report/expose database errors just like other application errors. You should focus on preventing errors by testing input and data for conditions that will cause an error, not catching errors after the fact.
  • Stored procedures should be very small and most times, should focus on doing one thing or contain one primary statement.
  • Do not use transactions in stored procedures. If you think a transaction is needed, consult with a development manager because a cleaner solution is probably available.
  • Stored procedures should return only one resultset. If you think you need multiple resultsets, you probably need a single but more complex select statement or multiple stored procedures.

Query Writing/SQL Syntax

  • Do not use select *. Always explicitly list column names.
  • Column names should always be on a new line.
  • All queries that return more than one row should have a default sort order. Additional sorting will be handled in the application layer.
  • In your editor, set tabs to be replaced by four spaces so formatting will be retained regardless of editor.
  • Format statements with major clause left-aligned and column/table names indented four characters.
  • When commas are used to separate column names, place the comma at the beginning of the line, not and end. This makes it easier to comment out rows while developing and testing.
  • Use NOLOCK and ROWLOCK as much as possible.
  • Handle optional parameters by accepting the parameter as null and testing for null and a valid value in the where clause. Parameter definition would be @ProjectId int = null … where clause would be (@ProjectId is null or ProjectId = @ProjectId). It is important that the @ProjectId is null is tested first so the condition will be abandoned before scanning the table if one is not provided.
  • Assign column name aliases only where there will be a conflict with other columns in the same output. A column named ProjectName should always be ProjectName in all query output to the application. Do not give it an alias such as select ProjectName as Project from Project … Aliases make is more difficult for other developers to reuse your queries without looking at the stored procedure. They also make it difficult to populate resultsets into properties of the domain classes.

This statement illustrates much of what is above:

select
    c.ChamberId
    , c.ChamberName
    , ...
from
    Chamber c with (nolock)
    inner join State s with (nolock) on s.StateId = c.StateId
where
    (@StateId is null or c.StateId = @StateId)
order by
    s.StateName asc
    , c.CityName
    , c.ChamberName

Do not do:

Select *
From Chamber, State
Where State.StateId = Chamber.StateId
And State.StateId = IsNull(@StateId, State.StateId)

Or:

Select ChamberId as Id, ChamberName as Name, ...
From Chamber, State on
Where State.StateId = Chamber.StateId
And State.StateId = IsNull(@StateId, State.StateId)

Problems with the above:

  • Poorly formatted and hard to read
  • First example does not specify columns, second statement formats columns as one big hard-to-read block
  • Both use old-style joins with no aliases for tables
  • Second one uses alias for column names
  • “Name” is a reserved word in SQL and should not be used for anything other than SQL’s intended purpose
  • Using IsNull() for optional parameter performs poorly compared with (@var is null or column = @var)
  • No default sort order

More Query Writing/SQL Syntax rules:

  • Join to a derived query in the from clause instead of using sub queries in the select or where clause of a statement. Avoid using where column IN(select…) and NOT IN(select …).
  • Use joins in complex Delete statement instead of using the IN() clause.

Do:

delete p
from
   Client c
   join Project p on p.ClientId = c.ClientId
where
   c.ExpirationDate < @today

Instead of:

delete
   Project
where
   ClientId in (select ClientId
                from Client
                where ExpirationDate < @today)
  • Always explicitly list column names in insert statements.
  • Do not use cursors. If you feel the need to use a cursor, see a development manager to review the specific problem for other solutions. If no alternative can be found, we might use cursors.
  • By default, splitting a record set into multiple pages will be handled in the application layer. Paging a record set in the database should only be done for very large recordsets and requires the approval of a development manager. Most likely, you need more/required filters because the user will never use or page through a recordset of 50K rows.
  • If a statement will have multiple tables and multiple filter criteria, the first table should be the one where a filter will always be provided and it will result in the smallest number of rows to be joined.

Do:

delete
    Project
from
    Client c
    join Project p on p.ClientId = c.ClientId
where
    c.ExpirationDate < @today

Instead of:

delete
    Project
from
    Project p
    join Client c on c.ClientId = p.ClientId
where
    c.ExpirationDate < @today

The first statement will reduce the number of Clients to only those who match Expiration Date criteria before attempting to read/filter the Project table.

If we reverse there From/Join order as is done in the second statement, SQL is likely to join all rows of the Project table to the Client table resulting in a much larger resultset to be filtered.

I haven’t tested this approach in a while, but as far as I know it is still valid and is a design practice you need to adhere to.

Share It:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • E-mail this story to a friend!
  • Netvibes
  • Ping.fm
  • Reddit
  • StumbleUpon
  • Technorati