back to main page

the leading-commas convention in SQL

SQL is a whitespace-insensitive language. You’re free to arrange tokens however you want on the screen. Some may quibble about capitalization or whether to put a newline directly after keywords; these are stylistic differences without much decide between them. But everyone recognizes in principle that consistent indentation is an objectively good practice, i.e. everyone thinks that

select
    foo,
    bar,
    some_function(baz) as new_baz,
    bax,
    ... -- lots more stuff here
from
    some_table
where
    foo > bar
    and not bax

is easier to work with than

select foo, bar, some_function(baz) as new_baz, bax, ...
from some_table where foo > bar and not bax

On the other hand, the leading commas convention is not so obvious to people. It does not seem natural to write code like this:

select
    foo
    , bar
    , some_function(baz) as new_baz
    , bax

In English we are used to seeing commas glued to the ends of words, so it looks a little uncanny when they stand alone. Nevertheless, there are objective benefits to this style.

The main reason is that SQL doesn’t let you write a trailing comma, i.e. this is not allowed:

select
    foo,
    bar,
    some_function(baz) as new_baz,
    bax,        -- syntax error
from
    some_table

This limitation is a source of endless syntax errors. It’s especially common when working interactively, trying to refine you query to get what you want. If you start out like this:

select
    foo,
    bar,
    some_function(baz) as new_baz,
    bax
from
    some_table

and you need to temporarily comment out the final bax line, you might carelessly forget to remove the comma on the preceding line:

select
    foo,
    bar,
    some_function(baz) as new_baz,
    -- bax
from
    some_table

And even if you don’t forget, it’s a chore to move the cursor around more just to satisfy the parser.

If you put your commas at the front, this problem goes away:

select
    foo
    , bar
    , some_function(baz) as new_baz
    -- , bax
from
    some_table

You might object that this has merely shifted the problem to the first line, rather than the last. And yes, now it is a syntax error to comment out the foo without also deleting the comma in front of bar. But in practice, it’s way more common to make adjustments to later lines than earlier lines.

Think about what happens when you write a query. Typically, you have some columns you know for a fact you’re going to need, often a primary key or a name or just *, and those are the ones you write down first. After that come columns you’re less sure about, or ones with more complicated subqueries or windowing functions. Those take more attempts to get right, and you often go back and forth, tweaking and commenting and uncommenting until it works. Those edits happen disproportionately near the end of the select clause. The trailing-commas convention punishes you here, whereas the leading-commas convention is forgiving.

It is easier at a glance to spot missing leading commas than missing trailing commas. They are always in the same horizontal position, so you just have to scan your eye down a vertical column. Missing ones stick out:

select
    something
    , something_else
    sum(blah)                           -- missing comma is obvious
        over (
            partition by something_or_other
            order by banana
        )
        as some_fancypants_thing
    , another_thing
    , and_something_else
from
    some_table

But with trailing-commas your eye must follow and unpredictable zig-zag because of the inconsistency in line lengths:

select
    something,
    something_else,
    sum(blah)
        over (
            partition by something_or_other
            order by banana
        )
        as some_fancypants_thing        -- missing comma may be overlooked
    another_thing,
    and_something_else
from
    some_table

Similar advantages apply to version control diffs: adding a column produces a one-line diff in the leading-comma convention, compared to a two-line diff in the trailing-comma convention.