SQL is a whitespace-insensitive language. You’re free to arrange tokens however you want on the screen. Some may quibble about capitalization or whether to put a newline directly after keywords; these are stylistic differences without much decide between them. But everyone recognizes in principle that consistent indentation is an objectively good practice, i.e. everyone thinks that
select
foo,
bar,as new_baz,
some_function(baz)
bax,... -- lots more stuff here
from
some_tablewhere
> bar
foo and not bax
is easier to work with than
select foo, bar, some_function(baz) as new_baz, bax, ...
from some_table where foo > bar and not bax
On the other hand, the leading commas convention is not so obvious to people. It does not seem natural to write code like this:
select
foo
, baras new_baz
, some_function(baz) , bax
In English we are used to seeing commas glued to the ends of words, so it looks a little uncanny when they stand alone. Nevertheless, there are objective benefits to this style.
The main reason is that SQL doesn’t let you write a trailing comma, i.e. this is not allowed:
select
foo,
bar,as new_baz,
some_function(baz) -- syntax error
bax, from
some_table
This limitation is a source of endless syntax errors. It’s especially common when working interactively, trying to refine you query to get what you want. If you start out like this:
select
foo,
bar,as new_baz,
some_function(baz)
baxfrom
some_table
and you need to temporarily comment out the final bax
line, you might carelessly forget to remove the comma on the preceding
line:
select
foo,
bar,as new_baz,
some_function(baz) -- bax
from
some_table
And even if you don’t forget, it’s a chore to move the cursor around more just to satisfy the parser.
If you put your commas at the front, this problem goes away:
select
foo
, baras new_baz
, some_function(baz) -- , bax
from
some_table
You might object that this has merely shifted the problem to the
first line, rather than the last. And yes, now it is a syntax error to
comment out the foo
without also deleting the comma in
front of bar
. But in practice, it’s way more common to make
adjustments to later lines than earlier lines.
Think about what happens when you write a query. Typically, you have
some columns you know for a fact you’re going to need, often a primary
key or a name or just *
, and those are the ones you write
down first. After that come columns you’re less sure about, or ones with
more complicated subqueries or windowing functions. Those take more
attempts to get right, and you often go back and forth, tweaking and
commenting and uncommenting until it works. Those edits happen
disproportionately near the end of the select
clause. The
trailing-commas convention punishes you here, whereas the leading-commas
convention is forgiving.
It is easier at a glance to spot missing leading commas than missing trailing commas. They are always in the same horizontal position, so you just have to scan your eye down a vertical column. Missing ones stick out:
select
something
, something_elsesum(blah) -- missing comma is obvious
over (
partition by something_or_other
order by banana
)as some_fancypants_thing
, another_thing
, and_something_elsefrom
some_table
But with trailing-commas your eye must follow and unpredictable zig-zag because of the inconsistency in line lengths:
select
something,
something_else,sum(blah)
over (
partition by something_or_other
order by banana
)as some_fancypants_thing -- missing comma may be overlooked
another_thing,
and_something_elsefrom
some_table
Similar advantages apply to version control diffs: adding a column produces a one-line diff in the leading-comma convention, compared to a two-line diff in the trailing-comma convention.