I’m a fairly new .NET developer. I worked with the framework a bit in 2005–6,
but hadn’t really touched it since then. In the meantime, I’ve been sticking
to the Linux ecosystem, and a little OS X, as mentioned in a previous
article.
So, time zones. I know they’re a sore point for many environments, but most
seem to dig themselves out of the hole and provide something that is, in the
end, usable.
Ruby’s builtin Time is actually pretty darn good, and if you use Rails,
ActiveSupport makes it even better. pytz seems .. alright. Databases
generally have their heads screwed on straight. A lot of the time you can get
away with just storing seconds since the epoch and call it a day, because
there’s nothing more intrinsic built into the system.
Then I got my new job, and it was time get back into
.NET. A lot has changed since 2006; it had only hit 2.0 then, mind.
So I felt confident I was using the latest, modern stuff. We target 4.0 and
4.5 across our projects, and there’s plenty nice about it.
Then I had to work with System.DateTime. Oh. Oh, gosh.
I quote the manual at you.
DateTime.Kind Property
Gets a value that indicates whether the time represented by this instance is
based on local time, Coordinated Universal Time (UTC), or neither.
Kind is the lone field on a DateTime which has anything to do with time
zones. It can take the value Local, Universal, or Unspecified. What
does that even MEAN. Note that Kind is ignored in comparisons, too, which
can only mean more fun for application developers.
It would be remiss of me to fail to note the paragraph in the docs which state:
An alternative to the DateTime structure for working with date and time
values in particular time zones is the DateTimeOffset structure. The
DateTimeOffset structure stores date and time information in a private
DateTime field and the number of minutes by which that date and time differs
from UTC in a private Int16 field. This makes it possible for a
DateTimeOffset value to reflect the time in a particular time zone, whereas a
DateTime value can unambiguously reflect only UTC and the local time zone’s
time. For a discussion about when to use the DateTime structure or the
DateTimeOffset structure when working with date and time values, see
Choosing Between DateTime, DateTimeOffset, and
TimeZoneInfo.
The linked page states that “although the DateTimeOffset type includes most of
the functionality of the DateTime type, it is not intended to replace the
DateTime type in application development.”
Intention or not, it should ALWAYS be used. It lists as a suitable
application for the DateTimeOffset:
Uniquely and unambiguously identify a single point in time.
Because we don’t want that at any other time? When do you want a DateTime
which non-specifically and ambiguously identifies several points in time?
On the other hand, listed as suitable for DateTime:
Retrieve date and time information from sources outside the .NET Framework,
such as SQL databases. Typically, these sources store date and time
information in a simple format that is compatible with the DateTime
structure.
No fucking comment.
It continues:
Unless a particular DateTime value represents UTC, that date and time value
is often ambiguous or limited in its portability. For example, if a DateTime
value represents the local time, it is portable within that local time zone
(that is, if the value is deserialized on another system in the same time
zone, that value still unambiguously identifies a single point in time).
Outside the local time zone, that DateTime value can have multiple
interpretations. If the value’s Kind property is DateTimeKind.Unspecified, it
is even less portable: it is now ambiguous within the same time zone and
possibly even on the same system on which it was first serialized. Only if a
DateTime value represents UTC does that value unambiguously identify a single
point in time regardless of the system or time zone in which the value is
used.
Completely useless. So, we’ll use DateTimeOffset in our application code,
right?
Only the ecosystem hasn’t caught up.
Enter Npgsql, a Postgres driver for
.NET with a frightening amount of code. It only works with DateTime objects
when sending or receiving timestamps to or from Postgres.
Postgres has two column types: timestamp with time zone and timestamp
without time zone (or timestamptz and timestamp, respectively). The
former is about as good as a DateTime, but without trying to be more than it
can: it doesn’t have Kind, which improves its usability by an order of
magnitude. You can make a policy decision like “we’ll always store UTC
timestamps”, and you’ve solved timezones in your application. They mark a
specific point in time unambiguously.
Or you can just use timestamptz and they still unambiguously mark a specific
point in time. It’s magic!
So how does Npgsql deal with this?
The genesis of this post was because we were noting strange behaviour: we had
read a timestamptz out of the database, and then later SELECTed all rows
where that column was strictly less than the value we read out. And yet that
same row would be included in the result. It made no sense.
Turns out it really did make no sense.
The rest of this blog is a sequence of test cases which demonstrate just how
bad the situation is.