dateutil Is My New Go-To Time Zone Library

Published , updated

Paul Ganssle, current maintainer of dateutil, asked me to review my 2014 post “The Case for pytz over dateutil” because he believes that dateutil now addresses the concerns I had in my post that caused me to give a slight nod to pytz over dateutil. He’s right: I don’t see any particular reason to use pytz anymore for my applications.

Paul pointed me at his post from earlier this year, “pytz: The Fastest Footgun in the West”. This is a good post, it summarizes dateutil versus pytz when it comes to time zones, so I recommend that you go read his post.

In my original post (which I have hidden away, since it is inaccurate today) I said I would err on the side of using pytz because it handled times in the folded hour better. See below for more about that, but the TL;DR is that, with the implementation of PEP 495 in Python 3.6 and dateutil, there is now a standard way to handle times that fall right around a time change, and dateutil supports it.

As Paul’s post says, dateutil is slower than pytz in some cases, but pytz arguably makes it easier to make mistakes when dealing with time zones. For that reason, I think I’ll be using dateutil for my Python time zone needs in the future. No more pytz dependencies for me!

And because I feel it’s only responsible, let me remind everyone: Convert times to UTC at your inputs and only convert back to local at your outputs! Your code should always deal with UTC internally. If you try to do things like arithmetic on local times you are likely going to introduce errors.

The Folded Hour

In my 2014 post, I noted that pytz handled the folded hour correctly whereas dateutil did not. The folded hour that I’m most concerned about is the one that happens in the United States, where most places in the country set their clocks backward one hour at the end of daylight saving time. This means that, for example, 1 AM “happens twice” in New York City: once when daylight saving time is in effect, and then a second time an hour later when daylight saving ends. The clocks in New York go from 01:59:59 EDT to 01:00:00 EST.

To quote from my 2014 post, here’s what I saw at the time from pytz (though here actually replicated using pytz 2018.5 under Python 2.7.15):

>>> from datetime import datetime, timedelta
>>> import pytz
>>> fmt = '%Y-%m-%d %H:%M:%S %Z (%z)'
>>> pytz_eastern = pytz.timezone("America/New_York")
>>> utc_1am_est = datetime(2002, 10, 27, 6, tzinfo=pytz.utc)
>>> utc_1am_est.astimezone(pytz_eastern).strftime(fmt)
'2002-10-27 01:00:00 EST (-0500)'
>>> utc_1am_edt = datetime(2002, 10, 27, 5, tzinfo=pytz.utc)
>>> utc_1am_edt.astimezone(pytz_eastern).strftime(fmt)
'2002-10-27 01:00:00 EDT (-0400)'

This is correct behavior. Now if we rewind to an old version of dateutil, like 2.2, which is the one I might have been using in my 2014 post, we see the problem I picked out in that old post:

>>> import dateutil
>>> dateutil.__version__
'2.2'
>>> from dateutil import tz
>>> datu_eastern = tz.gettz("America/New_York")
>>> utc_1am_est = datetime(2002, 10, 27, 6, tzinfo=tz.tzutc())
>>> utc_1am_est.astimezone(datu_eastern).strftime(fmt)
'2002-10-27 01:00:00 EST (-0500)'
>>> utc_1am_edt = datetime(2002, 10, 27, 5, tzinfo=tz.tzutc())
>>> utc_1am_edt.astimezone(datu_eastern).strftime(fmt)
'2002-10-27 01:00:00 EST (-0500)'

Yeah, that’s wrong. Now let’s try latest dateutil (still in Python 2.7):

>>> import dateutil
>>> dateutil.__version__
'2.7.3'
>>> from dateutil import tz
>>> datu_eastern = tz.gettz("America/New_York")
>>> utc_1am_est = datetime(2002, 10, 27, 6, tzinfo=tz.tzutc())
>>> utc_1am_est.astimezone(datu_eastern).strftime(fmt)
'2002-10-27 01:00:00 EST (-0500)'
>>> utc_1am_edt = datetime(2002, 10, 27, 5, tzinfo=tz.tzutc())
>>> utc_1am_edt.astimezone(datu_eastern).strftime(fmt)
'2002-10-27 01:00:00 EDT (-0400)'

Woo, that’s correct! Now check this out:

>>> type(utc_1am_est.astimezone(datu_eastern))
<class 'dateutil.tz._common._DatetimeWithFold'>

I believe this is dateutil backporting PEP 495’s “fold” concept into Python 2.7. PEP 495, and dateutil’s implementation thereof, is what basically fixed this problem for dateutil, as far as I can tell. If we try the above with the same version of dateutil, but under Python 3.6, the Python version where PEP 495 was first implemented, there’s nothing special:

>>> type(utc_1am_est.astimezone(datu_eastern))
<class 'datetime.datetime'>
>>> type(utc_1am_edt.astimezone(datu_eastern))
<class 'datetime.datetime'>

The difference is in the new fold attribute:

>>> utc_1am_est.astimezone(datu_eastern).fold
1
>>> utc_1am_edt.astimezone(datu_eastern).fold
0

The fold attribute is 0 when the value is the earlier “of the two moments with the same wall time representation”, whereas it is 1 when the value is the later of those two moments.

Producing Times in the Folded Hour

Also in my 2014 post I said that dateutil made it very hard/impossible to construct a datetime instance in the folded hour, whereas pytz provided a way to do it. Here’s how pytz handles producing times in the folded hour:

>>> naive_1am = datetime(2002, 10, 27, 1)
>>> pytz_eastern.localize(naive_1am, is_dst=False).strftime(fmt)
'2002-10-27 01:00:00 EST (-0500)'
>>> pytz_eastern.localize(naive_1am, is_dst=True).strftime(fmt)
'2002-10-27 01:00:00 EDT (-0400)'
>>> pytz_eastern.localize(naive_1am, is_dst=None).strftime(fmt)
Traceback (most recent call last):
  File "<ipython-input-12-6e97a68309e9>", line 1, in <module>
    pytz_eastern.localize(naive_1am, is_dst=None).strftime(fmt)
  File ".../lib/python2.7/site-packages/pytz/tzinfo.py", line 349, in localize
    raise AmbiguousTimeError(dt)
AmbiguousTimeError: 2002-10-27 01:00:00

The catch here is that pytz does require you to use the is_dst argument of its non-standard localize function. Thanks to PEP 495, we now have a standard way to construct instances on either side of the fold when using dateutil:

>>> naive_1am.replace(tzinfo=datu_eastern).strftime(fmt)
'2002-10-27 01:00:00 EDT (-0400)'
>>> naive_1am.replace(tzinfo=datu_eastern, fold=1).strftime(fmt)
'2002-10-27 01:00:00 EST (-0500)'

BTW, just for fun, the above with pytz’s tzinfos:

>>> naive_1am.replace(tzinfo=pytz_eastern).strftime(fmt)
'2002-10-27 01:00:00 LMT (-0456)'
>>> naive_1am.replace(tzinfo=pytz_eastern, fold=1).strftime(fmt)
'2002-10-27 01:00:00 LMT (-0456)'

The two times are just wrong because I’m not using pytz the way it is supposed to be used, with its own (non-standard!) methods and such. This is not surprising, though it is a bit amusing. After looking through its sources, I think pytz does not do anything with the fold attribute in any case.

Premature Optimizations Revisited

Finally, my 2014 post jokingly had a section “Premature Optimizations” where I made observations that pytz is a larger library than dateutil primarily because dateutil keeps its tz database compressed whereas pytz doesn’t. I didn’t check that again today.

However, in the original post I also pointed out that pytz instances are smaller when pickled. This is still the case:

>>> import pickle
>>> now = datetime.utcnow()
>>> len(pickle.dumps(now, -1))
53
>>> len(pickle.dumps(now.replace(tzinfo=pytz.UTC), -1))
72
>>> len(pickle.dumps(now.replace(tzinfo=tz.tzutc()), -1))
83

It’s really true with non-UTC instances:

>>> len(pickle.dumps(pytz_eastern.localize(now), -1))
105
>>> len(pickle.dumps(now.replace(tzinfo=datu_eastern), -1))
3456

I am guessing that this might be due to pytz identifying the offset tzinfo instance it should attach just once when you call pytz’s localize method, resulting in fast operation after construction, possibly (I believe) at the cost of surprising results in future computations unless you use pytz’s normalize method. (I learned that by reading Paul’s excellent article!) In contrast, I gather that dateutil’s tzinfo instances carry around all the possible information for the time zone, meaning that you can use the standard Python datetime methods and such without resort to any special methods as in pytz.

Update: Paul emailed me shortly after I published this to point out that pytz tzinfo instances are smaller because they more-or-less just store the time zone name, whereas dateutil’s tzinfo objects do indeed carry all of the necessary time zone information, even when pickled. I did some further investigation of pickling pytz and dateutil tzinfo instances in a sort of follow-up post.