When it comes to dealing with dates and times in Python, my first choice is the standard library’s
datetime module. However, while
datetime supports time zones in general, it has no knowledge of any specific time zones. It lacks a time zone database, information about all the world’s time zones past and present. PEP 431 would add such a database to the standard library, but it seems to be stalled.
I am aware of two libraries you can use when you need time zone information in Python,
pytz seems to be the most popular time zone library by far. However,
dateutil has the same time zone information and a lot of other useful date and time functionality besides. Is there any reason to use
dateutil seems to be a superset of
As it turns out, yes, there is at least one good reason to prefer
Before I divulge that reason, I should tell you that I am a firm believer that the software you write should internally use only UTC. Convert all times to UTC on input, and convert to local times only when producing output for the user. To do otherwise is to invite errors.
I mention this because
dateutil have some differences that are only relevant if you, for example, perform arithmetic on non-UTC times. I’m not addressing those differences here because I will (hopefully) never need to care about them. I avoid them by simply using UTC whenever possible.
Lest you think this advice regarding UTC is unsubstantiated, here’s an example: The United States’ Eastern time zone entered daylight saving time1 on April 7, 2002 at 7:00 a.m. UTC. Local clocks went from 1:00 a.m. EST to 3:00 a.m. EDT. In this example I’ll do some arithmetic on a local time.
>>> from datetime import datetime, timedelta >>> import pytz >>> fmt = '%Y-%m-%d %H:%M:%S %Z (%z)' >>> pytz_eastern = pytz.timezone("America/New_York") >>> utc_ny_dst_start = datetime(2002, 4, 7, 7, tzinfo=pytz.utc) >>> local_ny_dst_start = utc_ny_dst_start.astimezone(pytz_eastern) >>> local_ny_dst_start.strftime(fmt) '2002-04-07 03:00:00 EDT (-0400)' >>> (local_ny_dst_start - timedelta(minutes=1)).strftime(fmt) '2002-04-07 02:59:00 EDT (-0400)'
Oops! 2:59 a.m. never happened in the Eastern time zone on April 7, 2002. I used
pytz here, but if I had used time zones from
dateutil instead I would get exactly the same results. (
normalize method can actually fix this problem—but just use UTC whenever possible, OK?)
I was most recently reminded that I should use UTC internally by Taavi Burns’ excellent PyCon 2012 presentation, What You Need to Know about
datetimes, which itself quotes this advice from Armin Ronacher’s “Dealing with Timezones in Python.”
My prior example demonstrated a time that “never occurred,” 2:00 a.m. on April 7, 2002 in the US Eastern time zone. When daylight saving time ends, though, you have a different problem: times that “happen twice.” Later on in 2002, daylight saving time ended in the Eastern time zone on October 27 at 6:00 a.m. UTC. The local clocks went from 1:59:59 a.m. EDT to 1:00:00 a.m. EST, and all times between 1:00:00 a.m. and 1:59:59 a.m. “happened twice”: once in Eastern Daylight Time and then again in Eastern Standard Time. Therefore a time such as “1:30 a.m. on October 27, 2002” is ambiguous. Did I mean 1:30 a.m. EDT or 1:30 a.m. EST?
datetime’s API has problems with these ambiguous times. To demonstrate this problem, let’s say you read October 27, 2002 6:00 a.m. UTC from your database, and now you want to display this date to the user in his or her local time zone, which is Eastern time.
>>> from dateutil import tz >>> datu_eastern = tz.gettz("America/New_York") >>> utc_1am_est = datetime(2002, 10, 27, 6, tzinfo=tz.tzutc()) >>> utc_1am_est.astimezone(datu_eastern).strftime(fmt) '2002-10-27 01:00:00 EST (-0500)'
Great, daylight saving time ended at 6:00 a.m. UTC and the output is 1:00 a.m. EST as expected. Now, what if you did the same thing with the hour before that, 5:00 a.m. UTC? That should still be in EDT.
>>> utc_1am_edt = datetime(2002, 10, 27, 5, tzinfo=tz.tzutc()) >>> utc_1am_edt.astimezone(datu_eastern).strftime(fmt) '2002-10-27 01:00:00 EST (-0500)'
Oops! 5:00 a.m. UTC on October 27, 2002 was 1:00 a.m. EDT, not EST. Both the time zone abbreviation and the time zone offset are wrong.
This problem can be traced back to
datetime’s API, which documents this problem:
tzinfo.dst()method must consider times in the “repeated hour” to be in standard time. […] Applications that can’t bear such ambiguities should avoid using hybrid
tzinfosubclasses; there are no ambiguities when using UTC, or any other fixed-offset
tzinfosubclass (such as a class representing only EST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)).
pytz, however, gets this right:
>>> utc_1am_est.astimezone(pytz_eastern).strftime(fmt) '2002-10-27 01:00:00 EST (-0500)' >>> utc_1am_edt.astimezone(pytz_eastern).strftime(fmt) '2002-10-27 01:00:00 EDT (-0400)'
This difference in correctness is why I think
pytz should be preferred to
dateutil when you need to work with time zones. It may seem unlikely that your application will ever hit the problem I’ve demonstrated here, but in many cases I bet you can imagine how it is possible, and that’s enough for me. I prefer to err on the side of correctness.
In case you’re wondering how
pytz works here while
dateutil does not, it seems that
pytz actually uses two separate
tzinfo instances for the same time zone, one for standard time and another for daylight saving time:
>>> utc_12am_edt = datetime(2002, 10, 27, 4, tzinfo=pytz.utc) >>> one_hour = timedelta(hours=1) >>> local_12am_edt = utc_12am_edt.astimezone(pytz_eastern) >>> local_1am_edt = (utc_12am_edt + one_hour).astimezone(pytz_eastern) >>> local_1am_est = (utc_12am_edt + one_hour*2).astimezone(pytz_eastern) >>> local_2am_est = (utc_12am_edt + one_hour*3).astimezone(pytz_eastern) >>> local_12am_edt.tzinfo is local_1am_edt.tzinfo True >>> local_1am_edt.tzinfo is local_1am_est.tzinfo False >>> local_1am_est.tzinfo is local_2am_est.tzinfo True
Even though I passed in the same
pytz_eastern to every
astimezone call, the
tzinfo attached to each
datetime instance is different depending on whether or not the
datetime should be in standard time or daylight saving time.2
pytz’s documentation actually says that you should call its time zone instances’
normalize methods on the result of
astimezone when converting to a non-UTC time zone, but I have my doubts whether this is necessary. I have not yet found any circumstances where
normalize was necessary when merely converting from one time zone to another via
datetimesin a Repeated Hour
If you ever need to construct a local time directly, perhaps as a result of parsing a string, it’s nigh impossible to get the DST version of the hour where daylight saving time ends with
>>> datetime(2002, 10, 27, 1, tzinfo=datu_eastern).strftime(fmt) '2002-10-27 01:00:00 EST (-0500)'
That’s it. EST is your only option. There’s no way to tell the
datetime library which 1 a.m. you meant, EDT or EST. As stated in the passage quoted from the
datetime documentation, above, you will always get standard time.
pytz, on the other hand, gives you a way out of this problem using the
localize methods of its
>>> naive_1am = datetime(2002, 10, 27, 1) >>> pytz_eastern.localize(naive_1am, is_dst=False).strftime(fmt) '2002-10-27 01:00:00 EST (-0500)' >>> pytz_eastern.localize(naive_1am, is_dst=True).strftime(fmt) '2002-10-27 01:00:00 EDT (-0400)' >>> pytz_eastern.localize(naive_1am, is_dst=None).strftime(fmt) Traceback (most recent call last): File "<ipython-input-12-6e97a68309e9>", line 1, in <module> pytz_eastern.localize(naive_1am, is_dst=None).strftime(fmt) File ".../lib/python2.7/site-packages/pytz/tzinfo.py", line 349, in localize raise AmbiguousTimeError(dt) AmbiguousTimeError: 2002-10-27 01:00:00
In that last case,
is_dst=None means, “I don’t know if this is supposed to be daylight saving time or not, so raise an error if it’s ambiguous.”
Based on the preceding argument, I feel pretty strongly that I should use
pytz instead of
dateutil for my time zone needs. That said, I have found an argument in
dateutil’s favor: support for Windows’ built-in time zone data.
dateutil use the well-known “tz database”. Both Python libraries include a copy of this database, but on *nix systems both libraries will prefer to use your system’s database if available. I think the assumption is that your OS’s time zone information is more likely to be up-to-date.
Windows doesn’t use the tz database, though. It has its own time zone database stored in the Windows Registry.
dateutil will use Windows’ database before falling back to its included tz database. In contrast,
pytz doesn’t know how to use the Windows time zone database.
pytz can only use the tz database.
This is probably an argument in
dateutil’s favor, but I find it to be a particularly weak argument compared with the fact that common time zone usage with
dateutil may produce an incorrect result. For the vast majority of applications, no matter the platform, I suspect an up-to-date
pytz is a superior choice.
For fun, here’s a few other points of comparison between these libraries.
While we’re on the topic of included time zone databases, I’ll mention that, as of this writing,
pytz’s included tz database seems to be 2.3 MiB. In contrast,
dateutil keeps a compressed tarball of the tz database, weighing in at just 208 KiB. That could be a meaningful difference if you’re tight on storage space (e.g. an embedded system). Perhaps the
pytz author would accept a patch!
On the other hand, Armin Ronacher’s blog entry mentions that
datetime instances with
tzinfo, “often cause much larger pickles.” For your consideration:
>>> import pickle >>> now = datetime.utcnow() >>> len(pickle.dumps(now, 2)) 44 >>> len(pickle.dumps(now.replace(tzinfo=pytz.utc), 2)) 61 >>> len(pickle.dumps(now.replace(tzinfo=tz.tzutc()), 2)) 73
pytz’s UTC is a little smaller than
dateutil’s. That could make a difference if, for example, you wanted to store aware (as opposed to naïve)
datetime instances. (But really, how hard is it to call
.replace(tzinfo=pytz.utc) when reading in time stamps?)
Speaking of UTC, it seems like
dateutil makes a new
tzinfo object for UTC every time you ask for it:
>>> tz.tzutc() is tz.tzutc() False
pytz which just has its single
utc member, rather than creating a new object every time you want to use it.
Not “daylight savings time”! I am disappointed that she doesn’t think there is a clear standard for capitalizing time zone names. I usually prefer Chicago, but their time zone capitalization rules are too complicated for my tastes, so I’m choosing the AP Style Guide’s rules for time zones. ↩
Incidentally, neither of those
tzinfo attributes are the same object as
pytz_eastern, but each
tzinfo attribute as well as
pytz_eastern are instances of
datetime.tzinfo according to