Hacker News new | past | comments | ask | show | jobs | submit login
Dealing with Timezones in Python (pocoo.org)
158 points by samrat on July 15, 2011 | hide | past | favorite | 25 comments



a few more gotcha's:

* when storing a UTC timestamp and you need to get a datetime object back in order to add dates or do some calcualtions, make sure you use datetime.datetime.utcfromtimestamp(), eg.

    >>> datetime.datetime.utcfromtimestamp(1310727758.472)
    datetime.datetime(2011, 7, 15, 11, 2, 38, 472000)
* python timestamps are longs and represent time in seconds. in Javascript it is the number of milliseconds since epoch, so you need to divide/multiple by 1000 depending on which way you are going. for eg.

    > new Date().valueOf()
    1310728084389

    >>> datetime.datetime.utcfromtimestamp(1310728084389)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: year is out of range
    >>> datetime.datetime.utcfromtimestamp(1310728084389 / 1000)
    datetime.datetime(2011, 7, 15, 11, 8, 4)
* always set your server to UTC, always use the utc* functions (eg. utctimetuple, utcfromtimestamp, utctime etc.), store as a long, and when passing back to the client send the current UTC timestamp as well rather than relying on the client system clock, which is frequently wrong. You can also use:

http://json-time.appspot.com/time.json?tz=UTC

or implement something similar. I resorted to always outputting the current UTC time into a javascript variable and using that to calibrate the local computer time (again, never trusting client time)

* never use time.time() to get a timestamp back because it will munge the value and re-add tz info

* in Javascript, to go from your server-provided UTC timestamp to a local date object that you can then work with, use:

    > new Date(1310728084 * 1000).toUTCString()
    "Fri, 15 Jul 2011 11:08:04 GMT"
* If you are providing an API or a feed, please include a UTC timestamp in the output - parsing all the different date formats and timestamp info and attempting to work out when something happen (such as when a post was published) is a real pain in the ass. store the timezone in a sep field called 'utc_offset' or similar


> * never use time.time() to get a timestamp back because it will munge the value and re-add tz info

Not sure I understand this one. I though time.time() returned the seconds since Epoch?


it does, but it assumes that the timestamp you are giving it is from your current timezone. compare:

    >>> time.mktime(datetime.datetime.now().utctimetuple())
    1310737652.0
    >>> int(datetime.datetime.utcnow().strftime('%s'))
    1310701659
my local timezone is +10, and mktime is adding it back in, hence why I always use strftime


So I've been coding for about a year and a half now and had to learn all this the hard way. Which isn't a bad thing at all.

It's little stuff like this that really makes it true that you should try to build an app or two like a blog-n-whatnot so that one's ignorance of these issues can be exposed.


To be fair, even a lot of experienced programmers are tripped up by these issues.


Another good time read is "Time, What Every Programmer Should Know" (http://unix4lyfe.org/time/) -- in short, it's usually best to store time/dates as Unix time (it's UTC and a single number).


Storing time as Unix timestamps is often not possible. Often you need to be able to answer questions like "did this moment fall on a weekday?", "what day was it 3 hours after that moment" and so on.

My rule is to either work with integer Unix timestamps or timezone-aware datetime objects. A naive datetime object is, well, naive. If it's the number of seconds since the Epoch, represent it as a number, not an object.


> My rule is to either work with integer Unix timestamps or timezone-aware datetime objects. A naive datetime object is, well, naive. If it's the number of seconds since the Epoch, represent it as a number, not an object.

The object is basically just a wrapper around that number anyways. From the programmer's perspective there is not really a difference.


>Storing time as Unix timestamps is often not possible.

If only there were mechanisms that could use a stored set of predefined rules enabling those other values to be derived when given a unix timestamp.


You can convert unix time to whatever format you want.


But you frequently need to store the original timezone, to be able to do date calculations after storing it.

For instance, a user added an appointment and her timezone is America/Los_Angeles. Then she moved to China and changed her timezone in your app to Asia/Shanghai. Then you run analytics to see how many users had had appointments on weekdays.

When storing date information in persistent stores (like databases), it's always advisable to store the time and the timezone.


Even in that situation it's a terrible idea to store local time alone. In the calendar situation (which is the exception not the rule!) I personally would store the time in UTC, in local time and the name of the country separated. I would not store the local time alone because what do you show in the calendar if you are in a different country?

If the assumption is that the problem is a theoretical one because nobody has appointments in the middle of the night when DST changes come, I present you with sysadmins, rescue workers and other people that do have these things scheduled at odd times.

I know that around DST changes many people explicitly not work over night to avoid these issues. Same with year changes and other situations where computing and time does not work properly. For instance overnight trains in Europe just wait for an hour and do nothing.


You store the local time and timezone. You show the value in whatever timezone the user configured the application.

If you store only Unix time, you don't have enough information. If you store both the Unix timestamp, the local time and the timezone, you have too much information. You can derive Unix time from local time and the timezone.

Ergo: store time with time zone.


> You store the local time and timezone. You show the value in whatever timezone the user configured the application.

Because local time is ambiguous. During DST changes local time can mean that it happens twice. Your calendar would have to show 4 AM twice for instance and if you want to make an appointment there you would have to select which one you actually mean, thus UTC + local time.

But granted, it does not happen often that people schedule appointments during DST switches, but certain professions have to so certain people already have to live with that.

> If you store both the Unix timestamp, the local time and the timezone, you have too much information.

Nope. That is incorrect. If I store local time only + timezone and someone sets a date during that transition window it's ambiguous. If I store the date in UTC only and remember the timezone that date was intended for I will not be able to account for changes in the timezone. For instance the legislation of the country could deactivate or change DST settings (which is not uncommon). In that case the timezone information of my operating system will change again and my appointment will move without my consent.

If I store both I can see if they are still in sync and if not, prefer the local time and rebase as necessary.


Oh, I see what you're referring to.

Yes, storing the local time and the timezone is not exactly what's enough, all that time I was thinking about the PostgreSQL timestamp with time zone type, which is preferable to use vs storing the Unix timestamp.

I guess we've been talking past each other, with me being the main culpable :)


Why not store the Unix timestamp and the timezone that was set when created? Then you can just as easily determine what the original local time was and also convert to whatever the current local time is.

And you save yourself the misery of endless conversion when trying to compare anything internally.


Local time + timezone is not enough, because timezones change. For time in the past, including measurements of the current time, store something equivalent to UTC, e.g. time_t or ISO 8601. For times in the future, store local time and location, and do the location -> timezone lookup at rum time. Add an earlier/later disambiguation bit to deal with DST transitions and other ambiguities. Sadly no-one does it this way which is why you have problems like needing to fix all the copies of the TZ information in your Exchange server when TZ rules change. Because of this, the location to TZ database is an exercise for the reader...

http://fanf.livejournal.com/104586.html


And yet another good read is Erik Naggum's "The Long, Painful History of Time" http://naggum.no/lugm-time.html which describes better way for storing date(time)?s.


It seems such a shame that every new language has to learn this - I read a similar post relating to .Net a few years ago


I'm just bitten by this today: I was trying to get the time zone offset in Python to print the local time for an app.

Doing

    date +%z
in console gives me correct offset

    +0800
of China (we do not have DST anymore) while doing

    import time
    time.strftime('%z', time.localtime())
gives me

   +0706
WTF??!!! I can understand Python might not be able to get the correct timezone info from the system, but giving this kind of obviously incorrect offset is just dumb. Still searching for a solution other than manually configure Python code. If anyone knows why, I would be really glad for your help. Thanks in advance!


Use pytz.

  import pytz
  from datetime import datetime

  tz = pytz.timezone('Asia/Shanghai')
  tz.normalize(pytz.utc.localize(datetime.utcnow()).astimezone(tz))


Thanks for the response! I'm looking for a solution such that Python will go and load system's time zone and use that---it's a nightmare to deploy an app to multiple severs located in different timezone and I have to change the timezone in Python manually.


I'd just like to point out in reference to his Galileo quote that there would still be a need for time zones even if the sun revolved around the earth.


In Fantom we model DateTimes with both an absolute time in nanosecond ticks and a relative timezone:

http://fantom.org/doc/docLang/DateTime.html#dateTime

Works great for dealing with timestamps in absolute terms, but also when you need to deal with local time.


I have never heard any convincing argument (other than "we're already doing it") to use timezones or DSTs in the modern world. Can someone offer some articles or arguments?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: