A data file shows a student's application was filed on "14-Apr-2020". So, was that application filed on or before April 14?

That seems like it should be an easy question to answer .. obviously yes, right? But if you're in the US, your browser might say "No" .

Interpreting and comparing date values is surprisingly nuanced. There are many common date string formats, these formats vary by country, browsers differ in how they parse date strings, and they even consider the user's time zone, so results may vary depending on when and where the date string was parsed.

This article details some practical complexities and presents a simple function to convert date strings you might receive in a data file to a clean date string you can consistently use for comparisons.

Date formats vary widely

The web comic XKCD nicely hints at the issue: XKCD on dates
ISO 8601 was published on 06/05/88 and recently amended on 12/01/04.

The core idea in this comic is that there is only one internationally recognized standard date format ("YYYY-MM-DD"). And yet this so-called standard format is unlike anything most people use in daily life.

Date strings need to be interpreted

Because data files often represent dates as string values, we need to parse the date strings to interpret them as Date values.

The challenge is that Date values are precise, in that they also require some assumption about the time and the time zone.

There is no single date format that works in all browsers and Excel

Surprisingly, as of the time of this article, Microsoft does not read or write dates in ISO standard form.

Further browsers vary in their support for other date formats. And in some notations, US and Europe differ in whether the month or day comes first.

  • "2020-04-14T09:12:00.000Z" parses in Chrome and Firefox as midnight UTC, but not Excel
  • "2020-04-14" parses in Chrome, Excel and Firefox, as midnight am London time
  • "2020-04-14 09:12" parses in Chrome and Excel in local time zone, but not Firefox
  • "04/14/2020 09:12" parses in Chrome, Excel and Firefox but assumes US Locale
  • "14-Apr-2020 09:12" parses in Firefox but as year -2020 during the bronze age.
  • "Apr 14, 2020 09:12" parses in Excel, Chrome and Firefox ... but US only??

One format does work pretty well across browsers, Excel, even SPSS

The format "YYYY/MM/DD HH:mm" does work pretty well across many platforms. Here the hours are in 24 hour time so "2020/04/14 15:30" is 3:30 pm.

It's interpretable and writable by most major browsers, Excel, and SPSS If you could choose one date string to use this would be it. Unfortunately, it's not an ISO standard date. But it works for now.

What time? What time zone?

The string "14-Apr-2020" references a date but not a time or time zone. Chrome interprets the string "14-Apr-2020" as date 2020-04-14T00:00:00.000Z which is just past midnight am "Zulu" or Coordinated Universal Time often referred to as Greenwich Mean Time (GMT).

Technically there is a difference between GMT and UTC but per XKCD you can ignore any sentence that starts with the word "technically".

Browsers consider the users' own time zone

So was this on or after April 14? It might depend where the browser is.

If the date string is an non-standard date format, as is often the case in data files, the browser will assume the time zone is the browser location's local time, and if no time is specified it will assume the start of the day or hour 00:00:

> new Date("14-Apr-2020")
< Tue Apr 14 2020 00:00:00 GMT-0400 (Eastern Daylight Time)

However, if the deadline is specified in a different date format, the browser may assume the time is start of day Coordinated Universal Time:

 > new Date("2020-04-14")
 < Mon Apr 13 2020 20:00:00 GMT-0400 (Eastern Daylight Time)

If the browser is in Boston, local time is 4 or 5 hours earlier than London. And midnight in London on April 14 is 7:00pm on April 13.

So, the date string "14-Apr-2020" may be interpreted as on date April 14 in browsers located East of the Atlantic (but on April 14 in Europe and Asia) and the cutoff date interpreted as on April 13, so it may appear that the application came after the deadline.

To get dates correct, we need to be more explicit about time and time zone.

Idea: assume American Samoa time zone instead of GMT

In the absence of a time or time zone, a simple idea is to assume the time is 12:01am that day in American Samoa, just east of the International Date Line. The time zone in American Samoa time is GMT-11 hours.

For instance, "14-Apr-2020 00:00GMT-11" is on April 14 (almost) everywhere in the world:

  • 0:00am April 14, 2020 in American Samoa is...
  • 1:00am April 14, 2020 in Honolulu,
  • 4:00am April 14, 2020 in California
  • 7:00am April 14, 2020 in Boston
  • 12:00pm April 14, 2020 in London
  • 1:00pm April 14, 2020 in Barcelona
  • 2:00pm April 14, 2020 in Moscow
  • 4:30pm April 14, 2020 in Delhi
  • 7:00pm April 14, 2020 in Bejing
  • 8:00pm April 14, 2020 in Tokyo
  • 11:00pm April 14, 2020 in New Zealand
  • 11:00pm April 14, 2020 in Fiji

All of these on the same day, April 14, 2020.

This solution is not quite perfect, though. A short distance west across a narrow strait is the island of Samoa, a separate island and nation entirely, in Apia Standard Time GMT+13 where the above time is:

  • 00:00am April 15, 2020 in Samoa

Turning date strings into ISO date strings

ISO date strings are nice because you can compare them alphabeticall y and you get a result that makes sense conceptually.

You can't compare just any date string. For instance, "13-Dec-2020" < "20-Apr-2020" alphabetically but not conceptually.

But ISO date strings are neatly comparable, for instance, "2020-12-13" > "2020-04-20" both conceptually and alphabetically.

So if we can convert an arbitrary date string to an ISO date string we can reliably compare dates no matter the time zone.

Solution

So a possible solution to allow rigorous simple comparison of dates in a way that makes intuitive sense is to

  • parse date strings into Date objects,
  • assume the time zone is American Samoa,
  • convert the date to an ISO Date/Time string
  • keep just the date part

Code

The following function converts a wide range of date strings to a simple ISO Date string.

/**
 * Convert date string into a simple date string
 * @param date    Arbitrary date string, e.g. "14-Apr-2020"
 * @returns {*}   ISO date string, e.g. "2020-04-14"
 */
function toSimpleISODateString: function (date) {
  if (date instanceof Date) date = new Date(date)
  if (date.getTime() !== date.getTime() ) return null; // invalid date
  if (date.getFullYear() < 0) {
    date.setFullYear(-date.getFullYear()); // Firefox
  }
  else if (date.getFullYear() < 2000) {
    date.setFullYear(date.getFullYear() + 100);
  }

date.setUTCHours(12) date.setMinutes(0) date.setSeconds(0)

return date.toISOString().substr(0, 10) }

The above code handles only the conversion to a comparable date string. If your program is receiving a variety of date strings, it may be useful to first use the browser's own date parser, and if that fails, then fall back to using (moment.js)[momentjs.com] to try other date formats:

function parseDateString(dateStr, fallbackFormat) {
    var dt = new Date(date)
    if (dt.getTime() != dt.getTime()) {
        if (fallbackFormat) dt = moment(dateStr, fallbackFormat) 
    }
    return dt;
}

Example

> toSimpleISODateString("14-Apr-2020")
< "2020-04-14"

toSimpleISODateString("14-Apr-2020") < "2020-04-14"

toSimpleISODateString("14-Apr-2020 02:10") < "2020-04-14"

< toSimpleISODateString("14-Apr-2020 12:10")

"2020-04-14"

toSimpleISODateString("14-Apr-2020 12:10pm") < "2020-04-14"