I've finally got the initial design of Chrono 0.2 working. It took so much time, partially because I'm working on dozens of other projects including Rust libraries. It solves one of the major annoyance with Chrono, so I'm glad that this new design is promising.
Time zones in Chrono 0.1
As always, I start by pointing to Erik Naggum's excellent essay about date and time. (Yes, I'm terrible at story telling and I won't say a lot about that.) Most importantly, the core aspect of time zone is that it is ultimately a political creation rather than a natural necessity. This complicates lots of things, though Author David Olson and others have done a tremendously important work in this regard.
There are several problems with time zones:
- There is no reliable way to handle the local date in the future.
- There can exist a local date which occurred in two or more instants.
- There can exist a local date which never has been occurred.
- The conversion process itself is seriously annoying and easy to make a mistake.
(If you are interested in the timekeeping, you'll realize that this list equally applies to leap seconds. I had to deal with them in Chrono as well, and chose to make them invisible to the normal usage.)
In fact, Chrono's original Offset
design
does explicitly account for these problems.
In Chrono, the local date is a concept
only meaningful to accessors and formatting routines,
eliminating the very source of 1.
The possibility of 2 and 3 is handled via the LocalResult
enum,
while 4 is handled via... delegating everything to the Offset
.
This decision is partly because
we would only have a handful number of Offset
implementations,
so we have to implement them anyway.
In the end, we had something like this:
pub trait Offset: Clone + Show {
fn local_minus_utc(&self) -> Duration;
fn from_local_date(&self, local: &NaiveDate) -> LocalResult<Date<Self>>;
fn from_local_time(&self, local: &NaiveTime) -> LocalResult<Time<Self>>;
fn from_local_datetime(&self, local: &NaiveDateTime) -> LocalResult<DateTime<Self>>;
fn to_local_date(&self, utc: &NaiveDate) -> NaiveDate;
fn to_local_time(&self, utc: &NaiveTime) -> NaiveTime;
fn to_local_datetime(&self, utc: &NaiveDateTime) -> NaiveDateTime;
fn ymd(&self, year: i32, month: u32, day: u32) -> Date<Self> { ... }
// other constructors follow
}
pub struct DateTime<Off> {
datetime: NaiveDateTime,
offset: Off
}
// Date and Time follows
This sounds good.
You can put the offset data into the timezone-aware DateTime
,
and use it to convert to the local date (to_local_date
).
In the converse, DateTime
has to be created from Offset
so that it converts to the internal representation in UTC
(from_local_date
).
But you might wonder:
why is local_minus_utc
separate from to_local_datetime
?
The latter can be implemented via the former, right?
Yes! In the current implementation the latter is redundant.
And this redundancy suggests a bigger problem.
Originally, to_local_datetime
was to be used
in the absence of the exact offset to UTC.
This alone is enough for converting UTC to the local time,
and it is still used in the Offset
conversion
where the original value is converted to UTC then to the target time zone.
But this is inefficient,
especially if we have a large table of zone transitions.
Therefore we have local_minus_utc
for caching the calculated current offset.
The caller was expected to call local_minus_utc
for converting the current value to UTC
and to_local_datetime
etc. for other cases.
For the fixed-offset time zones like UTC
these methods will be largely a simple arithmetic,
so we only pay what we use.
In the reality this scheme didn't work well.
Local
was an example of the glaring problem;
since it has to cache the value, it should have a field,
but that meant that we need to create a Local
instance
every time we convert to the local date!
This defies the simple interface like dt.with_offset(UTC)
,
and since such Local
instance doesn't know about the exact offset,
we have a separate flag indicating the offset is correct or not.
(In Rust, this would translate to Option<FixedOffset>
.)
This even breaks the original premise of
"only paying what we actually use".
After I realized this problem, I'd tried several solutions and spectacularly failed. It was clear that we really need two kinds of types, but I was not sure how to do that.
Time zones in Chrono 0.2
Associated types, originally proposed in RFC #195, were a game changer for this problem. They allow for two types to be smoothly connected in the compile time. The resulting design is as follows:
pub trait Offset: Sized + Clone + fmt::Show {
fn local_minus_utc(&self) -> Duration;
}
pub trait TimeZone: Sized {
type Offset: Offset;
fn from_offset(offset: &Self::Offset) -> Self;
fn offset_from_local_date(&self, local: &NaiveDate) -> LocalResult<Self::Offset>;
fn offset_from_local_time(&self, local: &NaiveTime) -> LocalResult<Self::Offset>;
fn offset_from_local_datetime(&self, local: &NaiveDateTime) -> LocalResult<Self::Offset>;
fn offset_from_utc_date(&self, utc: &NaiveDate) -> Self::Offset;
fn offset_from_utc_time(&self, utc: &NaiveTime) -> Self::Offset;
fn offset_from_utc_datetime(&self, utc: &NaiveDateTime) -> Self::Offset;
// helpers and constructors follow
}
pub struct DateTime<Tz: TimeZone> {
datetime: NaiveDateTime,
offset: Tz::Offset,
}
// Date and Time follows
This new design directly shows that
we are dealing with two different types!
TimeZone
creates an Offset
which is a storage-oriented type,
which can be converted back to the TimeZone
via from_offset
.
TimeZone
is used for creating date and time values,
while Offset
is used for converting to the local time.
I originally tried to avoid separate trait for local_minus_utc
,
but ultimately abandoned that plan
to make dt.offset().local_minus_utc()
possible.
There are a set of TimeZone
s and Offset
s available:
TimeZone instance |
Offset instance |
---|---|
UTC |
UTC |
FixedOffset |
FixedOffset |
Local |
FixedOffset |
TzFile (*) |
TzFileOffset (*) |
TzRule (*) |
TzRuleOffset (*) |
(* Some instances are under the development.)
One can note that some time zones are their own offsets,
as they do not have an additional data (i.e. cache) for the storage,
and Local
reuses the FixedOffset
as Local
itself has no state.
Other time zones have to cache its offset and the transition data,
hence separate offset types.
TzFile
and TzRule
is a part of tzdata support;
they will be typically used as a reference-counted version,
such as Rc<TzFile>
or Arc<TzRule>
,
to avoid the deep copying (which is common in Chrono).
TzRule
implements the rule string in POSIX TZ
environment variable,
which is not that useful by its own (since we can use Local
anyway),
but it's TzFile
's solution to the aforementioned problem 1:
the future zone transition is encoded as a form of TzRule
,
and we need to implement it.
At the end this new design is quite promising,
but even after #![feature(associated_types)]
is gone
associated types are still cutting edge features.
I had to disable debuginfo due to the Rust issue #21010, for example.
Hopefully though, this shall not affect the validity of this design.