Opened 7 years ago

Closed 3 weeks ago

#96 closed task (fixed)

times according to ISO 8601

Reported by: milosch Owned by: Olly Betts
Priority: trivial Milestone: 1.4.13
Component: cavern Version:
Keywords: Cc:

Description

Hello Olly, it is not really a bug, but I miss the times according to ISO 8601. https://en.wikipedia.org/wiki/ISO_8601#Calendar_dates

When working with multiple programs special gears like * date 2017.01.12 disturb easily. Just like the old statement of declination.

Standardization was invented for that. greetings Milosch

Change History (6)

comment:1 Changed 7 years ago by Olly Betts

The non-standard date format is not ideal, but unfortunately not trivial to fix cleanly at this point without changing the meaning of existing files since '-' indicates a range of dates currently and incomplete dates are supported (sometimes you only have an approximate date for an old survey). For example:

*date 1911-12

By ISO-8601 that's December 1911, but currently that actually means "date range 1911-01-01 to 1912-12-31".

The current situation is liable to confuse, as I suspect many people will read the above as December 1911, but before changing it I think we need a plan which doesn't itself cause confusion.

comment:2 Changed 6 years ago by Olly Betts

Since 1.2.33 (released 2018-03-22) my 1911-12 example results in a warning:

test.svx:1:12: warning: Assuming 2 digit year is 1912
 *date 1911-12
            ^~

As NEWS notes 2 digit years are only getting more ambiguous with time:

  + Warn about 2 digit years.  We can't change the assumption that these are
    19xx without risking breaking existing datasets, but the further we get
    into this century, the more likely such an assumption is to catch someone
    out.  The warning can easily be quashed by explicitly adding the assumed
    "19".

I think the end goal is to be able to use "-" inside a date and a different character or characters (maybe "--" or "/" which are options ISO-8601 allows) between the two dates when there's a date range. Looks like Therion just allows multiple space-separated dates if I read the manual correctly, though I'm not sure what it does with them. Survex's 3d format currently allow a single date or a continuous range.

The only cases which are currently valid and get a different meaning are ones with a four digit start year, then a "-", then a 2 digit end year, and the end year must be between 1 and 12 to be a valid month - e.g. 1995-99 is currently a valid range, but not a valid ISO date. So to be a case that changes meaning, the start year must be 1912 or earlier and the end year between 1901 and 1912, which is rather old for cave survey data.

Ideally I'd like to come up with a way to handle this which doesn't break or change the meaning of existing datasets, but the dates involved do at least mean that affected datasets should be very rare.

It would also be useful to support different date types - the most obvious two are "exploration" and "survey". The survey date matters for automatic declination calculations and the like, but the exploration date is more interesting for looking at how the known cave evolved over time.

We could have "*date explored 1911-12" with the ISO interpretation (i.e. December 1911) while "*date 1911-12" keeps the current interpretation (i.e. 1911-1912), but I worry that's going to catch people out. Perhaps that plus a warning for "*date 1911-12" noting it's being interpreted as "1911-1912" for compatibility, and to rewrite as "*date explored surveyed 1911-12" if December 1911 is meant, or something like that.

comment:3 Changed 5 months ago by Olly Betts

Milestone: 1.4.11

The affected examples such as 1911-12 have now warned for over 6 years, so I think changing the meaning is reasonable at this point. If someone has an old dataset which hasn't been updated, they'll get such dates handled wrongly. How wrongly? Well the affected cases all seem to be of the form:

<year-between-1900-and-1912>-<two-digit-year-between-start-year-and-1912>

(Dates before 1900 are not supported and give a warning, end date before start date is an error.)

Survey data from these years is really uncommon to start with.

We would now interpret this as a month in the first year.

Survex uses the middle date of the range specified (for automatic declination calculations and also what's stored in the .3d file).

So the most wrong case seems to be 1900-12 which currently uses the middle of 1906 but would instead use the middle of December 1900 - i.e. ~5.5 years wrong.

The 1911-12 example currently uses the start of 1912, and would instead use the middle of December 1911, so only half a month wrong.

The IGRF model we calculate magnetic declinations using gets less good as you go further back too.

I'm thinking we encourage use of - in dates but accept . for compatibility. For the range case we could just take two dates separated by blanks, e.g.:

*date 2020-10-31 2020-11-01

I'm thinking to add support for date types at the same time, and that *date 1911-12 should warn that its meaning has changed, and to suppress that warning you can use *date surveyed 1911-12 (assuming you meant December 1911) or *date 1911 1912 if you really did mean 1911 to 1912.

Specifying a date type require force you to use - as the separator. Not entirely sure about that.

comment:4 Changed 4 months ago by Olly Betts

Milestone: 1.4.111.4.12

Ticket retargeted after milestone closed

comment:5 Changed 2 months ago by Olly Betts

Milestone: 1.4.121.4.13

Ticket retargeted after milestone closed

comment:6 Changed 3 weeks ago by Olly Betts

Component: Othercavern
Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.