main

Lee Maguire: webslog

Currently:

2004-05-19

X-Know-Archive

A Socratic correspondent writes:

From today I'll be forwarding most of my mail to my gmail.com account. Note that gmail doesn't support TLS so your mail to me will be sent in the clear. And of course, all our correspondence will be stored in a database with access controlled by a US based company, who (regardless their commitments to "not being evil") will almost certainly roll on NSLs.

So, just thought I'd give you a courtesy heads up. Gmail has some privacy info at http://gmail.google.com/gmail/help/more.html

I imagine that for every person that informs me they're storing their correspondence and/or addressbook in a commercial, third-party, US-based computer system, there are several who don't. The privacy of an email is entrusted to the recipient, but disclosure more often impacts the sender. (e.g. those regrettable email correspondences between London investment bankers on Monday, as far as Hong Kong by Tuesday, and in the London tabloids by Wednesday morning.) Yet people's choices and compromises are mainly in regards to their own privacy.

Some of the issues we see with the introduction of gmail are similar to the issues resulting from the introduction of DejaNews, a searchable web-based Usenet archive (which later became Google Groups). Advertising linked to postings were a hot topic, as well as people being uneasy with their, once-transitory, drunken missives waiting to be rediscovered.

The solution was to allow an opt-out. A "nuke" function was made available to erase old embarrassments from the archive, and the technical means to opt-out of long-term storage - the famous "X-No-Archive:" header.

Does gmail honour X-No-Archive? Should it? Unlike public Usenet postings, the trustee of the mail should still be the recipient. (Mail forwarding systems could easily not forward mail with that header to gmail, as it is. But that probably wouldn't go down well with users.) And yet, when I mail an individual (who may have mail forwarding in place) I'm not necessarily cogent to the privacy policies that will be applied to it. I still believe that personal email (even unencypted email) should have the same expectation of privacy that a physical letter would have.

Personally, I think the first step in reassuring people would be to allow for an optional mechanism to allow them to know when their mail is, or isn't, stored in a third party archive.

My suggestion (and this is from the top of my head, I haven't researched this, or looked for other proposals) would be to extend the syntax of RFC3798 - MDNs: "x-archive". An option only processed by third-party archives such as gmail.

An example of a mail header fragment might be:

From: Joe Example <joe@example.org>
To: Belinda Example <belinda@example.com>
Subject: My confession
Disposition-Notification-To: Joe Example <joe@example.org>
Disposition-Notification-Options: x-archive=optional, stored, purged, policy;

So x-archive would be a non-standard extension (i.e. not registered with IANA which is what the "x"-prefix denotes. A registered version might be "archive".) to "Disposition-Notification-Options:". The importance should be listed as "optional" if a notification request is required from systems that do not understand the x-archive option (as per the RFC, only a "failure" notice will be sent if all options are not understood).

Note that the variables for this option are concerned with an instance of a mail in an archive, and not with the user interaction with the mail. They mearly provide a mechanism for informing the sender of the status of the mail in an archive (and provide a form for promoting the relevent privacy policy). Examples of variables might be

  • "stored": send an MDN when the mail is stored in the archive
  • "policy": send an MDN when changes are made to the privacy policy
  • "purged": send an MDN when the mail is purged from the archive

There is a difference between "deleted", a user action (which in MDN terms does not preclude the un-deletion of a mail) and "purged" which is an indication of an unrecoverable deletion from an archive. Therefore a user deleting mail from an archive may (depending on the configuration) result in two different MDNs - but a mail being automatically purged from the archive without user interaction may just result in one.

These options might not be useful as default headers. For example, a wide use of "policy" may cause a large number of small MDNs to be delivered to the senders of any mail in the archive when a policy change is made. Someone without correct filters in place to interpret them might end up with a flood of mail. Would one "MDN" per email address suffice? Similarly, what if an entire user archive is purged?

Are there any groups looking into this issue?

Unclassified: posted at 15:18,