Currently:
- Sounds Pretty Feasible >>
-
I was a bit sceptical. But since reading the current draft for the proposed Sender Permitted From anti-spoofing (and, by extension, anti-spam) scheme, I'm still a little sceptical, but find myself thinking "blimey, this one might actually stand a chance". It's still early days yet, but it almost feels like ghost of a lost RFC - fragments of an early submission that was somehow dropped behind Jon Postal's desk, never to be recovered.
It's flexible to a fault: Almost all of my technical queries have been answered with a brain boggling mixture of network ACL, mail and RBL-like syntax. The bread and butter of the renaissance sysadmin.
It's not trying to overreach: It's not trying to provide of authentication individual addresses or "users" - that's the job of crypto in the utopian techno-future. This just provides information to allow a mailserver to determine if mail should be coming from a particular IP address based on its "Envelope From" (if you're not familiar with SMTP, note: this is not necessarily the address in the From: line).
It doesn't require a global atomic change: Users don't have to do anything. In fact nobody has to do anything. The magic pixies who run the Internet will start rolling it out for their own use, no existing standards need to be messed with. Others can join in when they feel like it. The idea, essentially, is to build a big distributed database of which sending servers are used by which domains. And for the hostmasters don't join in, alternative records can be made available elsewhere. (Hopefully a unique DNS RR will eventually be granted, and the world's nameservers and resolvers will be updated to recognise it, falling back to the TXT records if not.)
It doesn't require precise, public, information disclosure: Domains who, for whatever reason, can't (or won't) provide precise information can still provide better-than-nothing information to begin with. Even an odd rule such as "!ptr:kr" might be useful for spotting spoofing. (It means mail from addresses in this domain will never come from a server whose validated "reverse DNS" PTR is in the domain hierarchy of South Korea.).
Even if it's not fully adopted it still has significant value: Even if mail servers aren't going to act based on SPF information, the fact that there's a DNS-based database out there that can be used by client-side filters is a big win on its own. This would be the DNS whitelist to balance out all of those blacklists. It may be this proposal's saving grace.
Now obviously some knowledge gathering about how addresses are used is going to be necessary for rules to be published. But prime candidates for early listings would be the domains exclusively used by mailing lists.
To give you an example - the NTK mailouts, and associated list admin mails, are the only mails that should be seen with an Envelope From with the domain "ntk.net". And in fact these mails are only ever sent out from a server that is also a listed MX for it. Therefore the relevant record added to the zone might be:
ntk.net. IN TXT "v=spf1 mx default=deny"
In fact that'll probably be the case for the majority of domains/subdomains that are dedicated to mailing lists. (As it is, this record might require a second query for the MX records. Now a more network efficient way would have been to specify the IP addresses, but the example serves to illustrate the potential for simplicity.)
The SPF draft suggests a ludicrously optimistic "global sunrise period" ending on April 12th 2004 (as being the 10 year anniversary of Usenet spam). I'm sceptical it'll be that much further than a draft RFC by that point but, since publishing just involves the inclusion of a standard DNS record type I can easily see rapid adoption when the standard has stabilised. By mid-2004 it might be common to see "Received-SPF:" headers appearing atop mails, and utilised as client-side heuristic tests. And not long after, the inclusion of fake Received-SPF: headers in spam (although lower in the headers than they should be). SPF, for a while, will be known as "that spamassassin thing" and by the time support is folded into the likes of Mail.app and Firebird, it may have already have built up a critical mass of record publishers.
New iterations of mailserver software will be "SPF aware", but aside from some smaller servers, most of big sites will hesitate to "pull the switch" and use it for transaction time rejection, instread leaving it as a client-side tool. The reason will is the ".forward" problem, a close cousin of the mailing-list problem.
Essentially the problems mailing-lists pose to anti-spam schemes are (with some generalisations):
- Mailing-lists are not "personally" addressed to the receiver. They have the properties of any bulk mailing.
- The "From:" addresses may have no direct connection to the machine that is passing them on.
- Any "charge" applied to the sending of one mail must be multiplied by the size of the list (which may be tens of thousands).
- A mailing-list might not be a commercial tool, there may be no opportunity for solutions that involve a greater expenditure of money or resources.
- Any machine that sends mail is potentially a legitimate list server.
- The receiving mailserver doesn't keep records of which lists each client is subscribed to. Ultimately, only the client can separate the solicited from the unsolicited
- Spam sometimes infiltrates legitimate mailing lists - so applying measures against "the source" may do more harm than good.
The mailing-list problem is the sanity check I use when reading anti-spam proposals. Often the authors of schemes that don't have room for them will tack on a hand-waving addendum about how some unspecified "special arrangements" will need to be made. Others will even lament that that the idea of free mailing lists will have to end. "I think it's a small price to pay for a spam-free inbox". Well, I don't.
SPF side-steps the mailing list problem by concentrating on the "envelope sender" not the "From:" header, which for most modern mailing-lists is an address used by some automatic management process (e.g. majordomo@, list-admin@, or a VERP address) which is tied to the list's domain.
The problem is how mail forwarding works. When an email is automatically forwarded (e.g. via a .forward or aliases rewrite) the envelope sender (the address error reports are to be sent to) is preserved and used for the subsequent transaction (although some broken software will use the From: address). In the eyes of SPF checkers this makes them likely forgeries.
While the bulk of the SPF proposal is a neutral, optional specification that doesn't conflict with the existing mail infrastructure, this problem actually may require actual (although relatively minor) changes in how mailservers work in order to work properly.
As far as I can see, the only satisfactory way around this in the short term is to use distributed whitelists for the well known mail forwarding servers, and local whitelists controlled by clients. This probably means that only servers with a small number of users (i.e. where whitelist information, if needed, can be feasibly managed) will feel confident enough to use SPF (independent of other tests) for transaction time rejection.
The current SPF proposed system for dealing with this issue, SRS, not only appears to be a horrible kludge, it also introduces new problems into the existing infrastructure. Most mailservers would need to be updated just in order to understand how to process bounce messages. I'm sceptical that this approach as it stands now will ever find favour.
So, while I can see SPF TXT records being adopted in the short term, I think it's going to take a little longer to implement the full system, and fulfil its promise. I imagine, if this proposal goes through the IETF wringer, we'll eventually end up with a standard SPF ESMTP extension being advertised and recognised by "SPF2" compliant mailservers. This will be the hook to communicate the additional information for dealing with forwards and bounces. Servers that do not advertise or recognise the extension would work as normal, and be candidates for a local whitelist.
Which is not to say that the "bang path"-like chain of addresses for error reporting is necessarily a bad thing. The concept of chained bounces sounds like an anathema to postmasters in the age of spam. But, if address forgery was a (mostly) solved problem, it might be a preferable design.
I consider it a potential weakness of SMTP that when mail forwarding fails, the sender gets back more information about where the mail went than necessary. It's been the case in the past that I've sent a mail to some Important.Person@respectable-company.ltd.uk and just gotten back a bounce message from a free webmail service telling me that "user spunkymunky69 is over quota". It's potentially confusing for the sender and in some cases undesirable for the (non) receiver. The opportunity to massage bounce messages may prove to be beneficial.
So, if SPF were to be deployed there are things I hope to happen (a drastic reduction in spam, wider knowledge of the difference between header and envelope "from", greater support of SMTP AUTH and port 587 ) and some things I hope won't ( the overloading of PTR records, undelivered bounce message, service provider "lock-in"). I've often thought that the schemes more likely to succeed would be the ones that ask for the least in sacrifice. SPF seems to be asking us to make hardly any sacrifice at all.
(posted 2003-10-30T11:50, link )
- And another two cents >>
-
Since it was linked from last week's NTK people other than Googlebot have read a post I made in May about tightening SMTP standards and I've begun to get some feedback.
One correspondent questioned the validity of using SMS-spam as some indication that micropayment systems wouldn't necessarily deter spam. It's a fair point, I suppose. Message charges are set (and collected) by the sender's service provider and can traverse to recipients on other networks. Bulk message senders are free to shop for the cheapest service provider and thus pay a fraction of the usual cost. Surely any charging system designed for email would feature negotiation (automatic or not) directly between the sender and the receiver?
Well, yes. A well thought-out proposal could feature this. But I don't think you'd ever see it adopted. There's a lot of discussion and a lot of proposals flying around, and I'm not au fait with them, but I haven't seen anything to convince against the following:
- any widely deployed currency-based per-message charging will ultimately be set and collected by the sender's service provider
- currency-based charging will never be successfully integrated into the existing global SMTP infrastructure
It's not that I don't think it's technically feasible to implement a fair sender/receiver currency-based charging system globally, I just don't think it's politically feasible.
Imagine all the email you get in a day. Subtract all of the mailing list traffic. Subtract all of the mail that comes from addresses in your address book, addresses you have mailed, and addresses from which you have received legitimate mail from in the past. This is the mail that can be white-listed (ignoring the issue of spam infiltrating mailing lists for now). What you have left is the problem - probably 99% of it spam, and what's left after that is legitimate "initial contact" mail from new addresses. The problem isn't separating the spam from the mail, the problem is separating the "initial contact" mails from the spam.
(Yes, the ability to forge identities can reduce the effectiveness of a whitelist approach. The solution to that problem isn't payment, but rather the adoption of signing. Public-key based message signing at the client side, TLS on the delivery side - where some use can eventually be made of certificates for authentication and trust relationships.)
You wouldn't want your friends and contacts to be charged for the privilege of emailing you (well I wouldn't), so that leaves the spam and the "initial" mails. The ideal charging level is that which doesn't dissuade legitimate correspondence but does dissuade unwelcome commercial messages. Does such a level exist? I have no doubt that any cost barriers that are introduced will have a direct effect on the amount of unwanted commercial messages, but I don't know that there's a level that produces a neat line separating the two. I have always received more (paper) junk mail than normal mail, thankfully nowhere near the depressing ratio of spam to email,yet what value of postage stamp would dissuade me from, say, informing a random webmaster of a minor mistake on their site?
I consider the idea that some middle-ground would thrive - that, for example, busy executives would charge large amounts to receive email - to be fanciful. It's usually the fear of the odd overlooked gem that has rendered anti-spam techniques impotent. A salutation from a long lost friend with the subject "Hi", an important business mail sent out-of-hours from the kid's computer, that domain renewal reminder. Most people would apply no charge on the things they want to read, and a bajillion dollars on spam. And if there's mail you don't want to read but have to? Chances are you're being paid to read them already - get back to work.
So if you price too high for the spammers, no money changes hands. And if you set a price of nothing for your friends, no money changes hands. Whoa, hold on! To the Internet Service Providers this is a potential Critical Revenue Source. If they're going to have to develop and deploy these systems (assuming they aren't out-of-band charging systems) their slice of the pie is going to have to be worth something.
In fact, why wouldn't it be their pie? Why can't your Service Provider bill for email like the phone company bills for calls, or SMS messages. An analog to the phone company, while possibly flawed in the network sense, is clear. It's their systems that will be handling the mail, their systems that will likely negotiate the financial transactions, and it's they who receive the complaints when things go wrong. If anyone's going to benefit from email charges it's going to be them. And while we're at it, why would charges be the "nominal" amounts that proposals promise. Why wouldn't they be whatever the market will accept (which will be some point between significant and insignificant - more that nominal by definition).
It's often said of phone companies that it costs them more to bill for a call than it does to facilitate it. I can guarantee that will be the case, many times over, for email.
One of the effects of the big players arriving late to the email party is that, while it's difficult for any of them to unilaterally dictate the community's technical standards, it's easy for them to stymie the adoption of new ones. If your proposal doesn't sit well with the big boys, the likes of Microsoft (Hotmail) and AOL, it doesn't stand a chance of being adopted in any useful sense.
And if email is left to continue it's dissent into a cesspool of spam and Windows viruses? Well, it'll still be to the likes Microsoft whom the world will come begging for an alternative.
If you want an rough idea what a replacement for email might look like, keep an eye on the IM networks over the next couple of years. On the one hand is what I would consider the "traditional" approach to Internet services, favoured by Jabber, of decentralisation and open standards. On the other hand is the commercial approach of MSN, AOL, and Yahoo: proprietary protocols, user lock-in, software lock-in, and possible advertising or subscription revenue. As far as I know there's no way to send messages between these commercial networks. Currently the best solution is a single application managing multiple separate accounts, but - when these networks begin to monetise - these temporary freedoms may prove short lived.
But maybe we won't see this in email. Maybe the infrastructure won't be built for the benefit of the big commercial players. Maybe a payment infrastructure would be built that, if it actually worked, would hardly ever be used.
I can't see the world's tax-men ignoring this for long.
While the frequent warnings about impending email taxes are usually variations on a long running hoax, it doesn't mean that government bodies haven't seriously considered it (especially in Europe). As recently as 1999 the UN proposed an email tax purely as a way of generating revenue (in this case for third world infrastructure development). I imagine it was mainly the lack of suitable infrastructure coupled with widespread suspicion of Government meddling in an emerging industry that eventually shelves these ideas.
Back in the mid-nineties the Canadian economist Arthur Cordell was proposing a "bit tax" of .000001 cents/bit. This might sound reasonable if you work out how much that is per email, but at the time I remember working out that (even if you were to discount the data used for routing and transmission control) the network delivery of a single MPEG-2 encoded 2-hour movie would accrue around $300 in tax. There's clearly no direct correlation between bits and "value" - a single floppy disk's worth of text can be more valuable than the bulk of Charlie Sheen's filmography.
Emails are attractive as taxable electronic "items". There's still the potential for growth (especially in a spam-free environment), a fairly consistent idea of what a single email message is, and there's usually some level of personal or commercial value in them that make sub-penny level taxes seem insignificant. And now, thanks to spam, tax proposals can now be dressed up as positive measures to benefit users.
When you introduce money into a system you'll end up attempting to implement a solution that, as well as dealing with the primary problem of spam, also has to satisfy a host of conflicting secondary agendas. This is why I don't think we'll see currency based-charging in SMTP the resistance to change would be too great. And, I expect any proposed commercial replacement for SMTP/RFC-standard based email systems will incorporate charging from day one. And I don't expect users to adopt them at anywhere near the penetration of current email.
But what of other "payment" schemes, ones that rely on non-currency based charging? Certainly, I think they've got more chance of adoption than money-based charges, but I'm not convinced yet. I'll probably address these in a future submission.
(posted 2003-10-27T21:36, link )
- Night of the long queue >>
-
Should any photos surface that show me attending the "Night of the Panther" launch last night (hint: I'll be the only one not wearing black) let me just assure you I haven't become one of those "switchers". The evangelical apple-turnovers. The iPod-people.
No, my serious philosophical commitment to the ideals of free-as-in-freedom software would prevent me from ever being tempted by the dark side. That and the fact that I've never been in a financial position to afford an Apple machine. I've been happy enough using Macs at work and ever since OSX, I've thought of Macs as being the machines I'd prefer other people were using. Young, rich, sophisticated urbanites with well-honed aesthetic tastes. Those I am wont to mock but secretly envy. Conversely, it is relatively easy not to aspire to owning a machine capable of running the latest Microsoft offering.
Perhaps it's a shame that you'll never see a long queue of people along Tottenham Court Road awaiting the latest release of Debian buzzing with raw geek-energy and anticipation little removed from that of a sci-fi sequel premiere. For the hard core, the "Debian unstable", every night is an upgrade. Every apt-get promises the thrill (or the shock) of the new.
Drawn, cheifly by the prospect of beer 'n' pizza, I was present at the scared rite of "installation" that followed - but fear not, I maintained my cool cynical distance by occasionally pointing out how the emperor was clad rather scantily on that chilly October night:
Me: I don't think I'd happily pay another 100 quid just for a point release.
(posted 2003-10-25T22:30, link )
Nick: Ahh, but it's much more than just a point release.
Me: So what sort of new stuff does it have?
Matt: It doesn't really have new stuff, it has significant improvements on previous features. Better bindings for python and perl, that sort of thing.
Me: Ahh, so it's a sort of... update. Like a... point release?
Tom: I don't think I like you.
- How many jamirs in a kadam? >>
-
While watching my newly acquired Indiana Jones DVDs I was reminded of a something that annoyed me in the days before the web. There's a scene in Raiders of the Lost Ark where an old man is reading the inscription on the (Egyptian?) staff headpiece.
Imam: This was the old way. This means six kadam high.
Sallah: About seventy-two inches.
Imam: Wait.
Imam: "And take back one kadam to honor the Hebrew God whose Ark it is."So if 6 kadam is 72 inches that means one kadam is roughly equal to one foot. And to remove 1 kadam would make the staff 5 feet long (1.52m). But when we see the staff later in the movie it looks to be at least a foot taller than Indy.
So how long is a kadam? Ten years ago I wouldn't have had sufficient motivation to research it. Now I have google.
Kadam (μdq) seems to be the Hebrew for precede but sadly the only references I can find that refer to it as a form of measurement are pages referring to the goof in the movie. It appears in other languages. There's an entry in a Malay dictionary that gives the definition as "sole of foot", which would tie in nicely with the movie's 1 kadam = 1 foot definition.
So it's probably a goof - but whose? Was Sallah's definition of kadam incorrect? Was it the propmaker? I googled for the script to Raiders and found the "revised third draft" dated August 1979. In section 81:
AMIR It says it is...ten jamirs high... SALLAH About seventy-five inches. AMIR Wait! I am not finished... Amir's finger moves across the break as the markings con- tinue on the sun medallion. AMIR (reading) "And one jamir to honor the Hebrew God whose Ark this is." Indy, still holding the date, exchanges a long look with Sallah. INDY You said their top section was blank. Are you absolutely sure? Sallah nods. INDY Belloq's staff is seven and a half inches short. They're digging in the wrong spot!So the script originally called for a staff of around 83 inches (or around 210 cm), and in fact later says
(posted 2003-10-23T18:10, link )Indy carries a smooth wooden staff almost seven feet tall.
I guess we can blame a last-minute rewrite for this screw up, and I can watch Raiders again without being distracted by my own stupid nit-picking. Except... what was wrong with "jamirs" anyway?
- Old TLDs never die >>
-
When I was at college I briefly used an address of the form lee@cs.mycollege.ac.uk (CS being the subdomain used by the Department of Computer Science). Anyone familiar with the JANET big-endian domain naming convention will have spotted the mistake I made there. And will understand why meddling with top-level DNS can have unexpected effects.
Even though our college had changed over to the little-endian convention, some mail software, somewhere else in the UK, would assume the my address was written in big-endian notation (since the first part, "cs", was a valid TLD) and helpfully translate my address "back" into the little-endian form lee@uk.ac.mycollege.cs before passing it to a server which would attempt a DNS lookup for "mycollege.cs" . An NXDOMAIN would be returned and the mail would bounce. The convention of using the alias lee@dcs.mycollege.ac.uk suddenly started making sense.
.cs was assigned to Czechoslovakia in 1990, but by the time I started college the country had split into the Czech Republic (.cz) and Slovakia (.sk) and the TLD was obsolete. By 1995 .cs wasn't used at all.
And I imagine, beyond references in archives, the last vestiges of the old JANET naming would have been purged from the Internet by now. But who knows, maybe there's a crufty old hack, still running on a server somewhere that assumes a big-endian format and only converts to little endian if no match is found. It could have been running unchecked for a decade, undetectable but for a greater number of failed DNS queries. Possibly still reliant on the continuing non-existence of domains under .cs.
Well, we'll find out, 'cause CS is coming back. ISO have redelegated CS to Serbia and Montenegro. It's very probable that IANA (or rather ICANN) will re-assign the cs TLD.
It's controversial, but I don't think it's a bug in ISO's practice (there are only so many alpha-2 representations of a country's name that fit, and usually most are already used), and I don't think it's a flaw of DNS. The problem is that DNS domains don't realistically provide a level of permanence seemingly required by the URL/URI schemes in use right now. Nominet introduced the second-level domain .ltd.uk a few years ago, but received advice that PLCs could not legally refer to themselves as "Ltd". The solution was to also introduce ".plc.uk". This means that if a Ltd company becomes a PLC all of its documents named using the DNS derived namespace must also be renamed.
If it's controversial that a TLD would get recycled after (almost) 10 years of non-use - I'd love to see what happens when they try to pull the plug on SU (the code used by the Soviet Union/CCCP).
After the creation of .ru (Russian Federation) in 1994, the number of domains in actually .su went up and it took on a new status as a sort of autonomous cctld, the identifier of an non-existent state. IANA considers it
(posted 2003-10-06T14:00, link )being phased out
. But according to FID.su there are still over 2000 zones that say otherwise. Perhaps the Russians are just "keeping it warm"?
- Bizarre spelling >>
-
I notice the current version of the SVG Tiny spec lists one of the authors as
Benoît Bézaire, Corel Corporation
. Normally I'd assume this is a corruption in my browser of "Benoît Bézaire" caused by ignoring standard HTML entities. But wait! This is the W3 we're talking about.Checking the code shows:
Benoît Bézaire. It may be nonsense, but at least it's properly formatted, unambiguous nonsense.I wonder, does HTML have a standard way of marking up deliberate misspellings? A
(posted 2003-10-03T16:10, link )<span lang="gibberish">or<sic>tag?
- Scorched-Email Policy >>
-
Salon currently has an article (in its premium section) in which Jakob (
I have stopped using e-mail and hired staff to do it for me
) Nielson proposes his dump-SMTP solution to spam and viruses:It would really mean to stop accepting e-mail according to all the existing protocols. I think that the only way to do that is if you know enough important people that you want to talk to who stop using it.
My thought for how to implement this: a number of sufficiently big organizations — AOL, Microsoft, the federal government — would have to announce that two years from now no more e-mail will be accepted.
All the companies around the world would have to upgrade.
The reason it's impossible to really upgrade e-mail is that everybody has to upgrade at the same time. The beauty of e-mail — and it has worked fairly well for a long time — is that it's fairly ubiquitous.
I think that it would have to be a system that has built-in security and authentication that you can always track down. You know where it's coming from, and it's always encrypted and always secure.
As I have pointed out previously you don't need to shift en masse to something new. You can start start encrypting and signing email right now. You can stop reading unsigned email whenever you want. You can set up autoresponders that, instead of some challenge-response system, tell senders that un-signed mail is automatically de-prioritised.
SMTP mailservers out there are already using TLS for transport security. If AOL or Microsoft (Hotmail) was to announce that two years hence they would only accept SMTP mail using TLS then that might be the catalyst for better SMTP. And certainly more feasible than replacing it, atomically, with something else.
(posted 2003-10-03T11:22, link )