Alastair’s Place

Software development, Cocoa, Objective-C, life. Stuff like that.

File Ownership and ACLs

So, you ask, these ACLs… how can I read them from my own code? Well, you might think that the acl_get() family of functions were a good choice, and indeed, they do let you get at the ACL. Unfortunately, what isn’t mentioned is that, in addition to the usual UNIX uid/gid, files with ACLs may well have UUID-based owner and group information. There’s no acl family function to get that, but you’re going to need it if you want to understand what the ACL actually means.

If you look at Apple’s code (for instance, copyfile), you’ll see all kinds of undocumented functions, including such tasty looking morsels as statx_np() and chmodx_np(). Sadly they are not documented, and rely on further undocumented functions relating to the type filesec_t. These functions are visible (in <sys/stat.h> and <sys/fcntl.h>), but they are not documented.

At first sight it doesn’t seem that there is a documented way to access this data. But there is. getattrlist() allows you to retrieve ATTR_CMN_EXTENDED_SECURITY, ATTR_CMN_UUID and ATTR_CMN_GRPUUID (and setattrlist() lets you set them as well). Sadly it isn’t at all obvious how you go about getting from the thing getattrlist() returns to something you can manipulate as an ACL with documented APIs. The getattrlist() documentation unhelpfully notes that you get

A variable-length object (thus an attrreference structure) containing a kauth_filesec structure, of which only the ACL entry is used.

Well, you can see that structure in <sys/kauth.h>, but since it’s a kernel data structure, poking around inside it is naughty and you’ll probably get burned when Apple changes it.

How, then, can we get to an acl_t from this thing, without knowing what it is? Well, it turns out that the functions acl_copy_int() and acl_copy_int_native() expect their data in just this format. This is not documented — and it probably should be, given that Apple states in the kernel sources (in vfs_attrlist.c) that

/*
 * We have a kauth_acl_t but we will be returning a kauth_filesec_t.
 *
 * XXX This needs to change at some point; since the blob is opaque in
 * user-space this is OK.
 */

The other thing that it would be helpful to know is that the ACL data returned by ATTR_CMN_EXTENDED_SECURITY is in native endianness. So the function you need is acl_copy_int_native(), and not acl_copy_int(). Again, not documented.

So, to obtain this extended security information, we need to do

struct attrlist attrList;
struct {
  uint32_t        length;
  attribute_set_t returned_attrs;
  attrreference_t extended_security;
  guid_t          owner;
  guid_t          group;
  char            buffer[4096];
} attrs;
int ret;

memset (&attrList, 0, sizeof (attrList));
attrList.bitmapcount = ATTR_BIT_MAP_COUNT;
attrList.commonattr = (ATTR_CMN_RETURNED_ATTRS
                       | ATTR_CMN_EXTENDED_SECURITY
                       | ATTR_CMN_UUID
                       | ATTR_CMN_GRPUUID);

ret = getattrlist (path, &attrList, &attrs, sizeof (attrs), 0);

if (ret < 0) {
  // Handle errors
  ...
}

if (attrs.length > sizeof (attrs)) {
  // Need to use a bigger buffer in this case
  ...
}

acl_t acl = NULL;

if (attrs.returned_attrs.commonattr & ATTR_CMN_EXTENDED_SECURITY) {
  void *data = ((uint8_t *)&attrs.extended_security
                + attrs.extended_security.attr_dataoffset);
  acl = acl_copy_int_native (data);
}

at which point we have the ACL in acl, the owner’s UUID in attrs.owner (if attrs.returned_attrs.commonattr has ATTR_CMN_UUID set), and the group’s UUID in attrs.group (again, if ATTR_CMN_GRPUUID is set).

Likewise, if you want to set an ACL using setattrlist(), you’re somehow expected to know that you can create the blob of data you need with the acl_copy_ext_native() API.

Note that all of these functions and constants are documented, therefore in theory safe to use, though how you are supposed to use them together is not, and because Apple is using undocumented functions there is no good example showing how to use only documented APIs to find this information. Worse, the only example code I could find outside of the OS X sources, namely acl_api_fragment.c, makes use of the undocumented filesec and openx_np functions and is, therefore, a prime example of what not to do.

UTIs Are Better Than You Think — and Here’s Why

If you’re a Mac or iOS developer, you’ve probably heard of UTIs; if you come from some other platform, you probably haven’t, so I’ll start with a quick overview.

UTI stands for Uniform Type Identifier, and it’s used in a similar way to a MIME type—-it’s a name for a particular kind of data—-but unlike MIME types, UTIs are based on reverse DNS syntax strings (with two notable exceptions, which we’ll get to in a minute). If you want a new MIME type, you either have to register it with IETF, which is time-consuming and a source of uncertainty (they might not agree that you need one), or you can stick an x- prefix on it (but this might create clashes). If you want a UTI, however, all you need do is own an Internet domain; if you own example.com, you also own all of the UTI namespace starting com.example, so you could use com.example.MyDataType.

If you’re familiar with MIME types, you’ll also know that they’re split into two pieces, separated by a ‘/’ character. For instance, text/plain or image/jpeg. The part before the ‘/’ is supposed to tell you what kind of data you’re dealing with. This is useful, but it really doesn’t give you that much information; for instance, if you have a MIME type starting text/, you know it’s some kind of text, but you don’t really know how to decode it, or even if it’s compatible with US-ASCII.

UTIs, on the other hand, have a conformance hierarchy, so it’s possible to tell the system that your wonderful new data format, com.example.MyDataType, is actually a kind of video file. Moreover, you can be specific about it; maybe it’s actually a special kind of MPEG file. How does this work? Well, Apple has reserved all UTIs starting public., and has defined a large set of standard UTIs in that space. For instance, public.jpeg conforms to public.image, which in turn conforms to public.data, which, in turn, conforms to public.item. There’s actually more than one hierarchy (e.g. public.text conforms to both public.data and public.content), and Apple provides APIs that allow you to test any given UTI for conformance with any other UTI.

Additionally, every UTI has associated with it a set of tags. These are things like file extensions, MIME types and Mac OS OSType codes, and each class of tag is identified by—-you guessed it—-a UTI. So, the UTI for a file extension is public.filename-extension, for a MIME type it’s public.mime-type and for an OSType it’s com.apple.ostype. Again, APIs are provided that allow you to ask, given a UTI, e.g. what the file extension might be for that UTI.

Finally, Apple’s platform allows applications to declare their own UTIs by adding information to their Info.plist files, which makes them accessible to all applications on the system.

So what, you ask? Well here’s the really clever part; there’s an API that, given a tag class and a tag will find you the most appropriate UTI. For instance, if you give it public.filename-extension and jpeg, it will return public.jpeg. However, sometimes you’ll have a file that the system doesn’t recognize; you might know something about it (e.g. it’s file extension), but it isn’t in the system’s UTI database, and no application has mentioned it in their Info.plist. What happens now? Well, if, for instance, you ask for the UTI for a file you’ve found with the extension frob, you’ll get back a rather cryptic looking answer: dyn.age80q6xtqk.

The neat part about this odd-looking UTI is that the information you gave the system is still there. If you ask any Mac or iOS device for the file extension for dyn.age80q6xtqk, it will immediately tell you: frob. You could transmit this UTI across a network, and it would still tell you when you ask, that the extension is frob.

This runs slightly deeper, though. When you ask for a UTI, you can specify that it must conform to a particular type. Maybe you know that your frob file is really some kind of public.text? In that case, you might instead get the answer: dyn.ah62d4r34ge80q6xtqk. You might be able to guess that this new UTI also knows that it conforms to public.text, and again, any Mac or iOS device presented with this UTI will be able to tell you so.

Why is this feature useful? Well, web server administrators will have seen the unusual behaviour of various client systems when their server doesn’t know the correct MIME type for a file; some systems try to guess from the file extension, while others throw their hands up and treat the file as raw data (which is probably but not necessarily the default on any given web server). If the web used UTIs instead, the UTI would preserve all of the information the server did have, which would result in an improved experience in some cases.

What Apple doesn’t tell you is that these dynamic UTIs can actually hold more than just a single tag class. For example, the UTI

dyn.ah62d4r34gq81k3p2su1zuppgsm10esvvhzxhe55c

has a MIME type (text/X-frob), and a file extension (frob) as well as stipulating conformance to public.text. There’s no API on Mac OS X or iOS that will construct this UTI for you, mind… so how did I do it?

Well, the first thing you need to know is that all dynamic UTIs currently start with the string dyn.a. The first part, dyn. identifies the fact that it’s a dynamically generated UTI; the ‘a’ is, I’m guessing, a format identifier (so if you’re parsing these UTIs yourself, you must check that it’s an ‘a’ and not any other character; if it isn’t an ‘a’, you should indicate that you don’t understand the UTI).

Strip that off, and we have a funny looking string, which, it turns out, is encoded in a custom form of base-32 encoding using the encoding vector

abcdefghkmnpqrstuvwxyz0123456789

Let’s decode the dynamic UTI above and see what it contains; feeding the string

h62d4r34gq81k3p2su1zuppgsm10esvvhzxhe55c

through a custom base-32 decoder, we get

?0=7:3=text/X-frob:1=frob

Hmmm. Let’s look at another example; say we have a custom tag class com.example.SpecialType… let’s generate a UTI with that. Here it is:

dyn.ah62d4r34qr104pxftbu046dqqy1fg6dfqry0c5cytf2gntpgr710e2pw

Yikes! That’s a lot longer. What happened? Well, decoding it, we see

?0=7:com.example.SpecialType=foobar

Aha. So these are key-value pairs with the tag class and tag respectively. What about that funny ?0=7 on the front? Well, when I created this UTI I said it conformed to public.text. What if it conforms instead to com.example.Borked? Now I get

dyn.ah62d425try1gn8dbrz2g23mskm11e45fqu7gg55rf3w1u2prsb0gnpwxsbw0g4pbrvnhw6dfhzxg855cqf3a

which decodes to

?0=com.example.Borked:com.example.SpecialType=foobar

Hmmm. So ?0 means “the UTI we conform to”, right? So ‘7’ must mean public.text

Well, it turns out that the system has a built-in list of UTIs that get abbreviated to the hexadecimal digits ‘0’ through ‘F’. They are:

0: UTTypeConformsTo
1: public.filename-extension
2: com.apple.ostype
3: public.mime-type
4: com.apple.nspboard-type
5: public.url-scheme
6: public.data
7: public.text
8: public.plain-text
9: public.utf16-plain-text
A: com.apple.traditional-mac-plain-text
B: public.image
C: public.video
D: public.audio
E: public.directory
F: public.folder

Now we can understand the string we saw earlier;

?0=7:3=text/X-frob:1=frob

expands to

?UTTypeConformsTo=public.text:public.mime-type=text/X-frob:public.filename-extension=frob

There are, as ever, a couple of niggles. The first is that tags might in general contain special characters (for instance, ‘=’ signs) that would mess us up. What does the UTI system do about these? It escapes them, that’s what. The set of characters that are escaped is:

, : = \ NUL

The ‘,’ is interesting; it turns out that each of the keys in the string can have more than one value associated with them. That is, it’s legal for the UTI to encode something like

?0=7,B:3=text/X-frob,image/X-frob:1=frob

resulting in something like

dyn.ah62d4r3qkk7dgtpyqz6hkp42fzxhe55cfvy042phqy1zuppgsm10esvvhzxhe55c

If you ask the system about this UTI, you’ll find that it conforms to both public.text and public.image, in spite of the fact that neither type conforms to the other. Unfortunately, Apple only provides a way to copy the preferred tag given a tag class and a UTI, so if you ask for its MIME type, you’ll only get text/X-frob (the first one, as per the documentation).

So, in summary:

  • We’ve seen that UTIs can encapsulate any kind of type information, not just the obvious ones like file extensions. You can define your own type information, entirely orthogonal to the set Apple uses, if you like.

  • We’ve seen that conformance information allows programs using UTIs to determine that (for instance) they know how to handle data with a UTI they’ve never even seen before because it conforms to a UTI they do understand.

  • We’ve seen that it’s possible to generate dynamic UTIs that will remember the information with which they were created even when passed to another system.

  • We’ve seen how these dynamic are generated, and the mechanism by which they hold the information given when they were created.

If you’re interested in reading Apple’s documentation, you might find their Uniform Type Identifiers Overview informative. Note that the format of the dynamic UTIs is undocumented. If you are going to rely on the information I presented above, make sure you check that they start dyn.a and do not assume that you understand any dynamic UTI that has a different format identifier.

Dear Adobe

Dear Adobe,

Signing up for Creative Cloud was not a pleasant experience. Here is why:

  1. When I tried to go through your payment process, I was left with some flashing boxes and nothing else happened. This is apparently because I made the “error” of using Safari (Apple’s default web browser) to attempt to purchase from you.

    It is evident that you didn’t test your payment process in Safari properly (at least the last time you changed it) before you made it live. This is not acceptable.

  2. You then had me download an installer for “Adobe Application Manager”. This is fine, but it failed, giving the error message

    Leaving aside for the moment the stray capital letter ‘p’ on the word “Please”, this message does not mean anything useful and it is not obvious what to do about it.

  3. When I go to your support website and choose “Troubleshoot Creative Cloud Installation and Download”, I get this:

  4. On further investigation, I was able to locate the installer that had been downloaded, which makes the following claim:

    Leaving aside the presentational inconsistency that the text in this window appears to be white rather than black, and the fact that in 2012 it is not reasonable (especially given the cost of your products) to merely throw your hands up when presented with a case-sensitive filesystem, the fact is that my filesystem is not case sensitive.

    Let me repeat for the hard of hearing:

    MY FILESYSTEM IS NOT CASE SENSITIVE

    Since you (Adobe) most likely won’t take my word for it (though I can’t imagine why), here’s a quick test in a Terminal window to demonstrate:

Now, having tried your software before, and having discovered in the past that the installers are totally broken, I was aware that the fact that I normally log in to the machine as a network user (albeit one with local administrator privileges — as a developer, that’s pretty much a necessity) was most likely going to cause your software to fail. Note: that is not any kind of excuse. It’s your shoddy work that makes it fail, not my choice to operate my computer system in an entirely reasonable, Apple-supported configuration.

I did eventually get your software to install, however:

  • It is abundantly clear that you do not test your installers sufficiently well on Mac OS X. In particular, you need to make sure you address the following situations:

    1. When the user doing the install is not, themselves, an administrator.

    2. When the user doing the install is a network user whose home directory is on a fileserver somewhere.

    3. When the user doing the install has a case-sensitive home directory, but the rest of the filesystem is case-insensitive. THIS IS AN APPLE SUPPORTED CONFIGURATION and is quite common in set-ups where the fileserver is running some other flavour of Unix.

    There is no reason you should not be able to install successfully in all of these cases, even if you can’t be bothered to properly support case sensitive filesystems for your application bundles.

  • It is also apparent that you did not properly test your purchase form with Safari, which is the dominant web browser on Mac OS X.

  • You need to stop throwing your hands in the air when presented with a case-sensitive filesystem. It may not be supported for boot, but it is supported (at least) for peoples’ home directories and for other disks on the system. Some of these people might want your software installed in one of these other locations, and you should make sure that it works.

    If you are so utterly incompetent that you cannot work out that doing this is actually no more effort than making it work for case-insensitive systems, I have a library for you. Link against it, and all your case-sensitivity woes will go away. Though you may have some security bugs instead, if you don’t think too hard about it.

A result of all of this is that the purchase process was not smooth. As a developer, I was able to figure out what to do to get things to work; most normal users would not have been able to.

Get a Job!

There is a story on the BBC website about a family who will be affected by the benefit cap, presumably to illustrate the argument of those who think that capping benefits is in some way wrong.

I have a couple of observations:

  1. “Ray” is a software developer. Assuming he cannot get a job writing software, he must be a reasonably intelligent chap and should therefore be capable of getting himself all kinds of office work, never mind unskilled labour of one sort or another. Instead, apparently, he has been jobless since 2001.

  2. The breakdown of their spending includes the following items, which, I submit, are luxuries that they have no right to expect the state (in the form of you, me, and everyone else) to provide for them. Namely:

    • Shows. This is listed, but not explained, under “Other”.

    • Entertainment. What Ray does on a Friday night is up to him, but if he doesn’t have the money, he doesn’t have the money. Maybe his friends would care to buy his beer for him, instead of expecting us to buy it?

    • Sky TV (perplexingly listed separately from “Entertainment”). This is justified with the wonderful “We get the Sky Movies package because we’re stuck in the house all week – otherwise we wouldn’t have any entertainment”.

      Of course (a) Ray does not have to be stuck in all week — he could get a job; (b) there is a perfectly good free TV and radio service; © there are always books and board games; and (d) if all else fails, there is the public library! I might add that public libraries often lend out films as well as books, just in case Ray has forgotten how to read while sitting on his arse.

    • Mobile phones. I don’t care that Ray says his teenagers will whinge at him if they don’t have them. They can’t afford because their dad can’t be bothered to find himself a job.

    • “24 cans of lager, 200 cigarettes and a large pouch of tobacco”. Really? 200 cigarettes costs at least £50, right there.

  3. The amount that they will lose in benefits if this cap comes in is less than they are spending on tobacco and alcohol every week (I estimate this at the £20 “Entertainment”, plus £50 for 200 cigarettes, plus £15 for 24 cans of cheap lager and another £8 or so for the pouch of tobacco, which is £93).

The story ends with a quote from Ray: “I see eight people here having to choose between eating and heating.” Personally I see a lazy scrounger who can’t be bothered to go out and get himself a job. Any job. I don’t care if it’s as a software developer or as a damned toilet cleaner, quite frankly.

Stealing Porsches Is a Net Gain to Society (Honest)

Earlier today, Stephen Fry linked to an article by Matthew Yglesias that posits that a little copyright infringment may actually be good for society.

The article makes the usual arguments about the over-estimation of economic loss to copyright holders, who, of necessity, talk about opportunity loss rather than concrete losses. Of course, in practice it’s impossible to come up with a definitive figure owing to the nature of copyright infringement—simply put, infringers don’t tell the copyright owner about their infringement (the only case where that can really happen is with software, and software that makes any kind of effort to do that usually upsets the privacy lobby). My perspective, as a copyright holder, is usually that even if we assume conservatively that only 10% of pirates would pay, given sufficient incentive, it would still represent a sizeable loss on any reasonable estimate of the number of pirated copies of our software.

However, the article does make a few more interesting claims. First, it claims (giving the example of a pirated TV show) that the loss, however large, from infringement is offset by the “$15 to $85 worth of enjoyment” that watching a pirated TV show would create. This, it seems to me, is a bogus argument. A car enthusiast may get £100,000 worth of enjoyment from driving his Porsche; that does not mean that stealing one from the dealer is no longer a loss to society. And it certainly does not make stealing a £50,000 Porsche a net gain to society of £50,000.

It also points out that the loss to the copyright holder is not necessarily an economic loss to society overall, as the infringer may use the money saved to (for example) visit a pizzeria. Again, this argument is suspicious; it seems to me that it would apply equally to mugging… for instance, if I am mugged for £100, which is then spent on burgers, I have lost £100, the burger joint and its suppliers have gained——and by Matthew Yglesias’ argument, society has not lost out overall.

Taken together, these arguments are even more suspect. Not only can I steal a Porsche and have my £100,000 worth of enjoyment, but the £50,000 I saved on buying it can now be spent as well! Society doesn’t lose out at all, and I can claim (as Yglesias does) that the entire £100,000 worth of enjoyment was a gain for society too. Win-win, right?

Yglesias then goes on to say that, because the BBC has yet to release the second series of Sherlock in the United States, he has been downloading it illegally over BitTorrent. Leaving aside for a moment my irritation that, as a U.K. license fee payer, Yglesias has just admitted stealing from me, it seems to me to be difficult to take him seriously when he talks about the pros and cons of copyright infringement if he is also indulging in it himself.

The article proceeds to claim that there’s a “considerable” benefit in forcing copyright holders to compete with “free-but-illegal downloads”, citing the existence of iTunes and Hulu as examples of legal options that he feels might not exist without pressure from piracy. Again, I find the argument rather thin; piracy is essentially identical, economically, to having a competitor who is engaging in dumping). I have yet to hear an economist argue that it would be good if goods and services were stolen and dumped in order to depress the market price. On the contrary, the usual view is that price dumping of any sort tends to force competitors out of the market, and in the case of piracy, the competitors are the people making all of the content that is being dumped.

As for whether or not there’s a problem on the consumer side—as distinct from commercial pirates—I think Yglesias’ analysis is facile. First, the current situation, where there is an excess of entertainment available to the consumer, is a hang-over from the previous situation in which making music and movies was a highly profitable business. There is still a lot of that money in the system and it will take time to drain away.

Second, there is a tendency to under-estimate the scale of the problem of consumer infringement. Talking to ordinary people (and even celebrities like Stephen Fry, actually, whose own income is dependent to some extent on copyright), will rapidly disabuse you of the notion that piracy is not a widespread thing. Many people I have spoken to boast openly about how clever they are to get things for free rather than paying for them. Ordinary people. Not computer whizz-kids, not stay-at-home living-in-mum’s-basement types. Yet everyone always assumes that “it’s just one copy”, “it’s just me”, “the movie/ music/software company is rich enough anyway” and so on. In a way, Yglesias has demonstrated that himself—he apparently feels that it’s socially acceptable enough to tell us that he’s illegally downloaded the BBC’s Sherlock.

When piracy was just a case of sharing something with your friends, it was less of an issue for copyright holders. Of course, many of them protested the illegality of doing so, but I think even they knew that it wasn’t hurting them that much overall. The problem is that the Internet has changed “sharing with your friends” to “sharing with anyone who cares to”; the scale has increased out of all proportion.

Finally, I think consumers fail to understand the motives of some of the players in this argument, and many of them end up—effectively—astroturfing on behalf of big corporations who are making a profit from others’ piracy. There is a reason that Google searches for The Pirate Bay still work. There is a reason that registrars providing WHOIS hiding services refuse to stop hiding the details of their customers even when they are egregiously infringing the rights of others. There is a reason that ISPs refuse to enforce their own Terms of Service. None of these things happen in the case of child pornography, but all of them happen for copyright infringement, even when it is blatant.

It is certainly the case that advertising and donations on dedicated piracy sites makes money for their operators. Money that should, rightly, be going to the people who produced the copyrighted content that they help to distribute, but which, right now, is going to line the pockets of the operators of the site, of their ISPs and registrars, of payment processors and of advertising networks. SOPA, above all else, appears to be an attempt to curtail that flow of money, and so it is hardly surprising that many of the companies involved are protesting about it, though their PR departments have obviously concluded that it’s far better for their respective images to frame it as a stance on the moral high ground of opposition to censorship rather than admitting their somewhat baser motives.

Wikipedia Blackout

Everyone is probably aware that Wikipedia is blacking out its site today in protest at some new legislation proposed in the United States to discourage copyright infringement (namely SOPA and PIPA).

There are lots of breathless claims all over the Internet about the degree to which these bills will cause harm to the Internet, just as there were with DMCA before them. Indeed, people are talking about how any site might be taken down without notice, how payment providers and advertising networks might be forced to stop providing revenue streams and so on and so forth.

Most of these complaints are from people who have not bothered to read the full text of the bills, and are really just parroting what they have heard elsewhere. The result is that while they may be aware that SOPA could in principle be used to take down a site, they are unaware of the conditions attached to this, namely that:

  • The owner or operator must be committing or facilitating the commission of criminal violations under sections 2318, 2319, 2319A, 2319B or 2320, or chapter 90 of title 18 USC.

  • The site would be subject to seizure in the United States as a result of these violations if its owners or operators were located in the United States.

That is, in order for a site to be subject to take-down, it must already be breaking United States copyright law, and it would already be subject to take-down if its owners and/or operators were in U.S. territory. So, really, this part is just extending existing provisions in U.S. law so that they apply where the domain registrant is overseas. That seems fair enough, frankly, particularly as U.S. registrants might otherwise pretend to be overseas to escape the existing legislation.

Another thing that SOPA and PIPA do that is causing consternation is that they provide a mechanism for those whose rights are being infringed to notify payment processors and advertising networks that they must not process transactions for or make payments to the alleged infringer. This requires a notice similar to the ones specified by DMCA, and, just like DMCA, it is possible for the affected site to file a counter notice. And just like DMCA, if a counter notice is filed, it is the courts that must be used to decide what happens next.

They also create a limited immunity for anyone acting voluntarily to prevent copyright infringment; potential liability to their own customers has been used an excuse, historically, by registrars, payment processors and others, for continuing to allow their customers to egregiously infringe others’ rights even when their own Terms of Service explicitly ban such behaviour.

There have been all kinds of claims about the technical consequences of SOPA and PIPA, though most of these have been (as far as I can tell) baseless, since neither act makes any kind of stipulation about the technical measures that may or may not be used in its enforcement. I tend to think these are really a case of special pleading from a group of people who are making not inconsiderable sums of money from other peoples’ copyright infringement and/or are worried that enforcement might create additional costs for their businesses. These are not disinterested parties.

Anyway, regardless of your views on SOPA or PIPA, the blackout by Wikipedia is childish, affects countries other than the United States, whose citizens have no say whatsoever in whether or not the U.S. Congress or Senate pass their respective bills, and in addition has been done in a half-assed way.

For anyone who wishes to browse Wikipedia with Safari today, here’s a Safari extension that undoes the blackout.

Penelope

Today, I became a dad. Welcome to the world, little Penelope.

Congratulations - You Broke the ’Net

It should not have escaped the attention of any U.K.-based website operator or web developer that ICO has been banging its drum about the changes to The Privacy and Electronic Communications (EC Directive) Regulations 2003 and in particular section 6, which has been amended to say

Confidentiality of communications

6.—(1) Subject to paragraph (4), a person shall not use an electronic communications network to store information, or to gain access to information stored, in the terminal equipment of a subscriber or user unless the requirements of paragraph (2) are met.

(2) The requirements are that the subscriber or user of that terminal equipment—

(a) is provided with clear and comprehensive information about the purposes of the storage of, or access to, that information; and

(b) is given the opportunity to refuse the storage of or access to that information.

(3) Where an electronic communications network is used by the same person to store or access information in the terminal equipment of a subscriber or user on more than one occasion, it is sufficient for the purposes of this regulation that the requirements of paragraph (2) are met in respect of the initial use.

(4) Paragraph (1) shall not apply to the technical storage of, or access to, information—

(a) for the sole purpose of carrying out or facilitating the transmission of a communication over an electronic communications network; or

(b) where such storage or access is strictly necessary for the provision of an information society service requested by the subscriber or user.

ICO is emphasising the impact of these rules on cookies, but as you can see from the text of the actual regulations, above, they also cover any “information stored”. This would seem to include

  • The User-Agent string

  • The Accept-Language header

  • The URL itself (which may be covered by the exception in (4)(b), but not if it happens to contain session data)

  • Various information that is accessible to Javascript on the client side, but which may be of interest to the server merely to improve the end-user experience — for instance, the end user’s display size or colour depth, whether or not Adobe Flash or Java is installed and enabled, whether or not the end user is using a screen reader, and so on.

It is difficult to argue that the exceptions in (4) apply to all of this information, yet in most cases it would be unreasonable to demand explicit consent from the end user for any of it.

Further questions surround use of services like Google Adwords’ conversion tracking functionality; websites using this feature of Adwords are relying on the Adwords system setting a cookie on the end user’s machine when they click on an advert. This cookie isn’t actually under the control of the site operator—instead, it’s set by Google (via the googleadservices.com server). How is “informed consent” supposed to operate in that case? It isn’t as if Adwords conversion tracking is the kind of thing that anyone should be worried about—all it does is tells the person paying for the advertising how much each advert-driven sale is costing them.

ICO also rightly points out that the legislation applies to session cookies. Yes, you did read that right. And looking at ICO’s updated guidance it’s hard to get the impression that they plan on ignoring that fact.

Ironically, the regulations are actually worse for free services than they are for paid-for services, because the definition given for an “information society service” in The Electronic Commerce (EC Directive) Regulations 2002 is

“any service normally provided for remuneration, at a distance, by means of electronic equipment for the processing…”

and so exception (4)(b) doesn’t apply where remuneration would not normally be expected!

There’s some very wooly language in the ICO guidelines about what ICO considers would and would not fall within the exception, but even if ICO doesn’t think something is worth pursuing, there’s nothing stopping some crazy privacy campaigner from pursuing a private prosecution.

ICO does quite clearly say that you can’t rely on the availability of the “Do Not Track” header and associated browser preferences, contrary to the previous mutterings coming out of government on the issue.

I tried writing a letter to government to suggest some changes to the legislation that would provide some sanity, for instance by explicitly permitting the use of information sent by default by the user’s browser (like the User-Agent string), along with exemptions for session cookies and non-identifying properties of the user’s terminal equipment. In response, I was told that it wasn’t possible to change the law because that would require renegotiating at EU level—not an option at present, apparently. (I note, by the way, that the Danes apparently do not agree that additional work at EU level is necessary, since they have explicitly exempted session cookies, which cures at least one of the problems.)

At present, then, the European Union has broken the web. It turns out that most EU countries have been so slow at implementing the law that this hasn’t been a problem so far, but that situation won’t persist forever.

All of this could have been avoided had the EU actually consulted someone with sufficient technical expertise before changing the law. I made that point in my letter, and was told that various industry players had been consulted (the response listed Google, Apple and others), but it seems to me highly improbable that any competent technical expert would not have objected to the wording from the EU Directive, so my guess is that this consultation was after the fact.

Right now, the simplest thing seems to be to incorporate outside the European Union, and have the new entity run your company website. That would place both the site and the entity operating it outside of this idiotic piece of legislation and the regulators whose job it is to enforce it.

Jo

Oh, where to begin…

I’ve been very quiet about all of this; partly for reasons I’ll explain below, partly because I’ve been busy, and partly because, well, it’s just too amazing for words.

Software development is, as has been decried many times, horribly male-dominated. Even more so than engineering in general, and that is pretty male-dominated itself. All of us would love for this not to be the case, but it is, and for the moment at least, however much effort all of us put in to changing the demographics, we have to live with it.

One result of this is that we’re very unlikely to meet anyone at work, and worse, those women we do meet are very probably quite fed up of the unwanted attention they reportedly get from some of our number, are therefore on the defensive from the outset, and in any event are quite probably already spoken for.

Anyway, the upshot is that a fair number of us end up on online dating sites. In my case, the first one I tried was match.com, but came to the conclusion that it’s basically a rip-off; yes, I had a few actual dates, and even an actual relationship, but even using the site makes my skin crawl… sharp practice doesn’t even begin to describe the way match.com and others treat their customers. In many cases you won’t get replies to your messages for the simple fact that the person you sent it to hasn’t paid to be able to send e-mail. That’s right, both parties have to pony up in order to get a response. Either that, or you can buy the right to get replies from people who haven’t paid up, but that, as you might imagine, tends to be an expensive extra. To my mind, the sector needs regulations to protect customers from this kind of thing; if you pay to send messages, it should include the right to receive replies, end of story. Anything else is bilking the customer.

On Twitter, I’d heard about another dating site, OkCupid, which doesn’t force you to part with large sums of money before you can contact one another and which treats its members more reasonably. Obviously, since it’s free, it’s ad-funded (i.e. you are the product), though you have the option to pay to disable the ads if you find them objectionable. I should say, since I’m singing their praises somewhat, that OkCupid has since been bought by match.com, which may or may not have had an impact. It did result in the post about paid online dating that I linked above being removed although OkCupid insisted that that was because it wasn’t accurate, rather than being instigated by match.com.

So, around the end of last year, I had an e-mail on OkCupid from Jo, to which I replied asking if she’d like to go out for a meal with me at the excellent La Regatta in Southampton. Very convenient as she was living on the Isle of Wight at the time, and it’s right opposite the ferry terminal. Owing to the bad weather, we postponed our date until the New Year, but it’s something of an understatement to say we got along like a house on fire; we basically forgot to eat, we were so busy chattering. Our second date wasn’t really much different, though we did actually manage to eat something! I really don’t have the words to describe how much I love Jo; she’s the best friend I’ve ever had, the most wonderful company, and I don’t know now what I’d do without her.

Anyway, a couple of months later, I proposed, and she accepted.

So the first piece of news (to readers of my blog) is that I’m getting married, next July.

The second piece of news — in some ways even more amazing — is that Jo and I should be having our first child some time over the next couple of weeks. We’re both over the moon about this (Jo especially, as she wasn’t sure it was possible), and hopefully there will be another blog post soon enough to welcome a new life into the world.

(In an ideal world, I would have preferred to do this the other way around, but when I met Jo at the start of this year, she was separated but not yet divorced. This is also why I haven’t mentioned anything up to now — neither of us wanted to do anything that might upset the divorce proceedings.)

Jo, if you’re reading this, I love you so much.

3-D Secure — How Not to Do It.

A typical Verified by Visa form

If you’ve done any shopping on the Internet in recent years, chances are that you’ve happened across the joy that is 3‑D Secure (aka Verified by Visa or MasterCard SecureCode). This is a system that can be adopted by your bank, supposedly to provide you with additional reassurance that your card details cannot be used fraudulently by a third party to make purchases on Internet sites.

You’ll know if your bank has “enrolled” your card for this scheme because when you make your purchase you’ll very likely be presented with a screen like the one on the right.

Unfortunately, 3‑D Secure is still, in 2011, ten years after it was first launched, a total disaster. Why? Well:

  • Some banks don’t tell their customers about it, but have still signed all their cardholders up to the scheme.

  • Some banks’ implementations ask cardholders for things they frankly shouldn’t (for instance, in the United States, the customer’s Social Security Number). This frightens cardholders, because they have been told never to enter these details into a website because of the risk of identity theft.

  • Typically there is no way to proceed with the purchase without using the 3‑D Secure form; all you can do is use it or cancel. This is often the case even when the user is being prompted to sign up for 3‑D Secure, and as a result some customers abandon their purchase.

  • Banks generally outsource their side of 3‑D Secure, which means that the end user is seeing a page from a third-party. Of course, current recommendations from Visa and MasterCard say to use an HTML iframe anyway, so maybe they don’t see that, but if they do have the inclination to check it out, they may very well panic anyway.

  • Customers simply don’t expect to suddenly see a page displaying their bank’s logo while trying to pay for something. This is, of course, made substantially worse by their bank not mentioning to them that this will happen.

  • Some banks’ 3‑D Secure forms are not as concise as the example above and even in some cases require that the cardholder re-enter(!) all of the information they have already given to the site trying to sell them something. Yes, I did say re-enter.

But, more pertinently, passwords are a terrible way to verify customers’ identities. Even assuming the cardholder doesn’t choose the same password they use everywhere else, they’re likely to forget their password (which is very frustrating), and in any event it is susceptible to phishing or keylogger-based attempts to capture the necessary information.

The sad part is that 3‑D Secure itself is actually able to provide any authentication technique your bank cares to use. There is nothing stopping your bank from choosing something a little more human-friendly – for example, showing you pictures of faces and asking you to choose the correct one – or even providing a card reader and allowing your bank card to directly communicate its physical presence to the bank.