As you’ll know if you read the post I linked to above, iDefrag and iPartition are very much in decline; they have been for some time, actually, and I’ve been scratching around trying to work out what I’m going to do instead. Well, back in 2013, I thought I’d found a great new product that didn’t exist and that I thought I had the skills to develop.
I was sat upstairs in the study in my old house, and my elderly Harmon-Kardon SoundSticks (the original USB kind, not the newer analogue variety) died. This made me sad - the SoundSticks sounded great, and although the built-in speakers in my iMac weren’t bad, they don’t sound half as good as decent external speakers with a subwoofer.
It then occurred to me that I had a spare A/V receiver in the attic, along with a full 5.1 speaker set that I wasn’t using. It also occurred to me that the headphone socket on my iMac was actually a mini-TOSLINK port, and since the A/V receiver had optical inputs, I could hook my Mac up to it using an optical fibre and — maybe — get surround sound!
Excited, I got all the hardware together, rearranged my workspace to fit everything in, and turned it all on. And got stereo. Very nice sounding stereo — way better than the SoundSticks, never mind the iMac’s built-in speakers — but stereo nonetheless.
I was, of course, being naïve. It isn’t possible to send 5.1 channel raw PCM over a standard S/PDIF interface, so my Mac was doing the best it could and sending two channel stereo.
That was when I had my bright idea — I could write a Dolby Digital (aka AC-3) encoder, that took 5.1 channel audio from Core Audio in my Mac, compressed it in real time, and squirted it out over the optical interface. I managed to find the necessary specifications (not too hard, because AC-3 is part of various other published standards), and started work.
I was, of course, aware that I’d have to license the AC-3 codec from Dolby Laboratories, so I also started talking to them about that while I worked on my encoder.
Well…
The folks from Dolby were very nice, and quite helpful, though it was clear that they weren’t really set up for license applications for the kind of product I wanted to make. Months passed, and we were still talking to each other; meanwhile I had the software side of things pretty much working. Eventually, I was given a license agreement to look through, and that’s where things really unravelled.
To explain: AC-3, like the competing DTS standard, is a non-optional part of various other standards, including ATSC, the DVD and Blu-Ray standards and so on. As a result, it is licensed under terms described as “Reasonable and Non-Discriminatory” (aka RAND).
Now, that sounds great, right? It means, surely, that the terms are reasonable and that I, as a small software developer, will get the same terms as (say) Sony. Well, no, not quite.
What it actually means is that for the set of people who were expected to want to license it when the license agreement was written, the terms are reasonable, and that everyone gets the same license agreement (it doesn’t mean that the same terms in that agreement necessarily apply).
There were two problems; the first was that the license agreement tried to distinguish between “professional” use (i.e. content creation software and hardware, which is typically very expensive) and “mass market” use (i.e. people who make DVD players and the like), by charging different amounts per unit depending on the volume of units shipped. Sounds reasonable, right? Well, yes, until some upstart comes along with the idea I’d had, expecting to ship relatively small numbers at a relatively low cost. I can’t be specific about the licensing costs (they’re under NDA), but the numbers didn’t work.
The second problem was that the license fees increase every year with U.S. inflation. So even if I could just about stomach the initial per unit fee (and to do that I’d have had to have charged a lot more than the $20 per unit I had envisaged), in a few years’ time I’d simply have to stop selling the product because it wouldn’t make economic sense. And in the meantime, Dolby Laboratories would see almost all of the profit from my work.
At this point, I had a functioning piece of software, which worked really nicely for me in my study, but I couldn’t even give it away because it infringed Dolby’s patents, and I couldn’t license those patents because the cost was prohibitive. I asked Dolby if there was any way they could vary the terms to make it work, and, to their credit, they did go away and think about it, but eventually came back with the answer that they were unable to do so because of their “Non-Discriminatory” obligation — they could only offer me the same license they offered everyone else.
I managed to salvage some of the work I’d done — notably the new image-based licensing system, which was included in newer versions of iDefrag and iPartition — but most of it languished on my disk. It had taken me about a year’s work to get to this point.
I was upset. Now, I’d made a mistake in that I’d worked on the product before finalising the licensing — but then if it had worked out, that would absolutely have been the right choice, as I’d have been in a position to release it the moment the license was signed.
In retrospect, I should perhaps have realised that this problem existed; I had heard that some PC sound card vendors had AC-3 encoding support in their hardware, and that they had started trying to charge their customers extra to enable it.
Anyway, fast-forward to 2018. Sales of iDefrag and iPartition are falling away, and it’s getting to the point where I can’t pare my company back much more without actually shutting it down. And that’s what I was considering doing, as recently as two weeks ago; I’d intended to get to the end of this financial year (31st March) and then close down. It was looking like the end of my 14 year run at working for myself — as I have a family to support now, not to mention a wife doing a Master’s degree, there didn’t seem much option.
And then I saw a tweet. Just a small thing, noting that AC-3 was no longer “patent encumbered”. My heart leapt. Sure enough, I found evidence that the core AC-3 patents expired on the 20th of March 2017. I could ship!
And so, finally, Aura was released, today, some four years after I had something I could have shipped, but was stopped by “Reasonable and Non-Discriminatory” licensing from doing so.
It’s been quite a journey, this one.
Hah. Apparently Apple has quietly phased out the optical outputs on its newer models (anything made in 2016 and later, by the look of things). Figures.
]]>Understandably, today, there’s a lot of anger in social media, particularly on the left, about the tax evasion and tax avoidance that the BBC’s Panorama current affairs programme detailed in its latest episode. The thing is that a lot of the anger is misplaced, and, to be perfectly blunt about it, a lot of people are being manipulated by politicians greedy for more government cash into thinking that this is about schools or hospitals closing versus tax revenue being collected. Those same politicians don’t mention, of course, that they spend taxpayers’ money on guns, bombs, five star hotels, art for their offices, expensive office furniture and so on. I’m not necessarily opposed to those things, I might add, but they’re a lot less popular with the public than schools and hospitals, and so if you put yourself into the mindset of a politician who would like extra money to spend on his or her personal priorities (which may or may not include a fancy office chair, for instance; or extra “trade envoys” so their chums can enjoy a few foreign junkets at taxpayers’ expense), you’re always going to play the schools and hospitals card here. You might, if you’re a political leftie, also go on about “cuts” to peoples’ benefits — regardless of whether or not the current government has cut benefits, some people are likely to be receiving less than they were for whatever reason (rule changes, changes in personal circumstances, etcetera), and those people will suck up your argument, even if it is basically untrue.
Another fact that isn’t often mentioned, unless you talk to an economist anyway, is that there’s an underlying assumption that the money would be better used if paid in taxes to government. A case in point here is Apple. I won’t for one moment claim that I think that how Apple currently arranges its tax affairs isn’t anything short of outrageous — though I have a different view on it to the one being loudly expressed across the Internet today, which I’ll go into below — but in purely objective terms, the goal here should be to maximise the benefit to citizens, across all measures. So, for instance, one could argue that Apple having a lot of cash benefits millions of people because Apple spends a lot of that money on innovation, and that innovation makes the lives of many millions of people, the world over, better. Yes, it may also enrich Apple shareholders, directors and employees, but I’d argue that’s a much smaller effect. On the other hand, if governments had confiscated it as taxation, it’s quite unlikely they would choose to spend it that way. Yes, some of them might spend some of the money on schools and hospitals, or alleviating homelessness or poverty, but let’s be honest here — quite a bit of it would instead be spent on things the public isn’t so keen on.
Note: I’m not trying to defend Apple here. Apple can do that itself. I’m just trying to fill in some of the missing parts of this.
On the subject of Corporation Tax, which is what the fuss about Apple is all about, and is also, actually, at the heart of some of the other tax avoidance schemes that we’re talking about here, CT is just a bad tax, pure and simple. The OECD even published a paper about taxation at one point showing that Corporation Tax was the only widely deployed form of taxation that was negatively correlated with growth. It’s also too easy to manipulate, because it’s based on profit calculations — that is, companies can reduce it by appearing on their profit and loss sheets to have spent money or to have lost money on their assets through depreciation or other kinds of loss. Additional confusion is caused to the public by the fact that companies only exist within the legal jurisdiction in which they were incorporated. In a very real sense, for instance, there is no such company as “Apple”. Rather, there is Apple, Inc (which is in the United States), Apple Europe Limited (in the United Kingdom), Apple Operations International (Ireland), Apple Sales International (Ireland), Apple Distribution International (Ireland), as well as a host of other entities. All of them are separate companies, and therefore separate legal entities, though some may hold shares in others and they likely share some directors too. The thing the public thinks of as “Apple” is not, in a legal sense, real — but instead is projected by the actions of a number of co-operating legal entities in various different jurisdictions. You might say this is a sleight of hand, but it’s how the world works because it’s how the laws passed by our politicians work.
The upshot of this fact is that it’s possible for (for instance) Apple, Inc to pay money to Apple Operations International, which doesn’t change the amount of money “Apple” (the ephemeral thing the public thinks of) has, but does change things for tax purposes because Apple, Inc pays Corporation Tax at the United States rate (high by global standards), while Apple Operations International is taxed in Ireland. Of course, as the Paradise Papers make clear, things are not quite that simple, and some other steps are involved that reduce the Irish tax bill by, basically, paying money to another company in Jersey. But you get the idea.
Now, you might say, as Donald Trump does, that this is all outrageous, that Apple is a U.S. company, and that the billions it has “stashed away” outside of the United States should have been taxable at 35% in the United States and that Something Must Be Done. Or, you might say, as some here in the United Kingdom do, that Apple makes a lot of money here, but doesn’t seem to pay very much tax here, and so some of those billions are “ours” in some sense. But that simply isn’t how the law works.
Nor, and I’m going to be controversial here, is that how we should want it to work. Let me take another example. Imagine you operate a delivery and warehousing company in the United Kingdom, whereby overseas suppliers can pay you to stock goods for them and deliver them to addresses in the United Kingdom. Now imagine that there is a website that sells goods of all types, all shapes and all sizes, and that that website does so in the United Kingdom by hiring your delivery company to hold stock and deliver goods; in other territories they do much the same thing, but with different delivery and warehousing companies. Clearly both the delivery company and the website company will be able to calculate a profit figure (essentially sales minus costs), and so Corporation Tax will be paid at UK rate on the profit made by the delivery company and at some other rate depending on where the website company is incorporated on its profits. Now, let’s say the website company can choose where it incorporates — after all, it’s a website and the Internet is everywhere. So let’s pick somewhere with low tax rates. Luxembourg, say.
Now, the delivery and warehousing company is free to charge whatever it pleases. Obviously if it goes too high, it will lose the website company’s business, and if it goes too low, it will lose money and eventually go out of business, so there are limits, both low and high. Moreover, the high limit, according to standard economic theory, will tend towards the low limit as the level of competition increases — assuming perfect competition, the delivery and warehousing company will be making a profit of £0 on its operations.
All of this is fine and dandy, and nobody would question the right of the owners of the distribution company to operate at the lowest possible cost and even to make no profit at all if that’s what they wish to do. It’s their company, and there is no innate requirement that a company run at a profit (they could even choose to fund its operations by constantly shoveling money at it, though that strategy can’t last indefinitely as the owners will eventually run out).
I’m sure some people can see where I’m going with the situation I just described. So let’s cut to the chase. Let’s call the UK company “Amazon UK Limited”, and let’s call the website company “Amazon EU SARL”. Just, you know, for sake of argument. And since corporate ownership is rarely straightforward, let’s say the two companies share some directors and shareholders. We might even imagine (though this isn’t how it happened in Amazon’s case) that the website company might eventually decide to simply buy the distribution and warehousing company (in which case, the two separate companies still exist — it’s just that one owns the other). Why does this make it unreasonable for the distribution and warehousing company to run at zero profit all of a sudden? It was fine before I gave them both similar sounding names, and before they had shared directors/shareholders. Why is it suddenly not OK now?
“Nobody would run a business for zero profit”, I hear you say. Are you sure? What about a family business where the owners are employed by the business? Running at zero profit in that case might make sense — subject to tax legislation not making it much worse to pay salary rather than dividend.
The fact is that “multinational” companies are largely a fiction — legally speaking, they are really groups of national companies that happen to co-operate for whatever reason, often but probably not always because they have the same (or overlapping) ownership or directors. Each of these separate legal entities is separately taxable, in the jurisdiction in which it exists, on its profits, and there’s little you can do to prevent them from paying one another for services, intellectual property licensing and so on, thereby reducing the profit in one jurisdiction and increasing it in another. There are some rules governing payments between companies with shared ownership, so for instance you can’t have company A sell parts to company B at hugely inflated prices in order to reduce profits at company B and increase them at company A, but of course it’s very difficult to prove a value for intellectual property, especially things like brand names, so this is something of a losing battle for tax authorities as long as corporations’ accountants are on the ball.
A final nail in the coffin for Corporation Tax is that while the intent is that it should fall on the owners of corporations, in practice some fraction is, for understandable reasons, borne by their customers and employees instead, in the form of higher prices and lower wages respectively.
What should we do instead? Well, CT is a non-starter. It doesn’t work in a globalised world, it isn’t an efficient tax, it’s poorly understood by the public (which causes resentment when they hear that e.g. Amazon isn’t “paying its fair share”), forces governments to get involved in and to legislate about the calculation of corporations’ profits and is just, in general, not a good idea.
But let’s think for a moment; the goal here was to impose a tax on the owners of the corporation. How do those owners benefit from a corporation’s profits? Well, in two ways:
Through appreciation in the value of their shares.
Through dividend payments and other distributions.
In the former case, we tax the rise in value when they sell the shares. Currently in the United Kingdom, this would be covered by Capital Gains Tax, which is levied at a lower rate than Income Tax, which may or may not be desirable (it’s notionally to encourage investment in businesses). We might consider instead taxing at Income Tax rates, but taking into account inflation when calculating the taxable gain, if any.
In the latter case, these are often taxed through Income Tax, though presently here in the UK we have some special rules for dividends that make them a little more tax efficient. We could abolish those and instead tax them as ordinary income — the original justification for the different treatment was that the money had already been subject to Corporation Tax and that taxing it twice was, essentially, double taxation.
The elephant in the room here is probably overseas distributions or capital gains, and the solution there is quite straightforward: a withholding tax. So, for instance, if a company pays a dividend to an entity (a person or a company) that is outside of the United Kingdom, a UK company should apply income tax at full rate to that money at source. That tax could then be claimed back by the foreign entity if it can show that it has been taxed on the money. There are variations we could consider (for instance, perhaps at most the amount paid in tax in the foreign jurisdiction could be claimed back, even if the UK payment was higher), but the central idea is that you make it impossible to take the money out without paying some tax on it.
TL;DR:
Corporation Tax is negatively correlated with growth.
Corporation Tax should fall on the owners (typically shareholders) of corporations, but in practice is partially borne by customers and employees.
Corporation Tax is based on profits, and as such is easy to manipulate, particularly for “multinationals”.
To try to prevent manipulation, the rules have become increasingly complex in many jurisdictions — for instance, banning “depreciation” (here in the UK we have “capital allowances” instead), attempting to regulate the prices of “intra-company” transfers (i.e. sales between related legal entities) to prevent “profit-shifting” and so on. This complexity is good for accountants and lawyers, but it’s very bad for smaller businesses (and therefore bad for competition, and thereby consumers) and makes it more likely that loopholes are inadvertently introduced.
The notion that businesses will “play fair” is a nonsense. Big businesses, in particular, have every incentive not to; they have adequate funds and staff to challenge the tax authorities, and the sums involved can be colossal. They can also afford to employ the best and brightest — wages in public service are typically more restricted, and worse, if someone is really good, business may eventually poach them. On the other hand, small businesses are more likely to play fair because they don’t have those resources and want to concentrate on their business, so treating everyone the same way is likely to be advantageous to big business (whether you’re going to come down hard on everyone or not).
It is often asserted that it is “immoral” not to pay your “fair share” of tax. But what is your fair share? Let’s think about that for a moment.
Let’s start by assuming that we all agree that everyone should pay some tax. Not everyone thinks that, and there are details like whether the very poor should pay any tax at all, but we’ll take it as given that, in principle, we should pay some tax.
So, how much tax should you pay? Should it be based on your income? Or on your wealth? What if your income varies substantially from year to year? Is it fair to tax someone who earns £70,000 one year, but will only earn £20,000 the next, in the same way as someone who earns £70,000 every year? What if you have no income but are very wealthy? How does that change if your wealth is illiquid (maybe you own a large country house, or even a small house somewhere expensive like London)? And when we decide what to tax, at what rate should we set the tax? 13%? 35%? 50%? Higher? Should the tax rate increase (or decrease) with the overall amount? Why? Should anyone be exempt from tax for any reason? Should there be an amount you can earn or hold before you start having to pay tax? How should this interact with things like benefits or indeed voting rights?
My goal here is to make you think. This isn’t simple. It isn’t like the question of whether you should cheat on your wife (you shouldn’t, in case you’re wondering).
So what we’ve chosen to do, at least in most countries, is as follows: we elect people, who form a government. They, in conjunction with some kind of assembly, debate the matter and come up with a set of rules, which they pass as legislation. The legislation answers the above questions, at least as far as that country is concerned.
That is, the amount of tax you should pay is defined by the law. If the law says you should only pay £1 in tax, then that is what you should pay. There is no “moral” case for paying more than that amount. Paying less than that amount is evasion, and that is both illegal and wrong — because it’s unfair that everyone else has to comply with the law and you don’t.
“But tax avoiders…” I hear you say. Well, what is tax avoidance? Tax avoidance is really just where you notice that the law says you could pay less tax than someone else thinks you should. Note: it isn’t paying less tax than you owe; it’s really just where the amount of tax you owe is surprising for some reason. Now, aggressive tax avoidance can involve doing all kinds of things that you wouldn’t ordinarily have done (typically this involves companies owning things that you would normally have owned yourself; loans being made where none were necessary; low tax rate investments like pensions investing in assets you sell to them, and so on), solely to leave you in a position where the law says you owe little or nothing in tax. The UK, in common with other jurisdictions, has passed legislation to prevent that, namely the General Anti-Abuse Rule (or GAAR).
I’m in two minds about GAAR. On the one hand, I’m not really a fan of aggressive tax avoidance; yes, it’s legal, but I think where it’s obvious that you’re using the law in a manner different to that which Parliament intended, there’s an ethical problem with that. I won’t do it, even sometimes in cases where my accountant is convinced I should. And I’ve been offered avoidance schemes, which I’ve turned down. On the other hand, it essentially amounts to allowing the tax authorities to decide that the law doesn’t matter, and what does matter is their view of how much you should pay. I’m no fan of that either, and while there are checks and balances in place, my view is simple: it’s up to Parliament to get the law right in the first place.
Much of the aggressive avoidance is caused by Parliament complicating the tax system for political reasons. There are many examples; my favourite recent example was the legislation passed to make the UK an attractive place to make films. This was intentionally designed to provide tax breaks for investors, but many of the various vehicles that accountants and tax planners constructed to take advantage of the tax break have fallen foul of GAAR, it seems because someone didn’t realise the massive tax advantage they’d handed out. Yes, there were loans involved — but honestly, that’s quite normal — if you know you’re going to make a profit on your investment, you might well take out a loan in order to make a bigger investment than you could out of your own capital. The problem here was that in doing that, the investors were entitled to a much larger tax break, and could write off in some cases very large amounts of tax. I don’t think you can really argue that this wasn’t what Parliament intended; if it didn’t intend that, then those responsible for the legislation were spectacularly inept. And, I might add, an unfortunate consequence is that there were many much less wealthy people who became involved and for whom the tax consequences are dire, because the penalties being imposed are based on the total investment, including the money they borrowed, which was often many times the amount they put in themselves. Nor was this “wheeze” failing to result in films — the vehicles in question were responsible for a number of blockbusters, so the tax break certainly encouraged precisely what the government wanted it to. Basically, I’m no fan of Jimmy Carr, and I wouldn’t have invested in the scheme he did, but I’m a little uneasy about saying that people who did were doing anything other than what Parliament intended.
But even simple things like deliberately imposing high marginal rates on “wealthier” individuals create the scope for avoidance. Why? Because you’ve increased the value to those individuals of not paying that money in tax. That can make it worthwhile to do something unusual that wouldn’t ordinarily be viable because of the extra cost — like paying yourself through a limited company (as, it turned out, even civil servants and BBC staff were doing).
So what should we do? Well, the tax system needs to be fair, but we need to recognise that it’s just as unfair to confiscate very large portions of a rich person’s income as it is to do the same to a poorer person, but that, unlike poorer folk, the very rich are in a position to do something about it. Thus, we should resist the “soak the rich” mentality of the hard left, while nevertheless making sure that the matter of calculating the tax that is due is as simple as possible so that there is little room for manoeuvre. So we should also resist the urge of some to craft all kinds of exemptions, special schemes and complicated rules to encourage this and discourage that. Simplification should be the order of the day.
Personally I’m in favour of a flat tax at, say, 35%, with a large personal allowance to cover basic living costs, which can be combined with someone else’s so that a household with one person earning £X pays the same in taxes as a household with two people earning, between them, £X. At the same time, I’d abolish National Insurance, which is far too complicated and is contributing to the business of people unnecessarily being paid through companies, I’d get rid of Capital Gains Tax as a special case (but allow inflation to be used to reduce gains, and the same on savings in the bank, so that you don’t get taxed on inflation), and I’d get rid of the special treatment of dividends too. Similarly, VCTs, EIS, ISAs and all the other complications, I’d probably look to abolish — I’d rather people invested because they had money I hadn’t taken off them, instead of investing in order to prevent me taking money away from them which is what those schemes do — and at the same time I’d be looking carefully at the benefit system to see how it could be reformed (I find the universal basic income to be an interesting suggestion in this area, though I’m not sure how it would work in practice).
TL;DR:
“Evasion” is where you don’t pay the tax the law says you should. That’s wrong, pure and simple.
“Avoidance” is where the law says you need to pay an amount of tax that somebody finds surpisingly low for some reason. It isn’t illegal, and if something is at fault, it’s the law. You should focus your anger about this on the politicians who get to choose what the law says, not on those people who pay less tax than you think they might otherwise.
Aggressive avoidance is already tackled in many places through a General Anti-Abuse Rule (GAAR). These are problematic, though, because they effectively allow a body to decide that the law itself doesn’t matter as much as their opinion. I’m not a huge fan of GAAR overall — I’d rather the underlying law didn’t provide opportunities that create a need for it.
If someone starts bleating about “schools and hospitals”, you’re being manipulated. Governments spend money on lots of things you probably don’t approve of, in addition to the schools and hospitals everyone likes. The notion that tax avoidance or even evasion is responsible for school closures is nonsense, even at the outside estimates of the amount of tax that is avoided or evaded every year.
It’s also worth being a little more sceptical about some of the more outspoken voices you may hear in the media on this. Margaret Hodge, for instance, the current chair of the Public Accounts Committee appears to have a very large holding in Stemcor, much of it held in trusts. The facts and figures in that letter are disputed — both Hodge and Stemcor used the word “libellous” in their response — but the fact is that Stemcor is owned by and controlled by the Oppenheimer family, of which Hodge is a member, and there is certainly some evidence that it engages in the kind of tax planning of which Hodge has been so publicly critical.
Panorama made a point of mentioning Lewis Hamilton’s jet, on which he apparently hasn’t paid any VAT in spite of it allegedly being used for personal use, as opposed to merely business use. I’m sure he isn’t the only one doing this, I might add — it’s just that Panorama singled him out.
Now, again, this is more complicated than it seems on the face of it. If I were Lewis or his accountant or lawyer, I’d probably point out that much of what Lewis does, including posting pictures of himself on Instagram (or wherever) having a very nice time apparently on holiday, could be construed as “business”, in that Lewis Hamilton is, himself, a brand (in the same way as, for instance, David Beckham). That, and the itinerant nature of Formula 1, makes it quite difficult to separate personal use of his aeroplane from business use, and I imagine that’s what the Isle of Man’s tax officials had in mind when they allowed him to pay no VAT on its import into the European Union. It’s also, as I understand it, not unusual to discount a small amount of business use on some items, though the rules are very complicated and I don’t know how it applies to aeroplanes in practice, and it may well be that the Isle of Man got it wrong when they did this.
I’m not sure VAT avoidance or VAT is really in-scope for this piece, as it only formed a minor part of the information coming out from the Paradise Papers, so I’m not going to talk more about it here.
As always with these things, I’d say it’s a good idea to be more sceptical about what you read or hear in the media. Rather than blaming “the rich” or “corporations” for the problem of tax avoidance, you should look to your politicians. They, not the rich and not the corporations, are responsible for setting the law, and a lot of this is caused by baroque legislation made for political purposes and a failure to grasp the nettle of tax simplification.
]]>This turned out to be quite a chore; I didn’t want to use Rez, because it’s
deprecated, and the hdiutil udifrez
and hdiutil udifderez
options are,
well, not particularly well documented (not to mention asymmetric unless
you’re using the undocumented XML format).
Anyway, it turns out that the documentation on the legacy MacOS resource fork format is in the book More Macintosh Toolbox, though that doesn’t actually define the format of the resources themselves (for that you have to look elsewhere). I haven’t split out the resource fork parsing/generating code, unlike the Alias/Bookmark code, because this really is the only place that the resource fork format is being used now.
]]>To my American friends, I say this: take a deep breath. The fact is that America electing Trump does not mean you’re surrounded by racists, misogynists or homophobes; yes, those people doubtless voted Trump, but they won’t be a majority of his voters any more than you’d imagine that people who voted for Obama were black supremacists.
I would have preferred that you elect Hillary1.
But it will be OK with Trump, however awful he is. Your Constitution was designed to check the power of the executive, and the bulk of the GOP, which now holds a majority in both Congress and the Senate, was not united behind Trump. He will face plenty of opposition in both places from both Democrats and Republicans alike, and I am sure, and I hope, that Congress, the Senate and the Supreme Court will do their best to make sure that Trump’s presidency is not the disaster many fear it will be, either for the United States or for the rest of the world.
Some people will find this surprising; I am definitely right-wing, though I identify myself as a classical liberal, certainly not a conservative in the traditional sense and I definitely disagree with many of the things Trump has said during his campaign. If I had my pick of the candidates for the GOP nomination, I’d probably have picked Rand Paul, though I don’t share all of his views either (notably we differ on abortion). ↩
During this interview, Janie was asked to write a linked list; this is probably the second simplest data structure after an array, and her response to being asked about it was to tell the interviewer that she was
a hacker who learned programming by writing applications rather than learning algorithms and data structures you only use to pass code interviews at corporate entities
and was slightly incensed when the interviewer responded
“Oh, so you’re not a programmer. You’re more of a management type.”
I think one of the reasons Janie got a bit of push back here (which she talks about in her most recent blog post) is that while she’s right that it’s quite unlikely in run-of-the-mill programming jobs that you’ll find yourself needing to implement a linked list, the implication of her response is that this stuff is hard, that it needs a great deal of learning, and that it will be a waste of her time.
None of that is true.
Put another way: there is a reason they teach this stuff in Computer Science degrees. (I do have a CS degree - well, Information Systems Engineering, which included CS and Electronic Engineering - but I learned a lot of this stuff on my own before starting my degree.)
Let’s deal with the linked list thing first. Even if you know what one is,
the chances are very good that it’s the wrong data structure to use. On
modern microprocessors, in 99% of cases cache locality is more important than
being able to manipulate lists using pointers, so you should use an array
instead. Or a CFArray
. Or a Python list
. Or a C++ std::vector
.
If I ever interview you and ask you about a linked list, it’s because you said you had a CS degree and quite probably you failed to answer a question about a more sophisticated data structure I asked you about. Either that, or I’m going to get you to reason about it somehow and the list itself isn’t really what the question is about, and in that case, if you said you didn’t know what one was, provided you didn’t study CS, I’d show you because the point wasn’t the list, right? (If you did study CS and don’t know what a linked list is, you just failed the interview; regardless of whether you’ve ever used one or not in a real program, you were taught about it and you really should know.)
For the benefit of those who don’t know what a linked list is, imagine you want to store the integers 2, 4, 6, 8, 10. You could use an array
but if you wanted to insert, say, 7, into the array, you’ll have to resize it and copy data around. On modern architectures, in most cases, that’s actually the right way to implement this, but on older systems, on the less powerful hardware used in embedded systems, or in certain special cases you might instead choose to store the numbers like this:
Each number is now stored in a structure with two elements (traditionally called a node); the first is the number, while the second is a pointer to the next structure in the list. This is called a singly-linked list, and it should be apparent that inserting 7 into it is just a matter of allocating a new list node, putting 7 into it, setting its pointer to point at the node containing 8, and then updating the pointer in the node containing 6 to point at it.
Obviously with a singly-linked list, if you have a pointer to a node, you can easily obtain a pointer to the next node, but you have no way to go backwards through the list; this also makes it hard to remove a node given just a pointer. The desire to go either way through the list, and also to make node removal as easy as node insertion leads to the idea of the doubly-linked list:
There’s also a smart-ass variant of the above where there’s only one “pointer” per node, which consists of the exclusive-or of the pointers to the previous and next nodes, which is neat but unless you’re on a memory restricted microcontroller you really shouldn’t.
By the way, there is a nice variant that I haven’t seen in any textbooks, namely the circular list, which lets you quickly add elements at either end of the list and also simplifies bookkeeping because there are never any null pointers.
Here’s a singly-linked version:
Note that we keep a pointer to the last element; to insert at the head of the list, we update the last element’s pointer but not the tail pointer, whereas to insert at the end of the list, we also update the tail pointer.
If you ever have cause to implement a linked list algorithm, I strongly recommend using the circular variant. And if you are unlucky enough to turn up for an interview where someone really does want you to show them a linked list, draw that kind and explain to them what the benefits are (no null pointers, simplified manipulation, fast insertion/removal at either end with only a single tail pointer to manage). Well, if you want the job, anyway.
Note that I said “learn about”, not “learn”. You do not need to be able to write a Quicksort or Shell sort routine from scratch and I would never ask someone to in an interview; if you need to do that, you’ll be able to look it up.
The main thing to understand here is the idea of algorithmic complexity. Usually we’re talking time complexity but occasionally someone might care about space complexity too. Complexity is a measure of how expensive the algorithm is, and we typically express it using “big O notation”. Some examples:
Notation | Meaning |
---|---|
O(1) | The algorithm takes constant time (best possible) |
O(log n) | The algorithm takes time proportional to the logarithm of the size of the input (good) |
O(n) | The algorithm takes time proportional to the size of the input (OK) |
O(n2) | The algorithm takes time proportional to the square of the size of the input (not great) |
O(2n) | The algorithm takes exponential time (bad) |
O(n!) | The algorithm takes time proportional to the factorial of the size of input (really bad) |
You may also see people talk about worst case, amortised worst case and average case. Worst case and average case are fairly easy; amortised worst case is where you consider the overall cost of an algorithm over a set of inputs - the idea being that the amortised worst case will be lower if the worst case is hit less frequently.
It’s also important to understand that, in addition to their complexity, many algorithms have a fixed cost, and that there is a general trend towards higher fixed costs for algorithms and data structures with lower time complexity.
How is this useful? Well, many languages and runtime libraries make you
choose what kind of container to use to hold your data, and this choice can
have a noticable – and sometimes extreme – impact on your program’s
run time and memory usage. To help you make an informed choice, the
documentation will hopefully tell you the algorithmic complexity (or cost)
of the operations on the container. For instance, looking at
std::vector::operator[]
,
we can see that its complexity is listed as “constant” (i.e. O(1)), whereas
std::map::operator[]
lists its complexity as “logarithmic in the size of the container” (i.e.
O(log n)).
The C++ STL also has a few other types you could use instead of std::vector
,
for instance std::deque
or std::list
. It makes you, the developer,
choose, and to make that choice you need some idea of which will be better for
your particular application.
That’s a bit painful, and on iOS and macOS, we’re very lucky – Core
Foundation’s containers are smart and automatically use an appropriate
implementation for the number of items they contain. So, for instance, a
small CFArray
is basically just a C array, but as it grows it changes into a
somewhat more sophisticated data structure that allows fast insertion and
deletion in spite of the number of elements it holds. That said, there will
still sometimes be occasions where you need to choose between a CFArray
and
a CFDictionary
, and there may be occasions when you need a tree rather than
a hash, in which case you might end up rolling your own.
Learning this stuff will take months?
You can learn the basics very quickly
(hopefully reading the above was quite useful).
I could more profitably spend my time learning Core Data?
Yes, maybe, though
this stuff will have applications there too.
Those algorithms textbooks are huge and hard to read :-(
Well, some of
them are, yes. I’d recommend you pick up a copy of Sedgewick’s
Algorithms in <language>.
It’s available in a variety of different language
flavours (I have a C++ copy, but I’ve seen C, Pascal, and Java, and there are
probably others too), it’s short and accessible (lots of pictures and short
example programs). Even skimming it will give you at least some idea of
where to look when you need to.
If you go for an interview for a job as a programmer, it isn’t unreasonable to expect that someone will ask some questions relating to fundamental algorithms or data structures. If someone does ask, they aren’t trying to discriminate against the underprivileged; they’re trying to discriminate between job applicants on grounds of competence. Even if the question seems irrelevant to what you’re going to do, it’s a good bet that someone who gives a good answer is going to be better at doing the simpler work where you don’t need to know this, and that is something that will factor in to the decision about who to hire. (Of course, that somebody may also be more expensive to hire, so bear that in mind too.)
Now, as I said, I wouldn’t ask in an interview about linked lists per se, unless you say you have a CS degree and you’ve just failed to answer a question I think you should know the answer to, in which case I’m probably trying to decide whether you lied about your degree.
I might ask you to show me how you would search a string (but I don’t expect
you to know the best answer OTOH; the point is to work through it and see how
you react). I might ask about the merits of hash tables
(e.g. std::unordered_map
or CFDictionary
) versus trees (e.g. std::map
).
I would, however, take into account your background when thinking about your
answer, and if you didn’t know about something I might explain a bit and see
what you had to say. The point, often, is about testing your reasoning
skills, not about whether you know the answer and can rattle it off.
One final word of advice: if you respond to a question in an interview, however silly you feel it is, with snark, you probably aren’t going to get the job. Part of the reason for interviewing people is for both parties to decide whether they’d like to work together, and snark is going to put people off.
]]>OK, some background first. Owing to the increasing level of card-not-present fraud committed via the Internet, and the generally lax security standards of some of the websites involved, the Payment Card Industry Security Standards Council (PCI SSC) was formed and tasked with creating and maintaining a set of security standards called the Payment Card Industry Data Security Standard (PCI DSS).
The idea is a good one, as are many of the rules themselves, though I think it’s legitimate to criticise PCI-DSS for demanding things of smaller businesses that are simply unrealistic. The upshot of this is that smaller companies, and the payment processors who serve their market, wish to avoid the burden of being PCI compliant, but because they know that conversion rates are strongly impacted by being sent to a third-party site for payment, they would also like to design payment flows where a small business is able to take card payments on its own website.
The first attempt at this was to use client-side Javascript to securely encrypt the user’s payment data, and then the payment form itself would be submitted to the merchant’s system, but with only the encrypted blob rather than the original payment details. The downside of this approach is that if something goes wrong with the Javascript code and the HTML form isn’t carefully written, payment details go to the merchant’s server anyway and they are dragged into the scope of PCI compliance.
This method of avoiding having to be fully PCI compliant was “dealt with” in PCI DSS 3.0, which specifically imposes a compliance burden on sites doing the above.
However, PCI DSS 3.0 does allow payment processors to host parts of the
payment form on their own servers instead, such that the merchant can
embed those parts into the merchant’s own form using HTML
iframe
tags. This provides the same visual effect, but at
reduced risk because it no longer relies on client side Javascript to keep the
payment data away from the merchant’s servers.
So, that’s the background.
Now, on Troy Hunt’s blog, in the comments, I happened across some remarks from Craig Francis:
This Stripe implementation is insecure as well.
They use an iframe, which is trivial for a malicious hacker to replace if the original website is hacked (often possible as they use old software, FTP, bad passwords, etc - which all gets missed at the basic level of PCI checking, that Regpack also seem to suggest is acceptable).
Troy is right to suggest that you should go to the payment gateway directly to enter your details, at least customers will know who has them.
I’m currently working with Christine at Google to pressure the PCI council into doing something about this.
Craig then linked to this piece on his blog which advocates extending full PCI compliance (technically SAQ-A-EP) to those businesses who are using iframe-based payment systems.
This would, in my opinion, be a huge mistake.
The claim, basically, is that an iframe-based system is insecure because a third party could edit the page in which the iframe is embedded and make it point somewhere else. This is true, and it is a genuine vulnerability.
But what are the alternatives for smaller businesses? Well, the alternative being suggested is that they should send their customers off to a third-party payment processor’s website, have the details filled in there, and then come back again. Those of us who run small businesses that take card details will tell you for nothing that this causes two problems:
Our conversion rate drops. Instantly. Customers don’t like being bumped to another website, which they probably don’t recognise anyway, to make a card payment.
We actually get people e-mailing us to tell us they think they might be being defrauded. Wait, what? Yes, that’s right. Customers don’t expect to be suddenly redirected elsewhere; when it happens, they think something dodgy is going on.
Now, if your goal is to destroy small business and make the huge advantages experienced by big businesses even bigger, that’s a great idea. What it won’t do is improve security. Why? Because passing customers off to a third-party payment website has the exact same vulnerability we were just talking about. The web page that does it could be edited by a malicious third party, and pointed at a different page.
OK, you might say, but in that case you’ll see it in your browser’s address bar. Sure. Do you know the names of every payment gateway on the Internet? No, me neither. So how do you know that the page you’re looking at is a genuine payment processor? If you’re about to utter the words “they have an EV SSL certificate” or “because my address bar is green”, I have news for you: it’s easy to get an EV certificate. Even if we assume that certificate authorities can’t be convinced to issue EV certificates in error, all the certificate really says is that it belongs to the party listed in the certificate details. It doesn’t tell you they’re trustworthy.
So Craig’s assertion that merchants using the iframe approach should be forced to use SAQ A-EP, the more onerous compliance route, is clearly a non-starter. It doesn’t improve security in practice, and has a significant impact on lots of small businesses, most of whom will be forced to use third-party payment gateways, which is not only bad for business but is annoying for their customers too.
It’s also worth pointing out that, assuming we did tighten up this aspect of PCI DSS, there is still nothing stopping someone from setting up a website with a similar name, copying its appearance from a given merchant’s site, and defrauding customers that way. This is exactly the same kind of fraud we’re worrying about here – customers are being sent to a site other than the one they should be being sent to – only now it would be happening via Google, instead of from the merchant’s own (hacked) page. Should Google search suddently be dragged into scope for PCI DSS somehow? I don’t think anyone sensibly argues that.
This is a hard problem, and the iframe solution is not perfect, but it is an improvement over the client-side Javascript approach and it isn’t significantly less secure than redirecting to a third-party website to perform the payment.
The way forward is probably services like Apple Pay, which is now available in Safari 10, where the browser is responsible for capturing the payment information and sending it securely to the payment processor. Even that is not perfect – hackers could still change the merchant’s site to point at a different payment processor and try to collect money that way.
No.
Nor are completely PCI compliant systems necessarily secure.
PCI DSS compliance means that the system in question ticks all the relevant checkboxes in the latest PCI DSS standard, meets any audit requirements and has the appropriate paperwork in place. There’s a good chance that systems that are PCI DSS compliant are secure, but it isn’t guaranteed.
Why, if your system is secure, would you not want the burden of PCI DSS compliance? Well, unless you think that all small businesses’ websites (and we’re talking about sites here that explicitly avoid touching payment data) need automated audit logs, two factor authentication, sophisticated penetration testing, incident response plans, written security policies, written change control procedures, separate logging servers, and so on, I think you already know the answer to that question.
]]>This is, and has been, for some time, the conventional wisdom. It is wrong.
Why do I say this? Simple. The conventional wisdom implies that we should all be using the exact same code behind the scenes (this is often accompanied by claims of the superiority of Open Source implementations as they will be reviewed by many more people). For many people, and for many applications, this thinking leads to using OpenSSL, as it is “tried and tested”, and is Open Source so lots of people must have looked over the code and decided it was good, right? Well, let’s take a look at the huge list of vulnerabilities that have been found in that library, or the comments that the founder of OpenBSD, Theo de Raadt, made about it after deciding to fork it and create LibreSSL instead.
(Fine, you might say, use LibreSSL, or Botan, or Secure Transport, or CryptoAPI, or…; well, yes, that’s kind of my point. But I wouldn’t want to recommend that everyone should use LibreSSL, or Botan, or Secure Transport either. It’s much safer if there’s a mix of software performing this task.)
Heartbleed was only such a big problem because everyone was using the single implementation that contained that bug. Well, almost everyone; some software was using Apple’s Secure Transport, or Microsoft’s implementation (via CryptoAPI), or one of the various other implementations that are floating about. But the overwhelming majority uses OpenSSL, and as a result, a single vulnerability affected everyone, everywhere, simultaneously.
Another implication of this “thou shalt not implement crypto” view is that the set of implementations we presently have should be fixed. Maybe even some of them should go away. After all, nobody should be implementing crypto software (the only exception seems to be if the person quoting this rule knows your name, in which case you’re probably D.J. Bernstein or Bruce Schneier or some such). But that will make matters worse, not better. It will increase the reliance on OpenSSL and make the monoculture worse; and everyone switching wholesale to LibreSSL won’t help in that regard (it might be better in other respects, but that’s another matter). Indeed, it even implies that you shouldn’t be submitting any fixes to OpenSSL, because you can’t possibly be a suitable person to be tampering with cryptographic software.
Now, do I think you, dear reader, should immediately go out and roll your own RSA implementation? No, absolutely not. I am categorically not in favour of everyone implementing their own crypto (or, worse, rolling their own cryptographic algorithm). It isn’t something you can throw together in an afternoon, without carefully researching the subject first, and it certainly isn’t something you should be doing without adequate testing to make sure you haven’t slipped up. There are lots of gotchas in this area that you won’t appreciate unless you go and learn about it first. But what I don’t like about the conventional wisdom on the subject is that it has tended to discourage people who are competent to do so from writing additional implementations, and has created an atmosphere where you’re likely to be yelled at for merely suggesting that it might be a good idea for that to happen.
]]>clang
, I ended up feeling rather confused. I’m sure I’m
not the only one. The reason for the confusion is fairly simple; clang
supports two different coverage tools, one of which uses a tool with a name
that used to be used by the other one!
About half of the posts seem to indicate that the right way to get coverage
information is to use the --coverage
argument to clang
:
1 2 3 4 5 6 |
|
This appears to produce (approximately) GCOV format data which can then be
used with the gcov
command, noting that this is really LLVM’s gcov, not
GNU gcov, though it appears to be designed to be broadly compatible with the
latter. Older versions of LLVM apparently used to call this tool llvm-cov
rather than replacing gcov
with it, but that name is now used for a newer,
separate tool.
The rest of the posts, including some on the LLVM site, instead recommend using
the -fprofile-instr-generate
and -fcoverage-mapping
options:
1 2 3 4 5 6 |
|
Instead of outputting GCOV data, this generates a file default.profraw
,
which can be used with llvm-profdata
and llvm-cov
The way to use this file is to do something like
1 2 |
|
In case you were wondering: you must pass the raw profile data through
llvm-profdata
. It isn’t in the format llvm-cov
wants, and apparently the
“merge” operation does more than just merging.
Also, you can change the name of the output file, either by setting the
LLVM_PROFILE_FILE
environment variable, or by compiling your code with
-fprofile-instr-generate=<filename>
. This is mentioned in the help output
from the clang
command, but doesn’t seem to be anywhere in the clang
documentation itself.
In both cases, you need to pass the coverage options to the clang
or
clang++
driver when you are linking as well as when you are compiling.
This will cause clang
to link with any libraries required by the profiling
system. You do not need to explicitly link with a profiling library when
using clang
.
One final remark: on Mac OS X, gcov
will likely be in your path, but
llvm-profdata
and llvm-cov
will not–instead, you can access them via
Xcode’s xcrun
tool.
Let me start by saying that it has always been the case that most conferences reserved the right to eject you if you were in some way disruptive. As private events, they’re within their rights to do so (at least in Common Law countries), and if they have the appropriate wording in their Terms & Conditions they may not even have to refund your money.
Let me also say that I am not in favour of allowing harrassment or other bad behaviour by conference attendees, and I realise that there will be situations (e.g. where there are children present) where the organisers might want to draw attention to the fact that attendees should keep to their best behaviour.
So what is this Code of Conduct thing about? Well, a fair overview is this FAQ by Ashe Dryden, and there’s an example of the kind of thing we’re talking about on this website. To save time, I’d recommend that you go and read those now, then come back if you’re still interested in what I have to say.
OK, you’re back. So why would anyone object to these things?
We’re grown-ups, right? We should all, by now, know how to behave around other people, and for those who don’t, we already have a set of rules that we’ve collectively agreed upon that cover the worst kinds of harassment and bad behaviour, namely the law, plus — as I already mentioned — most conferences already reserve the right to remove you if you’re being disruptive.
I accept, for what it’s worth, that some people might find an explicit set of rules reassuring. Others, me included, do not. Quite the opposite, in fact, for reasons I’ll elucidate below.
It’s commonly asserted that a problem with leaving this up to the law is that the police “don’t have a great history of responding positively”, that complainants may not wish to involve the police and that as a result it might be better for conference organisers to deal with things themselves.
Except… conference organisers are not trained to deal with these types of situations. A lot of this is going to boil down to one person’s word against another, and it’s very easy to allow your own personal biases to determine your response. Police officers are trained not to do that (not always successfully, for sure, but they are at least trained); of course, that does sometimes make people unhappy when they complain to the police, because the police don’t seem to believe them — but that’s a misunderstanding. The function of the police is not to believe or to disbelieve, but to investigate, and where there is evidence, to bring it before a court for prosecution.
That courts of law require high standards of evidence — at least in Common Law countries — is undeniable, and that’s because we’ve collectively agreed that the principle should be that people are innocent until proven guilty.
This is particularly important in some of the areas we’re concerned with here, because of the reputational impact on people subject to allegations of sexism, racism or (worse) sexual assault, and the notion that the response to allegations of that nature might be decided by conference organisers on the basis of a low standard of evidence, without any right of appeal, really worries me.
I know it’s also asserted that “false accusations are… incredibly rare”. I’m happy to believe that. But there is a whole grey area of allegations that might seem true from a certain point of view that isn’t necessarily shared by all parties, and there are even situations where the accused and accusing parties simply don’t know themselves what happened.
Ashe Dryden asserts that a “code of conduct should apply to any event where your attendees may congregate”. This seems generally problematic.
Certainly there are situations where conference organisers might need to get involved; I accept that. But it seems hard to justify extending the Code of Conduct to all activities outside of those organised by the conference organisers.
So, for instance, if someone misbehaves in a bar right outside the conference venue, where there are a lot of conference attendees present, it is totally appropriate for conference organisers to have words with that person. Or, actually, for anyone present to have words with that person. But unless they have broken the law, or upset the bar owner, you won’t be able to ban them from hanging around in that bar, even if you kick them out of your conference. And, furthermore, to the extent that you feel the Code of Conduct may constrain their behaviour, it certainly won’t if you have invoked it to bar them from the rest of your conference.
Equally, it seems preposterous to argue that the Code of Conduct should extend to a shopping trip to a supermarket half way across town. Or to e.g. a group of attendees who decide to visit a strip club (not my cup of tea, but some people clearly enjoy that kind of thing, and it’s very likely effectively banned in the code of conduct you were thinking of using).
And then there are all kinds of questions about whether the Code of Conduct protects people who are not conference attendees at all, or indeed how it protects people who are conference attendees against those who are not (hint: it doesn’t).
What should be banned? http://confcodeofconduct.com suggests that “harassment includes offensive verbal comments related to… technology choices”! So you could, in theory, be evicted from a conference for making rude remarks about PHP (or, I suppose, for calling someone an idiot for using it). That seems a step too far, for sure.
In fact, while we’re about it, what constitutes an “offensive verbal comment”? Does it have to meet a reasonable person test? Would it be inappropriate to reproduce the cartoons of the Prophet Mohammed? In all circumstances? Are you sure? Does the whole of the community agree? Or if not, does everyone agree to compromise somehow?
And what exactly is “harassing photography”? Some people are very sensitive about having their photograph taken (even accidentally), and others much less so. Who decides? Is there a right of appeal? How many photographs does one have to take before it becomes harassment?
It’s also worth reflecting that quite a bit of that code of conduct would ban many well-respected and enjoyable comedy acts outright.
Again, please don’t misunderstand — I am all for conference staff taking someone aside and explaining that they’re upsetting someone, asking them to please be sensitive to that person’s concerns, and even if necessary warning them that they will be ejected if they continue with their behaviour. What I’m trying to tease out here is that there is a lot of subjective judgement involved, and attempting to codify this in a Code of Conduct is fraught with danger.
You might think that having a Code of Conduct would create some legal certainty for organisers when they do decide to act, but if they use the one at http://confcodeofconduct.com they could be in for a nasty shock. For instance, as it’s currently worded, it bans “offensive verbal comments related to… sexual images in public spaces”, rather than banning sexual images in public spaces as I’m sure its author intended. Granted, it says “harassment includes…”, so we can be certain that the definition is not exhaustive, but in cases where contracts are unclear, Common Law takes the view that they should not be interpreted in a way that favours the party that drafted them. My guess is that in court you’d find that they chose to use the legal definition of “harassment” (whatever that may be) and then added in anything in the “includes” list, in which case if you evicted someone for “following” and that person sued to recover their conference fees (and potentially travel and legal expenses), you might well find yourself out of luck and out of pocket.
Maybe that’s an argument for getting a lawyer to look over them, but IMO, it would have been much better to just put into the Terms and Conditions that the organisers reserve the right to eject attendees for behaviour that the organisers determine to be detrimental to other attendees or to the conference as a whole. I think you’d also want to make it clear what the procedure for doing that should be — who had the right to make the decision(s), whether there was an appeal process, under what circumstances attendees’ money might be refunded and so on. And on that subject, trying to keep hold of the entire conference fee whatever is probably a bad idea; the attendee’s credit card issuer is very likely to side with them, so if you’re going to try to keep hold of it you’ll only want to do so in cases where you have solid evidence of their misbehaviour.
Some people may be unpopular, or may hold views that are unpopular. You can certainly discuss this with them in advance if you think it will be a problem, and ask them not to raise their unpopular views at your conference. If they aren’t relevant to the conference itself, they might even agree to that.
Anyway, there are two problems here; the first is that some people appear to claim that the mere expression of a view with which they strongly disagree is some form of harassment, in and of itself. Indeed, there have even been demands to ban certain people from certain conferences on the grounds that people are aware that (or think that) they hold certain views, even if they have promised not to express them at the conference.
The second problem is that unpopular people (or those with unpopular views) are far more likely to be the targets of false — or at least questionable — allegations. I don’t want to pick individual people as examples, so I’ll stick to generalising here: if a well-known feminist makes a joke about men, it’s quite unlikely that anyone will complain, and even if they do, quite unlikely that anyone will do anything about it. If, however, a similar joke about women was made by a man, I would expect there to be complaints, and I would expect that Something Would Be Done. (I’m not trying to be anti-feminist here; I’m just observing that, right now, at least in tech. circles, a fairly muscular form of feminism is popular, and making any remark that conflicts with or disagrees with that is not.)
It’s also worth reflecting that the first problem includes things like the views Roman Catholics or Muslims hold about homosexuality, which certainly for some people meet the definition of “offensive verbal comments related to sexual orientation”. While one might argue that people who hold those views should keep them to themselves for politeness’ sake (and indeed most do), if someone knows that they hold those kinds of views, they might be tempted to try to goad them into expressing them in order to trigger the Code of Conduct and get rid of those people from the conference.
The irony here is that the intention of advocates of Codes of Conduct is generally to protect minorities, but that in practice they may in some cases achieve the opposite.
Sometimes it might actually be appropriate to prioritise freedom of speech over someone else’s right to not be offended. Sometimes it’s better to let people debate points of view that they may find challenging or even downright offensive.
I grant you that at most technology related conferences, this won’t be relevant, but I find Ashe Dryden’s assertion that this point can be addressed by stating that “free speech laws do not apply to harassment” overly simplistic, even leaving aside the obvious point that the United States Constitution, wonderful as it is, doesn’t actually apply over most of the surface of the Earth. It occasionally does all of us good to hear views we don’t like or agree with, even views we find offensive, if only because it makes us think.
(FWIW, I can imagine that this might become a problem if you wanted to have a conference talk or panel about gender politics in technology, which is something of a live issue at the moment; it’s very likely going to involve, one way or another, things that someone or other feels are “offensive verbal comments related to gender”. If you think not, imagine inviting e.g. Milo Yiannopoulos to debate with Brianna Wu, assuming you could get them to sit on the same stage.)
All of this is only my opinion, and hopefully I’ve explained above why I think this way.
Organisers certainly should have procedures to deal with poor behaviour by attendees, or with situations where one attendee is upsetting another somehow.
It would be wise to put these procedures in the Terms and Conditions.
It would be wise to train conference staff to follow these procedures (e.g. insisting that they report complaints up the chain to organisers until they reach someone you trust to deal with them sensibly).
Trying to codify what constitutes good or bad behaviour creates problems, and it’s probably better to use very general language in your Ts & Cs, instead of trying to write an explicit Code of Conduct.
If someone breaks the law, or is alleged to have done so, you really should consider letting the police deal with it, whatever your opinion of their effectiveness might be.
Attendees outside of your venue will be exposed to people who are not at your conference and are not subject to your Code of Conduct at all anyway (this potentially includes anyone you kick out for violating your Code of Conduct). As such, Codes of Conduct do not “protect” attendees. At best, if carefully drafted, they may protect conference organisers from future lawsuits.
Whether you have a Code of Conduct or not, you should consult a lawyer to avoid creating problems for yourself down the road.
There is nothing wrong with telling your attendees you expect them to behave themselves, drawing to their attention the fact that there are children present, telling them that you expect them not to stream pornography over the conference WiFi and so on. This is not the same as having a formal Code of Conduct.
Codes of Conduct mainly protect the conference organiser (and only if they are carefully worded); they don’t protect attendees. Defining what is and is not acceptable is hard, and boils down to subjective judgement anyway. Better to put procedures in place, stick them in your Ts & Cs, and train conference staff appropriately.
]]>But on OS X, this doesn’t work. Moreover, the symbolicatecrash
Perl script
that iOS developers could use as an alternative doesn’t understand OS X crash
logs and so will refuse to process them.
You could try using Peter Hosey’s Symbolicator package, but it’s a bit buggy — looking at the code, Peter has misunderstood the “slide”, and it also can’t cope with Xcode archives containing multiple dSYMs. I did contemplate fixing it and submitting a patch, but while I don’t want to be unkind to Peter, I think I’d end up rewriting rather too much of it in the process.
You could also try LLDB’s symbolicator, which you use like this:
$ lldb
(lldb) command script import lldb.macosx.crashlog
"crashlog" and "save_crashlog" command installed, use the "--help" option for detailed help
"malloc_info", "ptr_refs", "cstr_refs", and "objc_refs" commands have been installed, use the "--help" options on these commands for detailed help.
(lldb) crashlog /path/to/crash.log
This is actually really rather neat, or it would be if it worked. Unlike other symbolicators, it annotates the backtrace with your actual source code (and/or in some cases disassembly) so that you can see where the crash took place. Additionally, if you run it as above, within lldb itself, it will set up the memory map as if your program was loaded. Very cool.
You will note that I said if it worked. Because, out of the box, it does
not. The first problem is that the version shipped by Apple relies on a
script, dsymForUUID
, that is not provided and whose behaviour is not
documented anywhere. I wrote something that should be suitable and put it up
on PyPI so you can install
it with e.g.
$ sudo -H pip install dsymForUUID
(But wait… you might not need to.)
The second problem is that it’s also a bit broken. It chokes on some crash
logs because they contain tab characters rather than spaces, and it also only
loads the __TEXT
segment in the correct place, which makes for a bit of fun
if you need to poke around in one of the other segments.
Anyway, I filed a bug report today about all of this, with a patch attached to
it that fixes these problems. I’ve also put a copy of
the fixed crashlog.py
file here
so you can download and use it.
In addition to the usage shown on the lldb website, you can, in fact, invoke it directly from the Terminal prompt, e.g.
$ crashlog.py /path/to/crash.log
which is a very convenient way to use it in many cases. Likewise, if you want
to use this version rather than the built-in one, you just need to make sure
it’s in your PYTHONPATH
, then you can do
$ lldb
(lldb) command script import crashlog
to use it in lldb.
The fixed version does not require dsymForUUID
, and indeed it’s rather
faster without it, but it can use a dsymForUUID
script if you happen to
have one (e.g. because you work at Apple). To use it with your custom
dsymForUUID
, you need to set the DSYMFORUUID
environment variable to the
full path of your script.
I found an interesting bug in the symbolicator; I’ve uploaded a new crashlog.py script that fixes it.
I’ve moved crashlog.py to
Bitbucket, and added support for
symbolicating the output of the sample
command.
For the past few months I’ve been working hard on a new release of iDefrag, version 5, and as part of this I’m rewriting the documentation. Rather than using hand-written HTML like I did before, I’ve chosen this time around to use a documentation generator, Sphinx. The advantages of this approach include:
Built-in support for indexing and cross-referencing.
The ability to write the documententation in plain text.
Keeps the presentation details separate from the content (via theming and templates).
Supports multiple output formats, not just HTML.
The current version of Sphinx doesn’t directly support building Apple Help Books, but I’ve submitted a pull request to fix that so hopefully by the time you read this you’ll be able to do
$ sphinx-quickstart
fill in some fields and then do
$ make applehelp
to generate a help book.
(If you do do that, you’ll want to edit your conf.py
file quite a bit, and
you probably don’t want to use the default theme either.)
Anyway, all of the Sphinx related stuff was fine, and worked as documented. Unlike Apple Help, which doesn’t. I spent an entire day struggling to make a help book that actually worked, and most of that is because of problems with the documentation.
Let’s start with the Info.plist. Apple gives this not particularly helpful table:
Key | Exact or sample value |
---|---|
CFBundleDevelopmentRegion | en_us |
CFBundleIdentifier | com.mycompany.surfwriter.help |
CFBundleInfoDictionaryVersion | 6.0 |
CFBundleName | SurfWriter |
CFBundlePackageType | BNDL |
CFBundleShortVersionString | 1 |
CFBundleSignature | hbwr |
CFBundleVersion | 1 |
HPDBookAccessPath | SurfWriter.html |
HPDBookIconPath | shrd/SurfIcn.png |
HPDBookIndexPath | SurfWriter.helpindex |
HPDBookKBProduct | surfwriter1 |
HPDBookKBURL | https://mycompany.com/kbsearch.py?p='product'&q='query'&l='lang' |
HPDBookRemoteURL | https://help.mycompany.com/snowleopard/com.mycompany.surfwriter.help/r1 |
HPDBookTitle | SurfWriter Help |
HPDBookType | 3 |
HPDBookTopicListCSSPath | sty/topiclist.css |
HPDBookTopicListTemplatePath | sty/topiclist.xquery |
There are two serious problems with the table above. The first is that some of it is wrong(!), and the second is that it doesn’t indicate which values are sample values and which are required.
Here’s what you actually need:
Key | Value |
---|---|
CFBundleDevelopmentRegion | en-us |
CFBundleIdentifier | your help bundle identifier |
CFBundleInfoDictionaryVersion | 6.0 |
CFBundlePackageType | BNDL |
CFBundleShortVersionString | your short version string - e.g. 1.2.3 (108) |
CFBundleSignature | hbwr |
CFBundleVersion | your version - e.g. 108 |
HPDBookAccessPath | _access.html (see below) |
HPDBookIndexPath | the name of your help index file |
HPDBookTitle | the title of your help file |
HPDBookType | 3 |
The first thing to note is that CFBundleDevelopmentRegion
should have a
hyphen, not an underscore. Apple’s utilities generate this properly, but
the documentation is wrong.
The second thing to note is that in spite of the documentation implying that
you can use your help bundle identifier to refer to your help bundle (which
would, admittedly, make sense), you can’t. You need to use the HPDBookTitle
value. Oh, and ignore any references to AppleTitle
meta tags. You don’t
need those.
The third thing relates to HPDBookAccessPath
. The file referred to there
must be a valid XHTML file. In particular, it cannot be an HTML5 document
— that will simply not work, and the error messages you get on the system
console are completely uninformative.
The best solution I’ve come up with for this particular problem, as I want to
generate modern HTML output, is to make a file called _access.html
and put
the following in it:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Title Goes Here</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="robots" content="noindex" />
<meta http-equiv="refresh" content="0;url=index.html" />
</head>
<body>
</body>
</html>
This means that both helpd
and the help indexer (hiutil
) are happy, and I
can write my index page using modern HTML. Incidentally, Apple appears to be
using a similar trick in the help for the current version of Mail. Obviously
you can change the index.html
in the above to whatever you need.
In your application bundle, you need to fill in the following keys
Key | Value |
---|---|
CFBundleHelpBookFolder | The path of your help book relative to Resources - e.g. SurfWriter.help |
CFBundleHelpBookName | The value from HPDBookTitle, above |
Note that while the HPDBookTitle
is displayed to the user, it can be
localised using InfoPlist.strings
. Note also that you absolutely cannot,
contrary to what the documentation implies, give a bundle ID here. It just
doesn’t work. You could however, if you wanted, write an
InfoPlist.strings
file like this:
HPDBookTitle = "SurfWriter Help"
then put the bundle ID in as the HPDBookTitle
in the Info.plist
.
Oh, and if you think you’re going to be able to double-click a help book to
preview it, think again. That won’t work. Instead, you need either to use it
from within your application, or you can put it in
~/Library/Documentation/Help
(you might have to make that folder) and
double-click it in there. Why? Because help files are indexed and you can
only open them if they’re registered in the index.
One other thing that isn’t really documented at all is what exactly the
HPDBookRemoteURL
will do for you. There’s some handwaving about being able
to offer remote content updates, but how the URL is used is skirted over.
Well, if you do set HPDBookRemoteURL
, Help Viewer will essentially expect
it to point at a copy of the Resources
folder of your bundle; so if you have
HPDBookRemoteURL
set to http://example.com/foo/bar/
, then you’re going to
get requests like http://example.com/foo/bar/en.lproj/index.html
(and so
on).
Useful update (Feb 29th 2016)
You may have noticed that Help Viewer has a button to toggle the table of contents in your help file. Matt Shepherd did a bit of work looking into this and it turns out that it’s controlled by a Javascript API — see Matt’s gist for more information.
]]>It isn’t particularly clear from the petition, but the problem being raised is that in order to register for the Mini One Stop Shop in the UK, you currently need to be registered for UK VAT. This is something that we have been talking to HMRC about, and I have the impression that HMRC is amenable, in principle, to allowing non-VAT-registered entities to use the Mini One Stop Shop system, though the details of that have not been worked out.
Note also that your sales here in the UK will continue to be subject to ordinary UK VAT, and will not be reported through MOSS, and even if your UK-only sales are below the UK VAT threshold, it’s likely that you have expenditure in the UK that involves an element of VAT, so you might want to consider a voluntary registration in any event, in order to reclaim your input tax.
(There is a related issue within the Mini One Stop Shop itself, in that there are no thresholds for amounts reported via MOSS. HMRC did try to negotiate a threshold, but other member states didn’t support the idea and it was dropped.)
It is also worth pointing out that the Mini One Stop Shop is optional. You don’t have to use it. The alternatives are:
Use a digital “marketplace” (e.g. Apple’s App Store, Google Play, Paddle). Marketplace operators, as of the 1st of January 2015, are required by law to deal with EU VAT for you. You will only need to deal with B2B transactions between you and the store operator.
Register for VAT in EU member states into which you are selling. This will mean filing multiple VAT returns and complying fully with (up to) 28 different sets of VAT legislation.
Use a distributor in EU member states you wish to sell into. The distributor is a business, so you only need worry about a B2B sale; B2C sales will be made by the distributor within the member state(s) in which it operates.
Stop selling to other EU member states.
For a lot of digital micro-businesses, the best approach is likely to be to use a digital marketplace. MOSS gets you a single return and a single payment; unlike using a marketplace or a distributor, it does not free you from the need to comply with up to 28 different sets of VAT rules, though it makes doing so considerably simpler in a number of ways.
As regards determining whether your sale is in the EU or not, with very few exceptions (mostly having to do with e.g. mobile network operators, where there is an obvious way to tell where the customer is) you need to keep two non-contradictory pieces of information that identify your customer’s location. These might include, for instance
If those two pieces of information say your customer is outside the EU, then it doesn’t matter (from your perspective) if the customer was really stood in the middle of Brussels at the time; the rules say that you have done what is expected of you.
]]>Apparently, Bash allows subshells to inherit exported function definitions, which it implements by passing environment variables with those functions’ names through to subshells, with the value of the variable containing the function definition. For instance
outer$ function hello {
> echo "Hello World"
> }
outer$ export -f hello
outer$ PS1="inner$ " /bin/bash
inner$ hello
Hello World
inner$ exit
outer$ export -nf hello
In this case, the outer shell has exported the function hello
to the inner
shell, by setting an environment variable hello
to the string () { echo
"Hello World"; }
. We can test this:
outer$ export hello='() { echo "Hello World"; }'
outer$ PS1="inner$ " /bin/bash
inner$ hello
Hello World
inner$ exit
outer$ export -n hello
On its own, this feature is only harmful if a user can specify the name and content of an environment variable, and only then if some program is foolishly trying to run commands without specifying their full path. For example:
outer$ ls='() { echo "No way, Jose"; }' PS1="inner$ " /bin/bash
inner$ ls
No way, Jose
inner$ /bin/ls
foo.txt bar.txt
inner$ exit
However, current versions of Bash contain a bug that causes Bash to execute trailing statements on environment variables of this form, so for example
outer$ naughty='() { :;}; echo "Oh dear, oh dear"' PS1="inner$ " /bin/bash
Oh dear, oh dear
inner$ exit
In the above example, the inner shell runs the echo
command. It shouldn’t.
Now, this is potentially a major security hole, but only in certain circumstances, namely:
If a user can set the value of an environment variable, and
Where a program passes control to a Bash shell and passes that value through.
The two most common cases that you might find that allow remote exploitation
of this bug are CGI scripts (the old fashioned kind, not FastCGI, and not
anything run via Apache’s mod_php, mod_perl or mod_python) and OpenSSH if you
were relying on the ForceCommand
feature to provide restricted SSH
access. sudo
, fortunately, already strips out Bash exported functions (and
has done since 2004), so is not affected.
Put another way, unless you have very old code running on your web servers, and unless you are doing something like running a public SSH server that allows restricted log-ins (e.g. to run Git or Subversion via SSH, but nothing else), the chances are that you aren’t vulnerable to remote exploits based on this. You should check, but you should not panic.
]]>Why do I say this? Well, because it seems there are people out there who confuse Twitter with services like Glassboard, and think that people they don’t know shouldn’t respond to their tweets. Or maybe it’s just people who disagree with them; it’s unclear.
There are a few important facts that such people need to be made aware of:
People who follow them may retweet their tweets. As a result they may very well be seen by people who do not follow them, who they do not know and who might disagree with whatever opinion they’ve expressed.
By default, your tweets are public. That being the case, tweeting is like standing on a soap box at Hyde Park Corner, talking loudly to all who will listen. You don’t get to pick your audience.
If you say something on Twitter (or indeed from a soap box at Hyde Park Corner), and someone who sees your tweet (or is listening to you) finds it interesting or controversial, they have every right to reply. Your “conversation” is not private in any way, shape or form; indeed, it is not actually a conversation.
If you don’t like the above facts, Twitter has a mode for you; set your account to “protected” tweet mode. At that point, you do get to screen your followers, who can’t retweet you.
Yes, there are downsides to protected tweet mode. If you don’t like the way Twitter works, and you don’t want to protect your tweets, post to a blog instead and turn comments off. Or use a private group chat system like Glassboard. Alternatively, you will simply have to live with it.
Finally, if you ask on Twitter why people are replying to you when you don’t want them to, and someone points out all of the above, there is absolutely no excuse for threatening or abusing them.
]]>Let me summarise my conclusion first, and then explain why I came to it a long time ago, and why it’s relevant to Swift.
If you are using Unicode strings, they should (look like) they are encoded in UTF-16.
“But code-points!”, I hear you cry.
Sure. If you use UTF-16, you can’t straightforwardly index into the string on a code-point basis. But why would you want to do that? The only justification I’ve ever heard is based around the notion that code-points somehow correspond to characters in a useful way. Which they don’t.
Now, someone is going to object that UTF-16 means that all their English language strings are twice as large as they need to be. But if you do what Apple did in Core Foundation and allow strings to be represented in ASCII (or more particularly in ISO Latin-1 or any subset thereof), converting to UTF-16 on the fly at the API level is trivial.
What about UTF-8? Why not use that? Well, if you stick to ASCII, UTF-8 is compact. If you include ISO Latin-1, UTF-8 is never larger than UTF-16. The problem comes with code-points that are inside the BMP, but have code-point values of 0x800
and above. Those code-points take three bytes to encode in UTF-8, but only two in UTF-16. For the most part this affects Oriental and Indic languages, though Eastern European languages and Greek are affected to some degree, as is mathematics and various shape and dingbat characters.
So, first off, UTF-8 is not necessarily any smaller than UTF-16.
Second, and this is an important one too, UTF-8 permits a variety of invalid encodings that can create security holes or cause other problems if not dealt with. For instance, you can encode NUL (code-point 0) in any of the following ways:
00
c0 80
e0 80 80
f0 80 80 80
Some older decoders may also accept
f8 80 80 80 80
fc 80 80 80 80 80
Officially, only the first encoding (00
) is valid, but you as a developer need to check for and reject the other encodings. Additionally, any encoding of the code-points d800
through dfff
is invalid and should be rejected — a lot of software fails to spot these and lets them through.
Finally, if you start in the middle of a UTF-8 string, you may need to move a variable number of bytes to find the character you’re in, and you can’t tell in advance how many that will be.
For UTF-16, the story is much simpler. Once you’ve settled on the byte order, you really only need to watch out for broken surrogate pairs (i.e. use of d800
through dfff
that doesn’t comply with the rules). Otherwise, you’re in pretty much the same boat as you would be if you’d picked UCS-4, except that in the majority of cases you’re using 2 bytes per code-point, and at most you’re using 4, so you never use more than UCS-4 would to encode the same string.
If you have a pointer into a UTF-16 string, you may at most need to move one code unit back, and that only happens if the code unit you’re looking at is between dc00
and dfff
. That’s a much simpler rule than the one for UTF-8.
I can hear someone at the back still going “but code-points…”. So let’s compare code-points with what the end user things of as characters and see how we get on, shall we?
Let’s start with some easy cases:
0 - U+0030
A - U+0041
e - U+0065
OK, they’re straightforward. How about
é - U+00E9
Seems OK, doesn’t it? But it could also be encoded
é - U+0065 U+0301
Someone is now muttering about how “you could deal with that with normalisation”. And they’re right. But you can’t deal with this with normalisation:
ē̦ - U+0065 U+0304 U+0326
because there isn’t a precomposed variant of that character.
“Yeah”, you say, “but nobody would ever need that”. Really? It’s a valid encoding, and someone somewhere probably would like to be able to use it. Nevertheless, to deal with that objection, consider this:
בְּ - U+05D1 U+05B0 U+05BC
That character is in use in Hebrew. And there are other examples, too:
कू - U+0915 U+0942
कष - U+0915 U+0937
The latter case is especially interesting, because whether you see a single glyph or two depends on the font and on the text renderer that your browser is using(!)
The fact is that code-points don’t buy you much. The end user is going to expect all of these examples to count as a single “character” (except, possibly for the last one, depending on how it’s displayed to them on screen). They are not interested in the underlying representation you have to deal with, and they will not accept that you have any right to define the meaning of the word “character” to mean “Unicode code-point”. The latter simply does not mean anything to a normal person.
Now, sadly, the word “character” has been misused so widely that the Unicode consortium came up with a new name for the-thing-that-end-users-might-regard-as-a-unit-of-text. They call these things grapheme clusters, and in general they consist of a sequence of code-points of essentially arbitrary length.
Note that the reason people think using code-points will help them is that they are under the impression that a code-point maps one-to-one with some kind of “character”. It does not. As a result, you already have to deal with the fact that one “character” does not take up one code unit, even if you chose to use the Unicode code-point itself as your code unit. So you might as well use UTF-16: it’s no more complicated for you to implement, and it’s never larger than UCS-4.
It’s worth pointing out at this point that this is the exact choice that the developers of ICU (the Unicode reference implementation) and Java (whose string implementation derives from the same place) made. It’s also the choice that was made in Objective-C and Core Foundation. And it’s the right choice. UTF-8 is more complicated to process and is not, actually, smaller for many languages. If you want compatibility with ASCII, you can always allow some strings to be Latin-1 underneath and expand them to UTF-16 on the fly. UCS-4 is always larger and actually no easier to process because of combining character sequences and other non-spacing code-points.
Why is this relevant to Swift? Because in Matt Galloway’s article, it says:
Another nugget of good news is there is now a builtin way to calculate the true length of a string.
Only what Matt Galloway means by this is that it can calculate the number of code-points, which is a figure that is almost completely useless for any practical purpose I can think of. The only time you might care about that is if you were converting to UCS-4 and wanted to allocate a buffer of the correct size.
]]>Int
, because of a problem with the current version of the Swift compiler.
The second problem is that you can’t use it from the main thread in a Cocoa or Cocoa Touch program, because await
blocks.
As I mentioned previously on Twitter, to make it work really well involves some shennanigans with the stack. Anyway, I’m pleased to announce that I’ve been merrily hacking away and as a result you can download a small framework project that implements async/await from BitBucket.
I’m quite pleased with the syntax I’ve managed to construct for this as well; it looks almost as if it’s a native language feature:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Now, to date I haven’t actually tried it on iOS; I think it should work, but it’s possible that it will crash horribly. It is certainly working on OS X, though.
How does it work? Well, behind the scenes, when you use the async
function, a new (very small) stack is created for your code to run in. The C code then uses _setjmp()
and _longjmp()
to switch between different contexts when necessary. If you want to cringe slightly now, be my guest :-)
Possible improvements when I get the time:
T[]
hack that we’re using instead of declaring the result type in the Task<T>
object as T?
. The latter presently doesn’t work because of a compiler limitation.Anyway, here goes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
Now, obviously if Swift supported continuations, this might be done more efficiently (i.e. without any background threads or semaphores), but that’s an implementation detail.
There are also some syntax changes that would make it cleaner, notably if it
was permissible to remove the { return
and }
from the async function
declarations. I did briefly try to see whether I was allowed to assign to a
function, ala
func Test(var a : Int) -> Task = async { }
but that syntax isn’t allowed (if it was, async
would obviously need to return
a block).
If you are a software developer selling software in the European Union, these changes matter to you. There has been very little publicity thus far about these changes (that will change as we get closer to the end of the year), but given that you may need to make changes to your website, it seems like a good idea to tell you about them now.
So, what’s changing? Currently, if you are established in the European Union and you sell downloadable software to a customer who is also in the European Union, you always charge VAT in your country, following the rules in your country, and you pay it to the tax authority in your country. This is simple, because there is only one set of rules to follow, and it’s the one for your country.
As of the 1st of January, the VAT will instead be due in the customer’s country. If there were no other changes to the rules, you would therefore be obliged to register for VAT in other member states, according to their rules, and submit multiple returns every quarter (or at whatever period they specify). That means you might have to register with up to 28 member states, apply 28 different rates, 28 different sets of rules, make 28 times as many VAT returns and 28 separate payments in difference currencies (with currency conversions and rounding following different rules in different jurisdictions). For a small software company or an independent developer, this is clearly not going to work.
There are two other changes that are also coming in at the same time that mitigate this problem. The first is that app stores will be responsible for charging and remitting consumer VAT. Apple already does this, but some other app stores may not. Under the new rules, they will have to, so you will only have to deal with VAT as it applies to transactions between you and the app store provider.
If you sell direct to consumers, that doesn’t really help, though. What will help is that EU member states are going to operate a system known as the Mini One Stop Shop (or MOSS for short). This is similar to the scheme that has been operating for businesses outside of the EU selling to EU customers, whereby you can register with a single tax authority, submit a single return to that tax authority, and pay all of the tax due to that one place. You are still required to charge VAT at the rate applicable in the customer’s country, and in various respects the rules in that country will still apply — with some simplifications. Registration for this new scheme starts in October, and, unless you plan on only selling via an app store, you will probably want to register for it.
The other slight complication is that after 1st of January, you will need to keep two non-conflicting pieces of evidence to identify the location of your customer. HMRC has indicated, at least in the case of the U.K., that they will be fairly relaxed about this evidence — so, for instance, they realise that IP geolocation may not be 100% accurate, and that some customers may lie and give you false details. It also does not matter if you have more data that conflicts with your two non-conflicting pieces of evidence; all you need is those two. However, this affects all of your sales, not just those to customers in the EU, since it applies equally to your decision not to charge VAT to customers because they are not in any EU member state.
Why am I telling you about this? Because I’m a member of H.M. Revenue and Customs’ MOSS Joint SME Business/HMRC Working Group. Those of you who are in the UK, if you have queries about the scheme, or issues you would like to raise with HMRC, please do get in touch and I’ll try to help out. (If you are a member of TIGA, they have a couple of representatives on the working group also, so you can talk to them too.)
Finally, I will add that the law changes are already made — back in 2008 — so the scope for changing the rules at this stage is very limited. What we can influence to some extent is how they’re enforced and whether HMRC is aware of problems the new rules may cause us.
I’ll be posting some more on this topic over the coming weeks and months.
]]>dmgbuild
, that automates the
creation of (nice looking) disk images from the command line. There are no
GUI tools necessary; there is no AppleScript, and it doesn’t rely on Finder,
or on any deprecated APIs.
Why use this approach? Well, because everything about your disk image is defined in a plain text file, you’ll get the same results every time; not only that, but the resulting image will be the same no matter what version of Mac OS X you build it on.
If you’re interested, the Python package is up on PyPI, so you can just do
pip install dmgbuild
to get the program (if you don’t have pip, do easy_install pip
first; or
download it from PyPI, extract it, then run python setup.py install
).
You can also
read the documentation, or
see the code.
It’s really easy to use; all you need do is make a settings file (see the documentation for an example) then from the command line enter something like
dmgbuild -s my-settings.py "My Disk Image" output.dmg
The code for editing .DS_Store
files and for generating Mac aliases has been
split out into two other modules, ds_store
and mac_alias
, for those who
are interested in such things. The ds_store
module should be fully portable
to other platforms; the mac_alias
module relies on some OS X specific
functions to fill out a proper alias record, and on other systems those would
need to be replaced somehow. The dmgbuild
tool itself relies on hdiutil
and SetFile
, so will only work on Mac OS X.
Is this a thing? Really? Well, no, not really.
Very early on, disks and tapes were relatively unreliable and so there have basically always been checksums of some description to let you know if data you read is corrupted. Historically, we’re talking about some kind of per-block cyclic redundancy check, which is why one of the error codes you can receive at a disk hardware interface is “CRC error”.
Modern disks actually use error correcting codes such as Reed-Solomon Encoding or Low-Density Parity Check codes. A single random bit error under such schemes can be corrected, end of story. They may be able to correct multiple bit errors too, and these codes can detect more errors than they are able to correct.
The upshot is that a single bit flip on a disk surface won’t cause a read error; in fact, the software in your computer won’t even notice it because the hard disk will correct it and rewrite the data on its own.
It takes multiple flipped bits to cause a problem, an in most cases this will result in the drive reporting a failure to the operating system when trying to read the block in question. The probability of a multi-bit failure that can get past Reed-Solomon or LDPC codes is tiny.
The author then goes on to make a ludicrous claim that RAID won’t be able to deal with this kind of event, and “demonstrates” by flipping “a single bit” on one of his disks to make his point. Unfortunately, this is a completely bogus test. He has, in fact, flipped at many more bits than just the one, and he’s done so by writing to the disk, which will encode his data using its error correcting code, resulting in a block that reads correctly because he’s actually stored the wrong data there deliberately.
The fact is that, in practice, when an unrecoverable data corruption occurs on a disk surface, the disk returns an error when something tries to read that block. If a RAID controller gets such an error, it will attempt to rebuild the data using parity (or whatever other redundancy mechanism it’s using).
So RAID really does protect you from changes that occur on the disk itself.
Where RAID does not protect you is on the computer side of the equation. It doesn’t prevent random bit flips in RAM, or in the logic inside your machine. Some components in some computers have their own built-in protection against these events — for instance, ECC memory uses error correcting codes to prevent random bit errors from corrupting data, while some data busses themselves use error correction. If you are seeing random bit flips in files that otherwise read OK, it’s much more likely they were introduced in the electronics or even via software bugs and written in their corrupted form to your storage device.
An aside: programmers generally use the term “bit rot” to refer to the fact that unmaintained code will often at some point stop working because of apparently unrelated changes in other parts of a large program. Such modules are said to be suffering from “bit rot”. I’ve never heard it used in the context of data storage before.
]]>