Merry Christmas, everyone.
Hope you’re all having a good time, wherever you are.
Merry Christmas, everyone.
Hope you’re all having a good time, wherever you are.
I just had someone suggest that we should take part in a scheme like MacHeist, but where we don’t get paid at all!
The cheek of it!
We’re expected to provide technical support (a significant drain on our time—and therefore a significant cost—as we sell disk utility software), and users will expect discounts on future upgrades.
Oh, and let’s not forget the 250MB+ bootable CD template images that each of these users would want to download. Quite a significant bandwidth cost, that; we already transfer hundreds of GBs a month, without thousands of people suddenly hitting us for free copies.
What’s even worse, the people in question weren’t sure whether they were organising one “heist” or many, and don’t really seem to know how many people they’re going to give the software away to. In fact, they haven’t really thought about it at all.
About the only saving grace is that they claim they won’t be making any money out of it.
Who are they? I’m not going to give them any publicity by saying. But I’d advise other developers to stay well away from this kind of deal. If you want publicity, go buy some advertising. It may look expensive but it’ll be cheaper in the long run, plus you’ll attract people who will actually pay for your apps.
I’ve been reading with some interest John Gruber’s comments about MacHeist and MacZot, as well as Gus Mueller’s take on the situation.
JG’s analysis is, I think, spot on. MacHeist and MacZot are basically mechanisms for fleecing independent developers. I have no idea why anyone would agree to sell their app for a week for a fixed price rather than a percentage, especially when the organisers of the promotion are taking a percentage cut. It seems like bad business to me, especially when the developers will have to deal with the additional support that all of those users create.
And as for the argument that MacHeist and MacZot are in any way similar to a proper distributor, let me tell you: our Japanese distributor does way more work, including:
And they don’t ask for the kind of share of the profits that MacHeist are reportedly taking!
It’s patently ridiculous to argue that there is any benefit here to the apps’ developers. For the most part they aren’t going to get significant upgrade sales, and for many of the apps it isn’t even as if they need significant extra publicity in the marketplace.
There are some similarly silly arguments about how $5,000 is a lot of money for a single week’s sales. It actually isn’t, particularly when you consider that this isn’t ordinary sales; it’s a promotion. If it works properly, it should result in a lot of sales over a short period, and might actually cause a “sales drought” for a period thereafter (when you do these kinds of things, you, of course, hope to attract business that you wouldn’t otherwise have got… but many of these apps are already very competitively priced so I don’t see that happening here).
Someone even tried to compare it to conventional advertising, remarking “When was the last time you were paid to place an ad?” and commenting that:
“The people against MacHeist and similar things have never tried to buy exposure or ads or build a brand using traditional methods. MacHeist is not a new concept by any means, its [sic] just new to the Mac dev community who have, apparently, be [sic] squirreled away from the big bad world of real business.”
Hmmm. Well I’ll correct one thing straight up. I’m against MacHeist and MacZot, and I have bought ads—in fact, I recently spent thousands of dollars on an ad. campaign. I’m also involved with proper distributors who stick things in boxes on shelves, so I’m indirectly paying for “proper” advertising there too.
Anyway, the idea that MacHeist or MacZot are an ad. campaign is simply nonsense. When was the last time Jaguar ran an advertising campaign where they gave their cars away at an 80% discount to everyone who asked? <sarcasm>What’s that? They haven’t. Oh. I wonder why, I mean, it’s such an effective idea, right? And someone will even pay them $5,000 to do it for a week! That’s cheap advertising, right? I mean, it costs tens of thousands at least for a full-page advert in a broadsheet.</sarcasm>
If you believe that kind of tripe, you really don’t have a clue about business (ironically the person who wrote it finished their remarks by calling someone else’s business strategy “cr#p”). Advertising is advertising and promotions are promotions. MacHeist and MacZot are promotions, and—for the developers involved—assuming JG’s figures are right, bad ones.
One of the more unpleasant things that I’m cursed with is migraines. In my case they aren’t always hugely painful (they do hurt, though), but the thing that really upsets me is the “visual disturbance” that accompanies them (usually preceding the actual headache part). Worse, during the visual disturbance, I sometimes feel confused, which—for someone like me—is very frustrating indeed.
I have one going on at the moment, which is making it very difficult to write this post (I can hardly read it), and pretty much impossible to work :–(
lmh had posted not one, but two advisories about dmg related vulnerabilities in Mac OS X.
The first I’ve already dealt with, but since some people had combined the two issues into a single overall advisory, I thought I’d address the second one (CVE-2006-6062) as well.
This is much simpler than the first, which in a way makes it even less excusable to claim memory corruption when it doesn’t exist. However, in this case lmh didn’t claim that it allowed arbitrary code execution or make similar exaggerations. So, what is wrong here?
Well, what he’s done is he’s corrupted the Catalog File, which is a critical piece of filesystem metadata that tells HFS+ where things are on the disk. And the corruption in question means that it can’t find the root folder’s record. So whilst the call to VFS_MOUNT() in the mount() system call will succeed, when checkdirs() is subsequently called and tries to get a vnode for the filesystem root, VFS_ROOT() returns ENOENT, which triggers a kernel panic with the reason “mount: lost mount”.
Interestingly this particular problem is probably specific to HFS+, as most other filesystems wouldn’t be able to lose their root folder so easily.
So again, I don’t see any memory corruption. All you can do is trigger a kernel panic.
Let me be clear though; lmh didn’t claim that this was as serious as the other one. One thing he didn’t mention, however, about this bug (because he didn’t investigate it properly) is that it doesn’t just affect UDTO disk images… this is bug will trigger with suitably corrupted filesystems on any type of media, which could be annoying for an end user if it happens on their hard disk.
It seems I was wrong to blame Secunia for stating that they had verified or confirmed the dmg problem. So, my apologies to Secunia for that.
In my defence, the articles on which I based that remark (particularly the neowin.net post) look like and read like press releases, and until I pointed it out to them, I didn’t see any denial of the “validation” from Secunia, which is a shame.
Anyway, they have now denied having validated or confirmed the exploit and would like to point out that they did not, in fact, issue a press release about the issue.
On 20th of this month, lmh of MoKB fame posted a security advisory claiming that:
Mac OS X com.apple.AppleDiskImageController fails to properly handle corrupted DMG image structures, leading to an exploitable memory corruption condition with potential kernel-mode arbitrary code execution by unprivileged users.
(the bold highlighting is his, not mine, and is clearly intended to shock.)
A supposedly more complete description of the problem was posted here, and was subsequently “validated” by a third-party security company, Secunia, who rate it as a highly critical security flaw, allowing privilege escalation, denial of service and compromise of a vulnerable system. [Secunia tell me that they did not validate the issue, though they have been reported as having done so by others.]
This issue has been given a variety of numbers by various people, including:
Many of the above links are to organisations who people trust to provide authoritative security information. Indeed, so serious, apparently, is this bug, that it’s even made the BBC News site, not once, but twice.
It is a shame, therefore, that lmh didn’t do his homework properly before writing his advisory. It is even more of a shame that all of these people trusted what was written by someone who couldn’t be bothered to analyse the kernel panic he’d triggered properly or to work out what the flaw was that he’d found.
I doubt this would normally even have been noticed. But lmh made a mistake when he accused me of being someone who “doesn’t bother reading, checking and… [being] willing to say something that doesn’t make sense at all” and then subsequently ignored my advice not to mistake me for someone who didn’t know what they were doing.
As as a result of his comments on his site, and with some cajoling from Matasano Security’s Thomas Ptacek, I decided to investigate further.
Guess what I found? Not only is lmh’s diagnosis completely incorrect, but the problem isn’t a security flaw at all, let alone a critical, highly critical, or warn-everyone-via-the-BBC type event.
Now, I should say, that I’m wary of suggesting that disk images are totally safe. There’s a lot of code involved in mounting and reading/writing a disk image, and quite a bit of that runs in kernel mode. But I am pretty peeved at the way that this issue has been so widely publicised, attracting a great deal of attention for lmh and MoKB, when in actual fact there is no such security flaw. I’m also upset that none of the security companies listed above, especially Secunia who put out a press release saying that they’d “verified” the problem [Secunia tell me that this is untrue.], actually did sufficient analysis to realise that. Hopefully in future they will do their jobs properly and actually go through the kernel code like I did.
“So&rdquo, you ask. “this dmg file… it does cause a kernel panic, right? So if lmh is wrong, where’s your evidence—after all, he’s published a very technical looking report with a lot of gobbledegook from gdb.”
Yes, he has, hasn’t he. Unfortunately he didn’t interpret it properly, which is where the problems started.
It’s taken me the best part of three days’ work to figure out what is really going on here, which gives some idea of how difficult it is (particularly without the source code for the disk image driver) to trace everything back through and come up with a proper, definitive explanation of the problem. lmh obviously wasn’t interested in doing that—after all, why do proper research when you get just as much publicity from guesswork and scaremongering?
What happens when you double-click this dmg file? Well, the first thing that happens is that the system runs a program called diskimages-helper. This program is responsible for a number of tasks related to disk image support in Mac OS X, one of which is mounting disk images.
(All of the discussion that follows is a slight simplification, because there are lots of different types of disk image and the handling differs for some of them. I’m talking specifically about raw read-write disk images, of the kind that lmh posted.)
diskimages-helper talks to the kernel by way of an IOUserClient implementation in the disk image driver, IOHDIXController.kext. When you try to mount a disk image, the helper sends a message to this component asking it to do the actual work.
Next, the kernel component instantiates a number of objects and attaches them to the IOKit’s I/O registry, which causes the IOKit to begin probing for partition maps. For this part, we have source code, in the form of the file IOApplePartitionScheme.cpp; the lines we are interested in are:
// Determine the official block size to use to scan the partition entries.
dpmeBlockSize = mediaBlockSize; // (natural block size)
if ( OSSwapBigToHostInt16(driverMap->sbSig) == BLOCK0_SIGNATURE )
{
dpmeBlockSize = OSSwapBigToHostInt16(driverMap->sbBlkSize);
// Increase the probe score when a driver map is detected, since we are
// more confident in the match when it is present. This will eliminate
// conflicts with FDisk when it shares the same block as the driver map.
*score += 2000;
}
This probably looks like gobbledegook, so let me explain what’s going on here. The IOApplePartitionScheme object is looking to see if there is a Block0 structure at the start of the image. If it finds one, then the partition map’s location depends on the block size stored in the Block0, rather than on the natural block size of the disk. It’s relevant at this point to note that disk images always have a natural block size of 512 bytes (this is hard-coded in the kernel).
Then, further down in the same file, the code loops through the entries in the partition table, as follows:
// Scan the media for Apple partition entries.
for ( dpmeID = 1, dpmeCount = 1; dpmeID <= dpmeCount; dpmeID++ )
{
UInt32 partitionBlockSize = dpmeBlockSize;
// Determine whether we've exhausted the current buffer of entries.
if ( dpmeID * dpmeBlockSize + sizeof(dpme) > bufferReadAt + bufferSize )
{
// Read the next partition entry into our buffer.
bufferReadAt = dpmeID * dpmeBlockSize;
status = media->read(this, bufferReadAt, buffer);
if ( status != kIOReturnSuccess ) goto scanErr;
}
Notice how it’s using the dpmeBlockSize variable from above to work out where to read from. In lmh’s broken disk image, the start of the image looks like this:
00000000: 4552 023d 0000 2f00 0000 0000 0000 0000 ER.=../......... 00000010: 0000 0000 0000 9c00 0000 0000 0000 0000 ................
The second pair of bytes, 023d, are the interesting ones here; they tell the system that the block size for this image (that is, the value in dpmeBlockSize) is not 512, but 573 bytes. Therefore, the first partition map entry should be at byte offset 573 into the image, which is what bufferReadAt gets set to, and it’s at that offset that the IOApplePartitionScheme code tries to read a partition map entry.
Now, normally, partition map entries are at integer multiples of the device block size, so the various bits of code in IOStorageFamily.kext (which I won’t bore you with, though I went through them to work this out) would simply pass the buffer created by the IOApplePartitionScheme object through to the code that actually does the read.
However, in this case, the read wasn’t aligned on a block boundary. The system, of course, has code to cope with this situation. What it does is it creates an IODeblocker object, whose job is to read the data in chunks and then copy the correct bits of it into the buffer it’s supposed to read into.
It is this IODeblocker object that finally makes its way to the IOHDIXHDDriveInKernel object’s kernelIOThread() method. (IOHDIXHDDriveInKernel is an object instantiated by the in-kernel disk image code to handle I/O for the disk image itself.) Normally at this point, it would see an IOBufferMemoryDescriptor object instead, but because of the misalignment of the read request, it gets the IODeblocker.
The first thing that the IOHDIXHDDriveInKernel object tries to do with the buffer when it gets it is to map it into the kernel’s memory space (it doesn’t know at this point, but the buffer is already in the kernel’s memory map because this particular read request came from the kernel itself).
To do this, IOHDIXHDDriveInKernel calls the buffer object’s IOMemoryDescriptor::map() method, which, in turn calls IOMemoryDescriptor::makeMapping(). That function checks for an existing mapping, decides in this case that there isn’t one, and therefore creates a new _IOMemoryMap object.
(You can see all of this in IOMemoryDescriptor.cpp, which you can find in the xnu sources at xnu/iokit/Kernel.)
This object is then initialised using its initWithDescriptor() method, which in turn calls the buffer object’s doMap() method to actually do the work of updating the virtual memory map with the new mappings.
The doMap() method eventually finds its way to a call to IOMemoryDescriptorMapAlloc():
logical = *atAddress;
if( options & kIOMapAnywhere)
// vm_map looks for addresses above here, even when VM_FLAGS_ANYWHERE
ref.mapped = 0;
else {
ref.mapped = trunc_page_32( logical );
if( (logical - ref.mapped) != pageOffset) {
err = kIOReturnVMError;
continue;
}
}
if( ref.sharedMem && (addressMap == kernel_map) && (kIOMemoryBufferPageable & _flags))
err = IOIteratePageableMaps( ref.size, &IOMemoryDescriptorMapAlloc, &ref );
else
err = IOMemoryDescriptorMapAlloc( addressMap, &ref );
which fills-in the address at which the memory is mapped in the target task (in this case, the kernel), in the ref.mapped member of the IOMemoryDescriptorMapAllocRef structure, ref. (for those who are wondering, options does indeed contain the flag kIOMapAnywhere at this point.)
Now this is important: we don’t have a normal IOBufferMemoryDescriptor at this point. We have an IODeblocker instead. An IODeblocker has more than one buffer, so when the code in IOMemoryDescriptorMapAlloc() asks for the address of the buffer, it is given the address of the buffer for the first chunk.
All of this is fine, and at the bottom of the function, it makes a call to handleFault():
if( !ref.sharedMem || pager )
err = handleFault( pager, addressMap, ref.mapped, sourceOffset, length, options );
The important thing to note here is that the address argument to handleFault() points at the start of the buffer for the first chunk in our IODeblocker.
Now, we look at the handleFault() method. It clearly needs to be able to handle situations where the data isn’t contiguous from the kernel’s perspective, and indeed if you look at it, you’ll see that it is written as a big loop:
IOReturn IOMemoryDescriptor::handleFault(
void * _pager,
vm_map_t addressMap,
IOVirtualAddress address,
IOByteCount sourceOffset,
IOByteCount length,
IOOptionBits options )
{
...
physAddr = getPhysicalSegment64( sourceOffset, &segLen );
assert( physAddr );
pageOffset = physAddr - trunc_page_64( physAddr );
pagerOffset = sourceOffset;
size = length + pageOffset;
physAddr -= pageOffset;
segLen += pageOffset;
bytes = size;
do {
...
sourceOffset += segLen - pageOffset;
address += segLen;
bytes -= segLen;
pageOffset = 0;
} while( bytes
&& (physAddr = getPhysicalSegment64( sourceOffset, &segLen )));
if( bytes)
err = kIOReturnBadArgument;
return( err );
}
Anyone spot the mistake? If handleFault() were guaranteed to be called with multi-segment buffers that were actually a single chunk in user-land, everything would be fine. But here, we’re calling it with a multi-segment buffer (in our case an IODeblocker) that started off in the kernel. So the value in the address variable on the second pass through this loop will be wrong.
That wouldn’t have mattered, as it turns out, except that, because of a bug in the nVidia video driver, the following code was added as a “Temporary Workaround”:
/* *** ALERT *** */
/* *** Temporary Workaround *** */
/* This call to vm_fault causes an early pmap level resolution */
/* of the mappings created above. Need for this is in absolute */
/* violation of the basic tenet that the pmap layer is a cache. */
/* Further, it implies a serious I/O architectural violation on */
/* the part of some user of the mapping. As of this writing, */
/* the call to vm_fault is needed because the NVIDIA driver */
/* makes a call to pmap_extract. The NVIDIA driver needs to be */
/* fixed as soon as possible. The NVIDIA driver should not */
/* need to query for this info as it should know from the doMap */
/* call where the physical memory is mapped. When a query is */
/* necessary to find a physical mapping, it should be done */
/* through an iokit call which includes the mapped memory */
/* handle. This is required for machine architecture independence.*/
if(!(kIOMemoryRedirected & _flags)) {
vm_fault(addressMap,
(vm_map_offset_t)address,
VM_PROT_READ|VM_PROT_WRITE,
FALSE, THREAD_UNINT, NULL,
(vm_map_offset_t)0);
}
/* *** Temporary Workaround *** */
/* *** ALERT *** */
Because of this, on the second pass through this loop, vm_fault() gets called with address set to point to a non-aligned address inside the first buffer held by the IODeblocker.
Note that it isn’t possible for it to point outside of this buffer, because the segment length, held in segLen, has to be in the range 0x1 ≤ segLen &le 0x1ff as it’s set up by the IODeblocker to buffer the misaligned part at the beginning of the read. The buffer itself is larger than that, so there’s no chance of any mischief because the ability of a potential attacker to set the address at which vm_fault() gets called is extremely restricted.
Anyway, it is the call to vm_fault() with a non-aligned address that is causing the kernel panic.
So, what have we learned:
It is not a memory overwrite bug.
It is not exploitable, except in that you can kernel panic a machine if you can persuade a user to double-click a damaged dmg file.
It is not, therefore, possible to use this bug for privilege elevation or to execute arbitrary code in the kernel.
In fact, all lmh has found here is a bug that causes a kernel panic. Not a security flaw. Not a memory corruption bug. Just a completely orderly kernel panic. There aren’t even any processor exceptions involved; the path to the panic is perfectly normal non-exceptional code using ordinary function calls. (All of the preceding not withstanding, you should still turn off Safari’s “Open ‘safe’ files after downloading” option; that way, you can’t be tricked into clicking a link that could kernel panic your machine, or indeed fall foul of any other problems with the algorithm that determines how safe a file might be.)
The moral of this story? Well, to my mind there are a few:
Security research needs to be done properly and responsibly. It isn’t good enough to panic the user community over something if you haven’t done your research properly. Investigating this bug took me three days. What lmh appears to have done would have taken minutes, and is no way to verify a security flaw.
Don’t “confirm” or “verify” the findings of others without actually investigating. Particularly if you’re going to release press releases telling everyone about it. That means you, Secunia. [Secunia inform me that in fact, though they were reported by others as having verified the issue, they have not done so, and did not issue any press releases.]
Don’t accuse random people on the Internet of not knowing their stuff, like lmh did with me. Every so often you’ll find someone who really does, and you’re going to look very silly.
Credit is due to Thomas Ptacek of Matasano Security for casting an eye over my findings before I wrote this post.
I’ve just exchanged some U.S. Dollars for Pounds Sterling at a rate of two dollars to the pound. That is, one United States Dollar is now worth 50 pence.
I know in the past that it used to be worth even less than that. I think the peak may have been about 13 dollars to the pound some time in 1864, but for much of my adult life it’s been somewhere in the 1.5 to 1.75 range.
Most normal people don’t really care about the exchange rates. Unfortunately, however, I’m running a software business from the United Kingdom, and a large proportion of my customers are in the United States. That makes my business extremely sensitive to the exchange rate; the difference in takings caused by exchange rate fluctuations can be literally thousands of pounds. I really can see the difference it’s making to my bottom line… my company has to sell up to 14% more in order to make the same money it was making this time last year!
If things keep going the way they are, we’ll very likely have to increase our U.S. Dollar prices.
It’s ironic that MoKB have replied to my post about dmg kernel panics by accusing me of not:
reading, checking and…[being] willing to say something that doesn’t make sense at all.
They also continue to imply that dmg crashes really are a new discovery. Apparently it’s not enough that I’m a disk utility developer and therefore stand really quite a considerable chance of having seen one. No, it seems that it only happened if I announced it publicly, which I didn’t.
In fact, I reported such a bug to Apple so long ago that it’s now no longer listed in the “My Originated Problems” page on their Bug Reporter. I think I even sent them a dmg file that triggered a kernel panic when it opened, though it’s possible that I just documented the bug in question. The particular bug that I reported was certainly fixed, I know that much.
I also quite enjoy the way that they quote my remark about crashes not necessarily being (or implying the existence of) exploits and use it to imply that I’m some kind of Mac zealot. What they don’t say is that I mentioned that this criticism wasn’t operating system specific. I’m just as unhappy about similar leaps of faith (“Oh, it crashed… well that must be an exploit then!”) on the Windows side.
They then go on to say that “the definition of a ‘crash’ in kernel-land has quite a few possible meanings”. Ignoring the ungrammatical nature of that sentence, they proceed to conflate types of crash with causes of crashes, then fail to define the term “exploit” in any kind of meaningful way.
But let’s be clear, and let’s define “crash” and “exploit” (something MoKB’s post said it was going to do, then didn’t):
- crash
- …to break down or cause (a computer system or program) to break down completely…
- exploit
- vt to gain control of a computer system or program (usually without authorisation) by taking advantage of a flaw in the design of that system or program. — n a tool or technique for gaining control of a computer system or program.
The former definition comes from The Chambers Dictionary. The latter is my own. I haven’t included all the other meanings of the two words, since they aren’t in dispute here. It should be quite clear from the definitions that crashes and exploits are quite distinct things, contrary to what Brian Krebs’ article implies.
Now, it is true that some crashes are indicative of an exploitable bug, and that still others can be misused as part of a deliberate denial of service attack. But a crash is not an exploit. It doesn’t even mean that an exploit is possible! Interestingly, despite disputing this fact when I originally mentioned it, MoKB go on to state it themselves, saying that
Basically exploiting a bug a [sic] in kernel-land requires some conditions to be met:
and then listing a few such conditions (there are more that they don’t mention).
Anyway, MoKB people, please understand that I have a lot of experience with low-level issues. You aren’t going to steamroller your way through me by claiming that I’m just some random Mac blogger (which isn’t true) or that I “don’t understand”. I do understand; I used to work as an embedded systems developer and even before that I was patching bugs in software I owned, breaking into computer games so I could cheat, re-writing bits of my machine’s operating system so they worked better, etc… These days, I write disk utilities, though I still tinker with other things.
Trying to mischaracterise me as someone who “doesn’t know what they’re talking about” because I happen not to agree with you is just a very good way to make yourselves look foolish. But go right ahead, if you’re determined.
Here’s my latest attempt at Saturn. It looks sharper through the eyepiece, so I’m sure I should be able to do better.

I’m thinking about buying myself a 5x Powermate, which will mean fewer things hanging off the back of the telescope, meaning that the camera will be in a better position for focusing (presently I have to have my Barlow lens almost hanging out of the back of the ’scope in order to get the CMOS sensor in the focal plane—well, to the extent that it actually is, anyway).