Alastair’s Place

Software development, Cocoa, Objective-C, life. Stuff like that.

Why Core Data Is a Bad Idea

OK, so I’m being a little mischievous here; it’s possible that, for your application, Core Data is a good fit. But carry on reading, because I want to highlight something that you may still wish to think about when deciding to use Core Data as a persistence mechanism in your app.

One of the first things you might want to consider when thinking about using Core Data is what type of persistent backing store you wish to use. Apple’s framework provides four built-in implementations (three on iOS), namely:

  • XML (OS X only)
  • Atomic
  • SQLite
  • In-memory (and so, not, in fact, persistent)

Of these, the XML store is clearly intended for debugging, so let’s discard that, and the in-memory store, while useful, is not really persistent, so we’ll ignore that for now too.

So the options are the Atomic store and the SQLite store. Both are documented as “fast”, but the Atomic store only supports reading or writing the entire object graph, which seems like quite a bit of overhead.

A lot of people therefore plump for the SQLite store.

Now, SQLite was designed to provide DBMS-style ACID consistency guarantees, and in applications where that matters, it’s important that the database does not become corrupted. As a result, it uses either a rollback journal or write-ahead logging, and it also at various points needs to guarantee that changes have genuinely been flushed to non-volatile storage, which it does using the fsync() or fcntl(fd, F_FULLFSYNC) calls.

If you need this level of consistency guarantee, this is a Good Thing, but it is not without overhead. As the SQLite manual notes, “some operations are as much as 50 or more times faster” when SQLite does not need to call fsync()!

Sadly, a lot of applications that use Core Data have chosen the SQLite store but do not need this kind of consistency guarantee, and the highly synchronous behaviour simply creates a performance problem. The situation is much worse on a networked set-up where users’ home folders are on a server, because the fsync() causes the server to flush data to disk… imagine how much additional (and unnecessary) disk traffic that creates if you have 20 users all logged-in using your application.

I don’t want to name and shame here, but if your application’s Core Data store is really just an index to some other data (e.g. for an e-mail client), or if it is something you could easily reconstruct (e.g. in an RSS reader), you really don’t want to burden your users with the overhead of unnecessary synchronous I/O.

Now, on OS X 10.4, the only thing you could really do about this problem was to use a different persistent store — there is a defaults setting the user can set to disable synchronous behaviour, but it’s system-wide rather than per application, so telling people to set it is just a bad idea. For many applications, the Atomic store would be just fine, but for some applications you would have had to write your own store type.

Thankfully, on OS X 10.5, Apple added the NSSQLitePragmasOption store option, which allows you to tell SQLite exactly what kind of behaviour you expect from it. For applications like e-mail clients and RSS readers, you probably just want to turn synchronous behaviour off completely, e.g.

NSDictionary *pragmaOptions = @{ @"synchronous": @"OFF" };
NSDictionary *storeOptions = @{ NSSQLitePragmasOption: pragmaOptions };
NSPersistentStore *store;
NSError *error = nil;

store = [myCoordinator addPersistentStoreWithType:NSSQLiteStoreType

The worst that will happen is that your database will be corrupted, but you can easily rebuild it from other data. Your users will thank you, because your application will run significantly faster and even if they have to rebuild, it will be fairly quick.

According to the SQLite docs, the default setting for PRAGMA synchronous is FULL, which is probably not necessary in 90% of cases — even if you really don’t want your data file to get corrupted, NORMAL may well give sufficient guarantees for your application. Also, Apple’s documentation indicates that the related PRAGMA fullfsync option is disabled by default as of OS X 10.5, so we needn’t worry too much about that.

An additional option you may wish to contemplate is the PRAGMA journal_mode setting. In particular:

  • The SQLite default setting, DELETE, is slower than the TRUNCATE setting, and so you might care to specify TRUNCATE on systems prior to 10.7.

  • As of OS X 10.7/iOS 5, you could set it to WAL to enable write-ahead logging rather than using a rollback journal, which will improve performance, especially if you set PRAGMA synchronous to NORMAL rather than FULL.

You can see the documentation for PRAGMA journal_mode on the SQLite website.

To summarise:

  1. If you’re using Core Data, your application may not need the SQLite data store. You might actually be better off with the Atomic store.

  2. If you are using the SQLite data store, there is a very good chance that the default behaviour is overkill for your application. In many cases, you could reasonably disable synchronous disk behaviour by setting PRAGMA synchronous to OFF, and in the vast majority you could make your application run faster by setting it to NORMAL without a substantial increase in the risk of data loss.

  3. You may also wish to consider altering the PRAGMA journal_mode setting.