Core Data from Scratch: Concurrency

If you're developing a small or simple application, then you probably don't see the benefit of running Core Data operations in the background. However, what would happen if you imported hundreds or thousands of records on the main thread during the first launch of your application? The consequences could be dramatic. For example, your application could be killed by Apple's watchdog for taking too long to launch.

In this article, we take a look at the dangers when using Core Data on multiple threads and we explore several solutions to tackle the problem.

1. Thread Safety

When working with Core Data, it's important to always remember that Core Data isn't thread safe. Core Data expects to be run on a single thread. This doesn't mean that every Core Data operation needs to be performed on the main thread, which is true for UIKit, but it does mean that you need to be aware which operations are executed on which threads. It also means that you need to be careful how changes from one thread are propagated to other threads.

Working with Core Data on multiple threads is actually very simple from a theoretical point of view. NSManagedObject, NSManagedObjectContext, and NSPersistentStoreCoordinator aren't thread safe, and instances of these classes should only be accessed from the thread they were created on. As you can imagine, this becomes a bit more complex in practice.

`NSManagedObject`

We already know that the NSManagedObject class isn't thread safe, but how do you access a record from different threads? An NSManagedObject instance has an objectID method that returns an instance of the NSManagedObjectID class. The NSManagedObjectID class is thread safe and an instance contains all the information a managed object context needs to fetch the corresponding managed object.

// Object ID Managed Object
NSManagedObjectID *objectID = [managedObject objectID];

In the following code snippet, we ask a managed object context for the managed object that corresponds with objectID. The objectWithID: and existingObjectWithID:error: methods return a local version—local to the current thread—of the corresponding managed object.

// Fetch Managed Object
NSManagedObject *managedObject = [self.managedObjectContext objectWithID:objectID];

// OR

// Fetch Managed Object
NSError *error = nil;
NSManagedObject *managedObject = [self.managedObjectContext existingObjectWithID:objectID error:&error];

if (error) {
    NSLog(@"Unable to fetch managed object with object ID, %@.", objectID);
    NSLog(@"%@, %@", error, error.localizedDescription);
}

The basic rule to remember is to not pass the NSManagedObject instance from one thread to another. Instead, pass the managed object's objectID and ask the thread's managed object context for a local version of the managed object.

`NSManagedObjectContext`

Because the NSManagedObjectContext class isn't thread safe, we could create a managed object context for every thread that interacts with Core Data. This strategy is often referred to as thread confinement.

A common approach is to store the managed object context in the thread's dictionary, a dictionary to store thread-specific data. Take a look at the following example to see how this works in practice.

// Add Object to Thread Dictionary
NSThread *currentThread = [NSThread currentThread];
[[currentThread threadDictionary] setObject:managedObjectContext forKey:@"managedObjectContext"];

`NSPersistentStoreCoordinator`

What about the persistent store coordinator? Do you need to create a separate persistent store coordinator for every thread. While this is possible and one of the strategies Apple used to recommend, it isn't necessary.

The NSPersistentStoreCoordinator class was designed to support multiple managed object contexts, even if those managed object contexts were created on different threads. Because the NSManagedObjectContext class locks the persistent store coordinator while accessing it, it is possible for multiple managed object contexts to use the same persistent store coordinator even if those managed object contexts live on different threads. This makes a multithreaded Core Data setup much more manageable and less complex.

2. Concurrency Strategies

So far, we've learned that you need multiple managed object contexts if you perform Core Data operations on multiple threads. The caveat, however, is that managed object contexts are unaware of each others existence. Changes made to a managed object in one managed object context are not automatically propagated to other managed object contexts. How do we solve this problem?

There are two popular strategies that Apple recommends, notifications and parent-child managed object contexts. Let's look at each strategy and investigate their pros and cons.

The scenario we'll take as an example is an NSOperation subclass that performs work in the background and accesses Core Data on the operation's background thread. This example will show you the differences and advantages of each strategy.

Strategy 1: Notifications

Earlier in this series, I introduced you to the NSFetchedResultsController class and you learned that a managed object context posts three types of notifications:

NSManagedObjectContextObjectsDidChangeNotification: this notification is posted when one of the managed objects of the managed object context has changed
NSManagedObjectContextWillSaveNotification: this notification is posted before the managed object context performs a save operation
NSManagedObjectContextDidSaveNotification: this notification is posted after the managed object context performs a save operation

When a managed object context saves its changes to a persistent store, via the persistent store coordinator, other managed object contexts may want to know about those changes. This is very easy to do and it's even easier to include or merge the changes into another managed object context. Let's talk code.

We create a non-concurrent operation that does some work in the background and needs access to Core Data. The header would look similar to the one shown below.

#import <Foundation/Foundation.h>
#import <CoreData/CoreData.h>

@interface TSPImportOperation : NSOperation

@property (strong, nonatomic) NSManagedObjectContext *mainManagedObjectContext;

@end

The operation's interface is very simple as it only contains a property for the application's main managed object context. There are several reasons for keeping a reference to the application's main managed object context. This becomes clear when we inspect the implementation of the TSPImportOperation class.

We first declare a private property, privateManagedObjectContext, of type NSManagedObjectContext. This is the managed object context that the operation will use internally to perform Core Data tasks.

#import "TSPImportOperation.h"

@interface TSPImportOperation ()

@property (strong, nonatomic) NSManagedObjectContext *privateManagedObjectContext;

@end

Because we're implementing a non-concurrent NSOperation subclass, we need to implement the main method. This is what it looks like.

- (void)main {
    // Initialize Managed Object Context
    self.privateManagedObjectContext = [[NSManagedObjectContext alloc] init];
    
    // Configure Managed Object Context
    [self.privateManagedObjectContext setPersistentStoreCoordinator:self.mainManagedObjectContext.persistentStoreCoordinator];
    
    // Add Observer
    NSNotificationCenter *nc = [NSNotificationCenter defaultCenter];
    [nc addObserver:self selector:@selector(managedObjectContextDidSave:) name:NSManagedObjectContextDidSaveNotification object:self.privateManagedObjectContext];
    
    // Do Some Work
    // ...
    
    if ([self.privateManagedObjectContext hasChanges]) {
        // Save Changes
        NSError *error = nil;
        [self.privateManagedObjectContext save:&error];
    }
}

There are a few important details that need to be clarified. We initialize the private managed object context and set its persistent store coordinator property using the mainManagedObjectContext object. This is perfectly fine, because we don't access the mainManagedObjectContext, we only ask it for its reference to the application's persistent store coordinator. We don't violate the thread confinement rule.

It is essential to initialize the private managed object context in the operation's main method, because this method is executed on the background thread on which the operation runs. Can't we initialize the managed object context in the operation's init method? The answer is no. The operation's init method is run on the thread on which the TSPImportOperation is initialized, which is most likely the main thread. This would defeat the purpose of a private managed object context.

In the operation's main method, we add the TSPImportOperation instance as an observer of any NSManagedObjectContextDidSaveNotification notifications posted by the private managed object context.

We then do the work the operation was created for and save the changes of the private managed object context, which will trigger a NSManagedObjectContextDidSaveNotification notification. Let's take a look at what happens in the managedObjectContextDidSave: method.

#pragma mark -
#pragma mark Notification Handling
- (void)managedObjectContextDidSave:(NSNotification *)notification {
    dispatch_async(dispatch_get_main_queue(), ^{
        [self.mainManagedObjectContext mergeChangesFromContextDidSaveNotification:notification];
    });
}

As you can see, its implementation is short and simple. We call mergeChangesFromContextDidSaveNotification: on the main managed object context, passing in the notification object. As I mentioned earlier, the notification contains the changes, inserts, updates, and deletes, of the private managed object context. It is key to call this method on the thread the main managed object context was created on, the main thread. That's why we dispatch this call to the main thread.

Putting the TSPImportOperation class to use is as simple as initializing an instance, setting its mainManagedObjectContext property, and adding the operation to an operation queue.

// Initialize Import Operation
TSPImportOperation *operation = [[TSPImportOperation alloc] init];

// Configure Import Operation
[operation setMainManagedObjectContext:self.managedObjectContext];

// Add to Operation Queue
[self.operationQueue addOperation:operation];

Strategy 2: Parent/Child Managed Object Contexts

Since iOS 6, there's an even better, more elegant strategy. Let's revisit the TSPImportOperation class and leverage parent/child managed object contexts. The concept behind parent/child managed object contexts is simple but powerful. Let me explain how it works.

A child managed object context is dependent on its parent managed object context for saving its changes to the corresponding persistent store. In fact, a child managed object context doesn't have access to a persistent store coordinator. Whenever a child managed object context is saved, the changes it contains are pushed to the parent managed object context. There's no need to use notifications to manually merge the changes into the main or parent managed object context.

Another benefit is performance. Because the child managed object context doesn't have access to the persistent store coordinator, the changes aren't pushed to the latter when the child managed object context is saved. Instead, the changes are pushed to the parent managed object context, dirtying it. The changes are not automatically propagated to the persistent store coordinator.

Managed object contexts can be nested. A child managed object context can have a child managed object context of its own. The same rules apply. However, it's important to remember that the changes that are pushed up to the parent managed object context are not pushed down to any other child managed object contexts. If child A pushed its changes to its parent, then child B is unaware of these changes.

Creating a child managed object context is slightly different from what we've seen so far. A child managed object context uses a different initializer, initWithConcurrencyType:. The concurrency type the initializer accepts defines the managed object context's threading model. Let's look at each concurrency type.

NSMainQueueConcurrencyType: The managed object context is only accessible from the main thread. An exception is thrown if you try to access it from any other thread.
NSPrivateQueueConcurrencyType: When creating a managed object context with a concurrency type of NSPrivateQueueConcurrencyType, the managed object context is associated with a private queue and it can only be accessed from that private queue.
NSConfinementConcurrencyType: This is the concurrency type that corresponds with the thread confinement concept we explored earlier. If you create a managed object context using the init method, the concurrency type of that managed object context is NSConfinementConcurrencyType.

There are two key methods that were added to the Core Data framework when Apple introduced parent/child managed object contexts, performBlock: and performBlockAndWait:. Both methods will make your life much easier. When you call performBlock: on a managed object context and pass in a block of code to execute, Core Data makes sure that the block is executed on the correct thread. In the case of the NSPrivateQueueConcurrencyType concurrency type, this means that the block is executed on the private queue of that managed object context.

The difference between performBlock: and performBlockAndWait: is simple. The performBlock: method doesn't block the current thread. It accepts the block, schedules it for execution on the correct queue, and continues with the execution of the next statement.

The performBlockAndWait: method, however, is blocking. The thread from which performBlockAndWait: is called waits for the block that is passed to the method to finish before executing the next statement. The advantage is that nested calls to performBlockAndWait: are executed in order.

To end this article, I'd like to refactor the TSPImportOperation class to take advantage of parent/child managed object contexts. You'll quickly notice that it greatly simplifies the TSPImportOperation subclass.

The header remains unchanged, but the main method changes quite a bit. Take a look at its updated implementation below.

- (void)main {
    // Initialize Managed Object Context
    self.privateManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
    
    // Configure Managed Object Context
    [self.privateManagedObjectContext setParentContext:self.mainManagedObjectContext];
    
    // Do Some Work
    // ...
    
    if ([self.privateManagedObjectContext hasChanges]) {
        // Save Changes
        NSError *error = nil;
        [self.privateManagedObjectContext save:&error];
    }
}

That's it. The main managed object context is the parent of the private managed object context. Note that we don't set the persistentStoreCoordinator property of the private managed object context and we don't add the operation as an observer for NSManagedObjectContextDidSaveNotification notifications. When the private managed object context is saved, the changes are automatically pushed to its parent managed object context. Core Data ensures that this happens on the correct thread. It's up to the main managed object context, the parent managed object context, to push the changes to the persistent store coordinator.

Conclusion

Concurrency isn't easy to grasp or implement, but it's naive to think that you'll never come across a situation in which you need to perform Core Data operations on a background thread.

In the next two articles, I'll tell you about iOS 8 and Core Data. Apple introduced a number of new APIs in iOS 8 and OS X 10.10, including batch updating and asynchronous fetching.

HIGHLIGHTS OF THE DAY