Core Data and Swift: Concurrency

If you're developing a small or simple application, then you probably don't see the benefit of running Core Data operations in the background. However, what would happen if you imported hundreds or thousands of records on the main thread during the first launch of your application? The consequences could be dramatic. For example, your application could be killed by Apple's watchdog for taking too long to launch.

In this article, we take a look at the dangers when using Core Data on multiple threads and we explore several solutions to tackle the problem.

1. Thread Safety

When working with Core Data, it's important to always remember that Core Data isn't thread safe. Core Data expects to be run on a single thread. This doesn't mean that every Core Data operation needs to be performed on the main thread, which is true for UIKit, but it does mean that you need to be mindful which operations are executed on which threads. It also means that you need to be careful how changes from one thread are propagated to other threads.

Working with Core Data on multiple threads is actually very simple from a theoretical point of view. NSManagedObject, NSManagedObjectContext, and NSPersistentStoreCoordinator aren't thread safe. Instances of these classes should only be accessed from the thread they were created on. As you can imagine, this becomes a bit more complex in practice.

`NSManagedObject`

We already know that NSManagedObject isn't thread safe, but how do you access a record from different threads? An NSManagedObject instance has an objectID property that returns an instance of the NSManagedObjectID class. The NSManagedObjectID class is thread safe and an instance of this class contains all the information a managed object context needs to fetch the corresponding managed object.

// Object ID Managed Object
let objectID = managedObject.objectID

In the following code snippet, we ask a managed object context for the managed object that corresponds with objectID. The objectWithID(_:) and existingObjectWithID(_:) methods return a local version—local to the current thread—of the corresponding managed object.

// Fetch Managed Object
let managedObject = managedObjectContext.objectWithID(objectID)

// OR

do {
    let managedObject = try managedObjectContext.existingObjectWithID(objectID)
} catch {
    let fetchError = error as NSError
    print("\(fetchError), \(fetchError.userInfo)")
}

The basic rule to remember is not to pass the NSManagedObject instance from one thread to another. Instead, pass the managed object's objectID and ask the thread's managed object context for a local version of the record.

`NSManagedObjectContext`

Because the NSManagedObjectContext class isn't thread safe, we could create a managed object context for every thread that interacts with Core Data. This strategy is often referred to as thread confinement.

A common approach is to store the managed object context in the thread's dictionary, a dictionary to store data that is specific to the thread. Take a look at the following example to see how this works in practice.

// Add Object to Thread Dictionary
let currentThread = NSThread.currentThread()
currentThread.threadDictionary.setObject(managedObjectContext, forKey: "managedObjectContext")

Not too long ago, Apple recommended this approach. Even though it works fine, there is another and better option that Apple recommends nowadays. We'll look at this option in a few moments.

`NSPersistentStoreCoordinator`

What about the persistent store coordinator? Do you need to create a separate persistent store coordinator for every thread. While this is possible and one of the strategies Apple used to recommend, this isn't necessary.

The NSPersistentStoreCoordinator class was designed to support multiple managed object contexts, even if those managed object contexts were created on different threads. Because the NSManagedObjectContext class locks the persistent store coordinator while accessing it, it is possible for multiple managed object contexts to use the same persistent store coordinator even if those managed object contexts live on different threads. This makes a multithreaded Core Data setup much more manageable and less complex.

2. Concurrency Strategies

So far, we've learned that you need multiple managed object contexts if you perform Core Data operations on multiple threads. The caveat, however, is that managed object contexts are unaware of each others existence. Changes made to a managed object in one managed object context are not automatically propagated to other managed object contexts. How do we solve this problem?

There are two popular strategies that Apple recommends, notifications and parent-child managed object contexts. Let's look at each strategy and investigate their pros and cons.

The scenario we'll take as an example is an NSOperation subclass that performs work in the background and accesses Core Data on the operation's background thread. This example will show you the differences and advantages of each strategy.

Strategy 1: Notifications

Earlier in this series, I introduced you to the NSFetchedResultsController class and you learned that a managed object context posts three types of notifications:

NSManagedObjectContextObjectsDidChangeNotification: This notification is posted when one of the managed objects of the managed object context has changed.
NSManagedObjectContextWillSaveNotification: This notification is posted before the managed object context performs a save operation.
NSManagedObjectContextDidSaveNotification: This notification is posted after the managed object context performs a save operation.

When a managed object context saves its changes to a persistent store, via the persistent store coordinator, other managed object contexts may want to know about those changes. This is very easy to do and it's even easier to include or merge the changes into another managed object context. Let's talk code.

We create a non-concurrent operation that does some work in the background and needs access to Core Data. This is what the implementation of the NSOperation subclass could look like.

import UIKit
import CoreData

class Operation: NSOperation {

    let mainManagedObjectContext: NSManagedObjectContext
    var privateManagedObjectContext: NSManagedObjectContext!
    
    init(managedObjectContext: NSManagedObjectContext) {
        mainManagedObjectContext = managedObjectContext
        
        super.init()
    }
    
    override func main() {
        // Initialize Managed Object Context
        privateManagedObjectContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType)
        
        // Configure Managed Object Context
        privateManagedObjectContext.persistentStoreCoordinator = mainManagedObjectContext.persistentStoreCoordinator
        
        // Add Observer
        let notificationCenter = NSNotificationCenter.defaultCenter()
        notificationCenter.addObserver(self, selector: "managedObjectContextDidSave:", name: NSManagedObjectContextDidSaveNotification, object: privateManagedObjectContext)
        
        // Do Some Work
        // ...
        
        if privateManagedObjectContext.hasChanges {
            do {
                try privateManagedObjectContext.save()
            } catch {
                // Error Handling
                // ...
            }
        }
    }

}

There are a few important details that need to be clarified. We initialize the private managed object context and set its persistent store coordinator property using the mainManagedObjectContext object. This is perfectly fine, because we don't access the mainManagedObjectContext, we only ask it for its reference to the application's persistent store coordinator. We don't violate the thread confinement rule.

It is essential to initialize the private managed object context in the operation's main() method, because this method is executed on the background thread on which the operation runs. Can't we initialize the managed object context in the operation's init(managedObjectContext:) method? The answer is no. The operation's init(managedObjectContext:) method is run on the thread on which the Operation instance is initialized, which is most likely the main thread. This would defeat the purpose of a private managed object context.

In the operation's main() method, we add the Operation instance as an observer of any NSManagedObjectContextDidSaveNotification notifications posted by the private managed object context.

We then do the work the operation was created for and save the changes of the private managed object context, which will trigger a NSManagedObjectContextDidSaveNotification notification. Let's take a look at what happens in the managedObjectContextDidSave(_:) method.

// MARK: -
// MARK: Notification Handling
func managedObjectContextDidSave(notification: NSNotification) {
    dispatch_async(dispatch_get_main_queue()) { () -> Void in
        self.mainManagedObjectContext.mergeChangesFromContextDidSaveNotification(notification)
    }
}

As you can see, its implementation is short and simple. We call mergeChangesFromContextDidSaveNotification(_:) on the main managed object context, passing in the notification object. As I mentioned earlier, the notification contains the changes, inserts, updates, and deletes, of the managed object context that posted the notification.

It is key to call this method on the thread the main managed object context was created on, the main thread. That's why we dispatch this call to the queue of the main thread. To make this easier and more transparant, you can use performBlock(_:) or performBlockAndWait(_:) to ensure merging the changes takes place on the queue of the managed object context. We'll talk more about these methods later in this article.

// MARK: -
// MARK: Notification Handling
func managedObjectContextDidSave(notification: NSNotification) {
    mainManagedObjectContext.performBlock { () -> Void in
        self.mainManagedObjectContext.mergeChangesFromContextDidSaveNotification(notification)
    }
}

Putting the Operation class to use is as simple as initializing an instance, passing a managed object context, and adding the operation to an operation queue.

// Initialize Import Operation
let operation = Operation(managedObjectContext: managedObjectContext)

// Add to Operation Queue
operationQueue.addOperation(operation)

Strategy 2: Parent/Child Managed Object Contexts

Since iOS 6, there's an even better, more elegant strategy. Let's revisit the Operation class and leverage parent/child managed object contexts. The concept behind parent/child managed object contexts is simple but powerful. Let me explain how it works.

A child managed object context is dependent on its parent managed object context for saving its changes to the corresponding persistent store. In fact, a child managed object context doesn't have access to a persistent store coordinator. Whenever a child managed object context is saved, the changes it contains are pushed to the parent managed object context. There's no need to use notifications to manually merge the changes into the main or parent managed object context.

Another benefit is performance. Because the child managed object context doesn't have access to the persistent store coordinator, the changes aren't pushed to the latter when the child managed object context is saved. Instead, the changes are pushed to the parent managed object context, making it dirty. The changes are not automatically propagated to the persistent store coordinator.

Using ParentChild Managed Object Contexts

Managed object contexts can be nested. A child managed object context can have a child managed object context of its own. The same rules apply. However, it's important to remember that the changes that are pushed up to the parent managed object context are not pushed down to any other child managed object contexts. If child A pushes its changes to its parent, then child B is unaware of these changes.

Creating a child managed object context is only slightly different from what we've seen so far. We initialize a child managed object context by invoking init(concurrencyType:). The concurrency type the initializer accepts defines the managed object context's threading model. Let's look at each concurrency type.

MainQueueConcurrencyType: The managed object context is only accessible from the main thread. An exception is thrown if you try to access it from any other thread.
PrivateQueueConcurrencyType: When creating a managed object context with a concurrency type of PrivateQueueConcurrencyType, the managed object context is associated with a private queue and it can only be accessed from that private queue.
ConfinementConcurrencyType: This is the concurrency type that corresponds with the thread confinement concept we explored earlier. If you create a managed object context using init(), the concurrency type of that managed object context is ConfinementConcurrencyType. Apple has deprecated this concurrency type as of iOS 9. This also means that init() is deprecated as of iOS 9.

There are two key methods that were added to the Core Data framework when Apple introduced parent/child managed object contexts, performBlock(_:) and performBlockAndWait(_:). Both methods will make your life much easier. When you call performBlock(_:) on a managed object context and pass in a block of code to execute, Core Data makes sure that the block is executed on the correct thread. In the case of the PrivateQueueConcurrencyType concurrency type, this means that the block is executed on the private queue of that managed object context.

The difference between performBlock(_:) and performBlockAndWait(_:) is simple. The performBlock(_:) method doesn't block the current thread. It accepts the block, schedules it for execution on the correct queue, and continues with the execution of the next statement.

The performBlockAndWait(_:) method, however, is blocking. The thread from which performBlockAndWait(_:) is called waits for the block that is passed to the method to finish before executing the next statement. The advantage is that nested calls to performBlockAndWait(_:) are executed in order.

To end this article, I'd like to refactor the Operation class to take advantage of parent/child managed object contexts. You'll quickly notice that it greatly simplifies the NSOperation subclass we created. The main() method changes quite a bit. Take a look at its updated implementation below.

override func main() {
    // Initialize Managed Object Context
    privateManagedObjectContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType)
    
    // Configure Managed Object Context
    privateManagedObjectContext.parentContext = mainManagedObjectContext
    
    // Do Some Work
    // ...
    
    if privateManagedObjectContext.hasChanges {
        do {
            try privateManagedObjectContext.save()
        } catch {
            // Error Handling
            // ...
        }
    }
}

That's it. The main managed object context is the parent of the private managed object context. Note that we don't set the persistentStoreCoordinator property of the private managed object context and we don't add the operation as an observer for NSManagedObjectContextDidSaveNotification notifications. When the private managed object context is saved, the changes are automatically pushed to its parent managed object context. Core Data ensures that this happens on the correct thread. It's up to the main managed object context, the parent managed object context, to push the changes to the persistent store coordinator.

Conclusion

Concurrency isn't easy to grasp or implement, but it's naive to think that you'll never come across a situation in which you need to perform Core Data operations on a background thread.

HIGHLIGHTS OF THE DAY