If you're developing a small or simple application, then you probably don't see the benefit of running Core Data operations in the background. However, what would happen if you imported hundreds or thousands of records on the main thread during the first launch of your application? The consequences could be dramatic. For example, your application could be killed by Apple's watchdog for taking too long to launch.
In this article, we take a look at the dangers when using Core Data on multiple threads and we explore several solutions to tackle the problem.
1. Thread Safety
When working with Core Data, it's important to always remember that Core Data isn't thread safe. Core Data expects to be run on a single thread. This doesn't mean that every Core Data operation needs to be performed on the main thread, which is true for UIKit, but it does mean that you need to be mindful which operations are executed on which threads. It also means that you need to be careful how changes from one thread are propagated to other threads.
Working with Core Data on multiple threads is actually very simple from a theoretical point of view. NSManagedObject
, NSManagedObjectContext
, and NSPersistentStoreCoordinator
aren't thread safe. Instances of these classes should only be accessed from the thread they were created on. As you can imagine, this becomes a bit more complex in practice.
NSManagedObject
We already know that NSManagedObject
isn't thread safe, but how do you access a record from different threads? An NSManagedObject
instance has an objectID
property that returns an instance of the NSManagedObjectID
class. The NSManagedObjectID
class is thread safe and an instance of this class contains all the information a managed object context needs to fetch the corresponding managed object.
// Object ID Managed Object let objectID = managedObject.objectID
In the following code snippet, we ask a managed object context for the managed object that corresponds with objectID
. The objectWithID(_:)
and existingObjectWithID(_:)
methods return a local version—local to the current thread—of the corresponding managed object.
// Fetch Managed Object let managedObject = managedObjectContext.objectWithID(objectID) // OR do { let managedObject = try managedObjectContext.existingObjectWithID(objectID) } catch { let fetchError = error as NSError print("\(fetchError), \(fetchError.userInfo)") }
The basic rule to remember is not to pass the NSManagedObject
instance from one thread to another. Instead, pass the managed object's objectID
and ask the thread's managed object context for a local version of the record.
NSManagedObjectContext
Because the NSManagedObjectContext
class isn't thread safe, we could create a managed object context for every thread that interacts with Core Data. This strategy is often referred to as thread confinement.
A common approach is to store the managed object context in the thread's dictionary, a dictionary to store data that is specific to the thread. Take a look at the following example to see how this works in practice.
// Add Object to Thread Dictionary let currentThread = NSThread.currentThread() currentThread.threadDictionary.setObject(managedObjectContext, forKey: "managedObjectContext")
Not too long ago, Apple recommended this approach. Even though it works fine, there is another and better option that Apple recommends nowadays. We'll look at this option in a few moments.
NSPersistentStoreCoordinator
What about the persistent store coordinator? Do you need to create a separate persistent store coordinator for every thread. While this is possible and one of the strategies Apple used to recommend, this isn't necessary.
The NSPersistentStoreCoordinator
class was designed to support multiple managed object contexts, even if those managed object contexts were created on different threads. Because the NSManagedObjectContext
class locks the persistent store coordinator while accessing it, it is possible for multiple managed object contexts to use the same persistent store coordinator even if those managed object contexts live on different threads. This makes a multithreaded Core Data setup much more manageable and less complex.
2. Concurrency Strategies
So far, we've learned that you need multiple managed object contexts if you perform Core Data operations on multiple threads. The caveat, however, is that managed object contexts are unaware of each others existence. Changes made to a managed object in one managed object context are not automatically propagated to other managed object contexts. How do we solve this problem?
There are two popular strategies that Apple recommends, notifications and parent-child managed object contexts. Let's look at each strategy and investigate their pros and cons.
The scenario we'll take as an example is an NSOperation
subclass that performs work in the background and accesses Core Data on the operation's background thread. This example will show you the differences and advantages of each strategy.
Strategy 1: Notifications
Earlier in this series, I introduced you to the NSFetchedResultsController
class and you learned that a managed object context posts three types of notifications:
-
NSManagedObjectContextObjectsDidChangeNotification
: This notification is posted when one of the managed objects of the managed object context has changed. -
NSManagedObjectContextWillSaveNotification
: This notification is posted before the managed object context performs a save operation. -
NSManagedObjectContextDidSaveNotification
: This notification is posted after the managed object context performs a save operation.
When a managed object context saves its changes to a persistent store, via the persistent store coordinator, other managed object contexts may want to know about those changes. This is very easy to do and it's even easier to include or merge the changes into another managed object context. Let's talk code.
We create a non-concurrent operation that does some work in the background and needs access to Core Data. This is what the implementation of the NSOperation
subclass could look like.
import UIKit import CoreData class Operation: NSOperation { let mainManagedObjectContext: NSManagedObjectContext var privateManagedObjectContext: NSManagedObjectContext! init(managedObjectContext: NSManagedObjectContext) { mainManagedObjectContext = managedObjectContext super.init() } override func main() { // Initialize Managed Object Context privateManagedObjectContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType) // Configure Managed Object Context privateManagedObjectContext.persistentStoreCoordinator = mainManagedObjectContext.persistentStoreCoordinator // Add Observer let notificationCenter = NSNotificationCenter.defaultCenter() notificationCenter.addObserver(self, selector: "managedObjectContextDidSave:", name: NSManagedObjectContextDidSaveNotification, object: privateManagedObjectContext) // Do Some Work // ... if privateManagedObjectContext.hasChanges { do { try privateManagedObjectContext.save() } catch { // Error Handling // ... } } } }
There are a few important details that need to be clarified. We initialize the private managed object context and set its persistent store coordinator property using the mainManagedObjectContext
object. This is perfectly fine, because we don't access the mainManagedObjectContext
, we only ask it for its reference to the application's persistent store coordinator. We don't violate the thread confinement rule.
It is essential to initialize the private managed object context in the operation's main()
method, because this method is executed on the background thread on which the operation runs. Can't we initialize the managed object context in the operation's init(managedObjectContext:)
method? The answer is no. The operation's init(managedObjectContext:)
method is run on the thread on which the Operation
instance is initialized, which is most likely the main thread. This would defeat the purpose of a private managed object context.
In the operation's main()
method, we add the Operation
instance as an observer of any NSManagedObjectContextDidSaveNotification
notifications posted by the private managed object context.
We then do the work the operation was created for and save the changes of the private managed object context, which will trigger a NSManagedObjectContextDidSaveNotification
notification. Let's take a look at what happens in the managedObjectContextDidSave(_:)
method.
// MARK: - // MARK: Notification Handling func managedObjectContextDidSave(notification: NSNotification) { dispatch_async(dispatch_get_main_queue()) { () -> Void in self.mainManagedObjectContext.mergeChangesFromContextDidSaveNotification(notification) } }
As you can see, its implementation is short and simple. We call mergeChangesFromContextDidSaveNotification(_:)
on the main managed object context, passing in the notification object. As I mentioned earlier, the notification contains the changes, inserts, updates, and deletes, of the managed object context that posted the notification.
It is key to call this method on the thread the main managed object context was created on, the main thread. That's why we dispatch this call to the queue of the main thread. To make this easier and more transparant, you can use performBlock(_:)
or performBlockAndWait(_:)
to ensure merging the changes takes place on the queue of the managed object context. We'll talk more about these methods later in this article.
// MARK: - // MARK: Notification Handling func managedObjectContextDidSave(notification: NSNotification) { mainManagedObjectContext.performBlock { () -> Void in self.mainManagedObjectContext.mergeChangesFromContextDidSaveNotification(notification) } }
Putting the Operation
class to use is as simple as initializing an instance, passing a managed object context, and adding the operation to an operation queue.
// Initialize Import Operation let operation = Operation(managedObjectContext: managedObjectContext) // Add to Operation Queue operationQueue.addOperation(operation)
Strategy 2: Parent/Child Managed Object Contexts
Since iOS 6, there's an even better, more elegant strategy. Let's revisit the Operation
class and leverage parent/child managed object contexts. The concept behind parent/child managed object contexts is simple but powerful. Let me explain how it works.
A child managed object context is dependent on its parent managed object context for saving its changes to the corresponding persistent store. In fact, a child managed object context doesn't have access to a persistent store coordinator. Whenever a child managed object context is saved, the changes it contains are pushed to the parent managed object context. There's no need to use notifications to manually merge the changes into the main or parent managed object context.
Another benefit is performance. Because the child managed object context doesn't have access to the persistent store coordinator, the changes aren't pushed to the latter when the child managed object context is saved. Instead, the changes are pushed to the parent managed object context, making it dirty. The changes are not automatically propagated to the persistent store coordinator.
Managed object contexts can be nested. A child managed object context can have a child managed object context of its own. The same rules apply. However, it's important to remember that the changes that are pushed up to the parent managed object context are not pushed down to any other child managed object contexts. If child A pushes its changes to its parent, then child B is unaware of these changes.
Creating a child managed object context is only slightly different from what we've seen so far. We initialize a child managed object context by invoking init(concurrencyType:)
. The concurrency type the initializer accepts defines the managed object context's threading model. Let's look at each concurrency type.
-
MainQueueConcurrencyType
: The managed object context is only accessible from the main thread. An exception is thrown if you try to access it from any other thread. -
PrivateQueueConcurrencyType
: When creating a managed object context with a concurrency type ofPrivateQueueConcurrencyType
, the managed object context is associated with a private queue and it can only be accessed from that private queue.
-
ConfinementConcurrencyType
: This is the concurrency type that corresponds with the thread confinement concept we explored earlier. If you create a managed object context usinginit()
, the concurrency type of that managed object context isConfinementConcurrencyType
. Apple has deprecated this concurrency type as of iOS 9. This also means thatinit()
is deprecated as of iOS 9.
There are two key methods that were added to the Core Data framework when Apple introduced parent/child managed object contexts, performBlock(_:)
and performBlockAndWait(_:)
. Both methods will make your life much easier. When you call performBlock(_:)
on a managed object context and pass in a block of code to execute, Core Data makes sure that the block is executed on the correct thread. In the case of the PrivateQueueConcurrencyType
concurrency type, this means that the block is executed on the private queue of that managed object context.
The difference between performBlock(_:)
and performBlockAndWait(_:)
is simple. The performBlock(_:)
method doesn't block the current thread. It accepts the block, schedules it for execution on the correct queue, and continues with the execution of the next statement.
The performBlockAndWait(_:)
method, however, is blocking. The thread from which performBlockAndWait(_:)
is called waits for the block that is passed to the method to finish before executing the next statement. The advantage is that nested calls to performBlockAndWait(_:)
are executed in order.
To end this article, I'd like to refactor the Operation
class to take advantage of parent/child managed object contexts. You'll quickly notice that it greatly simplifies the NSOperation
subclass we created. The main()
method changes quite a bit. Take a look at its updated implementation below.
override func main() { // Initialize Managed Object Context privateManagedObjectContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType) // Configure Managed Object Context privateManagedObjectContext.parentContext = mainManagedObjectContext // Do Some Work // ... if privateManagedObjectContext.hasChanges { do { try privateManagedObjectContext.save() } catch { // Error Handling // ... } } }
That's it. The main managed object context is the parent of the private managed object context. Note that we don't set the persistentStoreCoordinator
property of the private managed object context and we don't add the operation as an observer for NSManagedObjectContextDidSaveNotification
notifications. When the private managed object context is saved, the changes are automatically pushed to its parent managed object context. Core Data ensures that this happens on the correct thread. It's up to the main managed object context, the parent managed object context, to push the changes to the persistent store coordinator.
Conclusion
Concurrency isn't easy to grasp or implement, but it's naive to think that you'll never come across a situation in which you need to perform Core Data operations on a background thread.
Comments