Protocol Extensions and Polymorphism

August 6, 2015

Update March 7, 2016: Doug Gregor wrote about this problem on swift-evolution

This post is also available as a playground.

Seemingly the most-talked-about session from this year’s WWDC was Session 408: “Protocol-Oriented Programming in Swift”. Since Apple made all WWDC videos available to the public this year, you can watch it on Apple’s Website, or read a transcribed version on ASCIIwwdc. As promised, it’s a really interesting talk about how most abstractions should be modelled in Swift as protocols instead of base classes, and how that meshes well with generics in order to produce nicely architected code. Lots has been written on the subject, so I won’t repeat it all here – you can watch the session or read any of the number of blog posts on it.

However, as I was watching the talk, I noticed a quirk with how protocol extensions work that I think is going to confuse people and lead to difficult-to-track-down bugs. I tweeted a screenshot of a playgound illustrating this. But unless you’ve been following along closely with Swift’s development, it probably doesn’t make much sense:

More @SwiftLang weirdness, discovered in the Protocol-Oriented Programming talk from WWDC. Powerful and confusing. pic.twitter.com/qhVopRatbN

— Jason Sadler (@sadlerjw) July 14, 2015

So here’s the more in-depth explanation.

The setup

Apple’s vision of protocol-oriented programming is built on top of Swift’s protocol extensions and default protocol implementations. (If you’re coming from a non-Apple development background, substitute “interface” for “protocol.”)

As an aside, the main feature that enables Apple’s vision of protocol-oriented programming (default protocol implementations) would have solved a big architectural problem for us at work. We use Core Data for many of our data objects, but not all of them. Therefore we have two base classes for model objects: one that inherits from NSObject and one from NSManagedObject. These two base classes provide almost identical base implementation for a bunch of functionality, which we’ve formalized by making both of them conform to a Model protocol. Each of the required methods from that protocol are implemented – usually identically – in the two base classes that conform to it. Default protocol implementations would have saved an awful lot of copy and paste in this situation!

Let’s say we have a protocol for a piece of our app that prints out textual representations of emotions:

protocol EmotionPrinter {
    func printHappiness() -> Void
    func printSadness() -> Void
}

This allows various implementations of the protocol to decide:

what they’re going to print
how they’re going to print it

But we want to be nice to implementers. We want them to be able to ignore one (or both) of these decisions if they don’t care about it. Let’s provide some default implementations:

extension EmotionPrinter {
    func printHappiness() -> Void {
        output("HAPPY")
    }
    func printSadness() -> Void {
        output("SAD")
    }

    func output(emotion: String) -> Void {
        print(emotion)
    }
}

To check it works, we can create an implementation that just uses the default implementations:

class BasicEmotionPrinter : EmotionPrinter {}
let basicEmotionPrinter = BasicEmotionPrinter()

basicEmotionPrinter.printHappiness() // Prints "HAPPY"

Now we’ll create a custom implementation which wants control over both what is printed and how it’s printed.

I’m too lazy here to invent an alternate method of printing. You could imagine rendering this in a UILabel or something. I’m just prepending some text and using print in this implementation of output.

class EmojiEmotionPrinter : EmotionPrinter {
    func printHappiness() -> Void {
        output("😀")
    }
    func printSadness() -> Void {
        output("😭")
    }

    func output(emotion: String) -> Void {
        print("Emoji Printer 3000 says \(emotion)")
    }
}

let emojiPrinter = EmojiEmotionPrinter()

emojiPrinter.printSadness()    // Prints "Emoji Printer 3000 says 😭"

Additionally, (as expected) even if the compiler doesn’t know the concrete type of an EmotionPrinter, the right version of output still gets called from inside each implementation of printHappiness or printSadness. Here the compiler types printer as EmotionPrinter.

let printerArray : [EmotionPrinter] = [basicEmotionPrinter, emojiPrinter]
for printer in printerArray {
    printer.printHappiness()
}

results in:

HAPPY
Emoji Printer 3000 says 😀

The confusing bit

Okay, so everything has been pretty straightforward so far. We have a default implementation that does one thing, and an implementation that overrides the defaults, and successfully does its thing, too.

Here’s where things get tricky. Let’s say we do something funky, like calling output directly on an instance of a type that implements EmotionPrinter. (And here’s where my example falls down a bit. In this example, it would take a very foolish programmer to call myEmotionPrinter.output("blah") but we’rd going to do exactly that.)

for printer in printerArray {
    printer.output("I am a banana 🍌");
}

I’d expect this code to produce the output:

I am a banana 🍌
Emoji Printer 3000 says I am a banana 🍌

But instead the output is:

I am a banana 🍌
I am a banana 🍌

Here’s an even weirder way to illustrate the same thing:

let castEmojiPrinter : EmotionPrinter = emojiPrinter;

emojiPrinter.output("🍌")      // Prints Emoji Printer 3000 says 🍌
castEmojiPrinter.output("🍌")  // Prints 🍌

emojiPrinter and castEmojiPrinter are two references to the same point in memory. I find it very surprising that based on what the compiler knows about the type (as opposed to the actual runtime type) can cause method dispatch to behave differently.

Apple’s rationale

As Dave Abrahams says in the WWDC talk:

You might ask, ‘what does it means to have a requirement that’s also fulfilled immediately in an extension?’ Good question.

The answer is that a protocol requirement creates a customization point.

A protocol requirement is anything that goes in your protocol declaration, like printHappiness and printSadness in my example, and is considered a customization point. Anything else that goes into a protocol extension is not. The latter is something that the default implementation wants to maintain control over.

This makes sense on the face of it. When you’re doing class-based polymorphism, you mark some methods as public or protected to allow subclasses to customize behaviour, and others as private or final to prevent that. In many cases what Apple’s done here is desired behaviour, and I think it forms an important feature in their approach to protocol-oriented programming.

But it’s a subtle and important difference in how the entire rest of the type system works in Swift. Everywhere else we have some form of dynamic dispatch – the runtime type determines which implementation of a method gets run, but here we have the compile-time type making that decision.

We made a mistake

With this “customization point” rationale in mind – that protocol requirements are customization points and methods in a protocol extension are not – this means that we made a mistake in modeling our protocol. I said that we wanted implementations to be able to decide what to print and how to print it. That means that output is a customization point and should have been included in the protocol definition:

protocol EmotionPrinter {
    func printHappiness() -> Void
    func printSadness() -> Void
    func output(emotion: String) -> Void
}

Try it out in the playground – it works the way you’d expect.

Language design is hard

If I were designing Swift (which, I should make clear, I should not be allowed to do) I think I would favour consistency with the rest of the type system over the expressiveness of “customization points” mixed in with other methods.

My first thought would be to mark the protocol extension’s implementation of output as final. That way by default everything is a customization point, similar to traditional subclassing, except for things specifically marked as “you can’t override this.”

But that doesn’t work because of the nature of protocol extensions. It’s conceivable that the EmotionPrinter protocol and EmojiEmotionPrinter implementation could be provided as part of a system library or 3rd party framework. Then I might have written the protocol extension on EmotionPrinter, providing default implementations for any types I was writing that conformed to that protocol.

And here’s the crux: by definition, any type that conforms to a protocol was designed with the knowledge of the protocol’s requirements. But it may not have been designed with the knowledge of any extensions on that protocol. EmojiEmotionPrinter implements a method output. But if I add a protocol extension with a final implementation of a method called output, that would have to cause a compile error, since EmojiEmotionPrinter is now overriding a method marked as final. And I think Apple chose to avoid this situation because it’s nonsense for code that a consumer of a framework writes to cause compile errors in the framework on which that code depends.

That situation would be both confusing and destructive. Code X that doesn’t depend on code Y shouldn’t fail to compile because of some change in code Y.

Epilogue

In a traditional subclassing pattern, EmojiEmotionPrinter would inherit from a base class that provided default implementations – and by definition would be aware of the existence of an output method in its superclass. But in this case it’s being modified after the fact, by a third party to conform to a protocol and its default implementation. Apple’s rules on dynamic vs static dispatch here are just one of many equally-or-more-confusing solutions to this problem. In the end, it’s just another quirk of Swift (and there are many!) that we’re going to have to keep an eye out for.

∞