Let polymorphism

I want to do a post about another unexpected typing quirk, but I think I need to write this as background to it, first.

If, having some passing familiarity with an ML-like language, and knowing what unification was, you were asked to run off and write a type inference algorithm, you’d probably come up with something like the Hindley-Milner algorithm, but with one little exception.

let id x = x
 in (id "hi", id False)

It’d probably raise a type error on the above code, unless you already knew otherwise.  The problem is that the type of “id” would be refined to “String -> String”, and then it’d raise an error when it’s given a Boolean.

To someone who has never had this explicitly pointed out to them (or tried to implement type inference), it’s easy to think that this should just work, and not realize at first that anything special has to be done to accommodate it.

The trick that’s employed is to give let special meaning (as opposed to the sugared view that let is just an abstraction and application), where all new type variables in the definition’s inferred type are quantified and it’s given a type schema, instead of a plain old type.

That is, even though we write

id :: a -> a
we really mean
id :: forall a. a -> a

This is achieved by generalization, and is what we call “let polymorphism.”  The result is that there are two kinds of bindings in the environment: binding names to non-schematic types, and binding names to type schemas, which come with a set of generalized variables.

Then, every where in the let body where ‘id’ appears, we freshen those quantified variables, and use that as the plain old (non-schematic) type of the identifier.  So schemas end up sort of isolated in the environment, never appearing elsewhere. (Until we add more features to the type system like rank 2 (or higher) types!)

On the face of it, this is sort of uninteresting. But it’s also the source of endless subtleties, one of which will be the subject of my next post.

Some other interesting notes:

The reason “a -> a”, when written in a type signature, is equivalent to “forall a. a -> a” is that the “a” is really a new Skolem constant, and unless it’s a type variable that’s already in scope, then it’s guaranteed to meet the “newness” criteria and end up being generalized.  Generalization isn’t really “turned off” until a “forall” appears in the type annotation.

While just requiring the type variable to be new (i.e. not appear anywhere in the typing context) in order to generalize it is enough for simple HM style type systems, when the constraints get more complicated, generalization gets a lot harder and more complicated.  To the point where some have advocated for its removal for nested lets. (Since top-level lets don’t really have the newness issue to worry about.)

This entry was posted in Typing Tidbits. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s