30

When I first learned Haskell, I very quickly came to love parametric polymorphism. It's a delightfully simple idea that works astonishingly well. The whole "if it compiles it usually works right" thing is mostly due to parametric polymorphism, IMHO.

But the other day, something occurred to me. I can write foo as a polymorphic function. But when bar calls foo, it will do so with a specific set of argument types. Or, if bar itself is polymorphic, then its caller will assign definite types. By induction, it seems that if you were to take any valid Haskell program and analyse the entire codebase, you can statically determine the type of every single thing in the entire program.

This, in a sense, is a bit like C++ templates. There is no run-time polymorphism, only compile-time polymorphism. A Haskell compiler could choose to generate separate machine code for every type at which each polymorphic function is called. Most Haskell compilers don't, but you could implement one if you wanted to.

Only if you start adding Haskell extensions (ExistentialQuantification is the obvious one) do you start to get real run-time polymorphism, where you have values who's type cannot be statically computed.

Oh, yeah, my question?

  1. Are the statements above actually correct?

  2. Is there a widely-used name for this property?

2

3 Answers 3

31

Haskell (with no extensions) permits polymorphic recursion, and this feature alone makes it impossible to statically specialize a program to a completely monomorphic one. Here is a program that will print an N-deep nested list, where N is a command-line parameter:

import System

foo :: Show a => Int -> a -> IO ()
foo 0 x = print x
foo n x = foo (n-1) [x]

main = do [num_lists] <- getArgs
          foo (read num_lists) 0

In the first call to foo, a has type Int. In the next recursive call, it has type [Int], then [[Int]], and so forth.

If polymorphic recursion is prohibited, then I believe it's possible to statically specialize a program.

7
  • 1
    Wow. OK, now can you think up a useful example of polymorphic recursion? ;-) [I'm guessing that's going to end up being quite complex...] May 10, 2012 at 14:03
  • 4
    @MathematicalOrchid The standard examples involve working with perfectly balanced trees: data BTree a = Leaf a | BTree (a,a). May 10, 2012 at 14:38
  • 4
    Still, just because you can't make a monomorphic version doesn't mean the types exist at runtime: GHC gets away with type erasure completely for most programs. (Though it doesn't always get away with typeclass dictionary erasure -- especially in a program like this one!) May 10, 2012 at 14:39
  • 2
    Even if polymorphic recursion is prohibited in Haskell (like it used to be), it possible to encode polymorphic recursion using a class. So just prohibiting polymorphic recursion is not enough to guarantee static specialization.
    – augustss
    May 10, 2012 at 16:16
  • 2
    @MathematicalOrchid, if you gave up when it was exceeded, surely that counts as not working? May 17, 2012 at 15:43
14

Yep, I've thought about this too. Basically, the idea is that it seems like you could implement Haskell 98, but not some of the language extensions to it, using polymorphism-by-multiinstantiation instead of polymorphism-by-boxing.

You can get some insight into this by trying to implement some Haskell features as C++ libraries (as you note, C++ does polymorphism-by-multiinstatiation). What you find is that you can do everything that Haskell can do, except that it's impossible to have polymorphic values, which includes references to polymorphic functions.

What this looks like is that if you have

template<typename T>
void f(T);           // f :: a -> IO ()

you can take the address of a particular instantiation to pass around as a function pointer at runtime:

&f<int>

but you cannot take the address of a template (&f). This makes sense: templates are a purely compile-time construct. It also makes sense that if you're doing polymorphism by multiinstantiation, you can have a pointer to any particular instantiation, but you cannot have a pointer to the polymorphic function itself, because at the machine code level, there isn't one.

So where does Haskell use polymorphic values? At first glance it seems like a good rule of thumb of is "anywhere you have to write an explicit forall". So PolymorphicComponents, Rank2Types, RankNTypes, and ImpredicativeTypes are obvious no-nos. You can't translate this to C++:

data MkList = MkList (forall a. a -> [a])
singleton = MkList (\x -> [x])

On the other hand, ExistentialQuantification is doable in at least some cases: it means having a non-template class with a template constructor (or more generally, a class whose constructor is templated on more things than the class itself).

If in Haskell you have:

data SomeShow = forall a. Show a => SomeShow a
instance Show SomeShow where show (SomeShow a) = show a

you can implement this in C++ as:

// a function which takes a void*, casts it to the given type, and
// calls the appropriate show() function (statically selected based
// on overload resolution rules)
template<typename T>
String showVoid(void *x)
{
    show(*(T*)x);
}

class SomeShow
{
    private:
        void *m_data;
        String (*m_show)(void*); // m_show :: Any -> String

    public:
        template<typename T>
        SomeShow(T x)
            : m_data(new T(x)) // memory management issues here, but that's orthogonal
            , m_show(&showVoid<T>)
        {
        }

        String show()
        {
            // alternately we could declare the top-level show() as a friend and
            // put this there
            return m_show(m_data);
        }
};

// C++ doesn't have type classes per se, but it has overloading, which means
// that interfaces are implicit: where in Haskell you would write a class and
// instances, in C++ you just write a function with the same name for each type
String show(SomeShow x)
{
    return x.show();
}

In both languages you have a non-polymorphic type with a polymorphic constructor.

So we have shown that there are some language extensions you can implement and some you can't, but what about the other side of the coin: is there anything in Haskell 98 that you can't implement? Judging by the fact that you need a language extension (ExplicitForAll) to even write a forall, you would think that the answer is no. And you would almost be right, but there's two wrinkles: type classes and polymorphic recursion. Type classes are typically implemented using dictionary passing: each instance declaration results in a record of functions, which are implicitly passed around wherever they're needed.

So for Monad for example you would have:

data MonadDict m = MonadDict { 
    return :: forall a. a -> m a,
    (>>=)  :: forall a b. m a -> (a -> m b) -> m b
}

Well would you look at those foralls! You can't write them explicitly, but in dictionary passing implementations, even in Haskell 98, classes with polymorphic methods result in records containing polymorphic functions. Which if you're trying to implement the whole thing using multiinstantion is obviously going to be a problem. You can almost get away without dictionary passing because, if you stick to Haskell 98, instances are almost always global and statically known. Each instance results in some polymorphic functions, but because which one to call is almost always known at compile time, you almost never need to pass references to them around at runtime (which is good, because you can't). The tradeoff is that you need to do whole-program compilation, because otherwise instances are no longer statically known: they might be in a different module. And the exception is polymorphic recursion, which practically requires you to build up a dictionary at runtime. See the other answer for more details on that. Polymorphic recursion kills the multiinstantiation approach even without type classes: see the comment about BTrees. (Also ExistentialQuantification *plus* classes with polymorphic methods is no longer doable, because you would have to again start storing pointers to polymorphic functions.)

10
  • Do you mean ExistentialQuantification instead of ExplicitQuantification?
    – is7s
    May 10, 2012 at 16:16
  • Obviously, C++ can not only do polymorphism-by-multiinstatiation. The ExistentialQuantification stuff can be done with virtual inheritance, which sadly is not automatically compatible with the template approach and closer to Java than to Haskell. May 10, 2012 at 22:51
  • @is7s, right, I was in a bit of a hurry and didn't have time to proof read. I'll do that now.
    – glaebhoerl
    May 11, 2012 at 2:13
  • @leftroundabout, can you elaborate on that? It sounds interesting. I haven't really wrapped my head around what virtual inheritance means beyond it being a not-so-satisfactory solution to the diamond inheritance problem.
    – glaebhoerl
    May 11, 2012 at 2:13
  • 1
    @glaebhoerl sorry, my comment does look like a mess. I had a picture of the solution in my head when I wrote it, but after 5 months I had to decipher it myself too :) Here's a gist: gist.github.com/enobayram/dce92b468423a8ef6882
    – enobayram
    Oct 28, 2015 at 8:10
8

Whole program compilers take advantage of global access to type information to make very aggressive optimizations, as you describe above. Examples include JHC and MLton. GHC with inlining is partially "whole program" as well, for similar reasons. Other techniques that take advantage of global information include super compilation.

Note that you can massively increase code size by specializing polymorphic functions at all the types they're used at -- this then needs heavy inlining to reduce code back to normal values. Managing this is a challenge.

3
  • Right. So it is as static as I suspected then? I find it interesting that this fact is hardly ever a problem. :-D And yes, I think the way GHC implements this (with the specialize pragma and all) is really nice. May 10, 2012 at 14:01
  • 4
    There are still unknown function calls. Polymorphic recursion, other kinds of dynamic dispatch. May 10, 2012 at 14:05
  • 8
    Even if you specialize all polymorphic functions to their actual types you'll find that many of them end up with identical code. E.g., all specializations of a polymorphic function where the actual types are boxed will end up the same. This will reduce code volume.
    – augustss
    May 10, 2012 at 16:20

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.