Exploring mutually recursive Variants

natefaubion · October 16, 2018, 12:52am

With Variant, we get extensible sums, but one drawback is you can’t have recursive data types. If you try to use recursion with type synonyms and records, you’ll get an error. The compiler would attempt to expand the recursive alias into an infinite type. The same restriction applies to Variant.

type List a = Variant (nil :: Unit, cons :: Tuple a (List a))

We would have to newtype this, which defeats the point, since we lose extensibility. However, with VariantF we can utilize fixed-point combinators to introduce a recursive variant.

type NIL r = (nil :: FProxy (Const Unit) | r)
type CONS a r = (cons :: FProxy (Tuple a) | r)
type ListF r a = VariantF (NIL + CONS a + r)
type List r a = Mu (LiftF r a)

This gives us a self-recursive extensible data type. But often times (like when writing types for language ASTs), we actually want mutually recursive data types. As an example, here is a small AST for adding numbers that also support let bindings. We’ve separated Binding out into a different data type because we might want to add more than 1 binding form.

data Expr
  = Lit Int
  | Var String
  | Add Expr Expr
  | Let Binding Expr

data Binding
  = Bound String Expr

For a language like PureScript, there are several such data types that form its mutually recursive AST. So now, why does the Mu VariantF approach fall flat with something like this? Translating Add is straightforward:

data Add a = Add a a
type ADD r = (add :: FProxy Add | r)
derive instance functorAdd :: Functor Add

But if you look at something like Let, it discriminates two points of recursion (Expr and Binding) which forms a mutual binding group. A naive attempt to translate Let is clearly wrong:

data Let a = Let a a

Since we are only using one variable a, it would be self-recursive in both arguments. We could potentially add two variables for each point of recursion:

data Let a b = Let a b

But this has a different kind, and does not fit uniformly into our framework. Since the particular branch we want to take in our binding group is determined by argument position, all of our data types would have to have the same number of arguments, and use the same slots to represent each data type. That means if we wanted to add another branch, then we’d have to go back and change all the data types. Not very extensible!

We can refactor this though and get something that is both uniform and can give us an infinite supply of types. At the value level, say we have a function that takes two arguments:

foo :: Int -> Int -> Unit

This is essentially a product (Int * Int). We can refactor this using an exponent (Int^2). In the algebra of data types, this is the -> operator. We just need a type with 2 inhabitants: Boolean = True | False.

foo :: (Boolean -> Int) -> Unit

So now instead of taking two parameter, I can take a single function parameter and feed it an index (true or false) to get a value for a particular branch. We can apply this same trick to our functor families, and since we ultimately want to use Variant, we will index it by Symbols.

data Add f = Add (f (SProxy "expr")) (f (SProxy "expr"))
data Let f = Let (f (SProxy "binding")) (f (SProxy "expr"))

Or, in a prettier way:

type Expr f = f (SProxy "expr")
type Binding f = f (SProxy "binding")

data Add f = Add (Expr f) (Expr f)
data Let f = Let (Binding f) (Expr f)

However, now our type parameter is of kind Type -> Type instead of Type, so we need a different abstraction than Functor.

class NFunctor f where
  nmap :: forall a b. (a ~> b) -> f a -> f b

Where we use a natural transformation (~>) between functors. Implementing this is mostly just boilerplate for our data types.

data NFProxy (f :: (Type -> Type) -> Type) = NFProxy

type Expr f = f (SProxy "expr")
type Binding f = f (SProxy "binding")

data Lit (f :: Type -> Type) = Lit Int
type LIT r = (lit :: NFProxy Lit | r)
instance nfunctorLit :: NFunctor Lit where nmap f (Lit a) = Lit a

data Var (f :: Type -> Type) = Var String
type VAR r = (var :: NFProxy Var | r)
instance nfunctorVar :: NFunctor Var where nmap f (Var a) = Var a

data Add f = Add (Expr f) (Expr f)
type ADD r = (add :: NFProxy Add | r)
instance nfunctorAdd :: NFunctor Add  where nmap f (Add a b) = Add (f a) (f b)

data Let f = Let (Binding f) (Expr f)
type LET r = (let :: NFProxy Let | r)
instance nfunctorLet :: NFunctor Let where nmap f (Let a b) = Let (f a) (f b)

data Bound f = Bound String (Expr f)
type BOUND r = (bound :: NFProxy Bound | r)
instance nfunctorBound :: NFunctor Bound where nmap f (Bound a b) = Bound a (f b)

With this we can start assembling a new Variant type for this shape:

data VariantNF (r :: # Type) (f :: Type -> Type)

inj ::
  forall r rx sym f a.
  Row.Cons sym (NFProxy f) rx r =>
  IsSymbol sym =>
  NFunctor f =>
  SProxy sym ->
  f a ->
  VariantNF r a

lit :: forall f r. Int -> VariantNF (LIT + r) f
lit = inj (SProxy :: SProxy "lit") <<< Lit

var :: forall f r. String -> VariantNF (VAR + r) f
var = inj (SProxy :: SProxy "var") <<< Var

add :: forall f r. Expr f -> Expr f -> VariantNF (ADD + r) f
add a b = inj (SProxy :: SProxy "add") $ Add a b

letIn :: forall f r. Binding f -> Expr f -> VariantNF (LET + r) f
letIn a b = inj (SProxy :: SProxy "let") $ Let a b

bound :: forall f r. String -> Expr f -> VariantNF (BOUND + r) f
bound a b = inj (SProxy :: SProxy "bound") $ Bound a b

So far this is mostly the same as normal VariantF, just with a different kind. Now we can implement a new fixpoint type which lets us tie the mutual binding group together, much like what Mu does for VariantF. We are going to index it by our binding identifiers, in the same vein as Variant.

data Tied (r :: # Type) a

tie ::
  forall r rx rec sym.
  Row.Cons sym (RProxy r) rx rec =>
  SProxy sym ->
  VariantNF r (Tied rec) ->
  Tied rec (SProxy sym)

untie ::
  forall r rx rec sym.
  Row.Cons sym (RProxy r) rx rec =>
  Tied rec (SProxy sym) ->
  VariantNF r (Tied rec)

This Tied type takes the spine for the entire binding group, and fixes it to a particular branch, as indicated by the SProxy argument.

type EXPR f r = (expr :: RProxy f | r)
type BINDING f r = (binding :: RProxy f | r)

expr ::
  forall f r.
  VariantNF f (Tied (EXPR f + r)) ->
  Tied (EXPR f + r) (SProxy "expr")
expr = tie (SProxy :: SProxy "expr")

binding ::
  forall f r.
  VariantNF f (Tied (BINDING f + r)) ->
  Tied (BINDING f + r) (SProxy "binding")
binding = tie (SProxy :: SProxy "binding")

And with this we can build our mutually recursive tree.

test :: _
test =
  expr
    (letIn
      (binding
        (bound "foo"
          (expr (lit 42))))
      (expr
        (add
          (expr (var "foo"))
          (expr (lit 42)))))

Which infers beautifully.

Wildcard type definition has the inferred type

Tied
    ( expr :: RProxy
                ( "let" :: NFProxy Let
                , add :: NFProxy Add
                , lit :: NFProxy Lit
                , var :: NFProxy Var
                | t0
                )
    , binding :: RProxy
                    ( bound :: NFProxy Bound
                    | t1
                    )
    | t2
    )
    (SProxy "expr")

The NFunctor type is suspiciously similar to the more general and known HFunctor:

class HFunctor f where
  hmap :: forall a b. (a ~> b) -> f a ~> f b

Where f is (Type -> Type) -> Type -> Type instead of (Type -> Type) -> Type. I’m not sure exactly if there’s an added benefit to using this. I don’t immediately see a benefit to the extra type argument, except that Add can also be a Functor, which is nice.

type Expr f a = f (SProxy "expr") a
data Add f a = Add (Expr f a) (Expr f a)

derive instance functorAdd :: Functor (Add f)
instance hfunctorAdd :: HFunctor Add where hmap f (Add a b) = Add (hmap a) (hmap b)

But I’d be interested to see if this admits some neat stuff.