When we do the 0.16.0
ecosystem update, it’s my understanding that we’ll update libraries from a Proxy
-based syntax to a Visible Type Application-based one. Below, I’m proposing a few guidelines to follow so as to minimize future breaking changes and keep the resulting APIs consistent across the libraries.
The outline for this post:
- What counts as a breaking change for VTAs?
- What order should type variables be in?
- Which type variables get the
@
treatment? - Naming conventions
First, it’s helpful to clarify what would be a breaking change when it comes to VTAs. Suppose we have the following example:
foo :: forall @a b @f g r. Class f => (a -> g r) -> f a -> g b -> g r
The following changes to the above type signature would be a breaking change:
- Dropping the
@
in front of thea
type var becausefoo @SomeType
would now unify withf
, not witha
. Moreover,foo @_ @myF
would no longer compile because there’s now only 1 VTA. - Adding an
@
in front of theb
type var becausefoo @aType @fType
andfoo @_ @fType
would causefType
to unify withb
, notf
. - Swapping the position of
@a
with@f
becausefoo @aType @fType
would causeaType
to unify withf
andfType
to unify witha
.
The following changes would not be breaking changes:
- Swapping the position of
@a
withb
or@f
withb
. Because the first VTA would still apply toa
and the second still applies tof
, either of these changes has no noticeable effect. - Adding an
@
in front ofg
because, while this increases the number of VTAs, it does not causefoo @aType
,foo @_ @fType
orfoo @aType @fType
to unify with the wrong variables. However, this would requires a minor version change because one would not be able to downgrade without code breaking.
In short, a breaking change happens when:
- the number of VTAs decreases
- the order of VTAs changes
- an insertion between two existing VTAs occurs
A minor change occurs when:
- a new VTA is added
Thus, there are two questions to ask:
- Which type variables should get the
@
treatment? - What should the order of the type variables be?"
Second, in answering question 2 above, I’d propose the following general guidelines for the order of type variables within a forall
, whether they get the @
treatment or not:
- type variables for higher-order types (e.g.
f
,g
,h
, etc.) should appear before first-order types (e.g.a
,b
, etc.) - type variables with type class constraints should appear earlier than those without
- if two type variables are mentioned in type class constraints, the type variable that appears in an earlier type class constraints should appear earlier in the
forall
- prioritize what is likely the most common VTA usage of the function/value (i.e. shortest number of VTAs used with the most important ones appearing first).
As an example, we’ll use foo
from above
-- Original definition (excluding `@` vars)
foo :: forall a b f g r. Class f => (a -> g r) -> f a -> g b -> g r
-- Put higher order first
foo :: forall g f a b r. Class f => (a -> g r) -> f a -> g b -> g r
-- put those with type classes first
foo :: forall f g a b r. Class f => (a -> g r) -> f a -> g b -> g r
-- while `g` appears within the first argument, we don't put it first
-- because of the type class constraint on `f` which `g` doesn't have
-- so `g` takes precedence here. If `g` had a type class,
-- I would propose putting that type class before `f`'s type class
-- and putting `g` before `f` in the `forall`.
foo :: forall f g a b r. Class f => (a -> g r) -> f a -> g b -> g r
Third, in answering Question 1 above, type variables get the @
treatment if
- they have a type class constraint
- one will often want/need to clarify which type with which a type variable unifies
Using foo
as an example:
-
f
gets the@
treatment because of the type class constraint -
g
,a
, andr
are judgment calls.-
a
likely does not need the@
treatment because of its containerf
already getting it -
g
likely does not because it appears twice, once in a function’s returned value and another as an argument -
r
might be useful to add@
because it can serve as both a clarification and an assertion when row type errors aren’t always helpful
-
Fourth, how should type variables be named? At the present, most libraries use a single letter to name them: a
-e
for first-order types and f
-h
for higher-order types, m
-n
for monadic higher-order types, p
for arrow-like types, and r
for record types or row kinds.
Arguably, names in the present implementation do not matter because everything is positional. However, if we did support non-positional VTA syntax via @{ f: Array, r: { aRecord :: Int } }
, then names would come into play here. So, unless such names are important for reading and understanding the code (e.g. type-level programming comes to mind), when doing this VTA update, I think any type variable names not using single-letter names should be renamed to single-letter names when possible. And I think this should apply only to type variables in functions and values, not type class heads.
Thoughts?