Readability of class and instance declarations

timjs · June 1, 2021, 8:43am

I’m a little bit concerned about the direction we’re taking with respect to class and instance declarations. It is mainly about readability and intention of the code. Let me first summarise two current issues and state my concerns, after which I’ll introduce a possible way to ease the pain.

Kind annotations and class inheritance

First, since PureScript 0.14, we can add kind annotations to datatypes, newtypes and classes.
When browsing some source code, I stumbled upon this (there are more examples):

class Category :: forall k. (k -> k -> Type) -> Constraint
class Semigroupoid a <= Category a where
  identity :: forall t. a t t

I don’t know if I’m alone in this, but reading this code from top to bottom boggles my mind. It goes like this:

“Ah, we’re declaring a Category class.”
“Oh wait! It’s the Semigroupoid class!”
“Oh sorry. Semigroupoid is the superclass of the Category class.”

I know, writing down the superclass before the new class is the way we always did it,
is the way Haskell does it for some decades, but…

Now that we have kind annotations, they don’t align with the declaration.
It makes me wonder what we are declaring here.
It remembers me of reading C, where type annatations read backwards.

Instance declarations and `forall`

Second, a PR has been merged which removes of explicit instance names. The compiler can generate them. Wonderful! We’re also discussion the addition of forall to instance declarations, which helps avoiding problems with type variable scoping. Hurray! I guess the end result will be something like this:

instance forall a b. (Show a, Show b) => Show (Tuple a b) where ...

Some observations:

We’re repeating ourselves. We have to state that we’re declaring an instance for all a and b, but the part after the => implicitly states the same thing.
Just as with class declarations, it reads backwards. We’re declaring an instance of Show on Tuples, and for that we need two other Show constraints. (But again, we did this, and Haskell did this, for decades now…)
The introduction of an explicit forall for instances, also raises the question: Shouldn’t we treat class declarations the same and add an explicit forall there too to aid type variable scoping?
```
class Category :: forall k. (k -> k -> Type) -> Constraint
class forall a. Semigroupoid a <= Category a where ...
```

Consistency with `type` and `data` declarations

For type and data declarations we do not need an explicit forall, as the type variables are introduced together with the data declaration. When, for example, we write data Tuple a b = ..., it is clear that we are introducing type variables a and b, and we’re only allowed to use a and b after the equals sign. We don’t have to repeat ourselves! Why shouldn’t we do the same thing with class and instance declarations?

So, what if we make the experience of entity declaration simpler and more uniform by starting every declaration with the name we are declaring, together with the variables that can be used in that declaration.
So:

type and data declarations stay untouched;
class declarations start with the class we’re declaring together with their type variables, followed by their superclasses which can use the type variables introduced earlier
(as is custom in I think every language with some kind of oo-class or interface/protocol/trait inheritance except Haskell and Idris);
do the same for instance declarations;
drop the explicit forall.

data Tuple a b = ...

type Usually a = ...

class Category :: forall k. (k -> k -> Type) -> Constraint
class Category a | Semigroupoid a where ...

instance Show (Tuple a b) | Show a, Show b where ...

I’m curious if other people’s minds have to do the same yoga when reading current class and instance declarations, and if above proposal would enhance DX.

ajnsit · June 1, 2021, 10:17am

Putting the superclasses first makes it harder for me to read the declaration as well.

It all looks very clean! Using | instead of => or <= also has the benefit of not having to worry about the direction of the arrow, i.e. whether the superclass is implied or needed, which is an unneccessary complication IMO.

hdgarrood · June 1, 2021, 10:33am

I don’t find the argument that we should avoid forall in instance declarations because we are repeating ourselves convincing. Class and instance declarations are different because the class declaration itself (which is analogous to a data declaration) introduces the type variables by saying what its parameters are, whereas an instance declaration (which is more analogous to a function declaration) doesn’t, so you need some other mechanism to introduce the type variables, namely forall.

Perhaps this example will convince you that these things should be treated differently - class Foo a a should be an error, whereas instance forall a. Foo a a is fine.

One benefit of requiring forall on instances is that it catches typos, whereas implicit quantification wouldn’t. This isn’t necessary on classes because a) all of the parameter names must be distinct, and b) you can’t refer to type variables which are not parameter names in the superclasses or member functions unless you’ve introduced them with another forall. In short: for instances, removing forall means quantifying implicitly.

hdgarrood · June 1, 2021, 10:50am

While I see the benefit of having the name of the class being defined come first in a class declaration, I’m less keen on rearranging instance declarations so that the instance comes first and the constraints come afterwards. It would obscure the runtime representation of the instance: at the moment, the => arrow means “this is a function whose argument is provided by the compiler,” and being able to easily see what is going on in these cases can be quite important, especially in conjunction with strictness, and avoiding infinite loops or ensuring that things get evaluated when you want them to. I think being able to mentally “desugar” class and instance declarations into dictionary data types and functions which construct them is very useful in situations where the solver is behaving in a way that you find surprising, and so I think it’s appropriate that instance declarations do look like function declarations.

jy14898 · June 1, 2021, 12:06pm

I don’t know if others would agree with me, but if I were making a haskell-like langauge I would probably just lump the superclass constraints in with the instance members:

class Category a where
  class Semigroupoid a
  identity :: forall t. a t t

instance Category (->) where
  identity x = x

My reason is that extensions such as associated types from type families have had to do the same thing, it seems more extensible to not have them in the ‘head’

EDIT: On second thought, probably better to use the instance keyword rather than class, as it appears we’re making a new class when we’re not

JordanMartinez · June 1, 2021, 2:02pm

Couldn’t we also write this using just one forall?

class forall k (a :: k -> k -> Type). Semigroupoid a <= Category a where ...

JordanMartinez · June 1, 2021, 2:11pm

Would this syntax work?

-- Category is a class whose super class is Semigroupoid
-- and `a` has explicitly quantification
class Category :: forall k. k -> Constraint
class Category a <= Semigroupoid a . forall a

-- MonadState is a class whose super class
-- is Monad and it has a functional dependency
-- and `m` and `s` are both explicitly quantified
class MonadState :: (Type -> Type) -> Type -> Constraint
class MonadState m s <= Monad m | m -> s . forall m s

-- The full syntax
class SomeClass 
  :: forall k. (k -> Type) -> (Type -> k) -> Constraint
class SomeClass m n 
  <= (Class1 m, Class2 n, Class3 m m) 
   | m -> n, n -> m
   . forall m n

In other words, does the order of the components in the syntax matter? Or do we just need to ensure that all the parts of the syntax are there?

I realize that this breaks convention from mathematics which typically writes forall x before some equation.

hdgarrood · June 1, 2021, 3:09pm

That’s unappealing to me for two reasons: firstly, it breaks quite significantly from the syntax of PureScript (I don’t really care about convention in mathematics, but consistency within the language is very important). Secondly, there is no reason to include forall in class declarations anyway, because the class SomeClass m n part is already binding the type variables m and n, in the same way that data SomeType m n = ... binds type variables m and n.

timjs · June 1, 2021, 5:25pm

Yes! Good point I find that a real good counter argument for my issues with instance declarations. That only leaves the class case.

I understand your argumentation. But then, functions taking dictionaries and instances taking constraints don’t look the same either: the first uses C a => D b => ... while the second uses (C a, D b) => .... I think there was an issue about this, but can’t find it any more…

Doesn’t make it more readable does it? I think separating function definitions from their type annotations on functions is one of the great appeals of Haskell-like syntax. For me, the same holds for type/data/class declarations and their kind annotations.

natefaubion · June 1, 2021, 8:52pm

Classes use <= for superclass implication, which means that if you wanted the class head to be first, then we should use => and flip the order. I’m afraid this might make things more confusing though. I think this is just one of those “it is what it is” cases, unfortunately. I agree that it’s weird, but other solutions require drastically new syntax or introducing soft-keywords/symbols which make the grammar and parser complicated (just dealing with the ad-hoc <= is kind of excruciating), and I’m not sure it’s worth all that trouble unless it’s just “wow this is just so much better”, and I don’t think anything suggested so far meets that bar (personal opinion, of course).

hdgarrood · June 1, 2021, 9:49pm

This one? https://github.com/purescript/purescript/issues/2871

JordanMartinez · June 2, 2021, 12:41am

Haha… yeah, not really

In other words, the order of these syntax parts is arbitrary, not necessary, but changing the order break a lot of code that would be tedious to fix for really no reason. If we could go back and rebuild PureScript, then that could have been different.

hdgarrood · June 2, 2021, 1:45am

No, that’s not quite what I’m saying. I’m saying that it would be best not to introduce new inconsistencies (regarding whether the forall goes at the start or the end of something) while trying to improve the syntax of class declarations.

timjs · June 2, 2021, 8:00am

Yes, that’s the one! So the main argument against it is this I think?

Constraints in types can be partially applied, but constraints in instances can’t.

Although I think this is more of an implementation detail, I think you don’t agree with me on that due to below comment?

Can you explain why you think instance declarations should look like function declarations? I actually never look at them that way, for me they are separate entities which group functions. Compare it with Rust’s impl syntax for example, there is no hint to functions at all.

I agree with you that flipping the order and the arrow is not a solution. Especially people coming from Haskell would be boggled when reading a class declaration which looks the same but declares something different.

However, I don’t agree with you here. The | in the proposed syntax is already a special symbol in the grammar, where <= isn’t. Also <= is commonly used for less then, and ligature fonts like Fira Code and Hasklig render it that way, and not as an implication arrow. Therefore I think flipping class declarations and using | makes the grammar, the parser, and the language simpler and more consistent. Actually, the | is already there, because fundeps are currently declared after a |.

About readability: This is the link I was looking for about the clockwise/spiral rule in C for reading types A nice read! Thanks to @milesfrain to point this out!

hdgarrood · June 2, 2021, 11:23am

I don’t really have much to add beyond the part you quoted already - that was my explanation for it. The “right” way of thinking of instances, I’d argue, is of functions which accept other instance dictionaries as arguments and then return an instance dictionary, because this is what instance declarations actually are and in my experience it is also the best way of understanding their behaviour in more complex scenarios.

natefaubion · June 2, 2021, 2:01pm

Your example curiously leaves out how this interacts with fundeps and what the grammar is though. You can’t just say it simplifies the grammar without stating how in terms of the actual grammar. My opinion is that you have not demonstrated any proof of the fuzzy adjectives you brought up.

natefaubion · June 2, 2021, 3:54pm

I don’t know if others would agree with me, but if I were making a haskell-like langauge I would probably just lump the superclass constraints in with the instance members:

FWIW, I agree with you (both in the approach and using instance). I like this approach because it closely matches the denotation, and the quantification isn’t backwards. I think the downside to it with PS as it exists is that the inertia for this change is quite high. While it’s essentially the same implementation, conceptually it’s very different from how we currently talk about “super classes”. So much that, to me, on the surface, it appears to be a completely separate language feature. It’s very difficult to overcome that inertia in both the ecosystem and in documentation.

timjs · June 3, 2021, 7:12am

True, my idea is to first list super classes, then fundeps. Possibly separating them with a extra | if you like. Even better: add named fundeps at the same time.

class Index c i | Fold c, c -> i
class Index c i | Fold c | c -> i
class Index c i | Fold c where
  type Idx c = i

So where is PureScripts official grammar so every change proposal can be accompanied with a grammar change? Or do you propose to send in a patch for the Happy parser definition?

With or without fuzziness, maybe I shouldn’t make a claim about the simplicity of the grammar and only about readability of the language. However, my point is that <= is already special and “excruciating”, as you already pointed out. So why not think about a grammar to take that out of the way too?

Adrielus · June 3, 2021, 7:33am

What about an implies keyword:

class Index c i implies Fold c | c -> i

jy14898 · June 3, 2021, 12:33pm

A hack for today:

class            {- => -} Semigroupoid a <=
      Category a                            where
  identity :: forall t. a t t

This might be sarcasm, I’m not sure

Readability of class and instance declarations

Kind annotations and class inheritance

Instance declarations and forall

Consistency with type and data declarations

Instance declarations and `forall`

Consistency with `type` and `data` declarations