Help with code comprehension

Hello again. I am having great difficulty understanding some of the code I am managing that I believe is causing the bug I am trying to fix. However since I don’t fully understand what is going on, and there are no comments, I am hesitant to start changing things. I would like to try and figure it out myself so I am just going to ask a lower level question that will hopefully put me on the right track.

I have the following class:

class (AllenInterval a, Ord x) ⇐ PointedInterval a x | a → x where
  within ∷ x → x → a

I am unsure how to actually read what this is doing. Primarily I am unfamiliar with the symbols and would appreciate any tips.

Thanks

1 Like

So PointedInterval a x is the class being defined with the within member. That <= symbol means it has a “superclass” - well, two in this case: AllenInterval and Ord. That means in order to have a PointedInterval a x instance for some a and some x, a must also be an instance of AllenInterval, and x must also be an instance of Ord.

Then you have that | a -> x part. That indicates a “functional dependency”, which means that if the compiler knows what a is, it should be able to figure out what x is without being explicitly told, because there’s only one possible answer. I don’t know the rules the compiler then uses for figuring out what x is, to me it feels like magic :wink:. Maybe someone more knowledgable can explain how x gets calculated.

Hopefully that’s enough to get you started. Are there other parts of that class definition that still don’t make sense?

2 Likes

Thanks that really helps. Some days I look at the code and its just alphabet soup. Your description is already helping a lot.

3 Likes

I like to think about classes with two type parameters as relations (in the mathematical sense) - then a fun-dep like above transforms this relation into a (partial) function - the type checker will watch out that whenever you have instances PointedInterval a x1 and PointedInterval a x2 that x1 ~ x2 and as you get this guarantee the type-checker can further infer what this x1 has to be if the a is known.

In this case here the fun-dep helps within to fix the output-type via the input-type.


So with this when you see

within (5 :: Int)

and you have an instance PointedInterval Int Boolean you can infer within 5 :: Boolean - without the fun-dep you could only know

within 5 :: forall x. PointedInterval Int x

(because without the fun-dep there someone could define another instance PointedInterval Int String and the type-checker needs to allow for this possibility)

2 Likes

I think this explanation just made the remaining puzzle pieces fit together for me. I was wondering why within sometimes output a BracketedInterval and sometimes output an EnumInterval. Thats really cool. Thank you.

1 Like

Hold on a second. Is this just algebra? Solve one side to get the answer to the other when given the proper context? At a very basic level it is like “x + y = a” with the context “x = y”. If given “a” there is only one value “x and y” could be, or if given “x or y” there is only one value “a” could be. Please correct me if I am wrong because I don’t want to internalize this comparison if it is incorrect

Not sure if I get your analogy but you have to be careful - in algebra (or this kind of algebraic manipulation of equations) you often assume properties like: “there is an additive inverse” (implying there is an neutral element) and “if I add something to both sides the equations is not changed”, … - almost(? - cannot think of one that is true right now but there are probably some) none of these are true for the property of two types a and b and "being an instance of TC a b for some type-class TC.

And then there is the issue with the way it is resolved - for example for a long time I had this instinct of assuming that if I have a type-class or instance with an <= then there should be a check for the “prerequisite” first but that’s not at all how it works - if the other side matches an known instance then those “prerequisites” will be added to the list of constraints that must be matched - which is kind-of the opposite of what my intuition was.
In PureScript you have instance-chains which somewhat relaxes this but in Haskell this does still bite me on occasion.

Maybe it’s better to see type-classes/instances as their own thing and not generalize to much :wink:

Yeah, when first starting out it can be tempting to gloss over all of the constraints and classes stuff as being mathematically solved by the compiler, but if you’re trying to build up an understanding of how this actually works, it’s best to learn the actual algorithm used, which is far more limited in power than high-school algebra.

First, start with single-parameter type classes. When a constraint needs to be solved, and the class of the constraint has one type parameter, the compiler looks at what it knows about the type given to that parameter—it might be entirely unknown, or it might have a head—for example, the head of the type Array Int is Array, and the head of the type String -> Maybe Int is Function (because String -> Maybe Int is just the infix notation for Function String (Maybe Int)). The compiler then looks in the module of the class and the module of the head of the parameter, if any, for instances that could match the parameter type. If a single unambiguous match is found, it is used, and any constraints on that instance are now themselves solved. Otherwise, an error is raised; no attempt is made to backtrack or otherwise refine the search. (I’m ignoring instance chains for the moment but they don’t complicate the story much.)

Next, graduate to multi-parameter type classes, which are solved similarly, except now there are multiple parameters to consider. Every parameter may or may not contribute a head, and every head contributed has its module searched. An instance needs to match all of the parameters in order to be considered. For example, if you’re looking for a solution to Foo Int t, where t is an unknown type, and there is an instance of type Foo Int String, the compiler won’t consider that a match, because t may or may not actually be String—it’s unknown! Again, if there is a single unambiguous match, the compiler can proceed, and otherwise it raises an error.

Now we can explain functional dependencies, which only apply to multi-parameter type classes. The main practical effect of a functional dependency is to change the above algorithm such that not all of the parameters in a constraint need to match—only enough parameters to cover all the others with fundeps! So to revisit the previous example, if the Foo class was defined as class Foo a b | a -> b, that means that a is sufficient when looking for a match, because b can then be covered by the a -> b fundep. Foo Int t now can be solved with a Foo Int String instance, and the compiler will unify t with String and proceed. (Of course, unifying t and String may not end up being possible in the context of the rest of the program, which will cause a different error to be raised.) In order for this to not break instance coherence—roughly, the idea that there’s only ever one way to use instances to solve constraints—functional dependencies also add an additional restriction on how instances can be defined. In the running example, it’s now illegal to define both a Foo Int String instance and a Foo Int Boolean instance, whereas that would have been legal without the fundep.

If all this stuff reminds you of how you solve algebraic equations or how mathematical relations work, that can be a good source of intuition—but don’t expect the compiler to solve anything other than by following the rote steps outlined above!

8 Likes

That’s a very helpful explanation of the algorithm used. Thank you!

Seconding @ntwilson here, that’s a lovely explanation

I’ve just bookmarked this because this makes a lot more sense to me, and I’m sure I’ll benefit from re-reading this the next time I overthink/confuse myself somehow.