Main Ideas / Diagrams of Category Theory, Part 1, Videos 1.1 to 3.1

JordanMartinez · January 9, 2019, 3:33am

I’ve summarized the points/diagrams of Bartosz Milewski’s Category Theory Part 1, Videos 1.1 to 3.1 in the below image for easier reading / faster understanding. The formatting is a bit crude, but it works, nonetheless.

github.com

JordanMartinez/purescript-jordans-reference/blob/category-theory-work/21-Hello-World/01-FP-Philosophical-Foundations/1x-CT-1--Video-1.1-3.1.svg

<?xml version="1.0" encoding="UTF-8"?><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" fill-opacity="1" color-rendering="auto" color-interpolation="auto" text-rendering="auto" stroke="black" stroke-linecap="square" width="1216" stroke-miterlimit="10" shape-rendering="auto" stroke-opacity="1" fill="black" stroke-dasharray="none" font-weight="normal" stroke-width="1" height="19683" font-family="'Dialog'" font-style="normal" stroke-linejoin="miter" font-size="12px" stroke-dashoffset="0" image-rendering="auto">
  <!--Generated by ySVG 2.5-->
  <defs id="genericDefs"/>
  <g>
    <defs id="defs1">
      <clipPath clipPathUnits="userSpaceOnUse" id="clipPath1">
        <path d="M0 0 L1216 0 L1216 19683 L0 19683 L0 0 Z"/>
      </clipPath>
      <clipPath clipPathUnits="userSpaceOnUse" id="clipPath2">
        <path d="M500 -5686 L1716 -5686 L1716 13997 L500 13997 L500 -5686 Z"/>
      </clipPath>
    </defs>
    <g fill="white" text-rendering="geometricPrecision" shape-rendering="geometricPrecision" transform="translate(-500,5686)" stroke="white">
      <rect x="500" width="1216" height="19683" y="-5686" clip-path="url(#clipPath2)" stroke="none"/>
    </g>
    <g fill="rgb(245,245,245)" text-rendering="geometricPrecision" shape-rendering="geometricPrecision" transform="matrix(1,0,0,1,-500,5686)" stroke="rgb(245,245,245)">
      <rect x="1162.8466" y="2463.8597" clip-path="url(#clipPath2)" width="110.8333" rx="4" ry="4" height="227.5532" stroke="none"/>
      <rect x="1162.8466" y="2463.8597" clip-path="url(#clipPath2)" fill="rgb(235,235,235)" width="110.8333" height="38.9219" stroke="none"/>
    </g>
    <g font-size="15px" stroke-linecap="butt" transform="matrix(1,0,0,1,-500,5686)" text-rendering="geometricPrecision" font-family="sans-serif" shape-rendering="geometricPrecision" stroke-miterlimit="1.45">

This file has been truncated. show original

YouTube playlist:

hdgarrood · January 9, 2019, 5:13am

Your laws aren’t quite right. For the composition law, if you have f : A -> B and g : B -> C, there is indeed an arrow g . f : A -> C; up to there, it’s fine. However, the next line down, which says that “g . f is different from other similar arrows ‘x’ and ‘y’” doesn’t really make sense. Presumably, by ‘similar,’ you mean that they are also arrows from A to C? If you’re saying that there exist other arrows from A to C which are distinct from g . f, this is not true - in fact there are lots of categories in which there is at most one arrow between any two objects (I think such categories even have a name but I can’t remember it right now). If you’re saying that g . f is different from x and y because g . f is defined via a composition whereas x and y aren’t, that’s not quite how it works: you can’t take an arrow and ask ‘was this defined via composition’. To give an example, take the functions ‘double’ and ‘quadruple’, which double and quadruple their arguments respectively, and consider them as arrows Z -> Z in the category of sets. Then quadruple = double . double. The left hand side is “not defined via composition,” but the right hand side is. Since the left hand side and right hand side are equal, the only conclusion we can draw is that “defined via composition” doesn’t make sense as a property of arrows.

Your identity law is not quite right either - in fact the identity law says that for each object A, there is an identity arrow id : A -> A. There may be other (non-identity) arrows A -> A. For example, the ‘double’ and ‘quadruple’ above are non-identity arrows Z -> Z; in the category of sets, the identity arrow is the identity function.

hdgarrood · January 9, 2019, 5:26am

Also, while this sort of resource is useful as a cheat sheet, I really don’t think it should be positioned as suitable for beginners, and especially not for someone’s first encounter with the topic. You’ll get through this summary a lot faster than you will these videos, sure, but when it’s trimmed down to this extent I doubt beginners will be able to take anything from it.

Incidentally Bartosz Milewski has also written a series of articles on category theory if you don’t like learning from videos: https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/amp/

garyb · January 9, 2019, 2:19pm

I’m also a bit curious about who the intended audience is for this too - I’d be a bit concerned about it furthering the myth that you need to know category theory to be able to use PS/Haskell, if this is intended to go in your PS learning resources repo.

chexxor · January 9, 2019, 4:28pm

That’s a neat way of taking notes! Very cool! Thanks for sharing, also.

I’m interested in category theory to figure out why it plays such a prominent role in PureScript, prominent because it’s in the recommended Prelude and many basic libraries are happy to place references to category theory words in type signatures.

It’s interesting to me how many authors and users of these libraries say similar to “it’s not necessary to understand category theory”. If so, then why refer to it’s concepts in type signatures? Because it’s helpful for program design? If so, then I should know more about it when designing my own programs. It’s contradictory - doing one thing and saying another.

JordanMartinez · January 9, 2019, 5:06pm

Thank you all for responding to this. It’s been helpful reading through your comments.

Before addressing your comments, I’d like to give context and explain why this image was even created.

The issue in my learning repo that made me think I should learn Category Theory more was More on Category Theory - Especially Monad. (@hdgarrood I am already aware of his blog posts and their PDF-ified version: https://github.com/hmemcpy/milewski-ctfp-pdf )
I was reading through this PDF back in November and I think I started getting lost when he explained Limits. Sometime after that, I worked on fixing my Application Structure lessons in my repo and am finally returning to CT. (Since it’s been so long, I can’t remember whether the videos were a better learning experience than reading the book.)

Rather than continuing with the book, I decided to look at his videos. I don’t know recall why. I appreciated the fact that he draws some arrows, then explains things, then draws more arrows. Since I can see the diagram in various states before its full completion, it helped me interpret those arrows correctly. In other examples, the full complete diagram is drawn, so one does not see the ‘problems’ in an complete version that one ‘solves’ by making the diagram more complete.

Still, I realized that there are a few issues with these videos:

they take a LOT of time to watch (even on 2x speed)
some of his initial drawings are later corrected, or are so tiny that they are not as easy to see
he sometimes refers to things (e.g. partial order or total order) that aren’t explained in the video.

So, my goal was to create a file with diagrams and more context. Since yEd made diagramming easy and SVG reduces the file size, I went with that approach.

Now to address your comments

Also, while this sort of resource is useful as a cheat sheet, I really don’t think it should be positioned as suitable for beginners, and especially not for someone’s first encounter with the topic. You’ll get through this summary a lot faster than you will these videos, sure, but when it’s trimmed down to this extent I doubt beginners will be able to take anything from it.

I’m also a bit curious about who the intended audience is for this too - I’d be a bit concerned about it furthering the myth that you need to know category theory to be able to use PS/Haskell, if this is intended to go in your PS learning resources repo.

The concerns for both of you are ‘Who is the intended audience?’ Initially, it was for new beginners/learners. With your pushback, I’m thinking it should be those who are curious in learning CT. Or, perhaps these should be nothing more than ‘cheatsheets.’

The “intended audience is new beginners” goal is likely based on believing in that myth that “one needs to know CT to use PS/Haskell.” Echoing @chexxor’s comment, I would appreciate clarification on this myth because I have heard similar statements of “oh, you don’t need to know it, but it ‘helps’ in a way.” It would help me greatly in knowing answers to these questions:

Addressing the pink elephant in the room, what should new beginners understand when they hear this myth? If a new learner comes to my repo, how can I bring awareness to this myth, quickly explain the roots behind the myth, and then explain why the myth isn’t true in plain non-jargony English? (If you only have time to answer one of these questions, please answer this one)
In what way does CT ‘help’ me?
- Program design?
- Thinking about how to define my types?
- Understanding why ‘Functor’ is defined in the way that it is as opposed to something else?
- Coming across some problem and realizing “hey, that has the shape of a [CT concept], which means I can solve it by using [function from CT concept]?”
Should I ever learn CT?
- If not, why?
- If I should, at what point in my learning process should it be done? Why then and not sooner/later?
What aspects of CT should I learn (i.e. regardless of what I do, I should know [these CT concepts])?
What aspects can be left for specialization (i.e. only if I do X should I then learn [these CT concepts])?

Thanks for the corrections. What I wrote was merely a summary of what I understood Bartosz to be saying. So, either I misunderstood him or he was giving a “working definition,” so people understood enough to move onto more advanced concepts. I’m guessing the former.

hdgarrood · January 10, 2019, 2:16pm

It’s interesting to me how many authors and users of these libraries say similar to “it’s not necessary to understand category theory”. If so, then why refer to it’s concepts in type signatures? Because it’s helpful for program design? If so, then I should know more about it when designing my own programs. It’s contradictory - doing one thing and saying another.

This is a good point. I don’t think it’s contradictory, and I’ll try to explain why.

I am very confident saying that it’s not necessary to understand category theory to make use of the Functor-Applicative-Monad hierarchy in languages like Haskell or PureScript, because I personally have been writing programs making use of this hierarchy since around 2013, whereas I only really got my head around what a Functor really is (in the category theory sense) last year, when I took a CT course during my maths degree. I’m sure I’m not the only one who was making use of this hierarchy without having studied the ideas behind it, too.

The key point is that a PureScript or Haskell Functor is a very specific example of the category theoretical functor. In category theory, if C and D are categories, a functor F : C -> D is a mapping between them; each object A in C is mapped to an object F(A) in D, and each arrow f : A -> B in C is mapped to an arrow F(f) : F(A) -> F(B) in D, such that F(f ∘ f’) = F(f) ∘ F(f’) for any pair of arrows f, f’ (provided that they are compatible i.e. it makes sense to compose them), and also F(id) = id. Functors for which the source and target category are the same are called endofunctors. So if F : C -> C, we would say that F is an endofunctor on C.

Haskellers often speak of the category Hask, whose objects are the Haskell types of kind * (aka Type) and whose arrows are Haskell functions. A Haskell Functor, therefore, is (more or less) an endofunctor on Hask.

Of all the functors which could exist, the functors which we can realise as instances of Functor are but a very restricted subset. To make use of the hierarchy in Haskell or PureScript, you only need to get your head around this subset. This is a much easier task than properly getting your head around the much wider concept of Functor in the CT sense; there are many examples of category theoretic functors which will at first seem like they have absolutely nothing in common with Haskell functors. For example, there are a few structures from abstract algebra which can be regarded as categories, such as monoids, groups, and posets. Functors between two monoids regarded as categories amount to monoid homomorphisms, and similarly, functors between groups and posets regarded as categories amount to group homomorphisms and order-preserving maps respectively.

You could argue that the Haskell Functor should really be called HaskEndofunctor… but you can probably immediately see why we don’t call it that.

To address the question of why we refer to its concepts in type signatures: because even though you don’t need to get your head around category theory to use them, it’s useful to indicate that this is where these concepts have come from in case you do want to dig deeper.

Category theory can certainly be useful for program design; the Functor-Applicative-Monad hierarchy itself is a great example. I think this talk is a great example too:

However, studying CT is far from the only way to improve your program design skills. For most people, there are going to be many other things you can study which will be more effective in improving your skill in designing programs, in terms of the investment / payoff ratio.

garyb · January 10, 2019, 3:44pm

This is kinda a tangent, but between CT and abstract algebra, I’d argue that the AA stuff is easier to understand and more immediately applicable/recognisable too, as even though it is still abstract, it’s more directly relevant when it comes to the behaviour of things in the numeric hierarchy, almost everything is at least a Semigroup, etc.

It seems like the average user is more likely to be dealing with data structures that can benefit from AA than they are inventing new control structures (that’s certainly true for me anyway).

JordanMartinez · January 11, 2019, 6:32pm

In short, learning CT is about the opportunity cost: will learning CT make me a better programmer than spending that same time learning [some other technology]? Often times, it’s “no”
because most of the CT I’ll use is a subset of CT. My abilities greatly increase the more I learn that subset, but once it’s learned, the increase to my abilities plateau as I learn more CT.

As an analogy (putting this into my own words), using and learning Category Theory is comparable to an old person using a computer: they don’t need to understand how it works (hardware -> boot process -> OS -> Applications, etc.) in order to use it. However, when they do learn how it works, they appreciate the beauty/complexity of it all, but it does not greatly extend their abilities to use the computer in newer ways. If anything, it only empowers them somewhat more. Rather than learning about how the computer works, it makes more sense for the person to learn how to use new programs (e.g. Excel).
Before learning about how computers work, they used an email program to do their emails. After learning about it, they still use their email program to do their emails, but now they realize that deleting old emails or reorganizing the folders may speed up the program (helpful but minor benefits in most cases). In terms of productivity, they would have benefited more if they had invested time learning a money-tracking/budgeting program rather than how the computer works.

Similarly, an FP programmer will have to use and understood the Functor-Applicative-Monad hierarchy, but one does not need to know all their CT foundations in order to use them. They will appreciate more about those concepts by learning their CT foundations, but the opportunity cost may not be worth it: learning or improving one’s skill in X might be better overall than learning more about CT.

I think this talk is a great example too:

Nice video! That was a very interesting talk.

JordanMartinez · January 11, 2019, 6:37pm

It seems like the average user is more likely to be dealing with data structures that can benefit from AA than they are inventing new control structures (that’s certainly true for me anyway).

Is CT mainly for control structures whereas AA is mainly for data structures?

I’d argue that the AA stuff is easier to understand and more immediately applicable/recognisable too

Thanks. I think this gives me guidance as to the order of things to learn.

garyb · January 11, 2019, 7:03pm

It’d be worth seeing what a real mathematician (@hdgarrood ) thinks too, as that’s just like my opinion maan. And yeah, I kinda see CT : control as AA : data broadly speaking.

edit: CT is a generalisation of AA too, for what it’s worth.

hdgarrood · January 12, 2019, 1:39am

I’m about to undermine my status as a ‘real mathematician’ (although I very much enjoy being described as one)…

To be totally honest, I don’t think I know enough category theory to confidently take any position on the idea that category theory is for control structures and abstract algebra is for data structures. (I have only read Tom Leinster’s book up to the chapter on limits). I think this is partially due to the fact that I’m not really sure how to define ‘control structures’. Of course Monad would be one example, but are there any other category-theory inspired ones?

hdgarrood · January 13, 2019, 1:55pm

Of course Monad would be one example, but are there any other category-theory inspired ones?

Come to think of it I suppose Comonad, Free, and Cofree all count as category-theory-inspired control structures as well.

chexxor · January 13, 2019, 5:19pm

I think of those as data structures. But in a Pure language like PureScript data structures are control structures. The only differentiation is how they are used, I would say. It’s one of the reasons the Data. and Control. module prefixes from Haskell slightly disappoint me.

JordanMartinez · January 15, 2019, 2:11am

Maybe I should ask this question in the FP Slack’s #categorytheory channel?

coot · March 8, 2019, 8:58pm

Category theory has been controversial even in mathematics. It is also called abstract nonsense by a reason ;). But it was an amazingly powerful language to discover hidden links between distinct fields. One example is Grothendieck-Galois theory. Galois (the inventor of group theory to large extend) solved the problem how to solve polynomial equations of order higher than 4, Grothendieck showed that’s it’s the same theorem as classification of the so called covering spaces (one of landmarks of topology theory). This connects algebra and topology. There were even further advances in this which are very abstract and thus very broad. And that’s the power of category theory. I don’t thing that anything but basics (which might explain the funny names) are necessary, and category theories often search for thing beyond a given fenomenon. For functional programming this might be the Curry-Howard-Lambek theorem which connects three seemingly distinct fields (as in Grothendieck-Galois theory).

With the advent of algebraic data types which includes recursive types, we get a very powerful language. One way to tame it is to study what are the possible constructions and what algebraic constructions plays well and are useful for expresing evolution of a program / computation. We get monoids, monads (which are kind perusal f recursive function) or even categories. Interestingly, category theory also studies internal structures, internal monoids to some category, internal functors, internal categories … Some of them are more useful than the other. But one had to try to check how they are useful in a programming context.

Control structures are most likely lazy, one doesn’t want to evaluate whole program to just start it. That’s probably the best line between data structures and control ones. Control ones are often recursive as well (hence the connection with monads, e.g. the free monad is a basic recursion scheme, and every monad is a ‘quotient’ of a free monad [this is not precise]). Also monads embrace sequencing which is another nice feature of one would like to model (this is more important in a lazy language). And sequencing via recursion rather than a list like monoid is more compelling in pure functional languages.