A guide to `purescript-coroutines`

JordanMartinez · March 22, 2020, 10:19pm

I’m reposting this here, so that I don’t spam the original thread where I posted it. Sorry about the double post, but I thought it’d be better not to side-track the original thread:

Here’s my basic understanding of how purescript-coroutines work based on my viewing yesterday. It isn’t complete (and I hope to finish it later this week), but it should be enough to get people 80% of what they need.

What Problem does `purescript-coroutines` Solve?

An Overview of the Problem

Based on an example provided by the library, it seems the library is trying to model functional reactive programming.
In its simplest version, functional reactive programming models “data pipelines” in a declarative way rather than an imperative way. Through it, one can write,

(simple) “When the event X occurs, do Y.”
(complex) “When EITHER (event A occurs, and B has not happened in the past 3 seconds, and user is in context C) OR (event D occurs and E is not false), do Z.”
(simple) “Pull the next 10 characters from a file and do X with them.”

“Data pipelines” work the same way as using the pipe/| character to compose multiple Linux bash programs together.

command1 | command2 | command3 | ... | finalCommand

It works similar to using >>> to compose multiple functions together.

To model this pipeline-like concept, coroutines provides 4 different concepts: Producers, Consumers, Transformers, and Cotransformers. We’ll cover the first two before describing the latter two.

The Basic Concepts

Producers represent a way to produce some initial data. For example, an apple tree produces apples. An HTML tag “produces” click events. A paper tray in a printer can provide paper when someone needs to print something. Producers don’t know or care what happens to the things they produce; they simply produce it.

Consumers represent a way to consume some final data. For example, one might eat an apple. Or a click event might be handled by alerting the user of some information. Or a printer might use the paper to print out a document. Consumers don’t know or care where the data comes from; they simply consume it once they get it.

Thus, the shortest program one can build is a Producer (e.g. apple tree) that is connected with a Consumer (e.g. a person who likes eating apples). When the Producer “emits” or produces some data, the Consumer “eats” or consumes that data. Put visually:

Producer --- (data) ---> Consumer

However, if all we had were Producers and Consumers, things would get pretty boring. This style of programming gets more interesting when we introduce Transformers and Cotransformers.

`Transformer`, a `Functor` for `Producer` and a `Contravariant` for `Consumer`

Transformer has similarities to the Functor and Contravariant type classes. In short, a Transformer changes the type emitted by Producer or consumed by Consumer.

Put visually, if Producer produces/emits/outputs values of the a type, and Transformer knows how to change values of the a type to values of the b type, and Consumer consumes values of the b type, then we would draw this like so:

Producer --- (a) --> Transformer ---- (b) ---> Consumer

If we combine the Producer with the Transformer, then the Transformer has a Functor-like affect on the Producer. In other words, it acts like map/<$> on the Producer's output.

| ------ the whole Producer ------ |
( Producer --- (a) --> Transformer ) --- (b) ---> Consumer

If we combine the Consumer with the Transformer, the Transformer has a Contravariant-like affect on the Consumer. In other words, it acts like contrmap/>$< on the Consumer's input.

                     | ------ the whole Consumer ------  |
Producer --- (a) --> ( Transformer --- (b) ---> Consumer )

`Cotransformer`, a `Producer`-`Consumer` with a transformative “hook”

Cotransformer is a way of combining both a Producer and Consumer together in an underlying “black box.” However, this black box exposes a “hook” that can accomplish the same Functor/Contravariant-like affect of a Transformer.

Put visually, the below box represents the Cotransformer and the function, a -> b, is the hook.

                       (a -> b)
                          ↓↓
                      +---++---+
                      |   ||   |
+---------------------+   ||   +------------------+
|                         ↓↓                      |
| Producer --- (a) ---> "hook" (b) ---> Consumer  |
|                                                 |
+-------------------------------------------------+

This is useful when you have the same Producer-Consumer relationship, but you need to change the Producer's output depending on other contextual information.

Data Pipelines: Recipes in the Making

In our original example of a producer, we mentioned that an apple tree produces apples. Then, we said that one apple can be transformed into applesauce or apple pie. Using this example, one might think that an apple can only be “consumed” once.

In reality, we can consume one apple multiple times. We’re dealing with data and binary: things that can be copied or referenced. Thus, we might have one Producers that emits a value that is consumed by multiple Consumers (e.g. 1-to-many). Or we might have multiple Producers whose outputs are eventually consumed by one Consumer (e.g. many-to-1). Or it might be multpile Producers whose outputs are consumed by multiple Consumers (e.g. many-to-many). The “food recipe” idea to functional programming definitely applies here

To Push or to Pull?

Once we have a data pipeline constructed, the question becomes, “Who controls the flow of data: the Producer or the Consumer?”

Program description starts with the phrase:	Controller	Pipeline Type
When a new event A occurs… then…	`Producer`	push-based
When I am ready for the next … then…	`Consumer`	pull-based

For example, when the apple tree produces apples and the apple-lover eats the apple, the apple-lover (i.e. Consumer) depends upon the apple tree (Producer). Thus, this is a push-based relationship because the Producer must “push” the data to the Consumer.

When the printer needs to print a new document, the printer (i.e. Consumer) will pull one sheet of paper from the paper tray (i.e. Producer). Thus, this is a pull-based relationships because the Consumer must “pull” data from the Producer.

The Internals of `purescript-coroutines`

Overviewing the Types

Looking at the library’s documentation, we have the following types:

Co
Emit
Producer
Await
Consumer
Transformer
Cotransformer
Process

If we look at the initial 5 types, we’ll see the following (and their desugared definitions):

type Co = FreeT
-- desugars into
type Co f m a = FreeT f m a

data Emit output a = Emit output a
type Producer output = Co (Emit output)
-- desugars into
type Producer output monad a = FreeT (Emit output) monad a

data Await input a = Await (input -> a)
type Consumer input = Co (Await input)
-- desugars into
type Consumer input monad a = FreeT (Await input) monad a

type Process = Co Identity
-- desugars into
type Process monad a = FreeT Identity monad a

`FreeT` and Its Usage

data Free functor output = -- implementation not shown

createAST :: Free functor output
createAST = -- implementation not shown

interpretAST
  :: forall functor m output
  --            (functor        ~> m       ) -- a NaturalTransformation
   . Monad m => (functor output -> m output) -> Free functor output -> m output
interpretAST = -- implementation not shown

Given that functor is a data structure that has a Funtor instance, Free functor output enables one to build a pure, side-effect-free and useless abstract syntax tree (AST) using data structures and then to interpret that AST into an impure, side-effectful but useful program. This interpretation works via a NaturalTransformation from the functor to the side-effectful monad.

FreeT changes this slightly by adding another type to the mix:

data FreeT functor baseMonad output = -- implementation not shown

FreeT has the shape of a monad transformer. A monad transformer is a way to “augment” a base monad (e.g. Effect or Aff) with some additional capabilities. It enables us to write “pure, side-effect-free” business logic that is decoupled from the impure, side-effectful implementation of that logic. For example, one writes their logic using a type class (e.g. MonadReader, for getting a configuration value at any point in the computation) and then uses a monad transformer to implement the requirements of that type class (e.g. ReaderT).

FreeT is a way of merging the Free-based notion of a pure AST data structure with the impure, side-effectful a monad transformer that implements the program. In other words, FreeT does not need a type class to achieve the same thing that a regular “type class and monad transformer” combination does. Thus, FreeT functor baseMonad output says,

If you give me a domain-specific language in terms of data structures that have a Functor instance (e.g. purescript-coroutines “commands:” Emit a value and Await for a value), and you give me a NaturalTransformation from those Functor types to a monadic data type like Aff (e.g. something the end-user of coroutines provides), then I can use those pure commands to run an impure program that produces output."

Reviewing `purescript-coroutines`’ types

Producer

Thus, looking back at our types

type Co f m a = FreeT f m a

data Emit output a = Emit output a

-- Produces values of type, `output`. These values can be produced
-- via side-effects if the `monad` type supports that.
type Producer output monad a = FreeT (Emit output) monad a

-- Create a producer that doesn't use side effects to produce its output.
-- It will always produce the same value each time in a pull-based consumer
-- or only once in a push-based consumer
emit :: forall m o. Monad m => o -> Producer o m Unit

-- Create a producer that uses side effects to produce its output.
-- If this side-effect returns `Left o`, the producer emits the `o`.
-- If it returns `Right r`, the producer stops emitting values.
producer :: forall o m r. Monad m => m (Either o r) -> Producer o m r

Consumer

type Co f m a = FreeT f m a

data Await input a = Await (input -> a)

-- Consume values of type, `input`. When consuming the input,
-- the Consumer can run side-effectful computations if the `monad` type
-- supports it.
type Consumer input monad a = FreeT (Await input) monad a

-- Create a consumer that never 'closes.' It will always respond to new
-- values produced by the Producer.
await :: forall m i. Monad m => Consumer i m i

-- Create a consumer that can close. Once closed, it will stop consuming
-- values produced by the Producer.
consumer :: forall i m r. Monad m => (i -> m (Maybe r)) -> Consumer i m r

Transformer

-- Lift a function into a coroutines type, but don't yet use it to transform
-- a Producer or Consumer.
transform :: forall m i o. Monad m => (i -> o) -> Transformer i o m Unit

-- Transform a Producer.
transformProducer :: forall i o f m a. MonadRec m => Parallel f m => Producer i m a -> Transformer i o m a -> Producer o m a

infixr 2 transformProducer as $~

-- Transform a Consumer.
transformConsumer :: forall i o f m a. MonadRec m => Parallel f m => Transformer i o m a -> Consumer o m a -> Consumer i m a

infixr 2 transformConsumer as ~$

-- Compose Transformers
composeTransformers :: forall i j k f m a. MonadRec m => Parallel f m => Transformer i j m a -> Transformer j k m a -> Transformer i k m a

infixr 2 composeTransformers as ~~

`Process`: a way to run a `Producer`-`Consumer` relationship

While not covered above, a Process is the result of connecting a Producer to a Consumer, so that the data pipeline is fully formulated and now ready to be executed.

-- Run a `Process` to completion.
runProcess :: forall m a. MonadRec m => Process m a -> m a

-- Create a Process by connecting a Producer and Consumer together using a
-- "push-based" relationship (i.e. Producer is in control).
-- The process ends when the Producer closes.
connect :: forall o f m a. MonadRec m => Parallel f m => Producer o m a -> Consumer o m a -> Process m a

infixr 2 connect as $$

-- Create a Process by connecting a Producer and Consumer together using a
-- "pull-based" relationship (i.e. Consumer is in control).
-- The process ends when the Consumer closes.
pullFrom :: forall o m a. MonadRec m => Consumer o m a -> Producer o m a -> Process m a

Infix Notation Cheatsheet

In the infixes above, the ~ indicates which side the transformer is on while the $ indicates which side the producer/consumer is on:

producesA      $~ transformsAToB == producesB
transformsAToB ~$ consumesB      == consumesA
transformsAToB ~~ transformsBToC == transformsAToC
producesA      $$ consumesA      == process

flip101 · March 25, 2020, 4:40pm

Thanks for this, great explanation. A lot of what was written reminds me of haskell’s conduit library.

However what i find a bit odd is that in other languages coroutines exist but there is not as a strong emphasis on pipes (or not even mentioning pipes at all !). I feel because of this i miss another point of view what the purescript coroutine library is. I associate “coroutine” also with async and concurrency, which was not mentioned. Maybe in the end the pipes-explanation is all what one needs to know to work with this technology … i don’t know.

thomashoneyman · March 25, 2020, 7:48pm

@JordanMartinez We should definitely add this to the library when you’re comfortable!

JordanMartinez · March 25, 2020, 8:13pm

For context, I once wrote code in Java. When I was reading through the code and examples in purescript-coroutines, it greatly reminded me of the concepts I had previously documented in ReactFX (see the docs I wrote live in the EventStreams section in the wiki). Most of the guide I wrote above was just an adaption of that prior work but for this particular library. Thus, if this library can be used in ways other than this “event stream” concept, I’m not aware of it and would not be able to update the guide to show that insight.
I know I’ve heard of pipes and other similar libraries but I haven’t used them myself or looked into them further.

JordanMartinez · March 25, 2020, 8:19pm

See https://github.com/purescript-contrib/purescript-coroutines/pull/31

flip101 · March 25, 2020, 8:39pm

Ok no problem. I think you already captured a lot of value with this explanation.

To clarify further i will quote some sources.

Coroutines are computer program components that generalize subroutines for non-preemptive multitasking, by allowing execution to be suspended and resumed. Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes.

From https://en.wikipedia.org/wiki/Coroutine

I think the “pipes” in that list of things that can be implemented are closest to the long text which was written about producers and consumers.

Coroutines are a general control structure whereby flow control is cooperatively passed between two different routines without returning.

The ‘yield’ statement in Python is a good example. It creates a coroutine. When the 'yield ’ is encountered the current state of the function is saved and control is returned to the calling function. The calling function can then transfer execution back to the yielding function and its state will be restored to the point where the ‘yield’ was encountered and execution will continue.

From https://stackoverflow.com/a/553745/1833322

All i’m saying is that the library is called “purescript-coroutines” and the text is saying “the library is trying to modal functional reactive programming”. When it seems that coroutines are a bit more broader primitive. But perhaps the purescript-coroutine library does specifically offer the frp / conduit / pipe functionality and it’s not to be taken in the sense of the primitive “control flow yielding piece of code”.

It’s certainly not meant to critique your text, i think it’s very good. It’s just my surprise that i read about FRP / conduit / pipe concept because i hadn’t really connected it to the term “coroutine” as strongly before.

natefaubion · March 26, 2020, 3:16am

I think at a high-level this is very good. Thanks for working on this! Some comments:

FRP (Conal-style) is not really related. There’s may be overlap for some use cases, but coroutines is not attempting to implement FRP (maybe look at purescript-events and purescript-behaviors for this).
Coroutines does not really implement push or pull. The primary means of fusing these uses Parallel to run all coroutines in parallel up until they hit a fusion boundary. I think this is an interesting take, but my experience has been that this is unintuitive if you are expecting more traditional stream processing. The consequences of this is that you need a Parallel instance, which is rare (it will probably be backed by Aff or maybe ContT Effect). This means it’s not possible to implement a pure pipeline with the built-in fusion combinators. You can get limited pull-based fusion with pullFrom, but this always creates a Process so you can’t create pull-based pipelines in general with what exists. We added it for Halogen because the Parallel semantics were broken for our use case. Push-based semantics are a whole other ball of wax (see this comment).
Processes end when anything stops with pure. There’s nothing to prevent a Producer/Consumer pullFrom to be halted by the Producer, but this isn’t really unique to coroutines.
You’ve omitted CoTransformers (which is understandable). If a Transform takes an input to some output, then a CoTransform is an output that can resume once provided an input. Fusing a Transform and CoTransform also creates a Process, where the initial output feeds into the Transform, then that feeds back into the CoTransform.

Tangentially related, the most common way of trying to integrate event-based stuff with coroutines (purescript-aff-coroutines) has some subtle problems and will result in leaked event sources that buffer their values indefinitely if you don’t use it just right. Halogen has all the EventSource stuff on top to account for this, and I don’t recommend using aff-coroutines directly.

garyb · March 26, 2020, 12:35pm

Halogen is moving to Event for its event subscriptions in v6 also, since they’re much simpler and meet the needs of that use case at least as well as coroutines did.

I have used coroutines in anger somewhat recently though, and they really shone in that use case: we have a web socket that’s providing data that is updating the UI, but for performance reasons the UI is only updating on demand from user interaction. The thing that made them so useful there is the ability for either side to pause - the socket side is paused when waiting for new messages from the server, and the UI side is paused until the user starts scrolling again. It worked out really nicely.

natefaubion · March 26, 2020, 1:57pm

The Parallel semantics work out if your Process is infinite, but if either side halts then you will get into the case where a side has been run for effects, but the only thing that can be done is just ignore the result. This can leave you in an inconsistent state. It’s similar to the issue with push-semantics where the producer will outrun the consumer.

@garyb Could this be solved by using a queue as well? I don’t know if you’re using aff-coroutines, but it essentially uses an AVar as a queue, where emit is a put, and await is a take. I personally recommend handling this pattern explicitly if you’re using Aff.

garyb · March 26, 2020, 2:22pm

It’s not using aff-coroutines, no. It is using an AVar to block on the UI side, and it could perhaps have been done entirely with an AVar, since the queuing behaviour is quite close to what this thing does, but I created a mutable array backed buffer for it, since there was some looking ahead and things required also.

Also, just curious, this is getting a little off topic now, but do you think AVar queueing would still be performant when it has like 100k things backed up in it? The buffer I’m dealing with can get pretty massive.

JordanMartinez · March 26, 2020, 2:25pm

“Broader primitive” makes sense! As I was writing the guide, part of me thought, “I wonder what would happen if one used something other than Await, Emit, Transform, etc. for the functor in the FreeT. Since Co SomeOtherFunctor would still fit the base type, perhaps there is more one could do than what this library currently exposes.”

Ah, no worries! Thanks for pointing out this kind of stuff. I guess you could say I had the opposite reaction. I thought, “Hey! That reminds me of ReactFX! …Huh? What’s a coroutine?”

I was wondering what the difference was between this and purescript-events as I thought the latter was actual FRP. I stated FRP above as the solution because it seemed too similar to FRP, so that was my understanding at the time (albeit with a bit of confusion for why then was it called ‘coroutines’). Regardless, thanks for further affirming that correction.

That was an assumption on my part. Thanks for correcting that.

Yeah… I ran out of time when I was writing the above guide. So, I figured I would cover Process and get back to it later since that’s how one runs it. Even if I had, it looks like my overview would have been wrong anyways.

Nice!

So, here’s some lingering questions I have:

Nate and Gary both implied that coroutines could be improved in various ways to make it more user-friendly (at least, that’s how I read your statements). What would such improvements look like? Could these be incorporated into purescript-coroutines or should that be a separate library?
Besides the things Nate mentioned above, what else would need to be covered before this guide is complete?
Besides Gary’s example, what other problems would be easily solved (and should be solved) using purescript-coroutines (rather than purescript-event, etc.)?

natefaubion · March 26, 2020, 3:43pm

There are no performance concerns with the size of the producer queue. It’s implemented as a mutable doubly-linked list, and all the operations are constant time. Honestly I wish I had just this exposed with a PureScript API . The only concern might be memory? There is the cell for the list, and then there is the object for the value/callback, so it will probably use more memory than a given Array. The AVar queue doesn’t need to copy as it grows, so there are some tradeoffs.

Nate and Gary both implied that coroutines could be improved in various ways to make it more user-friendly (at least, that’s how I read your statements).

My main issue with it is that it isn’t clear what the vision or goal of the library is. What workflows and use-cases is it trying to support? Without knowing that, it’s not possible to say whether the semantics are correct or not, or are particularly useful, or what really needs to be added or changed. Most users coming to it have slightly different needs, and I’ve found that it satisfies a very small number of them, so I’m inclined to recommend other solutions.

thomashoneyman · June 22, 2020, 1:34am

This has been added to the coroutines library and can be iterated on from there. Thanks, Jordan!

JordanMartinez · June 22, 2020, 8:13pm

Thanks for finishing it, @thomashoneyman!