In my experience, Data.List.Lazy
often doesn’t quite suffice to allow for the same sort of lazy streaming tricks as in Haskell. While the data structure is indeed lazy, using it in idiomatic PureScript will involve pushing it through some non-lazy classes like Semigroup
and Foldable
. Careful handling is required to preserve the laziness in such cases.
The following is an approach you could take to get a truly lazy collection type that would suffice for the example you’ve given. It employs a few tricks that are worth knowing for more general use, in my opinion.
The first trick is that applying a continuation-passing-style (CPS) transformation to your data structures will effectively convert them from call-by-value to call-by-name. It gives you explicit control over evaluation order. Go from
newtype Wrapper a = Wrapper a
to
newtype LazyWrapper a = LazyWrapper (forall r. (a -> r) -> r)
and now you can evaluate the a
-typed value only once a continuation has been provided to the wrapper.
Adding constraints to the result type r
will effectively give capabilities to your wrapper type—when the context provides your wrapper with extra operations on the result type, the wrapper can do more interesting things than simply call the continuation once and return the result. In our case, we want our wrapper to be a sort of ‘free Foldable
’, so we want to be able to call the continuation multiple times and combine the results. Thus:
newtype FreeFoldable a = FreeFoldable (forall r. Monoid r => (a -> r) -> r)
In order to use sequence
with this FreeFoldable
as the inner effect, it needs to be an Applicative
functor. This is a straightforward exercise in type chasing:
instance Functor FreeFoldable where
map f (FreeFoldable ff) = FreeFoldable (lcmap (lcmap f) ff)
instance Apply FreeFoldable where
apply (FreeFoldable ff1) (FreeFoldable ff2) = FreeFoldable \k -> ff1 \f -> ff2 \a -> k (f a)
instance Applicative FreeFoldable where
pure a = FreeFoldable \k -> k a
Your example uses range
, and so we’ll need an Unfoldable1
instance for it as well.
instance Unfoldable1 FreeFoldable where
unfoldr1 f b = FreeFoldable \k -> let go = f >>> \(Tuple a more) -> k a <> foldMap go more) in go b
Here we see the first real use of the Monoid
constraint inside FreeFoldable
. The FreeFoldable
produced by unfoldr1
will call its provided continuation for every element yielded by f
, and the results of these continuations are fused with <>
, a.k.a. append
.
(Exercise: Could this instance be written with only a Semigroup
constraint inside of FreeFoldable
in place of Monoid
? How would that change what FreeFoldable
represents conceptually?)
Here, however, we run into the sort of problem I was alluding to earlier. append
is a strict operation, so this version of unfoldr1
will produce a FreeFoldable
that immediately runs the continuation it receives on all of the values represented by the FreeFoldable
. While it may not be immediately obvious that this results in running out of heap on your example, you can see that it does if you complete this example without the following modification.
We instead want append
to be lazy in its second argument, so that the obligations of the first argument can be discharged before the second argument is computed. To accomplish this, I will use defer
:
instance Unfoldable1 LazyFreeFoldable where
unfoldr1 f b = LazyFreeFoldable \k -> let go = f >>> \(Tuple a more) -> k a <> defer \_ -> foldMap go more in go b
But note that defer
requires the result value of the continuation to implement the Lazy
class, so we will now switch from the FreeFoldable
definition given previously to the following:
newtype LazyFreeFoldable a = LazyFreeFoldable (forall r. Lazy r => Monoid r => (a -> r) -> r)
(Everything else we’ve done to this point can be migrated from FreeFoldable
to LazyFreeFoldable
without any alterations.)
Again, adding a constraint to the result type grants a capability to the CPS wrapper, in this case the capability to suspend evaluating a continuation until it’s actually needed, not just once the continuation is provided.
At this point, we can write:
example :: LazyFreeFoldable (Array Int)
example = sequence $ replicate 8 $ range 0 7
Construction of this value is near-instantaneous, but of course, we can’t do much with it yet. Let’s say we want to be able to take its length
, which means we’ll need to implement a Foldable
instance. As the name implies, the function wrapped by LazyFreeFoldable
matches very closely to the foldMap
we’ll need:
foldMap' :: forall m a. Lazy m => Monoid m => (a -> m) -> LazyFreeFoldable a -> m
foldMap' f (LazyFreeFoldable lff) = lff f
This is almost suitable for the foldMap
required by Foldable
, except for that pesky Lazy
constraint. We can be semi-clever about satisfying that, though:
instance Foldable LazyFreeFoldable where
foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit
Why is this only semi-clever? Well, we may be giving the inner function a continuation that returns a defer
able value, but then we immediately force the deferred computation! This will cause problems for us later, but for now let’s continue on to foldr
and foldl
:
instance Foldable LazyFreeFoldable where
foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit
foldl op z (LazyFreeFoldable lff) = unwrap (unwrap (lff (\a -> Dual $ Endo (_ `op` a)))) z
foldr op z (LazyFreeFoldable lff) = unwrap (lff (\a -> Endo (a `op` _))) z
(When attempting to maximize laziness, we can get better results implementing members like this by hand instead of using foldlDefault
and the like, which often don’t have the best run-time behavior.)
Again, this almost works. Endo
and Dual
are two very useful Monoid
s that can be dropped right into our continuation functions. The trouble is, they don’t have Lazy
instances. (Maybe they should! This might make a nice introductory PR over at purescript-control.)
As PureScript doesn’t support orphan instances, we’re going to have to make a wrapper. Let’s roll up our sleeves and write some boilerplate:
newtype LazyWrapper a = LazyWrapper a
derive instance Newtype (LazyWrapper a) _
derive newtype instance Semigroup a => Semigroup (LazyWrapper a)
derive newtype instance Monoid a => Monoid (LazyWrapper a)
We want Newtype
so that we can use unwrap
with this wrapper the same as we do with Dual
and Endo
. Semigroup
and Monoid
are important for obvious reasons. But the point of this wrapper, of course, is:
instance (Newtype a b, Lazy b) => Lazy (LazyWrapper a) where
defer = coerce (defer :: (Unit -> b) -> b)
The use of coerce
here is a slick trick. The Newtype
constraint both helps the compiler identify what a
is wrapping (i.e., the meaning of b
) and ensures that a
and b
have the same run-time representation—which means that a defer :: (Unit -> b) -> b
can be put to use as a defer :: (Unit -> a) -> a
at run time without any overhead.
Now we can write:
instance Foldable LazyFreeFoldable where
foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit
foldl op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Dual $ LazyWrapper $ Endo (_ `op` a)) z
foldr op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Endo (a `op` _)) z
(The unwrap
s were getting pretty gnarly, so I switched to a single coerce
per member. coerce
will unwrap any number of layers of newtypes. Be careful not to use it in places where the expected type isn’t obvious.)
And now length example
will work without blowing up the heap!
One last thing, though. Having gotten to this point, you might expect to be able to run for_ example logShow
or something to print every array in the example, one at a time. Alas, this explodes the heap again. So does foldMap logShow example
, which uses the Semigroup
instance to combine Effect
s instead of the Apply
instance as for_
does. Why do these examples fail? It comes down to the fact that an Effect
is represented by a thunk in JavaScript, and append
ing (or (*>)
ing) two Effect Unit
s effectively composes those thunks—which means a new thunk is constructed with references to the previous two. So even though we’ve constructed our LazyFreeFoldable
using a pattern that defers the second argument of append
until it’s needed, the effect of append
on Effect Unit
means that once it is needed, it just gets stuffed into an in-memory thunk tree that grows until the heap is exhausted.
The members of Foldable
can’t help us here, but we can use the foldMap'
function we defined earlier to get what we want. The almost-works version is this:
foldMap' logShow example
And it doesn’t work because Effect Unit
isn’t Lazy
. Once again, it could be! And once again, we’re stuck writing a wrapper type instead. We’ll need a little more this time:
newtype LazyEffect a = LazyEffect (Effect a)
derive instance Newtype (LazyEffect a) _
derive newtype instance Functor LazyEffect
derive newtype instance Apply LazyEffect
derive newtype instance Applicative LazyEffect
derive newtype instance Bind LazyEffect
derive newtype instance Semigroup a => Semigroup (LazyEffect a)
derive newtype instance Monoid a => Monoid (LazyEffect a)
instance Lazy (LazyEffect a) where
defer = (u >>= _) where u = pure unit
At last, at long last, you can watch your computer count to 16.7 million in base 8:
unwrap $ foldMap' (logShow >>> LazyEffect) example
I hope this has demonstrated a few new tricks for working in PureScript. The big takeaway, though, is that in a call-by-value language, getting laziness to work smoothly throughout even a relatively simple computation can be a big production, with lots of places where things can subtly go wrong and you won’t notice until you get a performance issue. A big part of this issue in PureScript is that fundamental type classes like Semigroup
and Foldable
simply aren’t typed to support lazy operations in general. Designing an alternate prelude with class signatures that encourage lazy-first programming is left as an exercise for the very industrious reader.
For working with lazy computations in the ecosystem as it exists today, consider the following:
- Do you really need lazy computations?
- Can you program directly on the existing lazy types (
Data.List.Lazy
, Data.Lazy
) without needing the power of the rest of the ecosystem?
- Try CPS-ing your computation and/or data types.
- Try writing wrapper types that add
Control.Lazy
support to types you use.
- Try writing specifically lazy versions of library functions that are too general as-is.
Full working code for the running example is below:
module Main where
import Prelude
import Control.Lazy (class Lazy, defer)
import Data.Array (replicate)
import Data.Foldable (class Foldable, foldMap, length)
import Data.Monoid.Dual (Dual(..))
import Data.Monoid.Endo (Endo(..))
import Data.Newtype (class Newtype, unwrap)
import Data.Profunctor (lcmap)
import Data.Traversable (sequence)
import Data.Tuple (Tuple(..))
import Data.Unfoldable1 (class Unfoldable1, range)
import Effect (Effect)
import Effect.Console (logShow)
import Safe.Coerce (coerce)
newtype LazyWrapper a = LazyWrapper a
derive instance Newtype (LazyWrapper a) _
derive newtype instance Semigroup a => Semigroup (LazyWrapper a)
derive newtype instance Monoid a => Monoid (LazyWrapper a)
instance (Newtype a b, Lazy b) => Lazy (LazyWrapper a) where
defer = coerce (defer :: (Unit -> b) -> b)
newtype LazyEffect a = LazyEffect (Effect a)
derive instance Newtype (LazyEffect a) _
derive newtype instance Functor LazyEffect
derive newtype instance Apply LazyEffect
derive newtype instance Applicative LazyEffect
derive newtype instance Bind LazyEffect
derive newtype instance Semigroup a => Semigroup (LazyEffect a)
derive newtype instance Monoid a => Monoid (LazyEffect a)
instance Lazy (LazyEffect a) where
defer = (u >>= _) where u = pure unit
newtype LazyFreeFoldable a = LazyFreeFoldable (forall r. Lazy r => Monoid r => (a -> r) -> r)
instance Functor LazyFreeFoldable where
map f (LazyFreeFoldable lff) = LazyFreeFoldable (lcmap (lcmap f) lff)
instance Apply LazyFreeFoldable where
apply (LazyFreeFoldable lff1) (LazyFreeFoldable lff2) = LazyFreeFoldable \k -> lff1 \f -> lff2 \a -> k (f a)
instance Applicative LazyFreeFoldable where
pure a = LazyFreeFoldable \k -> k a
instance Unfoldable1 LazyFreeFoldable where
unfoldr1 f b = LazyFreeFoldable \k -> let go = f >>> \(Tuple a more) -> k a <> defer \_ -> foldMap go more in go b
foldMap' :: forall m a. Lazy m => Monoid m => (a -> m) -> LazyFreeFoldable a -> m
foldMap' f (LazyFreeFoldable lff) = lff f
instance Foldable LazyFreeFoldable where
foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit
foldl op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Dual $ LazyWrapper $ Endo (_ `op` a)) z
foldr op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Endo (a `op` _)) z
example :: LazyFreeFoldable (Array Int)
example = sequence $ replicate 8 $ range 0 7
main :: Effect Unit
main = do
unwrap $ foldMap' (logShow >>> LazyEffect) example
logShow (length example :: Int)