When to use Control.Lazy, Data.Lazy or an explicit lazy function

sherman3ero · June 7, 2021, 6:21pm

Last week I ran into a performance issue while trying to Decode a recursive data type. With the help of the spy function I noticed that the problem was that we were eagerly trying to decode all options. I knew that the solution was to make the decoding lazy and my first attempt was to try and use Control.Lazy, but I couldn’t make head or tails of it. Luckly, I posted my issue in slack and with the help of @monoidmusician and @natefaubion I was able to fix the issue.

I decided to cross post fragments of the discussion so we don’t lose the information.

sherman3ero
does somebody have a good resource for understanding the Lazy typeclass in Control.Lazy? I understand the difference between eager and lazy but I don’t know exactly how to add the typeclass in the mix. For context, I’m having some performance issues with the Decode class from Foreign.Generic.Class as the decoder is eagerly trying more than it should.

To get my feet wet I’m trying to modify this example to use the typeclass but I can’t make it work https://jordanmartinez.github.io/purescript-jordans-reference-site/content/02-FP-Philosophical-Foundations/04-Lazy-vs-Strict.html

neither of these expressions compile

lazilyCompute' :: forall l. Lazy (l Int) => Int -> l Int
lazilyCompute' x = defer $ \_ -> strictlyCompute x

forceThunk' :: forall l. Lazy (l Int) => l Int -> Int

forceThunk' thunk = thunk unit

monoidmusician
note that that Lazy typeclass is very different from the Lazy datatype
I think you probably don’t need the Lazy typeclass unless you’re designing a library

are your issues with the decoders use of <|>? those might be hard to work around without writing your own decoders

sherman3ero
I’m defining a decoder for recursive type, and depending on the value, it takes a loooooong time. I managed to notice (with the use of spies), that even if the <|> operator is lazy, the expression seems to decode all paths per level (of the recursion)
I am writing my own decoders as we can’t use the generic representation

monoidmusician
how are you making <|> lazy?

sherman3ero
I’m not, looking at the definition I thought it was… So, I’m using decode, which returns F a, and that ends up being an ExceptT, and from what I understood, the alt implementation first try the first computation, and only if it fails it executes the second
https://github.com/purescript/purescript-transformers/blob/v5.1.0/src/Control/Monad/Except/Trans.purs#L83

monoidmusician
ah but F is Except e = ExceptT Identity e, the inner monad is not lazy so it doesn’t make a difference
making things lazy in PS is tricky

sherman3ero
doesn’t Identity have the Lazy instance?

monoidmusician
I don’t think so?
it might have a Lazy a => Lazy (Identity a) instance, but that just means it inherits laziness, it doesn’t add any laziness of its own
( that’s why Lazy is a bad typeclass for teaching)

I would suggest writing your own <|> operator
with type F a -> (Unit -> F a) -> F a
something like

altLazy :: forall a. F a -> (Unit -> F a) -> F a
altLazy (ExceptT (Identity (Right a)) _ = pure a
altLazy _ b = b unit

you can use typeclasses if you want, but I think just directly writing it out is simplest and the best way to get it correct

sherman3ero
it worked like a charm! Thanks a lot @monoidmusician @natefaubion
I’m still wondering what’s the intuition on when to use the Control.Lazy typeclass, the Data.Lazy type or other solutions like the one you posted…

natefaubion
Data.Lazy is for haskell-like lazy (memoized) evaluation. It’s useful when you have a computation/producer with potentially 0 or many consumers. This lets you save work by either not performing it or by sharing it. Control.Lazy is an abstraction for lazy (co)data/control structures. Examples are parsers and lazy lists. Essentially anything that can have an infinite unfolding, and so you can’t evaluate it’s whole spine up front. Therefore you need a way to defer portions of it to avoid non-termination. These may use Data.Lazy internally, or may not, and so may or may not share work. For example, lazy lists will share work, but parsers usually don’t. Unit -> a is equivalent to call-by-name deferred evaluation. There is no shared work, so it’s good for one-shot control structures.

I think that Control.Lazy should really be Control.Defer
“lazy” is a really overloaded term, and it’s super confusing to have both Data and Control.
I reach for Unit -> a 98% of the time.
purescript-call-by-name is a bit of a hack to auto-lift terms into Unit -> a closures. It’s equivalent to Swift’s @autoclosure annotation. This just means you don’t have to write \_ -> ... everywhere.
So with the magic <|> it will auto-lift the right-hand side. so foo <|> something expensive gets elaborated into foo <|> (\_ -> something expensive)
And then your expensive right hand side won’t be evaluated strictly.