Last week I ran into a performance issue while trying to Decode a recursive data type. With the help of the
spy function I noticed that the problem was that we were eagerly trying to decode all options. I knew that the solution was to make the decoding lazy and my first attempt was to try and use
Control.Lazy, but I couldn’t make head or tails of it. Luckly, I posted my issue in slack and with the help of @monoidmusician and @natefaubion I was able to fix the issue.
I decided to cross post fragments of the discussion so we don’t lose the information.
does somebody have a good resource for understanding the Lazy typeclass in Control.Lazy? I understand the difference between eager and lazy but I don’t know exactly how to add the typeclass in the mix. For context, I’m having some performance issues with the Decode class from Foreign.Generic.Class as the decoder is eagerly trying more than it should.
To get my feet wet I’m trying to modify this example to use the typeclass but I can’t make it work https://jordanmartinez.github.io/purescript-jordans-reference-site/content/02-FP-Philosophical-Foundations/04-Lazy-vs-Strict.html
neither of these expressions compile
lazilyCompute' :: forall l. Lazy (l Int) => Int -> l Int lazilyCompute' x = defer $ \_ -> strictlyCompute x forceThunk' :: forall l. Lazy (l Int) => l Int -> Int forceThunk' thunk = thunk unit
note that that Lazy typeclass is very different from the Lazy datatype
I think you probably don’t need the Lazy typeclass unless you’re designing a library
are your issues with the decoders use of <|>? those might be hard to work around without writing your own decoders
I’m defining a decoder for recursive type, and depending on the value, it takes a loooooong time. I managed to notice (with the use of spies), that even if the <|> operator is lazy, the expression seems to decode all paths per level (of the recursion)
I am writing my own decoders as we can’t use the generic representation
how are you making <|> lazy?
I’m not, looking at the definition I thought it was… So, I’m using decode, which returns F a, and that ends up being an ExceptT, and from what I understood, the alt implementation first try the first computation, and only if it fails it executes the second
ah but F is Except e = ExceptT Identity e, the inner monad is not lazy so it doesn’t make a difference
making things lazy in PS is tricky
doesn’t Identity have the Lazy instance?
I don’t think so?
it might have a
Lazy a => Lazy (Identity a)instance, but that just means it inherits laziness, it doesn’t add any laziness of its own
( that’s why
Lazyis a bad typeclass for teaching)
I would suggest writing your own <|> operator
with type F a -> (Unit -> F a) -> F a
altLazy :: forall a. F a -> (Unit -> F a) -> F a altLazy (ExceptT (Identity (Right a)) _ = pure a altLazy _ b = b unit
you can use typeclasses if you want, but I think just directly writing it out is simplest and the best way to get it correct
it worked like a charm! Thanks a lot @monoidmusician @natefaubion
I’m still wondering what’s the intuition on when to use the Control.Lazy typeclass, the Data.Lazy type or other solutions like the one you posted…
Data.Lazy is for haskell-like lazy (memoized) evaluation. It’s useful when you have a computation/producer with potentially 0 or many consumers. This lets you save work by either not performing it or by sharing it. Control.Lazy is an abstraction for lazy (co)data/control structures. Examples are parsers and lazy lists. Essentially anything that can have an infinite unfolding, and so you can’t evaluate it’s whole spine up front. Therefore you need a way to defer portions of it to avoid non-termination. These may use Data.Lazy internally, or may not, and so may or may not share work. For example, lazy lists will share work, but parsers usually don’t.
Unit -> ais equivalent to call-by-name deferred evaluation. There is no shared work, so it’s good for one-shot control structures.
I think that Control.Lazy should really be
“lazy” is a really overloaded term, and it’s super confusing to have both Data and Control.
I reach for
Unit -> a98% of the time.
purescript-call-by-name is a bit of a hack to auto-lift terms into
Unit -> aclosures. It’s equivalent to Swift’s
@autoclosureannotation. This just means you don’t have to write
\_ -> ...everywhere.
So with the magic
<|>it will auto-lift the right-hand side. so
foo <|> something expensivegets elaborated into
foo <|> (\_ -> something expensive)
And then your expensive right hand side won’t be evaluated strictly.