Solving problems the FP way aka Purescript AdventofCode

razcore-rad · February 13, 2023, 2:09pm

Purescript v0.15.7

I didn’t really know how to title this, but anyway long story short, I regained my motivation so learn Purescript yet again and for the past month or so I went back to https://adventofcode.com to solve some puzzles from the beginning, i.e. 2015.

I made some progress up to 2015 - day 17 where I got stuck because of memory limitations.

I was pretty surprised to see that Purescript doesn’t have a combinatorics-related package out there so for the purposes of solving these problems I went to the Python docs and stackoverflow to get some examples of implementations in Javascript and Python.

So I tried this code for cartesianProduct:

cartesianProduct :: ∀ f a. Foldable f => f (Array a) -> Array (Array a)
cartesianProduct = foldl (\acc pool -> concatMap (\as -> snoc as <$> pool) acc) [[]]

The problem with it is that it easily blows the heap memory. On my PC, cartesianProduct $ replicate 8 (range 0 7) blows the heap.

The input for the problem on day 17 is of length 20, i.e. the equivalent of range 0 19 so I need to start with cartesianProduct $ replicate 20 input essentially.

What could I do in this case? Is there a way to make this function not blow up the heap?

For reference:

here’s a link to how the solution looks in JavaScript: Advent of Code 2015 — Explanation ⸱ Blog ⸱ Eugene Obrezkov
the heap error in spago repl for cartesianProduct $ replicate 8 (range 0 7):

> cartesianProduct $ replicate 8 (range 0 7)

<--- Last few GCs --->

[447659:0x6a042a0]    12023 ms: Scavenge 2022.7 (2064.9) -> 2018.5 (2065.9) MB, 8.1 / 0.0 ms  (average mu = 0.176, current mu = 0.121) allocation failure; 
[447659:0x6a042a0]    12035 ms: Scavenge 2023.7 (2065.9) -> 2020.2 (2068.1) MB, 9.5 / 0.0 ms  (average mu = 0.176, current mu = 0.121) allocation failure; 
[447659:0x6a042a0]    12044 ms: Scavenge 2026.0 (2068.1) -> 2022.2 (2085.4) MB, 7.5 / 0.0 ms  (average mu = 0.176, current mu = 0.121) allocation failure; 


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xbe3c50 node::Abort() [/home/razcore-rad/.local/apps/node/bin/node]
 2: 0xaf08b4  [/home/razcore-rad/.local/apps/node/bin/node]
 3: 0xdcf7c0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/home/razcore-rad/.local/apps/node/bin/node]
 4: 0xdcfb76 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/home/razcore-rad/.local/apps/node/bin/node]
 5: 0xfce5f5  [/home/razcore-rad/.local/apps/node/bin/node]
 6: 0xfceba6 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/home/razcore-rad/.local/apps/node/bin/node]
 7: 0xfe0e96  [/home/razcore-rad/.local/apps/node/bin/node]
 8: 0xfe1ad5 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/razcore-rad/.local/apps/node/bin/node]
 9: 0xfe41b8 v8::internal::Heap::HandleGCRequest() [/home/razcore-rad/.local/apps/node/bin/node]
10: 0xf5e427 v8::internal::StackGuard::HandleInterrupts() [/home/razcore-rad/.local/apps/node/bin/node]
11: 0x13b1765 v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [/home/razcore-rad/.local/apps/node/bin/node]
12: 0x18383f9  [/home/razcore-rad/.local/apps/node/bin/node]

CarstenKoenig · February 13, 2023, 2:44pm

8 to the power of 8 is quite a large number (16,777,216) - are you sure you need that many arrays?
This was the tetris-problem right? Why do you need to enumerate anything here? Part 1 was checking if you implemented the “simulation” right and part 2 was a bit more involved (I don’t want to spoiler it here but I don’t think you need to compute combinations in that range - or at all).

Btw: You can do most “combinatorics” algorithms by using the monad-instance for arrays.

CarstenKoenig · February 13, 2023, 2:51pm

Oh sorry - I guess you are solving 2015/Day17 not 2022 - you don’t have to generate all combinations there either - while you look for combinations you should look at the volume you have to fill still.

razcore-rad · February 13, 2023, 2:51pm

Sorry, wasn’t clear, I meant adventofcode 2015 day 17. Anyway generalities like these aren’t helpful to me cause I don’t know Purescript that well or another way of saying it: how would an experienced Purescript developer solve 2015-D17?

CarstenKoenig · February 13, 2023, 2:53pm

yeah sorry - so do you mind spoilers then? One moment I’ll translate my Haskell solution …

razcore-rad · February 13, 2023, 2:54pm

Sure no problem, I already know how to solve the problem in something like Python it’s just the barrier with the language and it’s memory etc. limitations.

CarstenKoenig · February 13, 2023, 3:00pm

Ok here you go:

this is my part 1 (with my input).

In case you are wondering: this is more or less a dynamic programming kind of solution: You consider each container and you have two options: use it or don’t - if you use it you continue with a recursive sub-problem where the remaining volume you have to find is lowered by the picked container - if not you go on considering the rest containers with the same volume.
Now the only other part is the guard such that you don’t pick containers that are too large.

And then it goes on to just count the solutions - part two can be build upon this one (hint: sort)

module Day17 where

import Prelude

import Data.Array ((:))
import Data.Array as Array
import Data.Maybe (Maybe(..))

combinations :: Int -> Array Int -> Array (Array Int)
combinations 0 _ = [ [] ]
combinations vol arr =
  case Array.uncons arr of
    Just { head: c, tail: cs } ->
      let
        without = combinations vol cs
        with = (c : _) <$> combinations (vol - c) cs
      in
        if c <= vol then with <> without else without
    Nothing -> []

myInput :: Array Int
myInput =
  [ 43
  , 3
  , 4
  , 10
  , 21
  , 44
  , 4
  , 6
  , 47
  , 41
  , 34
  , 17
  , 17
  , 44
  , 36
  , 31
  , 46
  , 9
  , 27
  , 38
  ]

usage:

> import Data.Array as A
> import Day17 as D
> A.length $ D.combinations 150 D.myInput
1638

Which is the right answer - no heap/stack problems either.

razcore-rad · February 13, 2023, 4:00pm

I see well you went the smart route, I actually was thinking of this solution but I had no idea how to implement it in Purescript :)) thanks.

I’d still be interested if there’s a solution for cartesianProduct so it doesn’t give a heap error like maybe a lazy version or something because the brute force solution can still be used in other programming languages so I feel like something like this should work in Purescript as well. Like the solution I mentioned from JavaScript: Advent of Code 2015 — Explanation ⸱ Blog ⸱ Eugene Obrezkov

CarstenKoenig · February 13, 2023, 4:02pm

I don’t think you want cartesian product - you want to find all subsets/sublists and the algorithm for that is very similar to the one I’ve shown here (just remove all the volume stuff - the idea with take it or don’t is the same)

For cartesian product your code seems to be fine - you can probably tell node you want more heap also (see here)

rhendric · February 14, 2023, 8:53pm

This doesn’t directly help with your how-to-do-an-inefficient-thing-efficiently question, but it’s my pleasure to point out that cartesianProduct is one of the many names of sequence, should you find yourself in need of that operation in the future.

razcore-rad · February 16, 2023, 5:23am

Didn’t know what, that’s cool. Well it’s not that I want to solve it inefficiently, I guess I’m curious how one would use sequence with a lazy type class or something else so it doesn’t blow up the heap.

CarstenKoenig · February 16, 2023, 7:27am

That depends on what you actually want - if you want non-strict/lazy behaviour you might be better of using Haskell.

I guess you want to have a way to iterate through all in this case LazyList might do the job for you (did not try it though)

rhendric · February 17, 2023, 6:06am

In my experience, Data.List.Lazy often doesn’t quite suffice to allow for the same sort of lazy streaming tricks as in Haskell. While the data structure is indeed lazy, using it in idiomatic PureScript will involve pushing it through some non-lazy classes like Semigroup and Foldable. Careful handling is required to preserve the laziness in such cases.

The following is an approach you could take to get a truly lazy collection type that would suffice for the example you’ve given. It employs a few tricks that are worth knowing for more general use, in my opinion.

The first trick is that applying a continuation-passing-style (CPS) transformation to your data structures will effectively convert them from call-by-value to call-by-name. It gives you explicit control over evaluation order. Go from

newtype Wrapper a = Wrapper a

to

newtype LazyWrapper a = LazyWrapper (forall r. (a -> r) -> r)

and now you can evaluate the a-typed value only once a continuation has been provided to the wrapper.

Adding constraints to the result type r will effectively give capabilities to your wrapper type—when the context provides your wrapper with extra operations on the result type, the wrapper can do more interesting things than simply call the continuation once and return the result. In our case, we want our wrapper to be a sort of ‘free Foldable’, so we want to be able to call the continuation multiple times and combine the results. Thus:

newtype FreeFoldable a = FreeFoldable (forall r. Monoid r => (a -> r) -> r)

In order to use sequence with this FreeFoldable as the inner effect, it needs to be an Applicative functor. This is a straightforward exercise in type chasing:

instance Functor FreeFoldable where
  map f (FreeFoldable ff) = FreeFoldable (lcmap (lcmap f) ff)

instance Apply FreeFoldable where
  apply (FreeFoldable ff1) (FreeFoldable ff2) = FreeFoldable \k -> ff1 \f -> ff2 \a -> k (f a)

instance Applicative FreeFoldable where
  pure a = FreeFoldable \k -> k a

Your example uses range, and so we’ll need an Unfoldable1 instance for it as well.

instance Unfoldable1 FreeFoldable where
  unfoldr1 f b = FreeFoldable \k -> let go = f >>> \(Tuple a more) -> k a <> foldMap go more) in go b

Here we see the first real use of the Monoid constraint inside FreeFoldable. The FreeFoldable produced by unfoldr1 will call its provided continuation for every element yielded by f, and the results of these continuations are fused with <>, a.k.a. append.

(Exercise: Could this instance be written with only a Semigroup constraint inside of FreeFoldable in place of Monoid? How would that change what FreeFoldable represents conceptually?)

Here, however, we run into the sort of problem I was alluding to earlier. append is a strict operation, so this version of unfoldr1 will produce a FreeFoldable that immediately runs the continuation it receives on all of the values represented by the FreeFoldable. While it may not be immediately obvious that this results in running out of heap on your example, you can see that it does if you complete this example without the following modification.

We instead want append to be lazy in its second argument, so that the obligations of the first argument can be discharged before the second argument is computed. To accomplish this, I will use defer:

instance Unfoldable1 LazyFreeFoldable where
  unfoldr1 f b = LazyFreeFoldable \k -> let go = f >>> \(Tuple a more) -> k a <> defer \_ -> foldMap go more in go b

But note that defer requires the result value of the continuation to implement the Lazy class, so we will now switch from the FreeFoldable definition given previously to the following:

newtype LazyFreeFoldable a = LazyFreeFoldable (forall r. Lazy r => Monoid r => (a -> r) -> r)

(Everything else we’ve done to this point can be migrated from FreeFoldable to LazyFreeFoldable without any alterations.)

Again, adding a constraint to the result type grants a capability to the CPS wrapper, in this case the capability to suspend evaluating a continuation until it’s actually needed, not just once the continuation is provided.

At this point, we can write:

example :: LazyFreeFoldable (Array Int)
example = sequence $ replicate 8 $ range 0 7

Construction of this value is near-instantaneous, but of course, we can’t do much with it yet. Let’s say we want to be able to take its length, which means we’ll need to implement a Foldable instance. As the name implies, the function wrapped by LazyFreeFoldable matches very closely to the foldMap we’ll need:

foldMap' :: forall m a. Lazy m => Monoid m => (a -> m) -> LazyFreeFoldable a -> m
foldMap' f (LazyFreeFoldable lff) = lff f

This is almost suitable for the foldMap required by Foldable, except for that pesky Lazy constraint. We can be semi-clever about satisfying that, though:

instance Foldable LazyFreeFoldable where
  foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit

Why is this only semi-clever? Well, we may be giving the inner function a continuation that returns a deferable value, but then we immediately force the deferred computation! This will cause problems for us later, but for now let’s continue on to foldr and foldl:

instance Foldable LazyFreeFoldable where
  foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit
  foldl op z (LazyFreeFoldable lff) = unwrap (unwrap (lff (\a -> Dual $ Endo (_ `op` a)))) z
  foldr op z (LazyFreeFoldable lff) = unwrap (lff (\a -> Endo (a `op` _))) z

(When attempting to maximize laziness, we can get better results implementing members like this by hand instead of using foldlDefault and the like, which often don’t have the best run-time behavior.)

Again, this almost works. Endo and Dual are two very useful Monoids that can be dropped right into our continuation functions. The trouble is, they don’t have Lazy instances. (Maybe they should! This might make a nice introductory PR over at purescript-control.)

As PureScript doesn’t support orphan instances, we’re going to have to make a wrapper. Let’s roll up our sleeves and write some boilerplate:

newtype LazyWrapper a = LazyWrapper a
derive instance Newtype (LazyWrapper a) _
derive newtype instance Semigroup a => Semigroup (LazyWrapper a)
derive newtype instance Monoid a => Monoid (LazyWrapper a)

We want Newtype so that we can use unwrap with this wrapper the same as we do with Dual and Endo. Semigroup and Monoid are important for obvious reasons. But the point of this wrapper, of course, is:

instance (Newtype a b, Lazy b) => Lazy (LazyWrapper a) where
  defer = coerce (defer :: (Unit -> b) -> b)

The use of coerce here is a slick trick. The Newtype constraint both helps the compiler identify what a is wrapping (i.e., the meaning of b) and ensures that a and b have the same run-time representation—which means that a defer :: (Unit -> b) -> b can be put to use as a defer :: (Unit -> a) -> a at run time without any overhead.

Now we can write:

instance Foldable LazyFreeFoldable where
  foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit
  foldl op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Dual $ LazyWrapper $ Endo (_ `op` a)) z
  foldr op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Endo (a `op` _)) z

(The unwraps were getting pretty gnarly, so I switched to a single coerce per member. coerce will unwrap any number of layers of newtypes. Be careful not to use it in places where the expected type isn’t obvious.)

And now length example will work without blowing up the heap!

One last thing, though. Having gotten to this point, you might expect to be able to run for_ example logShow or something to print every array in the example, one at a time. Alas, this explodes the heap again. So does foldMap logShow example, which uses the Semigroup instance to combine Effects instead of the Apply instance as for_ does. Why do these examples fail? It comes down to the fact that an Effect is represented by a thunk in JavaScript, and appending (or (*>)ing) two Effect Units effectively composes those thunks—which means a new thunk is constructed with references to the previous two. So even though we’ve constructed our LazyFreeFoldable using a pattern that defers the second argument of append until it’s needed, the effect of append on Effect Unit means that once it is needed, it just gets stuffed into an in-memory thunk tree that grows until the heap is exhausted.

The members of Foldable can’t help us here, but we can use the foldMap' function we defined earlier to get what we want. The almost-works version is this:

foldMap' logShow example

And it doesn’t work because Effect Unit isn’t Lazy. Once again, it could be! And once again, we’re stuck writing a wrapper type instead. We’ll need a little more this time:

newtype LazyEffect a = LazyEffect (Effect a)
derive instance Newtype (LazyEffect a) _
derive newtype instance Functor LazyEffect
derive newtype instance Apply LazyEffect
derive newtype instance Applicative LazyEffect
derive newtype instance Bind LazyEffect
derive newtype instance Semigroup a => Semigroup (LazyEffect a)
derive newtype instance Monoid a => Monoid (LazyEffect a)

instance Lazy (LazyEffect a) where
  defer = (u >>= _) where u = pure unit

At last, at long last, you can watch your computer count to 16.7 million in base 8:

unwrap $ foldMap' (logShow >>> LazyEffect) example

I hope this has demonstrated a few new tricks for working in PureScript. The big takeaway, though, is that in a call-by-value language, getting laziness to work smoothly throughout even a relatively simple computation can be a big production, with lots of places where things can subtly go wrong and you won’t notice until you get a performance issue. A big part of this issue in PureScript is that fundamental type classes like Semigroup and Foldable simply aren’t typed to support lazy operations in general. Designing an alternate prelude with class signatures that encourage lazy-first programming is left as an exercise for the very industrious reader.

For working with lazy computations in the ecosystem as it exists today, consider the following:

Do you really need lazy computations?
Can you program directly on the existing lazy types (Data.List.Lazy, Data.Lazy) without needing the power of the rest of the ecosystem?
Try CPS-ing your computation and/or data types.
Try writing wrapper types that add Control.Lazy support to types you use.
Try writing specifically lazy versions of library functions that are too general as-is.

Full working code for the running example is below:

module Main where

import Prelude

import Control.Lazy (class Lazy, defer)
import Data.Array (replicate)
import Data.Foldable (class Foldable, foldMap, length)
import Data.Monoid.Dual (Dual(..))
import Data.Monoid.Endo (Endo(..))
import Data.Newtype (class Newtype, unwrap)
import Data.Profunctor (lcmap)
import Data.Traversable (sequence)
import Data.Tuple (Tuple(..))
import Data.Unfoldable1 (class Unfoldable1, range)
import Effect (Effect)
import Effect.Console (logShow)
import Safe.Coerce (coerce)

newtype LazyWrapper a = LazyWrapper a
derive instance Newtype (LazyWrapper a) _
derive newtype instance Semigroup a => Semigroup (LazyWrapper a)
derive newtype instance Monoid a => Monoid (LazyWrapper a)

instance (Newtype a b, Lazy b) => Lazy (LazyWrapper a) where
  defer = coerce (defer :: (Unit -> b) -> b)


newtype LazyEffect a = LazyEffect (Effect a)
derive instance Newtype (LazyEffect a) _
derive newtype instance Functor LazyEffect
derive newtype instance Apply LazyEffect
derive newtype instance Applicative LazyEffect
derive newtype instance Bind LazyEffect
derive newtype instance Semigroup a => Semigroup (LazyEffect a)
derive newtype instance Monoid a => Monoid (LazyEffect a)

instance Lazy (LazyEffect a) where
  defer = (u >>= _) where u = pure unit


newtype LazyFreeFoldable a = LazyFreeFoldable (forall r. Lazy r => Monoid r => (a -> r) -> r)

instance Functor LazyFreeFoldable where
  map f (LazyFreeFoldable lff) = LazyFreeFoldable (lcmap (lcmap f) lff)

instance Apply LazyFreeFoldable where
  apply (LazyFreeFoldable lff1) (LazyFreeFoldable lff2) = LazyFreeFoldable \k -> lff1 \f -> lff2 \a -> k (f a)

instance Applicative LazyFreeFoldable where
  pure a = LazyFreeFoldable \k -> k a

instance Unfoldable1 LazyFreeFoldable where
  unfoldr1 f b = LazyFreeFoldable \k -> let go = f >>> \(Tuple a more) -> k a <> defer \_ -> foldMap go more in go b

foldMap' :: forall m a. Lazy m => Monoid m => (a -> m) -> LazyFreeFoldable a -> m
foldMap' f (LazyFreeFoldable lff) = lff f

instance Foldable LazyFreeFoldable where
  foldMap f (LazyFreeFoldable lff) = (lff \a _ -> f a) unit
  foldl op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Dual $ LazyWrapper $ Endo (_ `op` a)) z
  foldr op z (LazyFreeFoldable lff) = coerce (lff \a -> LazyWrapper $ Endo (a `op` _)) z

example :: LazyFreeFoldable (Array Int)
example = sequence $ replicate 8 $ range 0 7

main :: Effect Unit
main = do
  unwrap $ foldMap' (logShow >>> LazyEffect) example
  logShow (length example :: Int)

razcore-rad · February 17, 2023, 9:08am

I knew the implementation isn’t as straightforward, cause even if we write the Lazy classes and all we could still get non-lazy run due to missing something less obvious.

Thank you so much for the detailed explanation. I am somewhat capable of following along the explanation (I’m still working on my jargon) and it looks like a very nice complement to Functional Programming Made Easier. There’s a lot of advanced features here that go beyond what’s in the book.

Well, going back to the drawing board