gregberns [10:39 PM]
hey I’m doing a PS POC to prove out performance compared to C#/raw JS, and having some pretty significant performance issues with Map, specifically when trying to insert 1 million times into a Map. I’m assuming the creation of new arrays each time is creating the issue. Are there any optimized libs that are faster? ST performance is OK, and I might be able to do some FFI work but not excited about it.
Any thoughts on other structures…. really anything to get performance times up??
John Mendonça [10:42 PM]
I know there is https://github.com/purescript/purescript-foreign-object but i dont know about its performance, etc
gregberns [10:43 PM]
I was looking at that and I’ll go that direction if I need to, but it seems like some of my type support disappears.
monoidmusician [10:46 PM]
it has the same amount of type safety, it’s just a different implementation of Map String
, essentially
it has a nice mutable API through ST, and obviously quick lookups through the native JS operation, but it scales worse at other operations (immutable insertion is O(n)
I believe, and there’s no sharing done – but I suspect you don’t care about that)
natefaubion [10:50 PM]
Are you just doing a benchmark, or do you really need a million items in a Map?
gregberns [10:51 PM]
hmmm… I can transform my record to Array (Tuple String Record)
which gives me my keys and records. Then convert that to the Object. I’m kinda hesitant (probably cause I havent worked with it) just because it might look challenging to others coming in
@natefaubion right now its a POC doing benchmarking. but the organization is totally new to PS and so were a little hesitant to need to do anything too drastic or difficult to understand to get performance gains. After 2 days of perf work, were at 2 minutes compared to 7-10 seconds with C# and it looks like its all about insertion into Map
natefaubion [10:54 PM]
I don’t think you can compare JS objects with C# let alone PS
They are in totally different leagues
If your usage is predicated on insertion of millions of items into a Map, you are using the wrong platform
gregberns [10:55 PM]
haha well… yea I get that, but… with raw JS 1M insertions into an object vs a C# dict is about the same
natefaubion [10:56 PM]
I would use ST then
which is going to be roughly the same as the JS version that mutates an object
But make sure you use unsafeFreeze
gregberns [10:56 PM]
We used ST and got some gains, but its definitely still not there
natefaubion [10:56 PM]
Then use JS. That’s why the FFI exists
You should not expect PS to be as fast as raw JS
Map is an immutable tree data structure, so GC churn for a million items is going to be insane
gregberns [10:57 PM]
please dont get me wrong, I do get that, just struggling to figure out a better solution.
haha yea!
natefaubion [10:58 PM]
How are you using ST?
vs the JS solution?
I think if you are ever in the situation where you need to insert a million items as fast as possible, you should use JS. I would be very surprised if that is the case, and in that case I would question why you want to do it in PS with a JS backend
gregberns [11:00 PM]
we’ve been looking at using the FFI and I think I have something decent for array, but once the insertions are complete I’d really like to return a Map instead of our own object
natefaubion [11:00 PM]
There is no way you can build a Map with a million items in reasonable time
gregberns [11:01 PM]
We are totally open to Haskell, but… record syntax.
natefaubion [11:01 PM]
For a million items there is probably a better data structure than PS Map
You could try the HashMap implementation
gregberns [11:02 PM]
For the POC there are two parts: first build an initial Map, which is just insertions based on Key. The second part is to do a lookup, modify the object, and insert it back in.
natefaubion [11:02 PM]
But anything immutable is going to be rough
for initial construction
If you want fast initial construction you absolutely need a data structure that supports an ST API
gregberns [11:04 PM]
I didnt know about HashMap. Any idea if its lookup perf will be slower?
natefaubion [11:04 PM]
Map does not support that. Building a Map from an Array of Tuples is no different than just calling a bunch of inserts
gregberns [11:04 PM]
(I will definitely try it)
natefaubion [11:05 PM]
gregberns [11:05 PM]
Building a Map from an Array of Tuples is no different than just calling a bunch of inserts
Yea that’s what I realized after looking at the source code
natefaubion [11:06 PM]
However if they are already ordered, I would just use binary search on the array
I think someone wrote an API for that
https://pursuit.purescript.org/packages/purescript-sorted-arrays/0.2.0
gregberns [11:07 PM]
Thats an interesting idea as well
Here’s the ST test we wrote (note: I knew nothing about ST 2 days ago)
test :: Effect Unit
test = do
let
inner :: forall r. Array Int -> ST r (Object Int)
inner arr = do
map' <- new :: ST r (STObject r Int)
_ <- foreach arr (\i -> poke (show i) (i * 10) map' *> pure unit)
freezeST map'
let arr = range 1 1000000
log "start"
t0 <- now
let a = run (inner arr)
t1 <- now
log $ "End " <> show (t1 - t0)
log ""
natefaubion [11:10 PM]
You should use unsafeFreeze
freezeST
will copy the object
which will just build it all over again
gregberns [11:10 PM]
I was wondering about that…
natefaubion [11:11 PM]
you should also use do syntax instead of *> if you are really serious about eking out performance
natefaubion [11:11 PM]
since in 0.12.1 it’s optimized to statements
gregberns [11:11 PM]
… what do I need to change to do that??
natefaubion [11:11 PM]
foreach
foreach arr \i -> do
_ <- poke ...
pure unit
magic-do does not optimize applicative operators
it probably should
gregberns [11:13 PM]
ah alright I’ll give that a try too
natefaubion [11:14 PM]
I would expect that to still be a good bit slower though, because it will still allocate thunks for the effects
Unfortunately there’s no EffectFn
analog for ST
gregberns [11:16 PM]
I think that test took something like 7 seconds vs another similar example that was at 14 sec (w/o ST), so we did cut it in half. I’ll re-run the test with those two changes.
Unfortunately there’s no EffectFn
analog for ST
yea that’s what we were wondering, if there could be a way to do all the mutations in an Effect.
Oh here was another question I had. If we do just use ST, is there any way to get away from using STObject/ get away from returning a generic Object and having the rest of the system work on that?
natefaubion [11:22 PM]
ST is just an FFI type, you can do anything in ST if you write FFI to do it
gregberns [11:22 PM]
ah, ok that makes sense
natefaubion [11:23 PM]
That HashMap implementation supposedly has an efficient implementation of fromArray
that uses a mutable API
I can’t speak for the library though
its backed by a lot of JS so hard to verify
gregberns [11:29 PM]
gottcha, alright I’ll take a look at that instead. I cant think of any reason to dislike HashMap at this point.
Thanks for all your help. We’ve been working the last 2 weeks in PS and its the deepest dive into a Haskell-like lang and I am loving it and am about willing to do anything to stick with it. Love the “it compiles, it works”… coming from JS and C#, my life is just better
natefaubion [11:31 PM]
I would encourage you to stick with it, but be aware that is a relatively naive language, runtime-wise (at least the primary JS backend)
It gives you a few tools, but it is not really optimized at all
gregberns [11:36 PM]
Were doing all we can to prove it out. The CIO is concerned with the performance of Node vs C#, and I think we can decently prove the Node/C# is generally neglagible. We just need to mitigate any serious concerns about PS’s performance. I dont think we need to be equal, but I think were 20 times slower now. We’re just trying to get that down as much as possible… and not use too many ugly tricks (ST).
(ugly tricks: hard for noobs to understand)
gregberns [12:12 AM]
@natefaubion the ST changes you suggested halved the time.
– 727.0 w/ unsafeFreeze and do
– 984.0 w/ freeze and do
– 1574.0 w/ freeze and NOT do
– 1298.0 w/ unsafeFreeze and NOT do
The HashMap impl is worse, but the implementation ‘looks’ very clean and readable
– 2936.0 w/ HashMap
gregberns [12:26 AM]
HashMap just 3x it with our process (1m entries)
Insertion with Map: 10138.0
Insertions with HashMap: 2857.0