Untagged union types

No, OCaml’s polymorphic variants are tagged. The difference to sum types is the same difference of PureScript’s records to product types, and both are driven by row types.

1 Like

I was thinking about the class based approach to solving this problem:

class Foo a

instance fooInt :: Foo Int
instance fooArray :: Foo (Array a)

foreign import fooInternal :: forall a. a -> Int

foo :: forall a. Foo a => a -> Int
foo = fooInternal

where you throw away any dictionaries passed via a helper definition. What if we could throw away dictionaries without having to introduce a new definition?

class Foo a

instance fooInt :: Foo Int
instance fooArray :: Foo (Array a)

foreign import foo :: forall a. Foo a ? a -> Int

I’m not sure what to call it (Predicate?) but the ? operator would act exactly like a constraint, but guarantee that the runtime representation is the same as the RHS. I picked ? and not something like an arrow as it’s not deferring execution like a function or constraint might. When used it requires that the dictionary is available at the use site, but doesn’t pass it in. I’m not sure exactly how it interacts when mixed with constraints - if a predicate is used, do we infer a normal constraint or a predicate at the top level?

EDIT: I suppose with the wording of requiring the dictionary to be available at the use site would mean that using a predicate would infer a normal constraint, but that is only with that specific way to define it, I’m sure there’s other ways

3 Likes

That’s a neat idea, I wonder if that would also be useful for say removing useless “Partial” dictionaries?

I have to disagree on this. If I’m calling a JS function that takes some record that looks like

{ a :: { b :: { c :: Int | String } } }

Then for me to add the tags to make this

{ a :: { b :: { c :: Either Int String } } }

and then all the conversions from one form to the other just feels like busy work. I don’t see how it adds any value to add the tags.

Now this is blending a bit with the “What PureScript Needs” topic, but while I’m arguing for why untagged unions are nice for interoping with JS code, I’m not arguing that there be any language changes (though I think both @Adrielus’s and @jy14898’s ideas are really neat!). I think this is a mostly solved problem already, by the untagged-union package and other options. If anything, I’m just arguing that it should be easier to discover the existing library-based solutions, because I believe it’s common for newcomers to wonder about untagged unions and optional fields (which is a special case of untagged union) as soon as they start doing FFI, but might not know what it’s called or how to search for it.

8 Likes

It feels incomplete without a way to destructure the datatype in Purescript. I think being able to use row information to destructure is a pretty good start (as done in untagged-unions package).

Yes! It might seem like a minor thing but it would be very nice for the compiler to optimise away empty dictionaries and have zero overhead.

1 Like

I gave untagged-unions a try. I’d say that the untagged unions problem is mostly solved. There are still a few rough edges, but not very important.

One big thing that’s missing is serialisation of ADTs in the form of plain tagged records instead of the currently implemented class based approach. But I understand that it’s something the core team is looking at already.

Could you expound on this idea? I feel like the class-based approach is actually kind of nice, because with a bit of JS FFI, you can use instanceof to do runtime type reflection. (I’m like 95% sure this is highly discouraged since it’s not guaranteed that ADTs will always translate to JS in the same fashion). How would serialization of ADTs as plain tagged records help the untagged union situation?

Here’s what a type-class based approach would look like:

module Main where

import Prelude

import Effect (Effect)
import Effect.Console (log)

class IntStringArray a
instance intStringArrayInt :: IntStringArray Int
instance intStringArrayString :: IntStringArray String
instance intStringArrayArray :: IntStringArray (Array a)

ps_to_ffi_boundary :: forall a. IntStringArray a => a -> String
ps_to_ffi_boundary = ffi_code

foreign import ffi_code :: forall a. a -> String

main :: Effect Unit
main = do
  log $ ps_to_ffi_boundary 4
  log $ ps_to_ffi_boundary "foo"
  log $ ps_to_ffi_boundary [1, 2, 3]

with FFI code as

"use strict";

exports.ffi_code = function (intStringArray) {
  if (Number.isInteger(intStringArray)) {
    return "Got an integer";
  } else if (typeof intStringArray === 'string' || intStringArray instanceof String) {
    return "Got a string";
  } else if (Array.isArray(intStringArray)) {
    return "Got an array of values";
  } else {
    throw new Error("This should never happen unless I checked the argument's type incorrectly");
  }
};

The outputted JavaScript code would be this:

"use strict";
var $foreign = require("./foreign.js");
var Effect_Console = require("../Effect.Console/index.js");
var IntStringArray = {};
var ps_to_ffi_boundary = function (dictIntStringArray) {
    return $foreign.ffi_code;
};
var intStringArrayString = IntStringArray;
var intStringArrayInt = IntStringArray;
var intStringArrayArray = IntStringArray;
var main = function __do() {
    Effect_Console.log(ps_to_ffi_boundary()(4))();
    Effect_Console.log(ps_to_ffi_boundary()("foo"))();
    return Effect_Console.log(ps_to_ffi_boundary()([ 1, 2, 3 ]))();
};
module.exports = {
    IntStringArray: IntStringArray,
    ps_to_ffi_boundary: ps_to_ffi_boundary,
    main: main,
    intStringArrayInt: intStringArrayInt,
    intStringArrayString: intStringArrayString,
    intStringArrayArray: intStringArrayArray,
    ffi_code: $foreign.ffi_code
};

So, if there was a special compiler rule that said, “Anything that extends the Erased type class will have its dictionary removed from compiled output,” would make this idea feasible. I’m not sure whether that can be done or how hard it would be to do that.

If it could be done and was, perhaps this would be the output:

"use strict";
var $foreign = require("./foreign.js");
var Effect_Console = require("../Effect.Console/index.js");
var main = function __do() {
    Effect_Console.log($foreign.ffi_code(4))();
    Effect_Console.log($foreign.ffi_code("foo"))();
    return Effect_Console.log($foreign.ffi_code([ 1, 2, 3 ]))();
};
module.exports = {
    main: main,
    ffi_code: $foreign.ffi_code
};

Incidentally there is already a dictionary erasure optimization in play here, eg:

ps_to_ffi_boundary()("foo")

Note that we aren’t supplying a value for the dictionary argument: the compiler knows that the dictionary is empty and skips passing anything. Of course you still have to pay the cost of the thunk: that’s maybe something that an uncurrying optimization or an optimization which pulled out and saved the result of ps_to_ffi_boundary() might be able to solve, and I think that’s closer to what you have in mind. I think we already have an open PR for the latter, as it happens.

2 Likes

I remember a PR being merged that erased dictionaries if the type class was “necessarily empty.” However, I wasn’t sure how that played out in compiled JS.

Regardless, it sounds like this approach works well now and will be better in the future after that PR you mentioned is merged. I’m now wondering how well this approach works with Record arguments.

Untagged unions would be nice for TypeScript interop. A few of us wanted to use React Material UI, which uses untagged unions heavily. We code-gen’d a library over the top of react-basic that works well, but the type signatures are insane:

checkbox ::
  forall given optionalGiven optionalMissing props required.
  Nub' (CheckboxReqPropsRow (MUI.Core.IconButton.IconButtonReqPropsRow (MUI.Core.ButtonBase.ButtonBaseReqPropsRow ()))) required =>
  Prim.Row.Union required optionalGiven given =>
  Nub' (CheckboxPropsRow (MUI.Core.IconButton.IconButtonPropsRow (MUI.Core.ButtonBase.ButtonBasePropsRow React.Basic.DOM.Props_button))) props =>
  Prim.Row.Union given optionalMissing props =>
  { | given } -> JSX
checkbox ps = element _Checkbox ps

I don’t think a human could really write that out and maintain their sanity on a large scale, and I don’t know how a noob would use our library with confidence.

It seems the argument against tagged unions is that it wouldn’t improve the language, and I agree, but it would be HUGE for interop.

3 Likes

I’m not sure if this is an appropriate place to discuss these implementation details but they are somewhat around the topic. They show an attempt which we’ve done to handle untagged unions in a large, real life setting - react-basic Material UI bindings (code snippet posted above by @dtwhitney presents the current solution in a nutshell). They also show my failure to do this in a sane manner :wink:
Please be aware that this is not an indication of definitive failure of the approach because @jvliwanag was able to write bindings also for large JS libraries using untagged-union which are fully usable.
Here are the details of our attempt:

  • I’ve prototyped optional field handling using untagged-union.
  • When I’ve found some performance problems and I’ve decided to simplify the strategy and used even simpler version of the above lib (caution: self plug) undefined-is-not-a-problem which handles only a | undefined unions.
  • Signatures were very nice when I used both strategies.
  • In the case of react-basic rows are really large and even trivial processing of such a row takes significant amount of time (for details please check: `RowList` iteration seems to be relatively slow)
  • Because of this slowdown library became unusable - processing time for every property row summed up so the recompilation time even of a trivial JSX written this way took a lot of time.
  • I have switched back to Union / Nub based approach to represent required fields (we have this info from the typescript types so we wanted to just use it).
  • This solution is not composable at all and full of “buggy approximations”. When you inherit and override properties from parent components (which in MUI is quite common) it is really difficult to represent this stuff using Union / Nub.

P.S. @dtwhitney I was about to propose dropping this required field handling all together so we can simplify our signatures. I have an emotional relation with this pieces because I’ve invested a lot of time to improve @srghma PR and experiment with it… but I think that weight to power ratio is clearly too large.

3 Likes

Interesting background information how ReScript/BuckleScript/ReasonML supports a form of untagged union types using GADTs and an unbox attribute. Conclusion (emphasis by me):

[T]hanks to unboxed attributes and the module language, we introduce a systematic way to convert values from union types (untagged union types) to algebraic data types (tagged union types). This sort of conversion relies on user level knowledge and has to be reviewed carefully.

1 Like

Hey, I really like using our library and I’ve built a pretty large application with it. I wasn’t trying to send out any negative vibes - just mildly advocating for untagged unions :slight_smile:

2 Likes