Untagged union types

You might be right about that. Honestly, the performance concerns haven’t been noticeable to me. I doubt the unboxing PureScript has to do - even in an example like that - is going to add any noticeable overhead.

My main reason to use untagged unions is the ergonomics of it. To echo @ajnsit’s sentiment, FFI can be “tedious” if you’re working with a library with a lot of untagged unions, and can make it significantly more difficult to just try out some npm library that doesn’t have PureScript bindings than it is in a language like TypeScript or even BuckleScript. You could spin up a new JS project to try out some library, but sometimes you want to try it out in your own project without spending a whole lot of time writing a bunch of type coercions first.

1 Like

I used TypeScript/JS a lot. And may say that such flexible and “unobtrusive” (or obtrusive?) JS APIs bring more evil than good. It all stems from the dynamic nature of JavaScript. I like that haskell/purescript actually forces you to think and make API static and more straightforward.

As for the wrapping issue, it should be considered as a necessary evil, an API should be converted to idiomatic PS style (imagine how this API would like for PS based lib), not trying to reflect JS style in PS, even if it would take more time, I’m sure this would be beneficial in the long run.

3 Likes

Untagged unions probably (often) mean different ways/forms for the representation of the same value, just use the most full and uniform that you need.

I think you are conflating flexibility with not being able to reason about code. The former is desirable, the latter is not.

Why? FFI interop is a real world feature that is important to real world projects. I haven’t seen any reason for us to compromise on this yet.

2 Likes

you are conflating flexibility with not being able to reason about code

Not only I, but library authors too often do this when creating “flexible APIs”, while having desire to simplify things for the end user they make their own api/library less reasonable and error-prone.

Why? FFI interop is a real world feature that is important to real world projects.

I think it is obvious, any FFI interop brings unsafety and related issues.It is nessessary, but one should not just try to transparently translate JS API to Purescript, such APIs are mostly designed without FP style in mind.

1 Like

I propose an in between solution:

  • Easy to implement (just syntactic sugar)
  • Should ease writing ffi code
  • Would not weaken the type system

How would it look?

exports.foo = (input) => {
  if (Array.isArray(input))
    return input.length

  return Math.sqrt(input)
}
-- Note: each branch of the union can have one and only one argument
union SingleOrArray a
  = Single a
  | Multi (Array a)

foreign import foo :: SingleOrArray Int -> Int

translates to:

foregin import data SingleOrArray :: Type -> Type

Single :: forall a. a -> SingleOrArray a
Single = unsafeCoerce

Multi :: forall a. (Array a) -> SingleOrArray a
Multi = unsafeCoerce

foreign import foo :: SingleOrArray Int -> Int

This would not implement pattern matching on unions, but should still greatly improve the ffi experience.

What do yall think about this?

1 Like

No, OCaml’s polymorphic variants are tagged. The difference to sum types is the same difference of PureScript’s records to product types, and both are driven by row types.

1 Like

I was thinking about the class based approach to solving this problem:

class Foo a

instance fooInt :: Foo Int
instance fooArray :: Foo (Array a)

foreign import fooInternal :: forall a. a -> Int

foo :: forall a. Foo a => a -> Int
foo = fooInternal

where you throw away any dictionaries passed via a helper definition. What if we could throw away dictionaries without having to introduce a new definition?

class Foo a

instance fooInt :: Foo Int
instance fooArray :: Foo (Array a)

foreign import foo :: forall a. Foo a ? a -> Int

I’m not sure what to call it (Predicate?) but the ? operator would act exactly like a constraint, but guarantee that the runtime representation is the same as the RHS. I picked ? and not something like an arrow as it’s not deferring execution like a function or constraint might. When used it requires that the dictionary is available at the use site, but doesn’t pass it in. I’m not sure exactly how it interacts when mixed with constraints - if a predicate is used, do we infer a normal constraint or a predicate at the top level?

EDIT: I suppose with the wording of requiring the dictionary to be available at the use site would mean that using a predicate would infer a normal constraint, but that is only with that specific way to define it, I’m sure there’s other ways

3 Likes

That’s a neat idea, I wonder if that would also be useful for say removing useless “Partial” dictionaries?

I have to disagree on this. If I’m calling a JS function that takes some record that looks like

{ a :: { b :: { c :: Int | String } } }

Then for me to add the tags to make this

{ a :: { b :: { c :: Either Int String } } }

and then all the conversions from one form to the other just feels like busy work. I don’t see how it adds any value to add the tags.

Now this is blending a bit with the “What PureScript Needs” topic, but while I’m arguing for why untagged unions are nice for interoping with JS code, I’m not arguing that there be any language changes (though I think both @Adrielus’s and @jy14898’s ideas are really neat!). I think this is a mostly solved problem already, by the untagged-union package and other options. If anything, I’m just arguing that it should be easier to discover the existing library-based solutions, because I believe it’s common for newcomers to wonder about untagged unions and optional fields (which is a special case of untagged union) as soon as they start doing FFI, but might not know what it’s called or how to search for it.

8 Likes

It feels incomplete without a way to destructure the datatype in Purescript. I think being able to use row information to destructure is a pretty good start (as done in untagged-unions package).

Yes! It might seem like a minor thing but it would be very nice for the compiler to optimise away empty dictionaries and have zero overhead.

1 Like

I gave untagged-unions a try. I’d say that the untagged unions problem is mostly solved. There are still a few rough edges, but not very important.

One big thing that’s missing is serialisation of ADTs in the form of plain tagged records instead of the currently implemented class based approach. But I understand that it’s something the core team is looking at already.

Could you expound on this idea? I feel like the class-based approach is actually kind of nice, because with a bit of JS FFI, you can use instanceof to do runtime type reflection. (I’m like 95% sure this is highly discouraged since it’s not guaranteed that ADTs will always translate to JS in the same fashion). How would serialization of ADTs as plain tagged records help the untagged union situation?

Here’s what a type-class based approach would look like:

module Main where

import Prelude

import Effect (Effect)
import Effect.Console (log)

class IntStringArray a
instance intStringArrayInt :: IntStringArray Int
instance intStringArrayString :: IntStringArray String
instance intStringArrayArray :: IntStringArray (Array a)

ps_to_ffi_boundary :: forall a. IntStringArray a => a -> String
ps_to_ffi_boundary = ffi_code

foreign import ffi_code :: forall a. a -> String

main :: Effect Unit
main = do
  log $ ps_to_ffi_boundary 4
  log $ ps_to_ffi_boundary "foo"
  log $ ps_to_ffi_boundary [1, 2, 3]

with FFI code as

"use strict";

exports.ffi_code = function (intStringArray) {
  if (Number.isInteger(intStringArray)) {
    return "Got an integer";
  } else if (typeof intStringArray === 'string' || intStringArray instanceof String) {
    return "Got a string";
  } else if (Array.isArray(intStringArray)) {
    return "Got an array of values";
  } else {
    throw new Error("This should never happen unless I checked the argument's type incorrectly");
  }
};

The outputted JavaScript code would be this:

"use strict";
var $foreign = require("./foreign.js");
var Effect_Console = require("../Effect.Console/index.js");
var IntStringArray = {};
var ps_to_ffi_boundary = function (dictIntStringArray) {
    return $foreign.ffi_code;
};
var intStringArrayString = IntStringArray;
var intStringArrayInt = IntStringArray;
var intStringArrayArray = IntStringArray;
var main = function __do() {
    Effect_Console.log(ps_to_ffi_boundary()(4))();
    Effect_Console.log(ps_to_ffi_boundary()("foo"))();
    return Effect_Console.log(ps_to_ffi_boundary()([ 1, 2, 3 ]))();
};
module.exports = {
    IntStringArray: IntStringArray,
    ps_to_ffi_boundary: ps_to_ffi_boundary,
    main: main,
    intStringArrayInt: intStringArrayInt,
    intStringArrayString: intStringArrayString,
    intStringArrayArray: intStringArrayArray,
    ffi_code: $foreign.ffi_code
};

So, if there was a special compiler rule that said, “Anything that extends the Erased type class will have its dictionary removed from compiled output,” would make this idea feasible. I’m not sure whether that can be done or how hard it would be to do that.

If it could be done and was, perhaps this would be the output:

"use strict";
var $foreign = require("./foreign.js");
var Effect_Console = require("../Effect.Console/index.js");
var main = function __do() {
    Effect_Console.log($foreign.ffi_code(4))();
    Effect_Console.log($foreign.ffi_code("foo"))();
    return Effect_Console.log($foreign.ffi_code([ 1, 2, 3 ]))();
};
module.exports = {
    main: main,
    ffi_code: $foreign.ffi_code
};

Incidentally there is already a dictionary erasure optimization in play here, eg:

ps_to_ffi_boundary()("foo")

Note that we aren’t supplying a value for the dictionary argument: the compiler knows that the dictionary is empty and skips passing anything. Of course you still have to pay the cost of the thunk: that’s maybe something that an uncurrying optimization or an optimization which pulled out and saved the result of ps_to_ffi_boundary() might be able to solve, and I think that’s closer to what you have in mind. I think we already have an open PR for the latter, as it happens.

2 Likes

I remember a PR being merged that erased dictionaries if the type class was “necessarily empty.” However, I wasn’t sure how that played out in compiled JS.

Regardless, it sounds like this approach works well now and will be better in the future after that PR you mentioned is merged. I’m now wondering how well this approach works with Record arguments.

Untagged unions would be nice for TypeScript interop. A few of us wanted to use React Material UI, which uses untagged unions heavily. We code-gen’d a library over the top of react-basic that works well, but the type signatures are insane:

checkbox ::
  forall given optionalGiven optionalMissing props required.
  Nub' (CheckboxReqPropsRow (MUI.Core.IconButton.IconButtonReqPropsRow (MUI.Core.ButtonBase.ButtonBaseReqPropsRow ()))) required =>
  Prim.Row.Union required optionalGiven given =>
  Nub' (CheckboxPropsRow (MUI.Core.IconButton.IconButtonPropsRow (MUI.Core.ButtonBase.ButtonBasePropsRow React.Basic.DOM.Props_button))) props =>
  Prim.Row.Union given optionalMissing props =>
  { | given } -> JSX
checkbox ps = element _Checkbox ps

I don’t think a human could really write that out and maintain their sanity on a large scale, and I don’t know how a noob would use our library with confidence.

It seems the argument against tagged unions is that it wouldn’t improve the language, and I agree, but it would be HUGE for interop.

3 Likes

I’m not sure if this is an appropriate place to discuss these implementation details but they are somewhat around the topic. They show an attempt which we’ve done to handle untagged unions in a large, real life setting - react-basic Material UI bindings (code snippet posted above by @dtwhitney presents the current solution in a nutshell). They also show my failure to do this in a sane manner :wink:
Please be aware that this is not an indication of definitive failure of the approach because @jvliwanag was able to write bindings also for large JS libraries using untagged-union which are fully usable.
Here are the details of our attempt:

  • I’ve prototyped optional field handling using untagged-union.
  • When I’ve found some performance problems and I’ve decided to simplify the strategy and used even simpler version of the above lib (caution: self plug) undefined-is-not-a-problem which handles only a | undefined unions.
  • Signatures were very nice when I used both strategies.
  • In the case of react-basic rows are really large and even trivial processing of such a row takes significant amount of time (for details please check: `RowList` iteration seems to be relatively slow)
  • Because of this slowdown library became unusable - processing time for every property row summed up so the recompilation time even of a trivial JSX written this way took a lot of time.
  • I have switched back to Union / Nub based approach to represent required fields (we have this info from the typescript types so we wanted to just use it).
  • This solution is not composable at all and full of “buggy approximations”. When you inherit and override properties from parent components (which in MUI is quite common) it is really difficult to represent this stuff using Union / Nub.

P.S. @dtwhitney I was about to propose dropping this required field handling all together so we can simplify our signatures. I have an emotional relation with this pieces because I’ve invested a lot of time to improve @srghma PR and experiment with it… but I think that weight to power ratio is clearly too large.

3 Likes

Interesting background information how ReScript/BuckleScript/ReasonML supports a form of untagged union types using GADTs and an unbox attribute. Conclusion (emphasis by me):

[T]hanks to unboxed attributes and the module language, we introduce a systematic way to convert values from union types (untagged union types) to algebraic data types (tagged union types). This sort of conversion relies on user level knowledge and has to be reviewed carefully.

1 Like

Hey, I really like using our library and I’ve built a pretty large application with it. I wasn’t trying to send out any negative vibes - just mildly advocating for untagged unions :slight_smile:

2 Likes