I’ve been doing some thinking lately about type class instances, specifically about how libraries should be providing type class instances for the type classes and types they define.
Currently there are three approaches (that I am aware of) for providing type class instances:
1. The library providing the type class provides the instances
In this approach, the library that defines the type class provides instances for it. For example, the EncodeJson
and DecodeJson
type classes come with a variety of instances for types that will come up frequently.
This approach works well for ubiquitous types, like primitive values (Int
, Boolean
, String
, etc.), common collection types (Array
, Map
), and other common types like Maybe
and Either
.
However, it does not scale well to the total set of possible types that may exist. The library that provides the type class would have to be aware of each type that someone may want to have a type class instance for in order to provide it.
2. The library providing the type provides the instances
In this approach, the library that provides a type also provides the instances on that type. For example, the CaseInsensitiveString
could provide an EncodeJson
instance to make it usable with Argonaut out of the box.
While this scales better than the first approach in terms of spreading out the responsibility for providing instances across the ecosystem, it comes with the trade-off of each library needing to pull in extra dependencies to provide the type classes.
In the case of CaseInsensitiveString
, it probably doesn’t make sense for the strings
package to take a dependency on argonaut-codecs
to provide an EncodeJson
instance, as not everyone using strings
is going to need it.
3. Library consumers use newtype
wrapping to provide their own instances
In this approach, the library consumer is responsible for providing their own type class instances for the types that require them. Since orphan instances are not permitted, this requires wrapping the type in some way (e.g., with a newtype
) in order to provide instances for the type.
For example, say I want to deserialize a CaseInsensitiveString
from JSON. Since strings
does not provide a DecodeJson
instance, I now have to wrap the type so I can provide my own instance of DecodeJson
.
This is a worse user experience than if CaseInsensitiveString
just provided the instance out of the box.
---
What we’re left with is a matrix that looks something like this:
Approach | Scalable? | Efficient? | UX |
---|---|---|---|
1 | |||
2 | |||
3 |
Legend
- Approach - The number of the approach (from the previous section)
- Scalable - Whether the approach scales well to large cross-products of type class-providing libraries and type-providing libraries
- Efficient - Whether the approach is efficient in the sense that it avoids libraries having to take dependencies that not all consumers might need
- UX - The user experience the consumer has when using the library
What we can see is that approaches 1 and 2 provide the best user experience, as the following two statements hold true (at least in the ideal case):
- When using a library that provides a type class, an instance is provided for the types I want to use
- When using a library that provides a type, an instance is provided for the type classes I want to use
However, this improved user experience comes at the expense of being unable to scale the approach, or in adding potentially unneeded dependencies.
Of these three approaches, the second one of having libraries provide type class instances shows the most promise:
- It is scalable, as the libraries providing the types can bring their own instances without upstream changes in the library providing the type class
- It provides the great user experience of things working out of the box
Is there a way we can make approach 2 efficient (by avoiding library consumers having to take on additional dependencies that they don’t need)?
---
There is some prior art in this space from the Rust ecosystem. Rust’s trait implementations are analogous to type class instances, and Rust also prohibits orphaned instances.
Rust solves this through the use of cargo
features. These features can be used to conditionally compile parts of the code based on which features are present.
It is relatively common for crates to have a serde
feature that provides trait implementations for serde
for consumers who want to serialize/deserialize the types from the crate.
The uuid
crate, for example, can be consumed like so:
[dependencies]
uuid = { version = "0.8", features = ["serde"] }
Internally, the crate uses the cfg
attribute to put the serde_support
module that provides the serde
trait implementations behind this feature:
#[cfg(feature = "serde")]
mod serde_support;
Likewise, serde
and serde_json
are marked as optional dependencies in Cargo.toml
, meaning that they are only pulled in when using the serde
feature.
---
To wrap things up, I’m curious to hear what others have to say about the current state of affairs for providing type class instances for libraries.
Does having something like Cargo features in Spago sound desirable? Feasible?
Are there other solutions or ideas in this space that you’ve thought of?