Unbiasing the Semigroup instance for Map

I propose another solution to this problem:

v0.14.0:

  • We reimplement ordered-collectionsMap as a completely new module Data.Map.Unbiased that has the updated instances. Those who want to “opt-in” to the new instances can use the new module.
  • The original Data.Map module’s Map one will have Warn constraints put on its instances that complain loudly about their upcoming change in v0.15.0.
  • we define a function that is effectively coerce between the two data types (i.e. Data.Map k v -> Data.Map.Biased k v and vice-versa), so that the ecosystem doesn’t have to create two things for each implementation.

v0.15.0

  • The original Map instances will be updated to use the “opt-in” version’s implementation of the instances. However, it’s Warn constraints haven’t been removed. Now they say, “These instances are now unbiased and are the same as the instances in Data.Map.Biased.”

v0.16.0

  • The Warn constraints from the original Map are removed. This Map is effectively the same as the Biased Map in v0.14.0
  • The biased Map module is deprecated, users of this code in v0.14.0 are now told to migrate back to the original implementation.

v0.17.0

  • The biased Map module is removed

This approach has the following benefits at the cost of how long this migration approach takes:

  • v0.14.0 users can immediately use the unbiased versions. Those who are currently depending on the biased versions are loudly notified of the issue
  • Those who “jump” two versions are still notified of the issue.

I don’t think there is any way of preventing users from running into this if they skip over major versions. @JordanMartinez with your proposal, people jumping from 0.13.x (or earlier) to 0.16.x (or later) run into the same issue; any code using Map’s Semigroup instance where the values have a Semigroup instance will silently change behaviour. It’s true that skipping over two major version series in a row will be rarer, though.

Rereading through this issue, we seem to have the following parties represented:

  1. Those who have 10,000s LOC backing real production-level apps that depend on the biased instance. Upgrading a codebase to v0.14.0 is already going to be somewhat tedious and this change is going to add to their workload and steal developer time away from other things that add value to the business.
  2. Those who do not have production-level apps and who pay no cost if the change is made
  3. Those who fall somewhere in-between 1 and 2
  4. The unknown group of people who will upgrade to v0.14.0 or some future breaking PS release sometime after this point

Reading through this again, it seems like we have a few options:

  1. do not change the instance and provide a newtype to get the unbiased version
  2. provide some multi-major-release solution that migrates from the current instance to the changed one
  3. change the instance to an unbiased one now

#1 seems to satisfy most people’s concerns.

  • Those who want the unbiased version can use coerce the Map to a newtype that specifically does that. Assuming they do not have a lot of code, that’s not hard to do.
  • Those who want to gain the benefits of other v0.14.0 changes do not have to be concerned about this otherwise “silent” change. Thus, they are not penalized by having a real-world app.
  • Anyone who upgrades to v0.14.0, which has a LOT of other breaking changes, won’t need to remember to read through the release notes of Map. They won’t experience a “silent” change, nor will they need to fix tooling via psa to remove UserDefinedWarnings if we did some sort of Warn-based constraint to notify people of it.

Normally, it is our preference for pushing breaking changes in core libraries when we make breaking PS releases. However, due to the “silent” change isue, perhaps this is a situation where we should just make a new library release that’s a few months after the v0.14.0. Then, the change would be less “silent” because there is less going on and people are more likely to read the release notes. We could also address the package set issue that Phil raised above by creating tooling that shows all the breaking changes across packages in a set from set A to set B.
Lastly, deciding not to make this change now would also give the community (or at least those who have the most to lose by this change occurring) time to develop a tool that could immediately highlight all usages of the original biased instance, notifying them that a decision needs to be made (e.g. coerce the value to Map k (First a)).

Put differently, perhaps this change should not be done until we have better tooling that prevents “silent” changes from occurring.

1 Like

I suppose, now that we have coerce, we can make an unbiased Map module with the exact same API, and each implementation just coerces the corresponding one from the other module. That sounds ideal, to reduce code duplication. (We just have to remember to keep them in sync later.)

Just thought to clarify why I voted against.

I really appreciate how the PureScript community usually deals with mistakes and new insights in that when it is necessary breaking code changes are not avoided. Some people don’t like this, but to me it is a necessity in that without it, the debt will be increasing and it just wouldn’t be possible for both the PureScript language and the library ecosystem to be in such an excellent state.

Also in this case, it feels to me that just changing the code (in any of the proposed ways) would leave the PS standard libs in a better end state, i.e. the unbiased version should be the one to take the Data.Map namespace.

However, this case is interesting in that the cost is so big compared to the benefit that I think it would be better to leave the existing Map unchanged for now, introduce an unbiased Map next to it and then explore ways how in the future these kind of changes can be made with less cost.

2 Likes

Whenever the issue of backward compatibility comes up, I always feel the need for a tool to perform mechanical source code updates automatically. In this case, automatically moving all existing code to use a First wrapper would provide the best of both worlds.

Some prior art - Go provides a go fix tool for this.

Is there interest in building something like this? We at Juspay would be happy to sponsor such an effort.

7 Likes

I would really love syntactic refactoring tools regardless, so I think this is a good direction, but I think this sort of migration is more complicated since it involves working with instantiations of Monoid (Map k a) which would require elaboration.

1 Like

Yes! This has been on my wishlist for a while. Created another topic for this here: Automated upgrades for breaking changes

2 Likes

I’m also on board for making an unbiased Map module with the exact same API. It seems ideal to have this module live alongside Data.Map in the same library, since it’s clearly something that plenty of people would find useful, and it would probably be easier to keep the interfaces in sync this way. That also allows us to make some progress on this without needing to break anything.

The more I think about it the more I like @JordanMartinez’s idea of having one release series where Map does have the unbiased instance, but it also still has a warning saying “this instance has changed”. I think skipping over a major version is probably already rare; skipping over two at a time is probably so rare that I think it becomes difficult to argue that people in this category should hold up the task of being able to provide a better instance.

Automatically rewriting code Map k a to Map k (First a) would be nice, but I think that’s likely to be extremely difficult to do well, to the extent that I don’t see it happening. I’d love to be proved wrong, of course.

perhaps this is a situation where we should just make a new library release that’s a few months after the v0.14.0. Then, the change would be less “silent” because there is less going on and people are more likely to read the release notes.

I’m not sure this solves the issue either, really, because we can’t make assumptions about when people will upgrade. I suspect it’s not all that uncommon for people to only upgrade their package set when updating to a new compiler version. If I update to 0.14.0 and then leave my package set alone until 0.15.0, this change could just as easily get lost among all of the other changes I make during the upgrade to 0.15.0.

We could also address the package set issue that Phil raised above by creating tooling that shows all the breaking changes across packages in a set from set A to set B.

Yes! This would be really good to have too.

1 Like

I’ve opened this PR for the time being: https://github.com/purescript/purescript-ordered-collections/pull/38

3 Likes

@kritzcreek points out in https://github.com/purescript/purescript-ordered-collections/pull/38 that duplicating the API does have a disadvantage, in that it could be confusing for people searching Pursuit or asking their tooling to import Map-related functions for them that there are now two different Map APIs to choose from, and they have to learn what an “unbiased” Map is (since that isn’t really a concept which exists anywhere else). That’s quite persuasive to me: I think there are probably at least a few places where people are using Maps without making use of the Semigroup instance at all.

I’m now leaning more towards the approach of providing just the newtype wrappers so that people can get the Semigroup behaviour they want while this migration is ongoing, instead of duplicating the whole API.

3 Likes

To follow up with this, here are the changes we would make if we went with the approach above described by Harry:

  • drop the Semigroup and Monoid instance for Data.Map
  • define a newtype in Data.Map called SemigroupMap. Its name makes it more obvious why it exists. Its Semigroup instance would be unbiased. It would have all the type class instances of Data.Map. However, it does not have any of the API of Data.Map (e.g. lookup, etc.). Thus, you would only use it when you need Data.Map to have a Semigroup instance.

This comes with the following advantages

  • Those who do rely on the left-bias nature of the current Map will not experience a “silent” change. Rather, they will be forced to update their code. In situations where they are using this behavior, they will have to coerce the Map k v to Map k (First v) and then wrap it in the SemigroupMap newtype to get the same behavior.
  • Those who want the flexibility of the unbiased Semigroup instance can have it now rather than later.
  • We don’t have two Map types that may confuse beginners or make it harder to import the correct function.
  • In v0.15.0, I believe we would deprecate the SemigroupMap newtype and add the unbiased Semigroup and Monoid instance back to Data.Map with a Warn that says something like “Instance differs from the instance defined in PS releases <= 0.13.x.” In v0.16.0, we would remove the SemigroupMap newtype and remove the Warn constraint on the Semigroup instance.
  • Those who jump two major releases would be notified of the change via the Warn constraint. Those who jump three major releases would not be notified, but that is likely very rare.
4 Likes

We still have the option of adding a warning to the Semigroup (Map k v) instance when we add it back in 0.15 with this approach, right?

Yeah. We could add it back in saying, “This instance differs from the implementation defined in the PureScript v0.13.x release or earlier” or something along those lines.

So, the “two-release-jump” people would still be notified. I’ve updated my response above to include that part.

I might misunderstand the use of the Warn class, but couldn’t it be used in order to print a deprecation warning (if the team will go that direction for the next release)? I don’t see why it would be difficult for people upgrading to see the differences if there will be a deprecation warning on build or run.

I’m sure people have been busy with holiday stuff in the past couple of weeks. If there is any thing else anyone would like to say, please do so soon.

@razcore-rad I’m not sure.

This is another breaking change release that should be made in v0.15.0.

Per the my comment above:

In v0.15.0, I believe we would deprecate the SemigroupMap newtype and add the unbiased Semigroup and Monoid instance back to Data.Map with a Warn that says something like “Instance differs from the instance defined in PS releases <= 0.13.x.” In v0.16.0, we would remove the SemigroupMap newtype and remove the Warn constraint on the Semigroup instance.

I don’t think we should do what was originally described exactly. If we have a Warn on both the SemigroupMap and Map instances, then everyone will get compiler warnings for the rest of v0.15.0. In v0.15.0, I’d rather not having to see a new MonadZero-like warning for the rest of the release. I would rather we put a Warn on Map so that those who jump two breaking changes are still notified, and NOT put a Warn on SemigroupMap's instances.

In other words:

  • v0.14.0
    • remove Map's Semigroup and Monoid instance
    • add SemigroupMap with unbiased Semigroup and Monoid instance
  • v0.15.0
    • add unbiased Semigroup and Monoid instances back to Map with a Warn in case people jump two breaking changes
    • SemigroupMap's instances are untouched.
  • v0.16.0
    • remove Warn on Map's Semigroup and Monoid instances.
    • add Warn on SemigroupMap's Semigroup and Monoid instances.
    • add deprecation notice to SemigroupMap
  • v0.17.0
    • remove SemigroupMap
4 Likes

Hi,

I observe a compiler warning in the program for the quoted snippet below:

component ::
  forall query output m. H.MonadAff m => Navigate m =>
  H.Component query TrainingInput output m
component =
  H.mkComponent

A custom warning occurred while solving type class constraints:

Data.Map's `Semigroup` instance is now unbiased and differs from the left-biased instance defined in PureScript releases <= 0.13.x.

How to resolve the warning properly?
I don’t see a way how these highlighted lines are related to Data.Map or Semigroup, because Halogen component doesn’t have Maps and MonadAff, Component, Navigate definitions neither.

So besides the warning itself, I would like to pay attention to the situation where compiler reports a problem, but developer cannot follow the hint. Intermediate lines/transitive symbols should be mentioned in the report between the highlighted line and the pragma warning.

You don’t. That warning is there to ensure people don’t accidentally rely upon the old Semigroup instance. If you are using the new spago though, you can filter out this warning.

  # Optional section to further customise the build.
  build_opts:
    # Specify whether to censor warnings coming from the compiler
    # for files in the `.spago` directory`.
    censor_library_warnings:
      # Note: when using `by_prefix`, use the `>` for block-string:
      # see https://yaml-multiline.info/
      - by_prefix: >
        Data.Map's `Semigroup` instance
1 Like

This is not good, because warning is a signal to a developer, that something goes wrong. Unconditional warning looks strange.
PureScript is positioned as type safe language. This kind of warning I would expect to experience in plain JS based library.

1 Like