Proposal on PureScript package management

wclr · April 2, 2021, 10:12am

Hey, PS community! Recently I’ve been pondering on the issues related to general PureScript package management and working on the proposal that may help to resolve them, now it is ready to be presented. It is a bit wordy, but I tried to give the rationale behind the suggested ideas in detail. I want to thank some members of the core team that had conversations with me on the topic and helped me to come to certain conclusions. Have a nice and thoughtful read!

Proposal

The goal of the current proposal is to suggest and put forward a solution to problems that relate primarily to the package management part of the Purescript ecosystem. Those issues are mainly addressed in the GitHub main repo issues and concern the compiler package awareness and problems with module’s namespaces in consumed packages (e.g., duplication of module names).

This proposal assumes re-thinking and re-evaluating the current understanding of a packaged entity, its relationship with included modules and publishing infrastructure. These changes if accepted and implemented should presumably lead to a more clear and concise design of packages, more simplified consumption, and management in general.

The proposal does not contain any particular low-level implementation details, but the higher-level overview of the proposed model and justifications for it. It is absolutely open for changes and corrections.

Packages and module namespaces

A package should be an encapsulated module namespace like Data.String or Some.Platform. A single package includes and exposes only one namespace. This namespace should be the name of the package and should serve as its primary identity.

Package may include and expose nested namespaces. Say package Data.String may include modules Data.String.RegEx and Data.String.NonEmpty and in this case even having Data.String module file itself could be not required, because nested namespaces should be available when just importing parent namespace: for example if we import Data.String as S then we get access to the nested S.RegEx.test for free. I believe this should be enabled by convention, and it allows to implement what has been discussed without adding extra re-export syntax and will expose only what should be exposed according to the structure for a package.

A package has a version. Packages may depend on other packages with particular versions or version ranges (nothing special about versioning here). A package is fully opaque for a consumer, which has no access to its dependencies or internal implementation but only to an exposed namespace.

This approach with putting module namespace in front forces to think about module names and namespaces hierarchy in the first place. It gives a much more clear notion what are the package purpose and author’s intentions. And this definitely should lead to better design decisions in general and will help to establish more order in the ecosystem through better standardization and categorization that will be partly embedded in package names.

While the current approach when the actual relation between package naming and inner module namespaces is absolutely unregulated and hidden only leads to additional cognitive load, weird and poor design decisions, issues like module names duplicates, unrelated namespaces, and modules in the same package, bloated interfaces.

In this changed paradigm, package authors need to start thinking about module names and take them seriously. Modules should not be “just a bunch of files”, and this attitude should be conveyed throughout the ecosystem. “How do I better name my package (what namespace should I use for it)?” should become a common question for package authors in the community.

In more abstract terms what we get with the proposed change on packages and modules is:

Removing what is not needed not losing anything (made up package names).
Hiding what should be hidden (inner implementation details).
Exposing what was concealed but should be explicit and visible (module namespaces).
Decomposing what was compound and complected (bloated packages).

It all may seem to be a big or significant change but actually it is not: it is just restructuring of the packages and removing unnecessary and label by replacing them with a single exposed module namespace. But the impact of it may be huge and positive.

In the next paragraphs, it will be shown how to solve the issues that arise (or become more explicit) with the new approach.

User namespaces

The registry should be divided into user namespace (or domains, or scopes), this would allow users to have a place for packages they own, and where they don’t mess with others. A user may publish their one’s packages without any particular limitations.

Such namespaces can be used not only as personal dev accounts but for teams and groups as well. This username-based organization can also help to communicate better the kind of package origin, level of trust, etc.

Important user namespaces like core, contrib, etc. should be defined and taken. Namespaces should be immutable and treated as a constant secondary part of package identity in the ecosystem, they can not be discontinued or changed by a user’s desire.

Why is is this important to have user namespaces? In the proposed design this is actually necessary. Currently developer’s freedom is achieved by allowing to come up with an arbitrary name for a new package, and inside this package, an author may place whatever one wishes. For example, a user decides to publish package “purescript-task”, then another user with a similar or related idea would like to do the same, but “task” is taken, so the user has to come with some other name: “taks2”, “better-task”? Maybe one could come up with and publish “purescript-work”? That looks really weird and clumsy, definitely a source of ambiguity and miscommunication and probably even a point of frustration for people.

I wonder why would anyone want to have such an explicit point of conflict and to be engaged in additional problems and resolution processes like package name squatting, name reassigning, disputes, reservations, etc. I see a registry system of that type without user namespaces as an app with a mutable shared state at scale, the more actors involved the more obvious and painful it becomes**,** and this feels wrong.

Another type of communication for packages authors should be encouraged: if it is necessary to resolve some module’s namespaces conflicts in codebases (see bellow) or collaborate, and it will be a whole different level of discussion, not just “he took my package name”.

User namespaces solve problems and create future design opportunities, I’ve been advocating them in general, but as it has been saying it is a necessary part of the proposed approach and this makes it even more valuable because there is no need to make a decision on this in particular.

From the implementation standpoint introducing user namespaces should not require much work or any significant efforts. We should just add the username to the package name, and treat it as a package identifier when discovering a package. It is definitely should not be seen as a complication because it is not (we anyway already have a username that is coupled with a package), but a simple solution to the identity and uniqueness problem, without shifting this onto a user’s head.

Continuity, unity, and stability of package ecosystem are achieved on another level by other means like package-sets and relying on packages from trusted users/teams rather than just squashing everything into one big pile of names. All people/users can not be headed in the same right/better direction but they should be given freedom and space to experiment with their own ideas and approaches without interfering with others.

With having introduced package-sets and relying on them as a standard it is actually possible to get the best of both worlds: one part of the system where have and need namespaces and another part where we have a flat conflict-fee set of packages, without compromises and trade-offs.

Package bundles

There should be bundles of packages. A package bundle is an entity that just holds a list of packages (potentially with version ranges) that are included in this bundle. Different bundles may include the same packages so packages are fully decoupled from bundles.

Bundles have simple names (lower case, like package names currently) and are published inside user namespaces, but can include packages from any user namespace, as long there is no module namespace conflicts.

For example, prelude bundle could include packages that are needed for starter: Prelude, Effect, Effect.Console. And node-http bundle may include Node.HTTP package, as well as Node.Process, Node.Path, Node.Url, etc. to help in fulfilling common tasks related to using http in node.js. There maybe even bundle node-all or something. Or a user may always choose a granular approach if needed and install individual packages instead of using bundles.

Though bundle is an immutable registry versioned entity (as it contains a list of packages with version ranges), a consumer user should not depend on them. It should be just a helper thing to simplify the installation of a bunch of modules that are supposed to be useful in achieving consumer’s goals. It may also be involved in uninstalling/managing tasks, but no direct tight dependency with a codebase should be engaged.

Advantages of such bundles model before current approach with packages in terms of distributing related modules in a chunk are quite obvious. They don’t intertwine, couple, or complicate anything but give a much more flexible way to implement the distribution and encourages thinking in terms of user’s tasks and scenarios.

Package sets

A package is bad if not in the package set. Maybe just not good enough yet.

Package sets are great. This approach obviously should be a foundation and de-facto standard of package and dependency management in the PureScript ecosystem. It ensures stability and consistency, ease of dependency management across the ecosystem and in user codebases.

Generally, package authors should be encouraged to add their packages in the set, but there also should be an emphasis on the considerations when users should strive to get on board and when it is worth to stand by for a moment.

The package set should be a place for useful and commonly/widely used packages with well thought out design, structure, interface, and documentation, clearly communicated purpose and intention. It should be not just a pile of whatever exists (and accidentally compiles). The registry with user namespaces makes it easy and unobtrusive (to other users in the ecosystem) to present to the world one’s creative efforts. Anyone can consume any packages of any user. And if a package really in demand and other packages and users want to depend on it then it is worth adding it to the set.

Managing of official and standard package sets can be greatly automated, but it still would require manual efforts that should be spent reasonably.

Adding a package to the package-set should be a point where users are helped with the analysis and review of the design of their packages. And it should be aligned with core community vision and understanding what and where should go. For example, if a user wants to create a module for working with a certain database and share it, he/she should be guided on what package namespace is better to use, etc. Developers should feel responsible for what they are doing and especially for being accepted in the package-set.

Dependency management and conflicts

In a codebase (compilation target) there should be no conflicting module namespaces (packages that expose modules with the same name). This is known as the flat dependencies model and this is the currently accepted (and only possible) way of doing things, and I’m quite sure this should not be changed.

With implementing current proposal names collisions and duplications should be less likely to happen because the potential surface for a conflict (per package) becomes smaller. Another and the more important reason is that a package name based on module namespace will more clearly convey the intention, and conflicts with the existing ecosystem should not be the intention of an author.

But a situation where a user needs a conflicting package may happen: if one needs to use a package with a conflicting namespace, or another version of the same package. To deal with this we should either invent some sophisticated way to make it possible to compile a code that references to the same name of the module but from different packages or propose a workflow that users can use to achieve one’s goal without introducing any complications for a compilation process. The latter option is definitely more simple and appropriate.

A concrete viable option, in this case, seems to be the following: if a user finds oneself in a situation where he/she really needs a conflicting package, one should fork it, change the package namespace, update deps if needed, publish it in one’s own user namespace and then add it in the project a new non-conflicting package. The workflow may be variable and for basic cases could be even fully automated, but should not disclaim the responsibility from the user.

This option seems to be is absolutely viable because conflicting situation should not be seen and treated as normal. If you need an older version you should then strive to eventually update to the newer version (supported by the package-set). If you need a newer version that is not currently in your package-set you will eventually update the whole of your project to the newer version. If you need a conflicting package from another user, you should somehow try to resolve this issue with the package author, which should understand that his or her package conflicts with the stable/another part of the ecosystem. Demanding “out of the box” flexibility in this area from the compiler will potentially lead to much worse problems for a user in the long run making one’s codebase more fragile.

The compiler package awareness

With all proposed above the compiler doesn’t seem to require a lot of or really complicated changes. If currently, the input is a list of globs of *.purs files, in addition to it the compiler would need to:

know a list of direct dependencies available to a compilation target to expose only allowed modules and hide from it transitive deps.
understand where is a package (by locating a package manifest) and be able to expose a single namespace from the package.

Another additional complication would be needed if a single package will be treated as a compilation unit. But it all this seems to be quite straightforward and potentially brings no breaking changes to the consumer side with properly updated metadata.

As for the deal of building against pre-compiled dependencies, I don’t see the reason why such packages could not be potentially precompiled by an appropriate version of the compiler and used in end-user projects.

Implementation plan

Here is a kind of high-level layout of what should be accomplished to bring proposed changes to life, details should be devised and corrected if the proposal is accepted in general.

The registry. This is probably the biggest part. The registry is still currently in the planning and intermediate implementation stage, so in addition to planned features, it would need to support: user namespaces, package bundles, and automatic versioning via analysis of package API changes (which I think should be a strong point because it is very important for providing consistently of the ecosystem).

Package management tools. This is tightly coupled with the registry implementation and requirements.

The compiler. As discussed above, the compiler should be updated to deal with and expose packages. This doesn’t seem to be a huge change.

Redesign of core/contrib packages. This is too quite a big task that would require significant efforts from the core team. It will probably require organizing new repositories (monorepos), transferring the code, etc. This should be planned beforehand and started after the registry implementation achieves some stable and working stage. But it is important to understand that this transition actually can be planned and made smooth and gradual, because there is no reason why the compiler and tools could not support both kinds of packages, the old and the new at the same time.

New version of Pursuit. This change will require creating a new version of the Pursuit site designed around the new package vision. It can be made more ergonomic and API-oriented, this is already discussed.

Ecosystem and users transfer. After the registry is ready, compiler and tools are updated, and core/basic packages are published, then users may start to transfer their packages and projects to new the version of the compiler and package infrastructure, from the users perspective there two kinds of updates:

Upgrade the metadata for consuming new packages infrastructure. This can be mostly automated as old packages (which are bunches of modules) could be translated to new packages in a quite straightforward way.
Update to PureScript source code, which is affected only by namespace/module names redesign of modules which is happening with each new compiler version.

As for package authors, I don’t think that there can be made anyway for “automatic” transition, most of the existing packages have to be restructured into a new form and published in the registry.

This also implies that the compiler and tools should facilitate the transition by support both old and new ways of installing and compiling dependencies, I believe it is totally possible as under the hood we are still dealing with separate module files. The point of all this is to re-construct the notion of the package entity which is still a pack of related modules but more granular and concrete.

In conclusion

In the paragraphs above we were discussing the benefits of the proposed model and some drawbacks of the current approach. But to conclude we need to answer another two questions:

What would be the cons and trade-off of the proposed approach if accepted?
What would be the benefits of staying more with the current approach?

As for the first question, among points of concern, I would notice the following:

It requires a more granular approach to development. There will be more independent packages and it will be not viable and efficient to use for each separate git repository. This will require embracing monorepo dev practices for closely related groups of packages. In my practice, monorepos are great for reducing complexing and eventually making the management of related releases easier.
It will also require a more granular approach to dependencies and version management (for some packages). For example, if now we have a package prelude which includes multiple modules that are developed, versioned and published as a single unit and that may feel easier to manage. With the proposed approach there will be multiple packages Prelude, Control, Data.Eq, etc, and Prelude package will depend on them if one of those packages is updated and this may require also updating Prelude package. But there wouldn’t be many packages like this with a big amount of dependencies which previously were in a single package. This is I don’t see this as a complication, but as sophistication that can expose hidden coupling between packages. In terms of dev experience, a monorepo approach discussed in the previous paragraph will definitely help do deal with this.
It requires more breaking changes to come, maybe. Though I don’t see how those changes may seem to be intimidating for the language and tools implementers or disruptive for the end-users, at least there are all the means to prevent it. And I doubt that it even can be called a significant breaking change because even if the language and core libraries may be considered as a mature part, the package management part is far from a satisfactory state and any enchantment to it should be considered and perceived as beneficial and not breaking at the current stage.

As for the second question:

I don’t really see any strong motivation and reason to stick with and promote the model based on the current approach with contrived package names and arbitrary modules included, and a flat repository of those package names because it brings with it so many problems and complexities already exposed and discussed.

In my view, the main reason why this problem still didn’t get actual progress is that people intuitively feel that this primordial status-quo situation with a flat ungrouped list of package names and compound package entities along with the simple and lean Haskell style module system just doesn’t have an elegant resolution if the goal is to have a robust, flexible and stable package ecosystem.

I’ve seen people discussing much more radical and breaking ideas for the module system, which are still around packages in the current understanding. So I believe it may be hard for people to imagine that so “familiar packages” with fancy composed names where one can put anything “that is needed” are not here anymore. People will need time to consider and ponder on it. But keeping up with the current approach to packages will only bring more problems in the long run. The Haskell ecosystem is fighting with the mess and trying to invent some workaround solutions but they can not allow to introduce more or less radical changes and diverge from bad design decisions made in the past, or it is much harder for them to think about it. I believe Purescript should strive to make more correct and eventually right decisions in all the areas and try to avoid mimicking the bad parts of Haskell’s traditional approaches.
Another reason there may be comments like: “This will require a lot of work and resources we don’t have.” - so we are trying to save time and energy and keep ourselves on the taken track. This could be a valid and fair argument, but only in one case: if the path is really chosen, everyone involved in making the decision agreed and there is a perception that it is really right and correct, then we do need to change anything.

But if we have some hesitations and unresolved questions, we probably should deeply consider the alternatives. In this case, we should answer the question “we don’t have resources for what”? We need to answer it with specifics, we should estimate whether the alternative path of change really requires more resources and work and give less or nothing (maybe it just takes and hams), or it just requires some amount of mental and emotional energy to make a switch and get on the right track eventually.

In any case, I believe people, esp. those who are responsible, should think in the first place about the needs of the community and future of the Language and the ecosystem, and only in the second place about their personal preferences and self-regarding aspects.

Nevertheless, I also assume that I may be failing to observe some significant and important drawbacks of the proposed approach compared to what is happening right now and where things are going. So if one should generally disagree with the proposal I think it is reasonable to answer those two questions when formulating and presenting the judgment. It really would be nice to see and hear more concrete technical and design considerations about maybe invalidity, irrelevance, or difficulties with implementation of proposed changes, but not just general words.

Related links to consider

There are some links to the posts I found resonating and inspiring, actually, I found many more resources to be useful in composing this proposal, but these are just a few:

This is an insightful and beautiful comment by joneshf about a fundamental issue of the relationship between packages and modules that pushed me to think further on the problem.
There is a comment by paf31, which proposes that each package like “foldable-traversable” could have its own conventional namespace like “FoldableTraversable”. So why not just get rid of the former name if possible? It also says about restricting conventions for module namespaces which are not discussed in this proposal because it is impossible to implement such an idea with the current design of namespaces.
Discouse hdgarrood’s comment on user namespaces/scopes and also hdgarrood’s blog posts on package management in general.
The post of a haskeller that talks about the flaws of having “Internal” modules conventions and that each module that presents its own functionality should be shared as a separate package.

hdgarrood · April 2, 2021, 12:58pm

There’s definitely some interesting topics to discuss here. Before I address those though, there’s one thing I want to respond to first.

In any case, I believe people, esp. those who are responsible, should think in the first place about the needs of the community and future of the Language and the ecosystem, and only in the second place about their personal preferences and self-regarding aspects.

This is not how this works - you just don’t get to ask maintainers to do this. I’m objecting to this on two levels: firstly, on a moral level - maintainers are people too, who need breaks and time for themselves just as much as anyone else does. Suppose you take the utilitarian view that because x hours of work done by a maintainer might save y hours of work by 100 or 1000 users, the maintainer should do that work (just for the sake of illustrating the argument - I’m not saying this is your view). The logical conclusion of that is that the maintainer is morally obliged to work themselves to the bone, which clearly isn’t right. My other objection is more practical; speaking from experience, if we take this view, it’s less likely we’ll manage to ship the features we want. Instead we’ll just get burnt-out maintainers who don’t want to work on the project any more.

Adrielus · April 2, 2021, 2:15pm

I wonder if a tool could be made to help automate the namespacing process

wclr · April 2, 2021, 6:46pm

I probably have been misinterpreted here. I didn’t mean to push people or impose additional work and responsibilities that they are not willing or able to take, or something like that. I’m just talking about the attitude that one should first think about the needs of the project and its future. No one here should be in a haste or overloaded this is implied and obvious, this not a startup or business endeavor with abnormal deadlines. By “Self-regarding aspects” I meant for example possible limited personal understanding, which poorly regards the interests of the whole. The whole (the community, users, projects) is not interested in overloaded and burned-out maintainers, and stuff like that, but it is interested in the best technical solutions and in the best engineering decisions made, that is what people who are in the core of the ecosystem are responsible for. Though this is definitely implied, I still wanted to make an emphasis on this. This is what I meant, and I probably should apologize if I sounded too pushy here.

wclr · April 2, 2021, 6:53pm

What do you mean “to automate the namespacing process”?

Adrielus · April 2, 2021, 7:21pm

Idk, translating from the old to the new module system

wclr · April 2, 2021, 8:18pm

There is no new module system assumed. Haskell’s style module system is really simple and lean, and probably the best (though this statement would need some proof) for the language. It is all just about reducing package entity to module namespace, which should simplify the whole packaging architecture. And module namespaces are not proposed to be a subject for any changes. So there would be no need in any modifications of the existing code. The whole deal is about packages and code distribution.

hdgarrood · April 3, 2021, 1:18pm

I think there are a few separate proposals here and I think you’d find that you’d generate more useful discussion if you separated them, because there is a lot of text here and it makes this difficult to respond to in its current form.

User/org scoped package names have been proposed and discussed already, for example, and I am not really persuaded by your claim that they need to be considered alongside the other things you mention. Scoped packages also have definite downsides, and your proposal would be stronger if you addressed them. See eg https://samsieber.tech/posts/2020/09/registry-structure-influence/
I am 100% on board with unifying module names and package names, although I (perhaps unsurprisingly) prefer my own proposal where the package name is the one we keep around and the module names become less important. Firstly because package names are already designed to work well in a registry, since they can’t use uppercase (case sensitive package names can be very annoying) and also because some of the conventions we have in module names should imo be discarded - Data.This and Control.That don’t buy us anything imo. I also prefer the approach of making the current package name the primary way of referring to a package/module because it means that we don’t have to rebuild the existing registry which tells us what names refer to what repositories. Finally, it’s easier (for me at least) to see how to migrate third party packages which exist under other packages’ namespaces with my proposal, such as Halogen.DayPicker or Data.String.Extra.
I’m not convinced by package bundles, and especially not if that requires building features into the registry. I think having “starter” repos for certain project types, like a halogen or react-basic app, fills the same need and doesn’t require any registry changes. I think it’s notable that we have tried to do this before with the purescript-base package - this package was eventually abandoned because frankly it was a pain to maintain and wasn’t really providing value. (At least this is my understanding - I wasn’t a maintainer of it.)
Reviewing packages on submission to package sets would be a massive ongoing amount of work which would only increase over time, and as far as I’m aware no other OSS ecosystem does this. It could also very easily deter would-be authors if they don’t want to go through a process where they have to defend their package’s right to exist in public. I think it could be great for some kind of more carefully curated package ecosystem to exist, but I feel quite strongly that this should be separate from the official package set.

I’m also not really convinced by your estimations of how much effort these things will require to implement. I also want to note that making packaging more complex is not just a one-time cost, it’s an ongoing cost as we will have to investigate and fix bugs which pop up which wouldn’t have been possible with a simpler system, account for all of the different ways people are using the feature set when making other changes, and so on.

wclr · April 3, 2021, 3:09pm

I don’t think I understand what you mean by unifying module names and package names? Could you elaborate on this?

In the proposal, it is meant that we no longer can have one package which exposes Data.String and Data.Char namespaces, and this is the main premise for the proposed change. If you with that, which I could not really get from your answer, should we just have “data-string” and "data-char" names for those packages or do you mean something else?

hdgarrood · April 3, 2021, 4:10pm

The proposal I am referring to is https://github.com/purescript/purescript/issues/2437. Essentially yes: each package would use its own package name as a namespace, and that namespace would be entirely reserved for that package’s use. For example, the existing strings package would add a strings prefix to all of its modules, and you might have, say, a module strings which replaces the current Data.String, and a module strings.Char replacing the current Data.Char.

JordanMartinez · April 3, 2021, 4:23pm

Just wanted to say a from me. I saw this post, started reading through it, saw just how long it was, and suddenly had “better” things to do

wclr · April 3, 2021, 5:01pm

If I manage to follow you, this is actually quite a different thing from what I’m talking about.

I propose not to introduce any breaking change to the module system (because it is beautiful) and existing code (because it is nice). I understand you correctly you are talking about just renaming current packages and changing importing style:

import strings.Char (...) as Char -- this comes from package strings
import strings as String -- this comes from package strings

I don’t see here the idea about re-thinking and restructuring packages, ect, what I’m talking about:

import Data.Char.Gen (...) as Gen -- this comes from package Data.Char
import Data.String.CodeUnits as S -- this comes from package Data.String

Data.Char and Data.String here are different packages. Why the are diffrent packages? Because they present different domains of functionality. Well, maybe Data.Char module is too small and not so important and independent to have its own domain and package, then I should be just inside a more powerful domain, for example, Data.String.Char and go along within Data.String package.

That is a very simple and logical model that I proposed. And his actually the main the only thing that worth to be discussed in the first place, all the rest things in this long-read document are logical consequences and justifications for this changed approach.

wclr · April 3, 2021, 5:04pm

This hardly may add something useful to the discussion. Anyway, If you (and other busy readers) care about the topic, in a couple of my last replies in this thread I summarized and emphasized the main idea worth discussing very clearly, I believe, in a few words.

wclr · April 4, 2021, 7:34am

Thought about it more. I believe I follow you and we are definitely talking here about similar things. But we are trying to approach this from different perspectives. You want also to simplify and also reduce some existing namespaces of the modules, but also significantly change the semantics of the imports and the module system:

import strings.Char (...) as Char -- this comes from package strings
import strings as S -- this comes from package strings

Would be equivalent with my proposal, if a namespace Data.String can be replaced with just String (I would be fine with it because Data.String seems a little bit excessive, at least for starter, though I think what is buying us a notion of the concrete functional domain):

import String.Char (...) as Char -- this comes from package String
import String.CodeUnits as S -- this comes from package String

The difference of those two may seem to be kind of cosmetic, but I believe it is much more deeper and important I’ll try to explain why.

If we look at import strings.Char (...) as Char - there are two kind of semantics here - strings and Char (if you imagine some name like strings-extra this would be more obvious), and this is definitely doesn’t seem to be a simplification. Existing semantics of the imports and module names very simple and expressive. I see this (even just visual) complication as a clear sign that something is wrong here. we invented some crutch to fix the issue we where unable to deal with elegantly. Why do we want to couple and mix the module system with package names that have different form and semantic? I have a strong feeling that this should be avoided.

How we are going to declare module name in this case: module strings.Char where, or this would be module Char where inside strings package which will be imported with import strings.Char? This is clearly something wrong and mixed up, at least some strange (semantic) smell is here.

But the problem with it is not purely visual, Haskell style module names are supposed to express functional domain using very clear, strict, and concise semantics. And what we can do to solve the problem is not to break and ruin it, but in the opposite, we can explicitly lift this functional domain to the level of a package, and this would be like a pure and functional solution, not something made up and breaking simplicity and elegance of the module system and even the language itself.

So I think the PureScript in this realm has a fair advantage before many other languages and their packaging ecosystems - a great and simple module system, abstracted from the file and packaging systems - and that is a good thing, that should be not hidden but in the opposite, lifted and exposed to propagate its simplicity to the upper levels.

I’ve looked at the code of the registry CI. And first of all, I may say that I’m ready to participate if needed. It is not rocket science, and if you say that it is hard to change the kind of package names the registry is handling or even to make something more sophisticated like adding a new repo entity (e.g. package bundles) I would say that the registry is a lacking a good architecture, and this not a good sign. But this can be changed if there is an understanding and desire to make things better, not just to follow the taken track, even if it may be wrong.

I think I have good answers for concrete questions on the design side, e.g what we should do with case dependent names and user scopes, how to avoid potential “inconveniences”, while getting the advantages and opportunities it gives, and why there is a risk to get a registry system that is designed to be simpler than it is potentially needed to be great.

I also still assume that I may be kind of wrong with this idea of lifting the functional domain and module namespace to the package level. So I would like to hear why it may be improper. And why it is better to stick with the current approach to package names that has no strong relations with included modules and why for achieving this it is even worth breaking the current semantics of the module system.

f-f · April 4, 2021, 8:55am

Would you like to elaborate more on this? What’s lacking exactly and how? Have you looked at the ongoing work?

hdgarrood · April 4, 2021, 12:41pm

When I say it will be hard to change, I don’t mean it will be hard to change the code in the registry and pursuit etc to support your proposed package name scheme, I mean it will be hard to avoid irritating large numbers of PureScript users when they have to learn a new way or referring to packages, and they have to update all of their package manifest files and other places they are referring to packages for what appears to them to be a cosmetic change. I understand that you don’t feel that this is a cosmetic change but I think we would struggle to persuade the wider community.

joneshf · April 4, 2021, 3:29pm

This kind of discussion is what I hoped would come out of that comment. I’m glad it helped here and thank you for moving the discussion forward.

If this were a year ago, I’d have more motivation to be involved. These days I don’t have the mental fortitude to take part in the discussion.

One thing I’ll throw over the wall (in case you hadn’t considered it) is that npm 7 supports strict peer dependencies. If you’re unfamiliar with peer dependencies, they behave like normal dependencies in all other ecosystems–including dependencies in the PureScript ecosystem. Peer dependencies weren’t a viable option before npm 7 because there was no way to require a single set of versions for everything. That’s now the default behavior.

I’m lobbing this over the wall because:

I don’t have the motivation to present a complete argument with ups, downs, and caveats. I’m not saying we should definitely do this. I don’t even know if this would really work.
It might still not actually work out in practice, but it makes sense theoretically. I plan to explore this idea at some point, but I’ve more important things to deal with at the moment.
It seems to address some of your points above:
- User namespaces can be addressed by scopes (as you pointed out above).
- Package sets seem like they could be implemented by depending on the package set package at a specific version and all of the other packages at any version:
```
{
    "peerDependencies": {
        "@purescript/package-set": "psc-0.14.0-20210402",
        "@purescript/prelude": "*",
        "@purescript/record": "*"
    }
}
```
  Or something like that. I mean, that’s basically how the package sets currently work.
- Registry effort can be abandoned since npm exists. People can either redirect their energy toward something else, or get back some of their life.

Anyway, like mentioned I’m not trying to present a full-fledged argument, but something to chew on. And I’m probably glossing over some intricacies. But the point is more that the state of the world has changed, and there might be more options that we once thought couldn’t be used.

FWIW, I think the PackageImports extension is one of the best extensions GHC has. I know it was made to address conflicting modules names, but its knock-on affects are better than its original intention. In an ecosystem where it’s incredibly hard to find the package a module comes from, it makes understanding a codebase way easier.

In any case, thanks for continuing the discussion and best of luck!

joneshf · April 4, 2021, 3:39pm

Sorry, one more thing. It might be useful to re-emphasize this point:

That mental change is going to provide the most benefit no matter how modules or the compiler changes. That mental change can happen today without any changes to compiler, modules, packages, or anything else. But if we can’t make that mental change in the current world, I doubt any technical change is going to help.

wclr · April 4, 2021, 3:54pm

I said that if it would be hard to add/change such features in current implementation I would say that it is lacking good architecture. Maybe I’ve misinterpreted what @hdgarrood meant about difficulties of possible and necessary changes. So I didn’t mean that the current architecture is no good or something like that. Good architecture is the one that helps to defer important design decisions, and allow flexible changes obviously. So as the changes that could follow from this proposal are not actually huge or difficult I can claim that good architecture should support them with ease.

And in any way even if you didn’t plan to include for example user namespaces feature it would be more correct to design and architect the system if it could be introduced at some point, this would be a better architecture. But I’m not claiming that the current one has no such capabilities even if the authors didn’t assume it (this is often the case in good architectures).

garyb · April 5, 2021, 12:37am

I’m really not following this example about Data.String, Strings, Char, and this “Chart” module (is that just a Char typo?). I’ve tried reading both the posts on it several times but I don’t understand what is being suggested/argued against.

Could you try and phrase it another way or concoct another example perhaps?