You could argue that JS is correct here, since the Char type is defined to be
Right now the C, Erlang/Elixir, Scheme and seemingly the C++/Go backends all use utf-8 encoded strings. Even though that’s not according to spec, I think it’s the right thing to do and that we should change the spec.
One could argue that there’s lots of subtle places where this will break existing code, and that’s a valid point. But also, in the current state, there’s a lot of code that is already subtly broken because of this, so I think it’s better to change this sooner rather than later.
There’s also a valid argument in terms of performance, if you’re writing performance critical code. Slicing or indexing into a string wouldn’t be as efficient anymore. I’m fairly confident that that this not an issue for most code, but I’m sure this will affect someone.
This was afaict originally suggested by Nate here: Consider changing `Char` to represent a code point rather than a UTF-16 code unit · Issue #3662 · purescript/purescript · GitHub so do go there and read the feedback people have posted over the last 3 years.