Peculiar indentation rules for let in do block?

Yesterday I was struggling with some code in a do-block where the type checker kept complaining about Unexpected or mismatched indentation. I was starring myself blind at the problem and eventually just pulled out the code to its own function instead -and things worked fine…

Today I ran into a similar problem again, and randomly I decided to give the code a little more indentation, and then the type checker magically stopped complaining.

A simplified silly example to show the problem:

fourspaces :: Effect Unit
fourspaces = do
  let stuff =
      [40,41,42,43,44] -- this line is indented with four spaces
        # Array.filter (\x -> x >= 42)
        <#>  (\x -> x + 1)
  log $ show stuff

The type checker reports

  Unable to parse module:
  Unexpected or mismatched indentation

Same example, but with 6 spaces:

sixspaces :: Effect Unit
sixspaces = do
  let stuff =
        [40,41,42,43,44] -- this line is indented with six spaces
          # Array.filter (\x -> x >= 42)
          <#>  (\x -> x + 1)
  log $ show stuff

The type checker have no problems with the version with 6 spaces…

Is this a bug? or is there some logic to why only the six space indentation works?
Its certainly a bit confusing to a beginner :sweat_smile:

1 Like

The entire expression bound to bindingName must be indented to the right of the column in which bindingName first appears. So, everything must be to the right of that | symbol. x represents an invalid indentation whereas represents a valid position:

foo = do
  let bindingName =
x     |
 x    |
  x   |  
   x  |  
    x |  
     x|  
      |
      |✓
      | ✓

People usually newline-then-indent the reference name for expressions that span multiple lines because it reduces the amount of indentation needed for the expression (5-6 spaces below vs 7-8 spaces above):

foo = do
  let
    refName =
x   |
 x  |
  x |
   x|
    |
    |✓
    | ✓
6 Likes

To elaborate a bit on why that’s the case: it’s because PureScript has a concept of binding groups. Multiple bindings under a single let form a single binding group, where each binding in the group is in scope within the entire group.

example = do
  let
    f = ...
    g = ...
 ...

So f and g are in scope within the definitions of f and g. However,

example = do
  let f = ...
  let g = ...
  ...

These are separate binding groups, where f is in scope for g, but g is not in scope for f. This is in contrast to something like JavaScript, which treats an entire block as a single binding group (for let and const), or a function body as a binding group (for var).

function example () {
  let f = ...;
  let g = ...;
}

Here f and g are similar to the first example, as they are both in scope for both bindings, though they may still be undefined depending on when they are accessed.

6 Likes

Thank you that makes sense!
Hope authors of educational material remembers to mention it explicitly. I did see code with lots of indentation after let bindings in the book I’m reading but never really thought about it and just assumed any indentation would do fine :slight_smile:

1 Like

I believe this is mentioned in the documentation repo

Am I correct in saying that this rule is necessary to disambiguate the grammar? Seems like otherwise the case

let
  x = e
    A y = ee

is ambigous because it could either be x = e A; y = ee or x = e; A y = ee.

Are there other ambiguities avoided by this parser rule?

(Just curious, and it looks like the linked documentation doesn’t already mention it.)

Yes, it avoids ambiguity in the grammar. All whitespace sensitive languages have the same basic rule (which is generally referred to as the offside rule). At an implementation level, the lexer scans the input token stream, inserting implicit delimiters for whitespace (we call this layout). You can see some of the layout golden tests which makes these implicit delimiters visible by printing them as braces and semicolons.

1 Like