Should removeDuplicates from Chapter 3 have arguments for Addressentries when called

nospace · December 15, 2020, 4:13pm

I’m currently stuck here https://book.purescript.org/chapter3.html at the last question to remove duplicates from an address book using nubBy. I tried to search for examples where where nubBy is used but I mostly find it with results for solving this chapter and nothing else. I didn’t want to have the answer directly but skimming though, there are two arguments passed within that function.

The current setup I have for the function is this:

removeDuplicates :: AddressBook -> AddressBook
removeDuplicates = head $ nubBy $ filter filterEntry
  where
  filterEntry :: Entry -> Entry -> Boolean
  filterEntry e1 e2 = e1.firstName == e2.firstName
    && e1.lastName == e2.lastName

which resulted in this error message:

$ spago test
Compiling Data.AddressBook
Error found:
in module Data.AddressBook
at src/Data/AddressBook.purs:54:42 - 54:53 (line 54, column 42 - line 54, column 53)

  Could not match type
                                   
    { address :: { city :: String  
                 , state :: String 
                 , street :: String
                 }                 
    , firstName :: String          
    , lastName :: String           
    }                              
    -> Boolean                     
                                   
  with type
           
    Boolean
           

while checking that type { address :: { city :: String     
                                      , state :: String    
                                      , street :: String   
                                      }                    
                         , firstName :: String             
                         , lastName :: String              
                         }                                 
                         -> { address :: { city :: String  
                                         , state :: String 
                                         , street :: String
                                         }                 
                            , firstName :: String          
                            , lastName :: String           
                            }                              
                            -> Boolean                     
  is at least as general as type t0 -> Boolean
while checking that expression filterEntry
  has type t0 -> Boolean
in value declaration removeDuplicates

where t0 is an unknown type

See https://github.com/purescript/documentation/blob/master/errors/TypesDoNotUnify.md for more information,
or to contribute content related to this error.


[error] Failed to build.

What other arguments should be placed into the function?

ntwilson · December 15, 2020, 4:40pm

You’re actually jumping the gun a little bit. The error you’re seeing is about your use of filter, and not nubBy. filter says it has signature:
filter :: forall a. (a -> Boolean) -> Array a -> Array a
So it’s expecting 2 arguments: first a function of one argument returning a boolean, and then an array. It returns an array.
You’re passing in 1 argument, which is a function of 2 arguments! That error message is saying it can’t use your filterEntry function as an input to filter because it takes in an extra argument.

Then you have the issue that filter is expecting 2 arguments. Now it looks like you’re trying to write this as a point-free function, where you define removeDuplicates as the composition of other functions rather than explicitly making it a function where you provide the arguments. But that operator you’re using there ($) isn’t the composition operator. You’d have to switch to using <<<.
Saying

removeDuplicates = head <<< nubBy <<< filter filterEntry

is the same thing as saying

removeDuplicates allEntries = head $ nubBy $ filter filterEntry allEntries

where the first is built through composition, never explicitly listing the allEntries argument, while the second is “normal” syntax.

Finally, nubBy says it’s signature is:
nubBy :: forall a. (a -> a -> Ordering) -> Array a -> Array a
So it takes 2 arguments. The first is a function of 2 arguments that returns an Ordering, and the 2nd is the array you’re going to eliminate the duplicates from.
Your filterEntry function almost matches that first argument a -> a -> Ordering, it just returns a Boolean instead of Ordering. Note there’s also a nubByEq function where your filterEntry function would match perfectly, though it says it doesn’t perform quite as well as nubBy.

Does that help?

milesfrain · December 16, 2020, 1:37am

Edit: Data.List and that exercise in Ch3 of the book, have been updated to use nubByEq instead of nubBy.

The following text (signatures, etc.) is now out of date.

Sorry about the less-than-ideal docs for Data.List.nubBy. It turns out that only Array’s nubBy has this helpful example:

nubBy compare [1, 3, 4, 2, 2, 1] == [1, 3, 4, 2]

After 0.14 is released, I’d like to go through all the purescript libs and ensure we have examples for everything.

Anyway, Data.List.nubBy actually has this signature:

nubBy :: forall a. (a -> a -> Boolean) -> List a -> List a

and so the first parameter (a -> a -> Boolean) is compatible with your filterEntry :: Entry -> Entry -> Boolean function.

Not for List. But this raises a good point about a potential source of confusion. List.nubBy and Array.nubByEq share the same predicate signature and poor performance (?), but have different names:

name	predicate	runtime
`List.nubBy`	`a -> a -> Boolean`	`O(n^2)`
`Array.nubByEq`	`a -> a -> Boolean`	`O(n^2)` ?
`Array.nubBy`	`a -> a -> Ordering`	`O(n log n)` ?

I think we should make these use the same name by either:

For List, rename nubBy to nubByEq
For Array, rename:
- nubByEq to nubBy
- nubBy to nubByCmp

Option 1 is the most deprecation-warning-friendly.

hdgarrood · December 16, 2020, 4:47am

Please don’t apologise for missing docs. It suggests some kind of failing on the part of the library author, which is not the case. (Obviously it would be better if everything were more clearly documented with examples, and I appreciate your work towards this goal.)

Anyway, thanks for flagging that inconsistency. We should change List to follow what Array does, because the default way of using nub should be via Ord, not Eq, because if you do it via Eq the performance is terrible, as you note.

nospace · December 20, 2020, 3:41pm

Just to be sure of another thing, does point-free function only not work when using the ‘$’ operator?

milesfrain · December 20, 2020, 10:30pm

Sometimes you can follow that rule of thumb, but I think a better strategy is to note that $ lets you eliminate parenthesis, while <<< “composes” functions of one argument together.

For example, all these foo functions are equivalent, and agree with your point-free claim.

foo1 :: Int -> Int
foo1 x = mul 10 (add 4 x)

foo2 :: Int -> Int
foo2 x = mul 10 $ add 4 x

foo3 :: Int -> Int
foo3 = mul 10 <<< add 4

But these bar functions are a counter-example of your claim that point-free is incompatible with $.

bar1 :: List Int -> List Int
bar1 xs = map (add 4) xs

bar2 :: List Int -> List Int
bar2 = map (add 4)

bar3 :: List Int -> List Int
bar3 = map $ add 4

The way I remember $ is to see it as being drawn as two crossed-out parens, so I think “replaces parens”.

nospace · December 22, 2020, 5:13pm

I did change the function composition to

removeDuplicates = head <<< nubBy <<< filter filterEntry

now, but I still get the exact same error as before.

It does exactly still mention t0 as variable and the iteration might still be off. Is the finding of duplicate entries still correct?

milesfrain · December 22, 2020, 6:38pm

There’s some extra stuff in your version that you don’t need. Let’s take a step back and look at the types.

You want something with this signature:

removeDuplicates :: AddressBook -> AddressBook

I wonder if part of the confusion is due to the type alias:

type AddressBook = List Entry

So you can also consider it with that type substituted, which might be clearer:

removeDuplicates :: List Entry -> List Entry

Now one of the building-blocks you have to work with is:

nubBy :: (a -> a -> Boolean) -> List a -> List a

If you let a be your Entry record, then you have something really close to removeDuplicates:

nubBy :: (Entry -> Entry -> Boolean) -> List Entry -> List Entry

It’s just missing something for (Entry -> Entry -> Boolean). Can you identify the function that you already wrote that has this signature? If you plug that into nubBy as the first argument, you’ll end up with List Entry -> List Entry (a.k.a. AddressBook -> AddressBook), which is exactly what you’re trying to create for removeDuplicates.

I’m now seeing that the book doesn’t do a great job explaining curried functions and partial application. We should improve that section, since those concepts are pretty important for solving this exercise.

nospace · December 23, 2020, 5:14pm

The closest thing I have to Entry -> Entry -> Boolean would be filterEntry declared in isInBook and findEntryByStreet. But it has Entry -> Boolean instead.

There have to be two address entries present at the argument which gives a boolean depending on condition. I implicitly pass the addressbook (List Entry) into the filterEntry which appears to iterate over the List of entries as seen in the other functions.

I tried to remove filterEntry and filter entirely, but both with the error that t0 and t1 have an undefined type.

Did I miss something with retrieving the differences in the Addressbook?

milesfrain · December 23, 2020, 5:38pm

Yep. That’s correct.
So you just need to pass that function into nubBy.

Note that functions can take other functions as arguments. That’s what’s going on here. nubBy needs to be given another function that tells it how to determine if two elements are equal.

This idea of passing functions around as values is available in most other languages, but might not be as common of a pattern. See the table in the First-class functions wiki page. If you share details about what other languages you have a background in, we can make an equivalent to nubBy and filterEntry in that language, which may make things clearer.

nospace · January 5, 2021, 6:59pm

I am familar with python to some extent.

So I guess filterEntry gets passed to nubBy

milesfrain · January 5, 2021, 7:36pm

Yes, exactly. Kinda like how you can pass a custom key function as an argument to Python’s sort function.

nospace · January 9, 2021, 5:04pm

What I can roughly get with the python example next to it, is that it should be without filter, but instead like this?

removeDuplicates = head <<< nubBy <<< filterEntry

but at the and an AddressBook should be returned, unlike a search result

nospace · January 9, 2021, 5:13pm

Maybe the point is to pass it to insertEntry after nubBy, but Insertentry also needs another empty AddressBook to store deduplicated entries first?

milesfrain · January 9, 2021, 8:13pm

You’re very close! Two things to change:

You don’t need the function composition operator <<< to pass filterEntry to nubBy.
You want to return the entire List Entry (aka AddressBook), but head just grabs a single Entry (technically Maybe Entry) from that list.

systems · January 11, 2021, 4:04am

Here is my solution that does not use nubBy
and my solution uses the showEntry for comparison which means it will show all addresses

removeDups :: AddressBook -> AddressBook
removeDups Nil = emptyBook
removeDups book  =
  Cons bookHead ( removeDups (filter filterHead $ fromMaybe emptyBook $ getBookTail book))
  where 
    bookHead = fromMaybe emptyEntry $ getEntry book
    filterHead bh = ((showEntry bookHead) /= (showEntry bh))

else where i define getEntry , emptyEntry and getBookTail as

getEntry :: AddressBook -> Maybe Entry
getEntry = head 

emptyEntry :: Entry
emptyEntry = { firstName: "", lastName: "", address: { street: "", city: "", state: "" }}

getBookTail :: AddressBook -> Maybe AddressBook
getBookTail = tail

I have a question about nubBy, how does it iterate over every possible combination in the list , it is passed a function that takes two entry and compare them, but I dont know how does iterate over every possible combination, and I wish this part was clarified in the doc (not how it does it, but that it does it, that does it check every possible combination)

afc · January 11, 2021, 8:23am

I’m sure you know this but in case you don’t, for any of the libraries that are in Pursuit you can answer the question of “how is this implemented?” by clicking on “Source”…

(i mean obviously you can look at the source of any public lib but i mean, it’s an easy and very worthwhile step to take)

In this case i believe the answer to your question is that you will see that the implementation sorts the array first, guaranteeing that the duplicates are clustered for “nubbing”.

milesfrain · January 11, 2021, 9:05am

True for arrays, but not yet true for lists (which is what this exercise involves). Will be soon though (PR). When that happens, this problem will require using nubByEq instead.

nospace · January 13, 2021, 2:49pm

Considering those points and the other previous function that outputs an AddressBook, I changed it to this

removeDuplicates = Cons <<< nubBy filterEntry

milesfrain · January 13, 2021, 7:11pm

Does your version with Cons compile? The solution should just be:

removeDuplicates = nubBy filterEntry

Or if you’d rather not write it in point-free form:

removeDuplicates book = nubBy filterEntry book