I’m currently stuck here https://book.purescript.org/chapter3.html at the last question to remove duplicates from an address book using nubBy. I tried to search for examples where where nubBy is used but I mostly find it with results for solving this chapter and nothing else. I didn’t want to have the answer directly but skimming though, there are two arguments passed within that function.
The current setup I have for the function is this:
$ spago test
Compiling Data.AddressBook
Error found:
in module Data.AddressBook
at src/Data/AddressBook.purs:54:42 - 54:53 (line 54, column 42 - line 54, column 53)
Could not match type
{ address :: { city :: String
, state :: String
, street :: String
}
, firstName :: String
, lastName :: String
}
-> Boolean
with type
Boolean
while checking that type { address :: { city :: String
, state :: String
, street :: String
}
, firstName :: String
, lastName :: String
}
-> { address :: { city :: String
, state :: String
, street :: String
}
, firstName :: String
, lastName :: String
}
-> Boolean
is at least as general as type t0 -> Boolean
while checking that expression filterEntry
has type t0 -> Boolean
in value declaration removeDuplicates
where t0 is an unknown type
See https://github.com/purescript/documentation/blob/master/errors/TypesDoNotUnify.md for more information,
or to contribute content related to this error.
[error] Failed to build.
What other arguments should be placed into the function?
You’re actually jumping the gun a little bit. The error you’re seeing is about your use of filter, and not nubBy. filter says it has signature: filter :: forall a. (a -> Boolean) -> Array a -> Array a
So it’s expecting 2 arguments: first a function of one argument returning a boolean, and then an array. It returns an array.
You’re passing in 1 argument, which is a function of 2 arguments! That error message is saying it can’t use your filterEntry function as an input to filter because it takes in an extra argument.
Then you have the issue that filter is expecting 2 arguments. Now it looks like you’re trying to write this as a point-free function, where you define removeDuplicates as the composition of other functions rather than explicitly making it a function where you provide the arguments. But that operator you’re using there ($) isn’t the composition operator. You’d have to switch to using <<<.
Saying
removeDuplicates = head <<< nubBy <<< filter filterEntry
is the same thing as saying
removeDuplicates allEntries = head $ nubBy $ filter filterEntry allEntries
where the first is built through composition, never explicitly listing the allEntries argument, while the second is “normal” syntax.
Finally, nubBy says it’s signature is: nubBy :: forall a. (a -> a -> Ordering) -> Array a -> Array a
So it takes 2 arguments. The first is a function of 2 arguments that returns an Ordering, and the 2nd is the array you’re going to eliminate the duplicates from.
Your filterEntry function almost matches that first argument a -> a -> Ordering, it just returns a Boolean instead of Ordering. Note there’s also a nubByEq function where your filterEntry function would match perfectly, though it says it doesn’t perform quite as well as nubBy.
nubBy :: forall a. (a -> a -> Boolean) -> List a -> List a
and so the first parameter (a -> a -> Boolean) is compatible with your filterEntry :: Entry -> Entry -> Boolean function.
Not for List. But this raises a good point about a potential source of confusion. List.nubBy and Array.nubByEq share the same predicate signature and poor performance (?), but have different names:
name
predicate
runtime
List.nubBy
a -> a -> Boolean
O(n^2)
Array.nubByEq
a -> a -> Boolean
O(n^2) ?
Array.nubBy
a -> a -> Ordering
O(n log n) ?
I think we should make these use the same name by either:
For List, rename nubBy to nubByEq
For Array, rename:
nubByEq to nubBy
nubBy to nubByCmp
Option 1 is the most deprecation-warning-friendly.
Please don’t apologise for missing docs. It suggests some kind of failing on the part of the library author, which is not the case. (Obviously it would be better if everything were more clearly documented with examples, and I appreciate your work towards this goal.)
Anyway, thanks for flagging that inconsistency. We should change List to follow what Array does, because the default way of using nub should be via Ord, not Eq, because if you do it via Eq the performance is terrible, as you note.
Sometimes you can follow that rule of thumb, but I think a better strategy is to note that $ lets you eliminate parenthesis, while <<< “composes” functions of one argument together.
For example, all these foo functions are equivalent, and agree with your point-free claim.
foo1 :: Int -> Int
foo1 x = mul 10 (add 4 x)
foo2 :: Int -> Int
foo2 x = mul 10 $ add 4 x
foo3 :: Int -> Int
foo3 = mul 10 <<< add 4
But these bar functions are a counter-example of your claim that point-free is incompatible with $.
bar1 :: List Int -> List Int
bar1 xs = map (add 4) xs
bar2 :: List Int -> List Int
bar2 = map (add 4)
bar3 :: List Int -> List Int
bar3 = map $ add 4
The way I remember $ is to see it as being drawn as two crossed-out parens, so I think “replaces parens”.
There’s some extra stuff in your version that you don’t need. Let’s take a step back and look at the types.
You want something with this signature:
removeDuplicates :: AddressBook -> AddressBook
I wonder if part of the confusion is due to the type alias:
type AddressBook = List Entry
So you can also consider it with that type substituted, which might be clearer:
removeDuplicates :: List Entry -> List Entry
Now one of the building-blocks you have to work with is:
nubBy :: (a -> a -> Boolean) -> List a -> List a
If you let a be your Entry record, then you have something really close to removeDuplicates:
nubBy :: (Entry -> Entry -> Boolean) -> List Entry -> List Entry
It’s just missing something for (Entry -> Entry -> Boolean). Can you identify the function that you already wrote that has this signature? If you plug that into nubBy as the first argument, you’ll end up with List Entry -> List Entry (a.k.a. AddressBook -> AddressBook), which is exactly what you’re trying to create for removeDuplicates.
I’m now seeing that the book doesn’t do a great job explaining curried functions and partial application. We should improve that section, since those concepts are pretty important for solving this exercise.
The closest thing I have to Entry -> Entry -> Boolean would be filterEntry declared in isInBook and findEntryByStreet. But it has Entry -> Boolean instead.
There have to be two address entries present at the argument which gives a boolean depending on condition. I implicitly pass the addressbook (List Entry) into the filterEntry which appears to iterate over the List of entries as seen in the other functions.
I tried to remove filterEntry and filter entirely, but both with the error that t0 and t1 have an undefined type.
Did I miss something with retrieving the differences in the Addressbook?
Yep. That’s correct.
So you just need to pass that function into nubBy.
Note that functions can take other functions as arguments. That’s what’s going on here. nubBy needs to be given another function that tells it how to determine if two elements are equal.
This idea of passing functions around as values is available in most other languages, but might not be as common of a pattern. See the table in the First-class functions wiki page. If you share details about what other languages you have a background in, we can make an equivalent to nubBy and filterEntry in that language, which may make things clearer.
I have a question about nubBy, how does it iterate over every possible combination in the list , it is passed a function that takes two entry and compare them, but I dont know how does iterate over every possible combination, and I wish this part was clarified in the doc (not how it does it, but that it does it, that does it check every possible combination)
I’m sure you know this but in case you don’t, for any of the libraries that are in Pursuit you can answer the question of “how is this implemented?” by clicking on “Source”…
(i mean obviously you can look at the source of any public lib but i mean, it’s an easy and very worthwhile step to take)
In this case i believe the answer to your question is that you will see that the implementation sorts the array first, guaranteeing that the duplicates are clustered for “nubbing”.
True for arrays, but not yet true for lists (which is what this exercise involves). Will be soon though (PR). When that happens, this problem will require using nubByEq instead.