优先使用do记号而不是Applicative操作符来组装记录 (2024)

优先使用do记号而不是Applicative操作符来组装记录 (2024)
Prefer do notation over Applicative operators when assembling records (2024)

原始链接: https://haskellforall.com/2024/05/prefer-do-notation-over-applicative

## Haskell 记录组装优先使用 `do` 记号在 Haskell 中构造记录时，**`do` 记号通常优于使用 Applicative 运算符 (`<&$>`, `<*>`)**，即使对于实现 `Monad` (如 `IO`) 或仅 `Applicative` (如来自 `optparse-applicative` 的解析器) 的类型也是如此。对于非 Monadic Applicative，需要启用 `ApplicativeDo` 扩展。 `do` 记号具有几个优点：**更好的易用性**，使代码更简洁易读，尤其是在添加提示或复杂逻辑时；**顺序无关性**——记录字段顺序的更改不会破坏代码；以及在缺少字段时**更好的错误消息**，直接指示需要初始化的字段。使用 Applicative 运算符可能会导致笨拙的多行表达式和关于函数类型的晦涩错误消息。`do` 记号更清晰的结构简化了调试，并且对初学者来说更容易理解。此建议专门适用于使用记录语法定义的记录；它不适用于位置参数。然而，这是一种以可维护和可理解的方式构建记录的宝贵实践。

对不起。

原文

This is a short post explaining why you should prefer do notation when assembling a record, instead of using Applicative operators (i.e. (<$>)/(<*>)). This advice applies both for type constructors that implement Monad (e.g. IO) and also for type constructors that implement Applicative but not Monad (e.g. the Parser type constructor from the optparse-applicative package). The only difference is that in the latter case you would need to enable the ApplicativeDo language extension.

The guidance is pretty simple. Instead of doing this:

data Person = Person
    { firstName :: String
    , lastName :: String
    }

getPerson :: IO Person
getPerson = Person <$> getLine <*> getLine

… you should do this:

{-# LANGUAGE RecordWildCards #-}

{-# OPTIONS_GHC -Werror=missing-fields #-}

data Person = Person
    { firstName :: String
    , lastName :: String
    }

getPerson :: IO Person
getPerson = do
    firstName <- getLine
    lastName <- getLine
    return Person{..}

Why is the latter version better? There are a few reasons.

Ergonomics

It’s more ergonomic to assemble a record using do notation because you’re less pressured to try to cram all the logic into a single expression.

For example, suppose we wanted to explicitly prompt the user to enter their first and last name. The typical way people would do extend the former example using Applicative operators would be something like this:

getPerson :: IO Person
getPerson =
        Person
    <$> (putStrLn "Enter your first name:" *> getLine)
    <*> (putStrLn "Enter your last name:"  *> getLine)

The expression gets so large that you end up having to split it over multiple lines, but if we’re already splitting it over multiple lines then why not use do notation?

getPerson :: IO Person
getPerson = do
    putStrLn "Enter your first name:"
    firstName <- getLine

    putStrLn "Enter your last name:"
    lastName <- getLine

    return Person{..}

Wow, much clearer! Also, the version using do notation doesn’t require that the reader is familiar with all of the Applicative operators, so it’s more approachable to Haskell beginners.

Order insensitivity

Suppose we take that last example and then change the Person type to reorder the two fields:

data Person = Person
    { lastName :: String
    , firstName :: String
    }

… then the former version using Applicative operators would silently break: the first name and last name would now be read in the wrong order. The latter version (using do notation) is unaffected.

More generally, the approach using do notation never breaks or changes its behavior if you reorder the fields in the datatype definition. It’s completely order-insensitive.

Better error messages

If you add a new argument to the Person constructor, like this:

data Person = Person
    { alive :: Bool
    , firstName :: String
    , lastName :: String
    }

… and you don’t make any other changes to the code then the former version will produce two error messages, neither of which is great:

Example.hs:
    • Couldn't match type ‘String -> Person’ with ‘Person’
      Expected: Bool -> String -> Person
        Actual: Bool -> String -> String -> Person
    • Probable cause: ‘Person’ is applied to too few arguments
      In the first argument of ‘(<$>)’, namely ‘Person’
      In the first argument of ‘(<*>)’, namely ‘Person <$> getLine’
      In the expression: Person <$> getLine <*> getLine
  |
  | getPerson = Person <$> getLine <*> getLine
  |             ^^^^^^

Example.hs:
    • Couldn't match type ‘[Char]’ with ‘Bool’
      Expected: IO Bool
        Actual: IO String
    • In the second argument of ‘(<$>)’, namely ‘getLine’
      In the first argument of ‘(<*>)’, namely ‘Person <$> getLine’
      In the expression: Person <$> getLine <*> getLine
  |
  | getPerson = Person <$> getLine <*> getLine
  |                        ^^^^^^^

… whereas the latter version produces a much more direct error message:

Example.hs:…
    • Fields of ‘Person’ not initialised:
        alive :: Bool
    • In the first argument of ‘return’, namely ‘Person {..}’
      In a stmt of a 'do' block: return Person {..}
      In the expression:
        do putStrLn "Enter your first name: "
           firstName <- getLine
           putStrLn "Enter your last name: "
           lastName <- getLine
           ....
   |
   |     return Person{..}
   |            ^^^^^^^^^^
 ^^^^^^^^^^

… and that error message more clearly suggests to the developer what needs to be fixed: the alive field needs to be initialized. The developer doesn’t have to understand or reason about curried function types to fix things.

Caveats

This advice obviously only applies for datatypes that are defined using record syntax. The approach I’m advocating here doesn’t work at all for datatypes with positional arguments (or arbitrary functions).

However, this advice does still apply for type constructors that are Applicatives and not Monads; you just need to enable the ApplicativeDo language extension. For example, this means that you can use this same trick for defining command-line Parsers from the optparse-applicative package:

{-# LANGUAGE ApplicativeDo #-}
{-# LANGUAGE RecordWildCards #-}

{-# OPTIONS_GHC -Werror=missing-fields #-}

import Options.Applicative (Parser, ParserInfo)

import qualified Options.Applicative as Options

data Person = Person
    { firstName :: String
    , lastName :: String
    } deriving (Show)

parsePerson :: Parser Person
parsePerson = do
    firstName <- Options.strOption
        (   Options.long "first-name"
        <>  Options.help "Your first name"
        <>  Options.metavar "NAME"
        )

    lastName <- Options.strOption
        (   Options.long "last-name"
        <>  Options.help "Your last name"
        <>  Options.metavar "NAME"
        )

    return Person{..}

parserInfo :: ParserInfo Person
parserInfo =
    Options.info parsePerson
        (Options.progDesc "Parse and display a person's first and last name")

main :: IO ()
main = do
    person <- Options.execParser parserInfo

    print person