[source]

compiler/hsSyn/HsExpr.hs

Note [CmdSyntaxtable]

[note link]

Used only for arrow-syntax stuff (HsCmdTop), the CmdSyntaxTable keeps track of the methods needed for a Cmd.

  • Before the renamer, this list is an empty list

  • After the renamer, it takes the form @[(std_name, HsVar actual_name)]@ For example, for the ‘arr’ method

    • normal case: (GHC.Control.Arrow.arr, HsVar GHC.Control.Arrow.arr)
    • with rebindable syntax: (GHC.Control.Arrow.arr, arr_22)
      where @arr_22@ is whatever ‘arr’ is in scope
  • After the type checker, it takes the form [(std_name, <expression>)] where <expression> is the evidence for the method. This evidence is instantiated with the class, but is still polymorphic in everything else. For example, in the case of ‘arr’, the evidence has type

    forall b c. (b->c) -> a b c

    where ‘a’ is the ambient type of the arrow. This polymorphism is important because the desugarer uses the same evidence at multiple different types.

This is Less Cool than what we normally do for rebindable syntax, which is to make fully-instantiated piece of evidence at every use site. The Cmd way is Less Cool because

  • The renamer has to predict which methods are needed. See the tedious RnExpr.methodNamesCmd.
  • The desugarer has to know the polymorphic type of the instantiated method. This is checked by Inst.tcSyntaxName, but is less flexible than the rest of rebindable syntax, where the type is less pre-ordained. (And this flexibility is useful; for example we can typecheck do-notation with (>>=) :: m1 a -> (a -> m2 b) -> m2 b.)

Note [OutOfScope and GlobalRdrEnv]

[note link]

To understand why we bundle a GlobalRdrEnv with an out-of-scope variable, consider the following module:

module A where
foo :: ()
foo = bar
bat :: [Double]
bat = [1.2, 3.4]
$(return [])
bar = ()
bad = False

When A is compiled, the renamer determines that bar is not in scope in the declaration of foo (since bar is declared in the following inter-splice group). Once it has finished typechecking the entire module, the typechecker then generates the associated error message, which specifies both the type of bar and a list of possible in-scope alternatives:

A.hs:6:7: error:
  • Variable not in scope: bar :: ()
  • ‘bar’ (line 13) is not in scope before the splice on line 11 Perhaps you meant ‘bat’ (line 9)

When it calls RnEnv.unknownNameSuggestions to identify these alternatives, the typechecker must provide a GlobalRdrEnv. If it provided the current one, which contains top-level declarations for the entire module, the error message would incorrectly suggest the out-of-scope bar and bad as possible alternatives for bar (see #11680). Instead, the typechecker must use the same GlobalRdrEnv the renamer used when it determined that bar is out-of-scope.

To obtain this GlobalRdrEnv, can the typechecker simply use the out-of-scope bar’s location to either reconstruct it (from the current GlobalRdrEnv) or to look it up in some global store? Unfortunately, no. The problem is that location information is not always sufficient for this task. This is most apparent when dealing with the TH function addTopDecls, which adds its declarations to the FOLLOWING inter-splice group. Consider these declarations:

ex9 = cat               -- cat is NOT in scope here
$(do ————————————————————-
ds <- [d| f = cab – cat and cap are both in scope here
cat = ()

|]

addTopDecls ds [d| g = cab – only cap is in scope here

cap = True

|])

ex10 = cat              -- cat is NOT in scope here
$(return []) -----------------------------------------------------
ex11 = cat              -- cat is in scope

Here, both occurrences of cab are out-of-scope, and so the typechecker needs the GlobalRdrEnvs which were used when they were renamed. These GlobalRdrEnvs are different (cat is present only in the GlobalRdrEnv for f’s cab’), but the locations of the two `cab`s are the same (they are both created in the same splice). Thus, we must include some additional information with each `cab to allow the typechecker to obtain the correct GlobalRdrEnv. Clearly, the simplest information to use is the GlobalRdrEnv itself.

Note [Parens in HsSyn]

[note link]

HsPar (and ParPat in patterns, HsParTy in types) is used as follows

  • HsPar is required; the pretty printer does not add parens.
  • HsPars are respected when rearranging operator fixities. So a * (b + c) means what it says (where the parens are an HsPar)
  • For ParPat and HsParTy the pretty printer does add parens but this should be a no-op for ParsedSource, based on the pretty printer round trip feature introduced in https://phabricator.haskell.org/rGHC499e43824bda967546ebf95ee33ec1f84a114a7c
  • ParPat and HsParTy are pretty printed as ‘( .. )’ regardless of whether or not they are strictly necessary. This should be addressed when #13238 is completed, to be treated the same as HsPar.

Note [Sections in HsSyn]

[note link]

Sections should always appear wrapped in an HsPar, thus
HsPar (SectionR …)

The parser parses sections in a wider variety of situations (See Note [Parsing sections]), but the renamer checks for those parens. This invariant makes pretty-printing easier; we don’t need a special case for adding the parens round sections.

Note [Rebindable if]

[note link]

The rebindable syntax for ‘if’ is a bit special, because when rebindable syntax is off we do not want to treat

(if c then t else e)

as if it was an application (ifThenElse c t e). Why not? Because we allow an ‘if’ to return unboxed results, thus

if blah then 3# else 4#

whereas that would not be possible using a all to a polymorphic function (because you can’t call a polymorphic function at an unboxed type).

So we use Nothing to mean “use the old built-in typing rule”.

Note [Record Update HsWrapper]

[note link]

There is a wrapper in RecordUpd which is used for the required constraints for pattern synonyms. This wrapper is created in the typechecking and is then directly used in the desugaring without modification.

For example, if we have the record pattern synonym P,

pattern P :: (Show a) => a -> Maybe a pattern P{x} = Just x

foo = (Just True) { x = False }

then foo desugars to something like
foo = case Just True of
P x -> P False

hence we need to provide the correct dictionaries to P’s matcher on the RHS so that we can build the expression.

Note [Located RdrNames]

[note link]

A number of syntax elements have seemingly redundant locations attached to them. This is deliberate, to allow transformations making use of the API Annotations to easily correlate a Located Name in the RenamedSource with a Located RdrName in the ParsedSource.

There are unfortunately enough differences between the ParsedSource and the RenamedSource that the API Annotations cannot be used directly with RenamedSource, so this allows a simple mapping to be used based on the location.

Note [m_ctxt in Match]

[note link]

A Match can occur in a number of contexts, such as a FunBind, HsCase, HsLam and so on.

In order to simplify tooling processing and pretty print output, the provenance is captured in an HsMatchContext.

This is particularly important for the API Annotations for a multi-equation FunBind.

The parser initially creates a FunBind with a single Match in it for every function definition it sees.

These are then grouped together by getMonoBind into a single FunBind, where all the Matches are combined.

In the process, all the original FunBind fun_id’s bar one are discarded, including the locations.

This causes a problem for source to source conversions via API Annotations, so the original fun_ids and infix flags are preserved in the Match, when it originates from a FunBind.

Example infix function definition requiring individual API Annotations

(&&&  ) [] [] =  []
xs    &&&   [] =  xs
(  &&&  ) [] ys =  ys

Note [The type of bind in Stmts]

[note link]

Some Stmts, notably BindStmt, keep the (>>=) bind operator. We do NOT assume that it has type

(>>=) :: m a -> (a -> m b) -> m b

In some cases (see #303, #1537) it might have a more exotic type, such as

(>>=) :: m i j a -> (a -> m j k b) -> m i k b

So we must be careful not to make assumptions about the type. In particular, the monad may not be uniform throughout.

Note [TransStmt binder map]

[note link]

The [(idR,idR)] in a TransStmt behaves as follows:

  • Before renaming: []

  • After renaming:

    [ (x27,x27), …, (z35,z35) ]

    These are the variables

    bound by the stmts to the left of the ‘group’ and used either in the ‘by’ clause,

    or in the stmts following the ‘group’

    Each item is a pair of identical variables.

  • After typechecking:

    [ (x27:Int, x27:[Int]), …, (z35:Bool, z35:[Bool]) ]

    Each pair has the same unique, but different types.

Note [BodyStmt]

[note link]

BodyStmts are a bit tricky, because what they mean depends on the context. Consider the following contexts:

  • BodyStmt E any_ty: do { ….; E; … }

    E :: m any_ty

    Translation: E >> …

  • BodyStmt E Bool: [ .. | …. E ]

    [ .. | …, E, … ] [ .. | …. | …, E | … ]

    E :: Bool

    Translation: if E then fail else …

  • BodyStmt E BooParStmtBlockl: f x | …, E, … = …rhs…

    E :: Bool

    Translation: if E then fail else …

  • BodyStmt E Bool: [ .. | …. E ]

    E :: Bool

    Translation: guard E >> …

Array comprehensions are handled like list comprehensions.

Note [How RecStmt works]

[note link]

Example:
HsDo [ BindStmt x ex
, RecStmt { recS_rec_ids   = [a, c]
          , recS_stmts     = [ BindStmt b (return (a,c))
                             , LetStmt a = ...b...
                             , BindStmt c ec ]
          , recS_later_ids = [a, b]
, return (a b) ]
Here, the RecStmt binds a,b,c; but
  • Only a,b are used in the stmts following the RecStmt,
  • Only a,c are used in the stmts inside the RecStmt
    before their bindings

Why do we need both rec_ids and later_ids? For monads they could be combined into a single set of variables, but not for arrows. That follows from the types of the respective feedback operators:

mfix :: MonadFix m => (a -> m a) -> m a
loop :: ArrowLoop a => a (b,d) (c,d) -> a b c
  • For mfix, the ‘a’ covers the union of the later_ids and the rec_ids
  • For ‘loop’, ‘c’ is the later_ids and ‘d’ is the rec_ids

Note [Typing a RecStmt]

[note link]

A (RecStmt stmts) types as if you had written

(v1,..,vn, _, ..., _) <- mfix (\~(_, ..., _, r1, ..., rm) ->
                               do { stmts
                                  ; return (v1,..vn, r1, ..., rm) })
where v1..vn are the later_ids
r1..rm are the rec_ids

Note [Monad Comprehensions]

[note link]

Monad comprehensions require separate functions like ‘return’ and ‘>>=’ for desugaring. These functions are stored in the statements used in monad comprehensions. For example, the ‘return’ of the ‘LastStmt’ expression is used to lift the body of the monad comprehension:

[ body | stmts ]
 =>
stmts >>= \bndrs -> return body

In transform and grouping statements (‘then ..’ and ‘then group ..’) the ‘return’ function is required for nested monad comprehensions, for example:

[ body | stmts, then f, rest ]
 =>
f [ env | stmts ] >>= \bndrs -> [ body | rest ]

BodyStmts require the ‘Control.Monad.guard’ function for boolean expressions:

[ body | exp, stmts ]
 =>
guard exp >> [ body | stmts ]

Parallel statements require the ‘Control.Monad.Zip.mzip’ function:

[ body | stmts1 | stmts2 | .. ]
 =>
mzip stmts1 (mzip stmts2 (..)) >>= \(bndrs1, (bndrs2, ..)) -> return body

In any other context than ‘MonadComp’, the fields for most of these ‘SyntaxExpr’s stay bottom.

Note [Applicative BodyStmt]

(#12143) For the purposes of ApplicativeDo, we treat any BodyStmt as if it was a BindStmt with a wildcard pattern. For example,

do
x <- A B return x

is transformed as if it were

do
x <- A _ <- B return x

so it transforms to

(\(x,_) -> x) <$> A <*> B

But we have to remember when we treat a BodyStmt like a BindStmt, because in error messages we want to emit the original syntax the user wrote, not our internal representation. So ApplicativeArgOne has a Bool flag that is True when the original statement was a BodyStmt, so that we can pretty-print it correctly.

Note [Pending Splices]

[note link]

When we rename an untyped bracket, we name and lift out all the nested splices, so that when the typechecker hits the bracket, it can typecheck those nested splices without having to walk over the untyped bracket code. So for example

[| f $(g x) |]

looks like

HsBracket (HsApp (HsVar "f") (HsSpliceE _ (g x)))

which the renamer rewrites to

HsRnBracketOut (HsApp (HsVar f) (HsSpliceE sn (g x)))
               [PendingRnSplice UntypedExpSplice sn (g x)]
  • The ‘sn’ is the Name of the splice point, the SplicePointName

  • The PendingRnExpSplice gives the splice that splice-point name maps to; and the typechecker can now conveniently find these sub-expressions

  • The other copy of the splice, in the second argument of HsSpliceE

    in the renamed first arg of HsRnBracketOut

    is used only for pretty printing

There are four varieties of pending splices generated by the renamer, distinguished by their UntypedSpliceFlavour

  • Pending expression splices (UntypedExpSplice), e.g.,

    [|$(f x) + 2|]

    UntypedExpSplice is also used for
    • quasi-quotes, where the pending expression expands to

      $(quoter “…blah…”)

      (see RnSplice.makePending, HsQuasiQuote case)

    • cross-stage lifting, where the pending expression expands to

      $(lift x)

      (see RnSplice.checkCrossStageLifting)

  • Pending pattern splices (UntypedPatSplice), e.g.,

    [| $(f x) -> x |]

  • Pending type splices (UntypedTypeSplice), e.g.,

    [| f :: $(g x) |]

  • Pending declaration (UntypedDeclSplice), e.g.,

    [| let $(f x) in … |]

There is a fifth variety of pending splice, which is generated by the type checker:

  • Pending typed expression splices, (PendingTcSplice), e.g.,
    [||1 + $$(f 2)||]

It would be possible to eliminate HsRnBracketOut and use HsBracketOut for the output of the renamer. However, when pretty printing the output of the renamer, e.g., in a type error message, we do not want to print out the pending splices. In contrast, when pretty printing the output of the type checker, we do want to print the pending splices. So splitting them up seems to make sense, although I hate to add another constructor to HsExpr.