`[source] `_ compiler/basicTypes/MkId.hs =========================== Note [Wired-in Ids] ~~~~~~~~~~~~~~~~~~~ `[note link] `__ A "wired-in" Id can be referred to directly in GHC (e.g. 'voidPrimId') rather than by looking it up its name in some environment or fetching it from an interface file. There are several reasons why an Id might appear in the wiredInIds: * ghcPrimIds: see Note [ghcPrimIds (aka pseudoops)] * magicIds: see Note [magicIds] * errorIds, defined in coreSyn/MkCore.hs. These error functions (e.g. rUNTIME_ERROR_ID) are wired in because the desugarer generates code that mentions them directly In all cases except ghcPrimIds, there is a definition site in a library module, which may be called (e.g. in higher order situations); but the wired-in version means that the details are never read from that module's interface file; instead, the full definition is right here. Note [ghcPrimIds (aka pseudoops)] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ The ghcPrimIds * Are exported from GHC.Prim * Can't be defined in Haskell, and hence no Haskell binding site, but have perfectly reasonable unfoldings in Core * Either have a CompulsoryUnfolding (hence always inlined), or of an EvaldUnfolding and void representation (e.g. void#) * Are (or should be) defined in primops.txt.pp as 'pseudoop' Reason: that's how we generate documentation for them Note [magicIds] ~~~~~~~~~~~~~~~ `[note link] `__ The magicIds * Are exported from GHC.Magic * Can be defined in Haskell (and are, in ghc-prim:GHC/Magic.hs). This definition at least generates Haddock documentation for them. * May or may not have a CompulsoryUnfolding. * But have some special behaviour that can't be done via an unfolding from an interface file Note [Wrappers for data instance tycons] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ In the case of data instances, the wrapper also applies the coercion turning the representation type into the family instance type to cast the result of the wrapper. For example, consider the declarations :: data family Map k :: * -> * data instance Map (a, b) v = MapPair (Map a (Pair b v)) .. The tycon to which the datacon MapPair belongs gets a unique internal name of the form :R123Map, and we call it the representation tycon. In contrast, Map is the family tycon (accessible via tyConFamInst_maybe). A coercion allows you to move between representation and family type. It is accessible from :R123Map via tyConFamilyCoercion_maybe and has kind :: Co123Map a b v :: {Map (a, b) v ~ :R123Map a b v} .. The wrapper and worker of MapPair get the types -- Wrapper $WMapPair :: forall a b v. Map a (Map a b v) -> Map (a, b) v $WMapPair a b v = MapPair a b v `cast` sym (Co123Map a b v) -- Worker MapPair :: forall a b v. Map a (Map a b v) -> :R123Map a b v This coercion is conditionally applied by wrapFamInstBody. It's a bit more complicated if the data instance is a GADT as well! :: data instance T [a] where T1 :: forall b. b -> T [Maybe b] .. Hence we translate to -- Wrapper $WT1 :: forall b. b -> T [Maybe b] $WT1 b v = T1 (Maybe b) b (Maybe b) v `cast` sym (Co7T (Maybe b)) -- Worker T1 :: forall c b. (c ~ Maybe b) => b -> :R7T c -- Coercion from family type to representation type Co7T a :: T [a] ~ :R7T a Newtype instances through an additional wrinkle into the mix. Consider the following example (adapted from #15318, comment:2): :: data family T a newtype instance T [a] = MkT [a] .. Within the newtype instance, there are three distinct types at play: 1. The newtype's underlying type, [a]. 2. The instance's representation type, TList a (where TList is the representation tycon). 3. The family type, T [a]. We need two coercions in order to cast from (1) to (3): (a) A newtype coercion axiom: :: axiom coTList a :: TList a ~ [a] .. :: (Where TList is the representation tycon of the newtype instance.) .. (b) A data family instance coercion axiom: :: axiom coT a :: T [a] ~ TList a .. When we translate the newtype instance to Core, we obtain: -- Wrapper $WMkT :: forall a. [a] -> T [a] $WMkT a x = MkT a x |> Sym (coT a) -- Worker MkT :: forall a. [a] -> TList [a] MkT a x = x |> Sym (coTList a) Unlike for data instances, the worker for a newtype instance is actually an executable function which expands to a cast, but otherwise, the general strategy is essentially the same as for data instances. Also note that we have a wrapper, which is unusual for a newtype, but we make GHC produce one anyway for symmetry with the way data instances are handled. Note [Newtype datacons] ~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ The "data constructor" for a newtype should always be vanilla. At one point this wasn't true, because the newtype arising from class C a => D a looked like newtype T:D a = D:D (C a) so the data constructor for T:C had a single argument, namely the predicate (C a). But now we treat that as an ordinary argument, not part of the theta-type, so all is well. Note [Compulsory newtype unfolding] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ Newtype wrappers, just like workers, have compulsory unfoldings. This is needed so that two optimizations involving newtypes have the same effect whether a wrapper is present or not: (1) Case-of-known constructor. See Note [beta-reduction in exprIsConApp_maybe]. (2) Matching against the map/coerce RULE. Suppose we have the RULE :: {-# RULE "map/coerce" map coerce = ... #-} .. :: As described in Note [Getting the map/coerce RULE to work], the occurrence of 'coerce' is transformed into: .. :: {-# RULE "map/coerce" forall (c :: T1 ~R# T2). map ((\v -> v) `cast` c) = ... #-} .. :: We'd like 'map Age' to match the LHS. For this to happen, Age must be unfolded, otherwise we'll be stuck. This is tested in T16208. .. Note [Inline partially-applied constructor wrappers] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ We allow the wrapper to inline when partially applied to avoid boxing values unnecessarily. For example, consider :: data Foo a = Foo !Int a .. :: instance Traversable Foo where traverse f (Foo i a) = Foo i <$> f a .. This desugars to :: traverse f foo = case foo of Foo i# a -> let i = I# i# in map ($WFoo i) (f a) .. If the wrapper `$WFoo` is not inlined, we get a fruitless reboxing of `i`. But if we inline the wrapper, we get :: map (\a. case i of I# i# a -> Foo i# a) (f a) .. and now case-of-known-constructor eliminates the redundant allocation. Note [Activation for data constructor wrappers] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ The Activation on a data constructor wrapper allows it to inline only in Phase 0. This way rules have a chance to fire if they mention a data constructor on the left RULE "foo" f (K a b) = ... Since the LHS of rules are simplified with InitialPhase, we won't inline the wrapper on the LHS either. On the other hand, this means that exprIsConApp_maybe must be able to deal with wrappers so that case-of-constructor is not delayed; see Note [exprIsConApp_maybe on data constructors with wrappers] for details. It used to activate in phases 2 (afterInitial) and later, but it makes it awkward to write a RULE[1] with a constructor on the left: it would work if a constructor has no wrapper, but whether a constructor has a wrapper depends, for instance, on the order of type argument of that constructors. Therefore changing the order of type argument could make previously working RULEs fail. See also https://gitlab.haskell.org/ghc/ghc/issues/15840 . Note [Bangs on imported data constructors] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ We pass Maybe [HsImplBang] to mkDataConRep to make use of HsImplBangs from imported modules. - Nothing <=> use HsSrcBangs - Just bangs <=> use HsImplBangs For imported types we can't work it all out from the HsSrcBangs, because we want to be very sure to follow what the original module (where the data type was declared) decided, and that depends on what flags were enabled when it was compiled. So we record the decisions in the interface file. The HsImplBangs passed are in 1-1 correspondence with the dataConOrigArgTys of the DataCon. Note [Data con wrappers and unlifted types] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ Consider data T = MkT !Int# We certainly do not want to make a wrapper $WMkT x = case x of y { DEFAULT -> MkT y } For a start, it's still to generate a no-op. But worse, since wrappers are currently injected at TidyCore, we don't even optimise it away! So the stupid case expression stays there. This actually happened for the Integer data type (see #1600 comment:66)! Note [Data con wrappers and GADT syntax] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ Consider these two very similar data types: :: data T1 a b = MkT1 b .. :: data T2 a b where MkT2 :: forall b a. b -> T2 a b .. Despite their similar appearance, T2 will have a data con wrapper but T1 will not. What sets them apart? The types of their constructors, which are: :: MkT1 :: forall a b. b -> T1 a b MkT2 :: forall b a. b -> T2 a b .. MkT2's use of GADT syntax allows it to permute the order in which `a` and `b` would normally appear. See Note [DataCon user type variable binders] in DataCon for further discussion on this topic. The worker data cons for T1 and T2, however, both have types such that `a` is expected to come before `b` as arguments. Because MkT2 permutes this order, it needs a data con wrapper to swizzle around the type variables to be in the order the worker expects. A somewhat surprising consequence of this is that *newtypes* can have data con wrappers! After all, a newtype can also be written with GADT syntax: :: newtype T3 a b where MkT3 :: forall b a. b -> T3 a b .. Again, this needs a wrapper data con to reorder the type variables. It does mean that this newtype constructor requires another level of indirection when being called, but the inliner should make swift work of that. Note [HsImplBangs for newtypes] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ Most of the time, we use the dataConSrctoImplBang function to decide what strictness/unpackedness to use for the fields of a data type constructor. But there is an exception to this rule: newtype constructors. You might not think that newtypes would pose a challenge, since newtypes are seemingly forbidden from having strictness annotations in the first place. But consider this (from #16141): :: {-# LANGUAGE StrictData #-} {-# OPTIONS_GHC -O #-} newtype T a b where MkT :: forall b a. Int -> T a b .. Because StrictData (plus optimization) is enabled, invoking dataConSrcToImplBang would sneak in and unpack the field of type Int to Int#! This would be disastrous, since the wrapper for `MkT` uses a coercion involving Int, not Int#. Bottom line: dataConSrcToImplBang should never be invoked for newtypes. In the case of a newtype constructor, we simply hardcode its dcr_bangs field to [HsLazy]. Note [Unpacking GADTs and existentials] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ There is nothing stopping us unpacking a data type with equality components, like data Equal a b where Equal :: Equal a a And it'd be fine to unpack a product type with existential components too, but that would require a bit more plumbing, so currently we don't. So for now we require: null (dataConExTyCoVars data_con) See #14978 Note [Unpack one-wide fields] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ The flag UnboxSmallStrictFields ensures that any field that can (safely) be unboxed to a word-sized unboxed field, should be so unboxed. For example: :: data A = A Int# newtype B = B A data C = C !B data D = D !C data E = E !() data F = F !D data G = G !F !F .. All of these should have an Int# as their representation, except G which should have two Int#s. However :: data T = T !(S Int) data S = S !a .. Here we can represent T with an Int#. Note [Recursive unboxing] ~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ Consider data R = MkR {-# UNPACK #-} !S Int data S = MkS {-# UNPACK #-} !Int The representation arguments of MkR are the *representation* arguments of S (plus Int); the rep args of MkS are Int#. This is all fine. But be careful not to try to unbox this! data T = MkT {-# UNPACK #-} !T Int Because then we'd get an infinite number of arguments. Here is a more complicated case: data S = MkS {-# UNPACK #-} !T Int data T = MkT {-# UNPACK #-} !S Int Each of S and T must decide independently whether to unpack and they had better not both say yes. So they must both say no. Also behave conservatively when there is no UNPACK pragma data T = MkS !T Int with -funbox-strict-fields or -funbox-small-strict-fields we need to behave as if there was an UNPACK pragma there. But it's the *argument* type that matters. This is fine: data S = MkS S !Int because Int is non-recursive. Note [Dict funs and default methods] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ Dict funs and default methods are *not* ImplicitIds. Their definition involves user-written code, so we can't figure out their strictness etc based on fixed info, as we can for constructors and record selectors (say). NB: See also Note [Exported LocalIds] in Id Note [Unsafe coerce magic] ~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ We define a *primitive* GHC.Prim.unsafeCoerce# and then in the base library we define the ordinary function Unsafe.Coerce.unsafeCoerce :: forall (a:*) (b:*). a -> b unsafeCoerce x = unsafeCoerce# x Notice that unsafeCoerce has a civilized (albeit still dangerous) polymorphic type, whose type args have kind *. So you can't use it on unboxed values (unsafeCoerce 3#). In contrast unsafeCoerce# is even more dangerous because you *can* use it on unboxed things, (unsafeCoerce# 3#) :: Int. Its type is forall (r1 :: RuntimeRep) (r2 :: RuntimeRep) (a: TYPE r1) (b: TYPE r2). a -> b Note [seqId magic] ~~~~~~~~~~~~~~~~~~ `[note link] `__ 'GHC.Prim.seq' is special in several ways. a) In source Haskell its second arg can have an unboxed type x `seq` (v +# w) But see Note [Typing rule for seq] in TcExpr, which explains why we give seq itself an ordinary type seq :: forall a b. a -> b -> b and treat it as a language construct from a typing point of view. b) Its fixity is set in LoadIface.ghcPrimIface c) It has quite a bit of desugaring magic. See DsUtils.hs Note [Desugaring seq (1)] and (2) and (3) d) There is some special rule handing: Note [User-defined RULES for seq] Note [User-defined RULES for seq] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ Roman found situations where he had case (f n) of _ -> e where he knew that f (which was strict in n) would terminate if n did. Notice that the result of (f n) is discarded. So it makes sense to transform to case n of _ -> e Rather than attempt some general analysis to support this, I've added enough support that you can do this using a rewrite rule: :: RULE "f/seq" forall n. seq (f n) = seq n .. You write that rule. When GHC sees a case expression that discards its result, it mentally transforms it to a call to 'seq' and looks for a RULE. (This is done in Simplify.trySeqRules.) As usual, the correctness of the rule is up to you. VERY IMPORTANT: to make this work, we give the RULE an arity of 1, not 2. If we wrote RULE "f/seq" forall n e. seq (f n) e = seq n e with rule arity 2, then two bad things would happen: - The magical desugaring done in Note [seqId magic] item (c) for saturated application of 'seq' would turn the LHS into a case expression! - The code in Simplify.rebuildCase would need to actually supply the value argument, which turns out to be awkward. Note [lazyId magic] ~~~~~~~~~~~~~~~~~~~ `[note link] `__ lazy :: forall a?. a? -> a? (i.e. works for unboxed types too) 'lazy' is used to make sure that a sub-expression, and its free variables, are truly used call-by-need, with no code motion. Key examples: * pseq: pseq a b = a `seq` lazy b We want to make sure that the free vars of 'b' are not evaluated before 'a', even though the expression is plainly strict in 'b'. * catch: catch a b = catch# (lazy a) b Again, it's clear that 'a' will be evaluated strictly (and indeed applied to a state token) but we want to make sure that any exceptions arising from the evaluation of 'a' are caught by the catch (see #11555). Implementing 'lazy' is a bit tricky: * It must not have a strictness signature: by being a built-in Id, all the info about lazyId comes from here, not from GHC.Base.hi. This is important, because the strictness analyser will spot it as strict! * It must not have an unfolding: it gets "inlined" by a HACK in CorePrep. It's very important to do this inlining *after* unfoldings are exposed in the interface file. Otherwise, the unfolding for (say) pseq in the interface file will not mention 'lazy', so if we inline 'pseq' we'll totally miss the very thing that 'lazy' was there for in the first place. See #3259 for a real world example. * Suppose CorePrep sees (catch# (lazy e) b). At all costs we must avoid using call by value here: case e of r -> catch# r b Avoiding that is the whole point of 'lazy'. So in CorePrep (which generate the 'case' expression for a call-by-value call) we must spot the 'lazy' on the arg (in CorePrep.cpeApp), and build a 'let' instead. * lazyId is defined in GHC.Base, so we don't *have* to inline it. If it appears un-applied, we'll end up just calling it. Note [noinlineId magic] ~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ noinline :: forall a. a -> a 'noinline' is used to make sure that a function f is never inlined, e.g., as in 'noinline f x'. Ordinarily, the identity function with NOINLINE could be used to achieve this effect; however, this has the unfortunate result of leaving a (useless) call to noinline at runtime. So we have a little bit of magic to optimize away 'noinline' after we are done running the simplifier. 'noinline' needs to be wired-in because it gets inserted automatically when we serialize an expression to the interface format. See Note [Inlining and hs-boot files] in ToIface Note [The oneShot function] ~~~~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ In the context of making left-folds fuse somewhat okish (see ticket #7994 and Note [Left folds via right fold]) it was determined that it would be useful if library authors could explicitly tell the compiler that a certain lambda is called at most once. The oneShot function allows that. 'oneShot' is levity-polymorphic, i.e. the type variables can refer to unlifted types as well (#10744); e.g. oneShot (\x:Int# -> x +# 1#) Like most magic functions it has a compulsory unfolding, so there is no need for a real definition somewhere. We have one in GHC.Magic for the convenience of putting the documentation there. It uses `setOneShotLambda` on the lambda's binder. That is the whole magic: A typical call looks like oneShot (\y. e) after unfolding the definition `oneShot = \f \x[oneshot]. f x` we get (\f \x[oneshot]. f x) (\y. e) --> \x[oneshot]. ((\y.e) x) --> \x[oneshot] e[x/y] which is what we want. It is only effective if the one-shot info survives as long as possible; in particular it must make it into the interface in unfoldings. See Note [Preserve OneShotInfo] in CoreTidy. Also see https://gitlab.haskell.org/ghc/ghc/wikis/one-shot. Note [magicDictId magic] ~~~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ The identifier `magicDict` is just a place-holder, which is used to implement a primitive that we cannot define in Haskell but we can write in Core. It is declared with a place-holder type: :: magicDict :: forall a. a .. The intention is that the identifier will be used in a very specific way, to create dictionaries for classes with a single method. Consider a class like this: :: class C a where f :: T a .. We are going to use `magicDict`, in conjunction with a built-in Prelude rule, to cast values of type `T a` into dictionaries for `C a`. To do this, we define a function like this in the library: :: data WrapC a b = WrapC (C a => Proxy a -> b) .. withT :: (C a => Proxy a -> b) -> T a -> Proxy a -> b withT f x y = magicDict (WrapC f) x y The purpose of `WrapC` is to avoid having `f` instantiated. Also, it avoids impredicativity, because `magicDict`'s type cannot be instantiated with a forall. The field of `WrapC` contains a `Proxy` parameter which is used to link the type of the constraint, `C a`, with the type of the `Wrap` value being made. Next, we add a built-in Prelude rule (see prelude/PrelRules.hs), which will replace the RHS of this definition with the appropriate definition in Core. The rewrite rule works as follows: magicDict @t (wrap @a @b f) x y ----> f (x `cast` co a) y The `co` coercion is the newtype-coercion extracted from the type-class. The type class is obtain by looking at the type of wrap. ------------------------------------------------------------- @realWorld#@ used to be a magic literal, \tr{void#}. If things get nasty as-is, change it back to a literal (@Literal@). voidArgId is a Local Id used simply as an argument in functions where we just want an arg to avoid having a thunk of unlifted type. E.g. x = \ void :: Void# -> (# p, q #) This comes up in strictness analysis Note [evaldUnfoldings] ~~~~~~~~~~~~~~~~~~~~~~ `[note link] `__ The evaldUnfolding makes it look that some primitive value is evaluated, which in turn makes Simplify.interestingArg return True, which in turn makes INLINE things applied to said value likely to be inlined.