SingleAccretion/small-types-in-ir.md

## small-types-in-ir.md

      
    Raw
  

              small-types-in-ir.md
            
          
    Small types in RyuJit IR

The IR, for simplicity and efficiency reasons, largely follows the IL model, where only 32 and 64 bit integers are tracked as distinct types, while integers smaller than that exist only for storage locations, and are implictly widened on load and narrowed on store.
Thus, you will not see primitive arithmetic operations of, for example, type SHORT in the IR - they only exist for INT and LONG (ignoring BYREFs).
The following is a list of IR nodes known to use small types and what semantics they have:

INDs on the RHS of an assignment (and always in LIR): specify the width of the indirection. Signedness of the type determines whether the load will use sign extension or zero extension.  These nodes always produce INTs.
INDs on the LHS of an assigment (STOREIND in LIR): specify the width of the storage location.
Relops (EQ/NE/LE/GE/LT/GT) of type UBYTE - xarch-specific lowering optimization that means the result doesn't need to be zero-extended (currently, it happens for a STOREIND user).
CALLs - yet to be investigated. Interesting case: callees that do not normalize the return (i. e. native calls).
ASGs - use the type of the LHS. Type of ASGs largely does not matter except in cases where the ASG is a setup arg where it keeps the illusion that the non-late arg "produces a value".
LCL_VARs that are NormalizeOnLoad: arguments, address-exposed locals, and promoted struct fields.

On the RHS: get wrapped in a CAST to the small type by morph and retyped as INTs. Also always produce INTs.

One interesting detail of NormalizeOnLoad variables is that, if the variable is used from a memory location, the "load" can be "out of bounds" (location assigned to such a variable can be narrower than sizeof(int)), since it is performed by the LCL_VAR node, which is typed as int (the width of the LCL_VAR determines the width of the load). It should not cause problems as it is immediately extended, so the upper bits are never seen by anything other than the cast, and reading 2 or 3 bytes more from a stack location (where this variable lives) ought not to have other ill effects (except for perhaps confusing the debugger once in a while). Still, this does mean that optimizations operating with NormalizeOnLoad variables need to be mindful of this fact.
Because of this detail, value numbers given to NormalizeOnLoad are "wrong" in the sense that the LCL_VAR tree doesn't compute the narrow number it was given at a definition (which is the way they are numbered now), but rather it plus whatever random bytes will happen to be next to the local on the stack at the point of use.


On the LHS: remain typed small, mirroring the "width of the store" semantic that INDs on the LHS have.


LCL_VARs that are NormalizeOnStore:

On the RHS: get retyped as INT.
On the LHS: sometimes do not get retyped (CSE), sometimes do (importing IL locals), produce INT but without the extension - it is implicit that the upper bits are in a "good" state.

NormalizeOnStore variables are always assigned a location size of sizeof(int) bytes, whether they're used from memory or from a register.


Lowering (see Lowering::OptimizeConstCompare) transforms relops with casts to UBYTE/BOOL as the first operands in a unique way: it gets rid of the cast and retypes its source to the small type. Unlike all other known cases in the IR, this "small node" actually has a "true" small type, i. e. there are no implicit extensions allowed for it - they would in fact be incorrect. This is because the original cast would have zero-extended from the source, but now the value is used "as-is", with the upper bits preserved in case it ends up in a register. In theory this optimization could be applied to most operands but now it is only enabled for a few select ones. This optimization has one undesirable side-effect: in results in a larger encoding for the test instruction when the operands end up in RSI, RDI, RBP or RSP on Amd64. This optimization is also the reason why decomposition has to insert "redundant" casts to small types (via DecomposeLongs::EnsureIntSized) - they are no longer redundant, the types are actually small.
No results found