- Proposed
- Prototype: Not Started
- Implementation: Not Started
- Specification: Not Started
We allow deconstructing an instance into its constituent properties/fields in a way paralleling how property patterns can conditionally deconstruct an instance, and positional deconstruction can deconstruct instances with a suitable Deconstruct method.
We similiarly allow deconstructing a collection into its constituent elements in a way parraleling list patterns.
It is common to want to extract a number of fields/properties from an instance. Currently this is possible to do declaratively using property patterns, but the fields/properties are only assigned when the pattern matches. This forces you to put your code within an if statement if you want to use pattern matching to declaratively extract a number of properties from an instance. In order to keep this brief I will link to a motivating example from an earlier discussion: dotnet/csharplang#3546.
Additionally there's an aspect of symmetry in the language (see #3107 for more on this theme):
There is currently a parralelism in two dimensions between positional data, nominal data, and collections on one axis, and declaration, construction, deconstruction, and pattern matching on the other.
You can declare types positionally using positional records/primary constructors. You can construct an instance positionally using a constructor, you can deconstruct it using positional deconstructions, you can pattern match it using a positional pattern.
You can declare types nominally through properties/fields. You can construct an instance nominally through an object initialize and you can pattern match it using property patterns.
You can construct a collection using a collection initializer, and you will likely soon be able to pattern match it using list patterns.
This proposal fills in two of the three missing squares here by introducing nominal and sequence deconstructions.
We have 3 aims which inform this design:
- Make the most common cases as easy as possible.
- Maintain symmetry with existing constructs (positional deconstructions, and patterns).
- Don't block ourselves in from making enhancements in future language versions.
The most common case is to simply want to declare a bunch of variables. Here we take a cue from positional deconstruction, which allow you to preface a deconstruction with var to automatically declare locals for all identifiers within the deconstruction:
var {
Start: { Line: startLine, Column: startColumn },
End: { Line: endLine, Column: endColumn },
} = textRange;This declares 4 variables, startLine, startColumn, endLine, endColumn.
Positional deconstruction also allow you to specify the type explicitly, and assign to arbitrary lValues, so we allow that by leaving off the var:
{
Start: { Line: long startLine, Column: C.Column },
End: var { Line: endLine, Column: endColumn },
} = textRange;Patterns can contain any arbitrary pattern so we allow nesting any deconstruction in any other, e.g:
var ({ A: [ a, b, c] }, d) = (new { A = new[]{1, 2, 3} }, 4);Patterns can assign a pattern to a variable, even if the pattern itself contains other nested patterns, so we allow that:
var {
Start: { Line: startLine, Column: startColumn } start,
End: { Line: endLine, Column: endColumn } end,
} = textRange;It's useful to be able to assign such a variable to an existing local, like so:
TextPoint start;
{ Start: { ... } start } = textRange;On the other hand, we want to be able to declare a new local. We can't do so by putting var beforehand, since that makes all nested identifiers declare new locals. We don't want to do so by putting a Type beforehand, since that would lead to a confusing difference between var and other types. Instead we say that {} identifier declares a new local if one does not exist, and otherwise assigns to the existing local. This is very different to how C# works so far and may be reconsidered.
We apply all the principles to positional and sequence deonstructions as well, which results in a very cohesive grammar and set of features.
Unlike patterns, deconstruction does no checking for null, or bounds checking, and will throw a NullReferenceException or a IndexOutOfRangeException if these are violated. As ever, the compiler will warn you if you deconstruct a maybe null reference.
statement
: ...
| declaration_statement
;
declaration_statement
: declaration_target '=' expression ';'
| type single_variable_designation ('=' expression)? (',' single_variable_designation ('=' expression)?)* ';'
;
foreach_statement
: ...
| 'foreach' '(' declaration 'in' expression ')' embedded_statement
;
declaration
: 'var' var_variable_designation
| type single_variable_designation
;
variable_designation
: var_variable_designation
| single_variable_designation
| discard_designation
;
var_variable_designation
: parenthesized_variable_designation
| nominal_variable_designation // new
| sequence_variable_designation // new
;
parenthesized_variable_designation
: '(' variable_designation ',' variable_designation (',' variable_designation)+ ')' identifier?
;
nominal_variable_designation
: '{' named_variable_designation (',' named_variable_designation)* ','? '}' identifier?
;
sequence_variable_designation
: '[' variable_designation (',' variable_designation)* ']' identifier?
;
named_variable_designation
: identifier ':' variable_designation
;
single_variable_designation
: identifier
;
discard_designation
: '_'
;
declaration_target
: declaration
| deconstruction
;
deconstruction
: positional_deconstruction
| nominal_deconstruction // new
| sequence_deconstruction // new
;
declaration_target_or_expression
: declaration_target
| expression // see https://github.com/dotnet/roslyn/blob/fbf1583ed659db06e903d877b35c3cbd45eb7e1d/src/Compilers/CSharp/Portable/Generated/CSharp.Generated.g4#L685 for complete list
;
positional_deconstruction
: '(' declaration_target_or_expression ',' declaration_target_or_expression (',' declaration_target_or_expression)+ ')' identifier?
;
nominal_deconstruction
: '{' nominal_deconstruction_element (',' nominal_deconstruction_element )*, ','? '}' identifier?
;
sequence_deconstruction
: '[' declaration_target_or_expression (',' declaration_target_or_expression)* ']' identifier?
;
nominal_deconstruction_element
: identifier ':' declaration_target_or_expression
;Examples:
// Short-hand deconstruction
var (x, y) = e;
var [x, y] = e;
var { A: a } = e;
// Recursive deconstruction
(var x, var y) = e;
[var x, var y] = e;
{ A: var a } = e;
// Bind to an existing l-value
(x, y) = e;
[x, y] = e;
{ A: a } = e;A var_variable_designation is lowered recursively as follows:
-
Every
var_variable_designationhas a unique targett, which is a temporary variable of typeTinferred from the expression that is assigned tot. If thevar_variable_designationis the top levelvar_variable_designationin adeclaration_statementwe assignexpressiontot. If thevar_variable_designationis the top levelvar_variable_designationin aforeach_statementwe assignenumerator.Currenttot. Elsetis defined recursively below. -
If a
var_variable_designationdefines anidentifieri, we declare a local of typeT?and nameiand the same scope as the scope of thedeclaration_statement/foreach_statement, and assignttoi. -
Assuming the
var_variable_designationhasnchildvariable_designationsv0tovn - 1, we produce a set of child tempst0totn - 1as follows.- If the
var_variable_designationis aparenthesized_variable_designationwe look for a suitable deconstructor onTto deconstructtintot0totn - 1. See the spec for more details. - If the
var_variable_designationis anominal_variable_designation, for eachnamed_variable_designationwith identifierix,tmust have an accessible property or fieldix, and we assignt.ixtotx(this should match the spec for property patterns). - If the
var_variable_designationis asequence_variable_designation,tmust have an indexer accepting a single parameter of typeint, and we assignt[x]totx(this should match and keep up to date with spec for collection patterns, e.g. we may allow use ofGetEnumeratorhere).
- If the
-
For each child
variable_designationvx- If
vxis avar_variable_designationwe lower vx as specified here, usingtxastforvx. - If
vxissingle_variable_designationwithidentifierixwe declare a local of typeTx?and nameixand the same scope as the scope of thedeclaration_statement/foreach_statement, and assigntxtoix. - If
vxis adiscard_designationwe do nothing.
- If
A deconstruction is lowered recursively as follows:
-
Every
deconstructionhas a unique targett, which is a temporary variable of typeTinferred from the expression that is assigned tot. If thedeconstructionis the top leveldeconstructionin adeclaration_statementwe assignexpressiontot. Elsetis defined recursively below. -
If a
deconstructiondefines anidentifieri- If there is a local in scope with name
iwe assignttoi. - Else we declare a local of type
T?and nameiand the same scope as the scope of thedeclaration_statement, and assignttoi.
- If there is a local in scope with name
-
Assuming the
deconstructionhasnchilddeclaration_target_or_expressionsd0todn - 1: If this is a top leveldeconstruction: For eachdeclaration_target_or_expressiondx- If
dxis anexpression, it must be a valid lValue as defined by the spec, and we evaluate as much ofdxas is evaluated before the RHS of an assignment operator as defined by the spec. The result of this evaluation is stored in a tempdtx. - If
dxis adeconstructionwe perform this step recursively to evaluate as much of it's childexpressions as are necessary.
- If
-
We produce a set of child temps
t0totn - 1as follows.- If
deconstructionis apositional_deconstructionwe look for a suitable deconstructor onTto deconstructtintot0totn - 1. See the spec for more details. - If the
deconstructionis anominal_deconstruction, for eachnominal_deconstruction_elementwith identifierix,tmust have an accessible property or fieldix, and we assignt.ixtotx(this should match the spec for property patterns). - If the
deconstructionis asequence_deconstruction,tmust have an indexer accepting a single parameter of typeint, and we assignt[x]totx(this should match and keep up to date with spec for collection patterns, e.g. we may allow use ofGetEnumeratorhere).
- If
-
For each child
declaration_target_or_expressiondx- If
dxis anexpressionwe assigntxtodtxas specified by the spec on simple assignment. The assignment must be valid according to the rules specified there. - If
dxis adeclaration- If the
declarationis avar_variable_designationwe lower vx as specified above, usingtxastforvx. - If the
declarationis asingle_variable_designationwithidentifierixandtypeTxwe declare a local of typeTx`` and nameixand the same scope as the scope of thedeclaration_statement, and assigntxtoix. IftypeisvarTxis inferred fromtx`. - If the
declarationis adiscard_designationwe do nothing.
- If the
- If
dxis adeconstructionwe lowerdxas specified here, usingtxastfordx.
- If
This is a significant set of enhancements to deconstruction. Deconstruction is far less common than pattern matching, so it may be that the benefit from this set of enhancements is not considered sufficient to pay for itself.
In order to distinguish between a nominal_deconstruction and a block, we need to parse till we reach a , a ; or the closing brace (at which point we can check if it's followed by a = or not). This lookahead may be expensive.
In order to distinguish between a positional_attribute and an attribute on a local function we need to parse till we reach the closing ] and check to see if it's followed by a = or not. This may also be expensive.
If expression blocks are added in the future, this may possibly lead to genuine ambiguities even at a semantic level. E.g { P : (condition ? ref a : ref b) } = e; could be a nominal deconstruction, or an assignment to an expression block containing the label P.
There are a number of simplifications to this spec we could consider:
- Only allow the
varform of the patterns as the most common. - Don't allow mixing the different forms of deconstruction.
- Don't allow declaring a local as well as a deconstruction. etc.
As well there's a lot of axis on which the exact grammar/semantics could be adjusted. I hope I made clear in my high level overview why I made the decisions I did, but I will not be surprised if others come to different conclusions.
How do we modify the spec I've given above to allow target typing.