eisenwave/deducing_this_concisely.md

## deducing_this_concisely.md

      
    Raw
  

              deducing_this_concisely.md
            
          
    Deducing this concisely

P0847: Deducing this has introduced
explicit object member functions.
These greatly empower the developer, but come at a cost of added verbosity.
This proposal offers two independent ways of reducing this verbosity.
Motivation

struct Self {
    void impl() &;              // implicit object member function
    void expl(this Self& self); // explicit object member function
    void prop(this&);           // proposed (non-template)
};
Explicit object parameters force the developer to repeat themselves in two ways:

self and this are redundant.
It is already clear that self is an explicit object parameter due to the this keyword.
The class name Self must be repeated in the parameter.
The longer the class name, the more noticeable this becomes.

This redundancy punishes developers who prefer explicit object parameters as a style.
One of C++'s aims has always been not to restrict the style in which C++ developers write code.
Impeding a style through severe verbosity is not in the spirit of this noble goal.
Furthermore, explicit object parameters offer an opportunity to fix one of C++'s oldest design issues.
References have been added relatively late into C++ development (C++ 3.0), long after the this pointer,
which originated in C with classes.
Due to backwards compatibility, this was not made a reference, which has been lamented by numerous C++ developers.
This has been especially problematic since C++11 introduced rvalue ref-qualifiers, which arguably should affect what this is.
P0847 has left open a window of opportunity to remedy this historical issue without breaking changes.
Last but not least, the type of the current object cannot always be named, making it beneficial if it can be omitted:

In a lambda expression, one must use a generic lambda with this auto, even if a non-template is desired.
In macros, the current type isn't always known, making it more difficult to generate member functions.

Proposed Syntax

Unnamed Explicit Object Parameters

The redundancy of this and self can be eliminated by treating unnamed this parameters specially.
Namely, an unnamed this parameter changes what this names, rather than making this a pointer:
struct Self {
    int i = 0;
    
    int f1(this Self&&) { return i; } // ill-formed
    int f2(this Self&&) { return this.i; } // OK
    int f3(this Self&& self) {
        return this.i;  // ill-formed
        return this->i; // ill-formed
        return self.i;  // OK
    }
};
In f1 and f2, this is an lvalue of type Self, and the type of f1 and f2 is void(Self&&).
Note that P0847 has examined three other options:

If there is an explicit object parameter self, all access must be through self (proposed by P0847).
Implicit access as in f1 is disallowed, but the this pointer may be used as usual.
Implicit access and access through the this pointer is allowed.

The proposed syntax is a fourth option where an unnamed parameter changes what this names.
If the explicit object parameter is unnamed, this names that parameter.
Due to this being inaccessible in explicit object parameter functions in C++23, this is not a breaking change.
This approach has several advantages:

self (or other) names become unnecessary in almost all cases.
The developer opts into this changing meaning, so that they are not surprised by this no longer being a pointer.
If the developer consistently uses unnamed explicit object parameters, this is always used to access the current object.
This is consistent, teachable, and tooling-friendly, due to this being highlighed as a keyword in editors, unlike self.

The downside is that this is no longer a pointer in every context, which makes the language slightly more complex and context-dependent.
Interplays with capturing [this] and [*this] in lambdas

With the meaning of this changed, it needs to be examined how lambda captures are impacted.
The proposed behavior makes this name the function parameter, not the this pointer.
Therefore:

[this] would capture the object parameter by copy,
[*this] would be ill-formed,
[&this] would capture the object parameter by reference,
[&] can implicitly capture the current object by reference,
[=] can implicitly capture the current object by value.

This approach has two downsides:

The meaning of lambda expressions changes depending on the scope they are located in, despite them introducing their own scope.
There is now a second set of rules for this in lambda captures.

I see the first downside as harmless, since refactoring to use this. to access members is very simple.
The second issue is also benign, since this set of rules is not special, but rather the set of rules for regular parameters.
The overwhelming upside is that the developer has consistent behavior for unnamed explicit object parameters,
i.e. this. works in the lambda and outside of it.
Relocating code in and out of the lambda becomes easy.
Note that using an unnamed explicit object parameter cannot be done for both the member function, and the lambda parameters:
void foo(this Self&) {
    [&this](this auto&) {}; // error: a lambda parameter cannot shadow an explicitly captured entity
}
This example should be ill-formed, and would require the developer to disambiguate by giving one of the two parameters a name.
Implicit-Type Explicit Object Parameters

The second redundancy is having to repeat the name of the current class in the parameter.
This type name is obviously redundant, since implicit object member functions function without it.
Avoiding it with this auto is often undesirable, since it turns the member function into an abbreviated function template.
struct Self {
    void f1(this Self& self); // current
    void f2(this& self); // equivalent to f1, implicit-type explicit object parameter
    void f3(this&); // equivalent to f1, with unnamed explicit object parameter
    
    void f4(this auto&& self); // current, abbreviated function template
    void f5(this auto&&); // equivalent to f4, with unnamed explicit object parameter
};
While this T offers the flexibility that T can be a base class, or even a fundamental type,
the most obvious use case for member functions is that T is the current class.
This could simply be turned into the default, i.e. if no type is provided.
Note that there is an ambiguity with this approach.
It is sometimes unclear whether the type or the parameter name has been omitted.
void f(this self&); // unambiguous: the self must be the name of the type
void f(this self);  // ambiguous: self could also be the parameter name
The latter parameter could be equivalent to:

this self (current, unnamed explicit object parameter of type self), or to
this Self self (explicit object parameter of type Self named self).

I don't propose to change the current behavior.
This would unnecessarily break existing code, and with this proposal, there are two unambiguous alternatives:

void f(this) is an unnamed and implicit-type explicit object parameter.
void f(this Self self) is a named and typed explicit object parameter named self.

Impact on Lambda Expressions

This proposal allows the developer to express behavior which they were unable to express previously:
It is possible to have a non-generic lambda expression where the current object can be accessed:
[](this auto& self) { return self(); } // C++23
[](this& self) { return self(); }      // proposed, with implicit-type explicit object parameter
[](this&) { return this(); }           // proposed, with implicit-type unnamed explicit object parameter
Not Proposed: Value Category Flexibility

Explicit object parameters are currently not as capeable as implicit object parameters in one case.
Namely, the member function
void foo();
can be called with both lvalues and rvalues, and it can mutate the implicit object.
This behavior is commonly desirable for classes such as builder classes.
A Builder::add_x() member function should be callable with both builder. and Builder{}.,
and it mutates the implicit object.
Explicit object member functions don't allows this, unless the user resorts to a function template with a forwarding reference.
It is not always desirable to use a template for this purpose.
This proposal could have suggested some special form such as this* to recreate this behavior:
void foo() { this->x = 0; }
// or
void foo(this* self) { self->x = 0; }
// or
void foo(this*) { this->x = 0; }
Another possible workaround is a utility class template any_ref:
template <typename T>
struct any_ref {
    T& ref;
    any_ref(T&& r) : ref(r) {}
    any_ref(T& r) : ref(r) {}
    T& operator*() const { return ref; }
    T* operator->() const { return &ref; }
};

struct Self {
    void f(this any_ref<Self>) {
        return this->g();
    }
    void g();
};
However, this would add both wording effort and implementation effort, and similar to P0847,
I don't believe that this effort would be worth it, since implicit object parameters can still be used in this instance.
Impact on Existing Code

This proposal does not make any existing code invalid.
It only makes code valid which would have been syntactically invalid previously.
Implementation Experience

None.
Proposed Wording

WIP.
Unnamed Explicit Object Parameters

Replace [expr.prim.this] p1 as follows:

The keyword this names

a pointer to the object for which an implicit object member function ([class.mfct.non.static]) is invoked,
a pointer to the object for which a non-static data member's initializer ([class.mem]) is evaluated, or
the object for which an explicit object member function with an unnamed explicit object parameter ([dcl.fct]) is invoked.


Update [expr.prim.this] p3 as follows:
 If a declaration declares
-a member function or member function template,
+an implicit object member function or implicit object member function template
 of a class X, the expression this is a prvalue of type “pointer to cv-qualifier-seq X”
 wherever X is the current class between the optional cv-qualifier-seq and the end of the
 function-definition, member-declarator, or declarator.
-It shall not appear within the declaration of either a static member function
-or an explicit object member function of the current class
-(although its type and value category are defined within such member functions
-as they are within an implicit object member function).
Add the following paragraph to subclause [expr.prim.this]:

If a declaration declares an explicit object member function with an unnamed explicit object parameter,
the expression this names the unnamed explicit object parameter,
unless it appears in a decltype-specifier in the explicit-object-parameter-declaration ([dcl.fct]).
[Example:
struct Self {
  int x;
  auto f(this Self&) -> decltype(this.x) { return this.x; }    // OK, the return type of f is int
  auto g(this Self&) { return [&this] { return this.x; }(); }  // OK, equivalent to return this.x
  void h(this decltype(this)) { }                              // ill-formed
  Self j(this Self, decltype(this)& out) { out = this; }       // OK
  Self k(this()) { return this(); }                            // OK, this is a pointer to a function returning Self
};
- end example]

Update the grammatical rule simple-capture as follows:
simple-capture:
  identifier ...opt
  &identifier ...opt
  this
+ & this
  * this
Update [expr.prim.lambda.capture] p2 as follows:
 If a lambda-capture includes a capture-default that is &,
 -no identifier
 +neither this nor any identifier
 in a simple-capture of that lambda-capture shall be preceded by &.
 If a lambda-capture includes a capture-default that is =,
 each simple-capture of that lambda-capture shall be of the form “& identifier ...opt”, “this”,
+, “& this”
 , or “* this”.

 [Note: The form [&,this]
+outside explicit object member functions with an unnamed explicit object parameter
 is redundant but accepted for compatibility with ISO C++ 2014. — end note]
Update [expr.prim.lambda.capture] p4 as follows:
 The identifier in a simple-capture shall denote a local entity ([basic.lookup.unqual], [basic.pre]).
+In an explicit object member function with an unnamed explicit object parameter,
+the simple-captures this and & this denote the unnamed explicit object parameter,
+and the simple-capture * this shall not appear.
+Otherwise, the
-The
 simple-captures this and * this denote the local entity *this
-.
+, and the simple-capture & this shall not appear.
 An entity that is designated by a simple-capture is said to be explicitly captured.
Update [expr.prim.lambda.capture] p7.2 as follows:
-A this expression potentially references *this.
+In an explicit object member function with an unnamed explicit object parameter,
+a this expression potentially references the unnamed explicit object parameter.
+Otherwise, a this expression potentially references *this.
Update [expr.prim.lambda.capture] p10 as follows:
An entity is captured by copy if
  * it is implicitly captured, the capture-default is =, and the captured entity is
-   not
+   neither &this nor
    *this, or
  * it is explicitly captured with a capture that is not of the form
-   this,
    & identifier, & identifier initializer,
+   & this, or
+   this outside of an explicit object member function with an unnamed explicit object parameter.
Implicit-Type Explicit Object Parameters

Update the grammatical rule parameter-declaration as follows:
parameter-declaration:
- attribute-specifier-seq_opt this_opt decl-specifier-seq declarator
+ attribute-specifier-seq_opt this_opt decl-specifier-seq_opt declarator
  attribute-specifier-seq_opt decl-specifier-seq declarator = initializer-clause
- attribute-specifier-seq_opt this_opt decl-specifier-seq abstract-declarator_opt
+ attribute-specifier-seq_opt this_opt decl-specifier-seq_opt abstract-declarator_opt
Add the following paragraph to [dcl.fct]:

An explicit-object-parameter-declaration where

the decl-specifier-seq contains no simple-type-specifier, elaborated-type-specifier, or typename-specifier, or
the decl-specifier-seq is absent

has the same type as an explicit-object-parameter-declaration where the decl-specifier-seq contains T,
where T is the name of the current class.
[Example:
struct T {
    void f1(this);               // explicit object parameter is of type "T"
    void f2(this T);
    void f3(this T self);
    
    void c1(this const);         // explicit object parameter is of type "const T"
    void c2(this const T);
    void c3(this const T self);
    
    void r1(this&);              // explicit object parameter is of type "lvalue-reference to T"
    void r2(this T&);
    void r3(this& self);
    void r4(this T& self);
};
- end example]

Criticisms and FAQ


Changing the meaning of the this expression now seems a little suspect.

It is a change in meaning, but an opt-in change.
If you refactor a parameter to be an unnamed explicit object parameter, you have to update the function body as well.
I believe this is reasonable, and the challenge of refactoring is simply replacing -> with ., or adding this. in most cases.

What about a this(self) parameter? Is this equivalent to this Self self or this Self(self)?

This form is equivalent to this Self(self), i.e. an unnamed parameter of type "function taking self and returning Self".
This is not very useful, since this(self) could have been used to disambiguate this self so that self is a parameter name.
However, this would not be consistent with other language rules, and with the relatively simple proposed wording.
this, and this Self self offer viable alternatives, so it is not worth dedicating effort to this case.

What about a this decltype(this) self parameter?

This is ill-formed, since this cannot be used in an explicit object member function with a named parameter.

What about a this decltype(this) parameter?

This form is not very useful and it should be disallowed.
By the conventional rules in this proposal, it would mean that decltype(this) is whatever this has been redefined to.
However, that would require look-ahead, since this form appears prior to the (abstract) declarator which gives this meaning.
I don't believe that this form offers enough value to justify the implementation effort.
No results found