Functions
Overview
A function is defined by writing a function signature after the :
and a statement (expression or {
}
compound statement) after the =
. After the optional template parameters available for all declarations, a function signature consists of a possibly-empty parameter list, and one or more optional return values.
For example, the minimal function named func
that takes no parameters and returns nothing (void
) is:
Function signatures: Parameters, returns, and using function types
Overview
There are six kinds of function parameters, and two of them are the kinds of functions returns:
Kind | Parameter | Return |
---|---|---|
in |
⭐ | |
inout |
✅ | |
out |
✅ | |
copy |
✅ | |
move |
✅ | ✅ |
forward |
✅ | ⭐ |
The two cases marked ⭐ can automatically pass/return by value or by reference, and so they can be optionally written with _ref
to require pass/return by reference and not by value (i.e., in_ref
, -> forward_ref
).
That's it. For details, see below.
Parameters
The parameter list is a list enclosed by (
)
parentheses. Each parameter is declared using the same unified syntax as used for all declarations. For example:
func: (
x: i32, // parameter x is a 32-bit int
y: std::string, // parameter y is a std::string
z: std::map<i32, std::string> // parameter z is a std::map
)
= {
// ...
}
The parameter type can be deduced by writing _
(the default, so it can be omitted). You can use is
to declare a type constraint (e.g., a concept) that a deduced type must match, in which case _
is required. For example:
// ordinary generic function, x's type is deduced
print: (x: _) = { std::cout << x; }
print: (x) = { std::cout << x; } // same, using the _ default
// number's type is deduced, but must match the std::integral concept
calc: (number: _ is std::integral) = { /*...*/ }
There are six ways to pass parameters that cover all use cases, that can be written before the parameter name:
Parameter kind | "Pass an x the function ______" |
Accepts arguments that are | Special semantics | kind x: X compiles to Cpp1 as |
---|---|---|---|---|
in (default) |
can read from | anything | always const automatically passes by value if cheaply copyable |
X const x or X const& x |
copy |
gets a copy of | anything | acts like a normal local variable initialized with the argument | X x |
inout |
can read from and write to | lvalues | X& x |
|
out |
writes to (including construct) | lvalues (including uninitialized) | must = assign/construct before other uses |
cpp2::impl::out<X> |
move |
moves from (consume the value of) | rvalues | automatically moves from every definite last use | X&& |
forward |
forwards | anything | automatically forwards from every definite last use | auto&& , and if a specific type is named also a requires -constraint requiring convertibilty to that type |
Note: All parameters and other objects in Cpp2 are
const
by default, except for local variables. For details, see Design note:const
objects by default.
For example:
append_x_to_y: (
x : i32, // an i32 I can read from (i.e., const)
inout y : std::string // a string I can read from and write to
)
= {
y = y + to_string(x); // read x, read and write y
}
wrap_f: (
forward x // a generic value of deduced type I can forward
) // (omitting x's type means the same as ': _')
= {
global_counter += x; // ok to read x
f(x); // last use: automatically does 'std::forward<T>(x)'
}
Return values
A function can return either a single anonymous return value, or a return parameter list containing named return value(s). The default is -> void
.
Single anonymous return values
->
kind X
to return a single unnamed value of type X
using the same kinds as in the parameters syntax, but where the only legal kinds are move
(the default) or forward
(with optional forward_ref
; see below). The type can be -> void
to signify the function has no return value. If X
is not void
, the function body must have a return /*value*/;
statement that returns a value of type X
on every path that exits the function, or must be a single expression of type X
.
To deduce the return type, write _
:
-> _
deduces by-value return.-> forward _
deduces by-value return (if the function returns a prvalue or type member object) or by-reference return (everything else), based on thedecltype
of the returned expression.-> forward_ref _
deduces by-reference return only.
A function whose body is a single expression = expr;
defaults to -> forward _ = { return expr; }
.
For example:
// A function returning no value (void)
increment_in_place: (inout a: i32) -> void = { a++; }
// Or, using syntactic defaults, the following has identical meaning:
increment_in_place: (inout a: i32) = { a++; }
// A function returning a single value of type i32
add_one: (a: i32) -> i32 = { return a+1; }
// Or, using syntactic defaults, the following has identical meaning:
add_one: (a: i32) -> i32 = a+1;
// A generic function returning a single value of deduced type
add: <T: type, U: type> (a:T, b:U) -> forward _ = { return a+b; }
// Or, using syntactic defaults, the following have identical meaning:
add: (a, b) -> forward _ = a+b;
add: (a, b) a+b;
// A generic function expression returning a single value of deduced type
vec.std::ranges::sort( :(x:_, y:_) -> forward _ = { return y<x; } );
// Or, using syntactic defaults, the following has identical meaning:
vec.std::ranges::sort( :(x,y) = y<x );
// Both are identical to this, which uses the most verbose possible syntax:
vec.std::ranges::sort( :<X:type, Y:type> (x:X, y:Y) -> forward _ = { return y<x; } );
Return parameter lists: Nameable return value(s)
-> ( /* parameter list */ )
to return a list of named return parameters using the same parameters syntax, but where the only needed kinds are out
(the default, which moves where possible) or forward
. The function body must initialize the value of each return-parameter ret
in its body the same way as any other local variable. An explicit return statement is written just return;
and returns the named values; the function has an implicit return;
at the end. If only a single return parameter is in the list, it is emitted in the lowered Cpp1 code the same way as a single anonymous return value above, so its name is only available inside the function body.
For example:
divide: (dividend: int, divisor: int) -> (quotient: int, remainder: int) = {
if divisor == 0 {
quotient = 0; // constructs quotient
remainder = 0; // constructs remainder
}
else {
quotient = dividend / divisor; // constructs quotient
remainder = dividend % divisor; // constructs remainder
}
}
main: () = {
div := divide(11, 5);
std::cout << "(div.quotient)$, (div.remainder)$\n";
}
// Prints:
// 2, 1
This next example declares a member function with multiple return values in a type named set
:
set: <Key> type = {
container: std::set<Key>;
iterator : type == std::set<Key>::iterator;
// A std::set::insert-like function using named return values
// instead of just a std::pair/tuple
insert: (inout this, value: Key) -> (where: iterator, inserted: bool) = {
set_returned := container.insert(value);
where = set_returned.first;
inserted = set_returned.second;
}
ssize: (this) -> i64 = std::ssize(container);
// ...
}
use_inserted_position: (_) = { }
main: () = {
m: set<std::string> = ();
ret := m.insert("xyzzy");
if ret.inserted {
use_inserted_position( ret.where );
}
assert( m.ssize() == 1 );
}
Function outputs are not implicitly discardable
A function's outputs are its return values, and the "out" state of any out
and inout
parameters.
Function outputs cannot be silently discarded. To explicitly discard a function output, assign it to _
. For example:
f: () -> void = { }
g: () -> int = { return 10; }
h: (inout x: int) -> void = { x = 20; }
main: ()
= {
f(); // ok, no return value
std::cout << g(); // ok, use return value
_ = g(); // ok, explicitly discard return value
g(); // ERROR, return value is ignored
{
x := 0;
h( x ); // ok, x is referred to again...
std::cout << x; // ... here, so its new value is used
}
{
x := 0;
h( x ); // ok, x is referred to again...
_ = x; // ... here where its value explicitly discarded
}
{
x := 0;
h( x ); // ERROR, this is a definite last use of x
} // so x is not referred to again, and its
// 'out' value can't be implicitly discarded
}
Cpp2 imbues Cpp1 code with nondiscardable semantics, while staying fully compatible as usual:
A function written in Cpp2 syntax that returns something other than
void
is always compiled to Cpp1 with[[nodiscard]]
.A function call written in Cpp2
x.f()
member call syntax always treats a non-void
return type as not discardable, even if the function was written in Cpp1 syntax that did not write[[nodiscard]]
.
For details and rationale, see Design note: Explicit discard.
Using function types
The same function parameter/return syntax can be used as a function type, for example to instantiate std::function
or to declare a pointer to function variable. For example:
decorate_int: (i: i32) -> std::string = "--> (i)$ <--";
main: () = {
pf1: std::function< (i: i32) -> std::string > = decorate_int&;
std::cout << "pf1(123) returned \"(pf1(123))$\"\n";
pf2: * (i: i32) -> std::string = decorate_int&;
std::cout << "pf2(456) returned \"(pf2(456))$\"\n";
}
// Prints:
// pf1 returned "--> 123 <--"
// pf2 returned "--> 456 <--"
Control flow
if
, else
— Branches
if
and else
are like always in C++, except that (
)
parentheses around the condition are not required. Instead, {
}
braces around a branch body are required. For example:
if vec.ssize() > 100 {
do_general_algorithm( container );
}
else {
do_linear_scan( vec );
}
for
, while
, do
— Loops
do
and while
are like always in C++, except that (
)
parentheses around the condition are not required. Instead, {
}
braces around the loop body are required.
for range do (e)
statement says "for each element in range
, call it e
and perform the statement." The loop parameter (e)
is an ordinary parameter that can be passed using any parameter kinds; as always, the default is in
, which is read-only and expresses a read-only loop. The statement is not required to be enclosed in braces.
Every loop can have a next
clause, that is performed at the end of each loop body execution. This makes it easy to have a counter for any loop, including a range for
loop.
Note: Whitespace is just a stylistic choice. This documentation's style generally puts each keyword on its own line and lines up what follows.
For example:
words: std::vector<std::string> = ("Adam", "Betty");
i := 0;
while i < words.ssize() // while this condition is true
next i++ // and increment i after each loop body is run
{ // do this loop body
std::cout << "word: (words[i])$\n";
}
// prints:
// word: Adam
// word: Betty
do { // do this loop body
std::cout << "**\n";
}
next i-- // and decrement i after each loop body is run
while i > 0; // while this condition is true
// prints:
// **
// **
for words // for each element in 'words'
next i++ // and increment i after each loop body is run
do (inout word) // declare via 'inout' the loop can change the contents
{ // do this loop body
word = "[" + word + "]";
std::cout << "counter: (i)$, word: (word)$\n";
}
// prints:
// counter: 0, word: [Adam]
// counter: 1, word: [Betty]
There is no special "select" or "where" to perform the loop body for only a subset of matches, because this can naturally be expressed with if
. For example:
// Continuing the previous example
i = 0;
for words
next i++
do (word)
if i % 2 == 1 // if i is odd
{ // do this loop body
std::cout << "counter: (i)$, word: (word)$\n";
}
// prints:
// counter: 1, word: [Betty]
Here is the equivalent of the Cpp1 code for ( int i = 0; i < 10; ++i ){ std::cout << i; }
:
(copy i := 0)
while i < 10
next i++ {
std::cout << i;
}
Line by line:
(copy i := 0)
: Any statement can have statement-local parameters, and this is declaringi
as anint
that's local to the loop. Parameters by default areconst
, and for not-cheap-to-copy types they bind to the original value; so because we want to modifyi
we saycopy
to explicitly declare this is the loop's own mutable scratch variable.while i < 10
: The termination condition.next i++
: The end-of-loop-iteration statement. Note++
is always postfix in Cpp2.
Loop names, break
, and continue
Loops can be named using the usual name :
syntax that introduces all names, and break
and continue
can refer to those names. For example:
outer: while i<M next i++ { // loop named "outer"
// ...
inner: while j<N next j++ { // loop named "inner"
// ...
if something() {
continue inner; // continue the inner loop
}
// ...
if something_else() {
break outer; // break the outer loop
}
// ...
}
// ...
}
Move/forward from definite last use
In a function body, a definite last use of a local name is a single use of that name in a statement that is not in a loop, where no control flow path after that statement mentions the name again.
For each definite last use:
-
If the name is a
copy
ormove
parameter or is a local object whose name does not start withguard
, we know the object will not be used again before being destroyed, and so the object is automatically treated as an rvalue (move candidate). If the expression that contains the last use is able to move from the rvalue, the move will happen automatically. -
If the name is a
forward
parameter, the object is automatically forwarded to preserve its constness and value category (std::forward
-ed).
Note: This gives language meaning to a naming convention of
guard
as a name prefix for "guard" stack objects, such as localstd::scoped_lock
objects, whose destructors are always the object's real last use.
For example:
In this example:
-
x
has a definite last use on one path, but not another. Line 13 is a definite last use that automatically treatsx
as an rvalue. However, if theelse
is taken,x
gets no special automatic handling. Line 9 is not a definite last use becausex
could be used again where it is mentioned later on line 13. -
y
has a definite last use on every path, in this case the same on all executions of the function. Line 19 is a definite last use that automatically treatsx
as an rvalue. -
z
has a definite last use on every path, but unlikey
it can be a different last use on different executions of the function. That's fine, each of lines 13 and 16 is a definite last use that automatically forwards the constness and value category ofz
. -
w
has a definite last use on every path, in this case the same on all executions of the function. Line 21 is a definite last use that automatically treatsw
as an rvalue.
Generality note: Summary of function defaults
There is a single function syntax, designed so we can just omit the parts we're not currently using.
For example, let's express in full verbose detail that equals
is a function template that has two type parameters T
and U
, two ordinary in
parameters a
and b
of type T
and U
respectively, and a deduced return type, and its body returns the result of a == b
:
equals: <T: type, U: type> (in a: T, in b: U) -> _ = { return a == b; }
We can write all that, but we don't have to.
First, : type
is the default for template parameters, so we can omit it since that's what we want:
equals: <T, U> (in a: T, in b: U) -> _ = { return a == b; }
So far, the return type is already using one common default available throughout Cpp2: the wildcard _
(pronounced "don't care"). Since this function's body doesn't actually use the parameter type names T
and U
, we can just use wildcards for the parameter types too:
equals: (in a: _, in b: _) -> _ = { return a == b; }
Next, : _
is also the default parameter type, so we don't need to write even that:
equals: (in a, in b) -> _ = { return a == b; }
Next, in
is the default parameter kind. So we can use that default too:
equals: (a, b) -> _ = { return a == b; }
We already saw that { return
... ; }
is the default for a single-expression function body that deduces its return type:
equals: (a, b) -> _ = a == b;
Next, -> forward _
(fully deduced return type) is the default for single-expression functions that return something, and in this case will have the same meaning as -> _
:
equals: (a, b) = a == b;
Finally, at expression scope (aka "lambda/temporary") functions/objects aren't named, and the trailing ;
is optional:
:(a, b) = a == b
Here are some additional examples of unnamed function expressions:
std::ranges::for_each( a, :(x) = std::cout << x );
std::ranges::transform( a, std::back_inserter(b), :(x) = x+1 );
where_is = std::ranges::find_if( b, :(x) = x == waldo$ );
Note: Cpp2 doesn't have a separate "lambda" syntax; you just use the regular function syntax at expression scope to write an unnamed function, and the syntactic defaults are chosen to make such function expressions convenient to write. And because in Cpp2 every local variable capture (for example,
waldo$
above) is written in the body, it doesn't affect the function syntax.