Macro Implementation
This section describes the definition and usage of the Cangjie macros, which can be classified into Non-attribute Macros and Attribute Macros. In addition, the behaviors when macros are nested are illustrated.
Non-attribute Macros
Non-attribute macros receive only code and reject other parameters (attributes). They are defined in the following format:
import std.ast.*
public macro MacroName(args: Tokens): Tokens {
... // Macro body
}
A macro can be called in the following format:
@MacroName(...)
A macro is called with ()
. The value in the parentheses can be any valid tokens
or blank.
When a macro is used for declaration, parentheses can be omitted under normal circumstances. The following presents some examples.
@MacroName func name() {} // Before a FuncDecl
@MacroName struct name {} // Before a StructDecl
@MacroName class name {} // Before a ClassDecl
@MacroName var a = 1 // Before a VarDecl
@MacroName enum e {} // Before an Enum
@MacroName interface i {} // Before a InterfaceDecl
@MacroName extend e <: i {} // Before an ExtendDecl
@MacroName mut prop i: Int64 {} // Before a PropDecl
@MacroName @AnotherMacro(input) // Before a macro call
Valid Tokens
in the parentheses must meet the following requirements:
-
The input must be a sequence composed of valid
Token
s. Symbols like "#", "`", and "\" cannot be used individually as input, as they are not valid CangjieToken
s. -
If the input contains unmatched parentheses, use the escape character ("\") to escape them.
-
If the at sign "@" needs to be included as the input
Token
, use the escape character ("\") to escape it.
The following example illustrates these special requirements.
// Illegal input Tokens
@MacroName(#) // Not a whole Token
@MacroName(`) // Not a whole Token
@MacroName(() // ( and ) not match
@MacroName(\[) // Escape for unsupported symbol
// Legal input Tokens
@MacroName(#"abc"#)
@MacroName(`class`)
@MacroName([)
@MacroName([])
@MacroName(\()
@MacroName(\@)
The macro expansion process works on the Cangjie syntax tree. After the macro is expanded, the compiler continues the subsequent compilation. Therefore, you need to ensure that the code after the expansion is still valid Cangjie code. Otherwise, compilation problems may occur. When a macro is used for declaration, if the parentheses are omitted, the macro input must be a declaration with valid syntax. IDE enables syntax check and highlighting.
Here are some typical examples of macro application.
-
Example 1
Macro definition file
macro_definition.cj
macro package macro_definition import std.ast.* public macro testDef(input: Tokens): Tokens { println("I'm in macro body") return input }
Macro call file
macro_call.cj
package macro_calling import macro_definition.* main(): Int64 { println("I'm in function body") let a: Int64 = @testDef(1 + 2) println("a = ${a}") return 0 }
For details about how to compile the preceding code, see Compiling and Using Macros.
I'm in macro body
from the macro definition will be printed duringmacro_call.cj
compilation, when the macro definition is evaluated. In addition, the macro call point will be expanded. The following is an example of code compilation:let a: Int64 = @testDef(1 + 2)
The compiler inserts the
Tokens
returned by the macro to the syntax tree where the call point is located to obtain following code:let a: Int64 = 1 + 2
That is, the actual code in the executable program turns into the following:
main(): Int64 { println("I'm in function body") let a: Int64 = 1 + 2 println("a = ${a}") return 0 }
The value of
a
is calculated to be 3. Therefore, the value ofa
is printed as 3. Thus, the execution result of the preceding program is:I'm in function body a = 3
A more significant example in processing functions with macros is presented as follows: where the ModifyFunc macro adds the Composer parameter to MyFunc and inserts a code segment before and after counter++
.
-
Example 2
Macro definition file
macro_definition.cj
// file macro_definition.cj macro package macro_definition import std.ast.* public macro ModifyFunc(input: Tokens): Tokens { println("I'm in macro body") let funcDecl = FuncDecl(input) return quote( func $(funcDecl.identifier)(id: Int64) { println("start ${id}") $(funcDecl.block.nodes) println("end") }) }
Macro call file
macro_call.cj
package macro_calling import macro_definition.* var counter = 0 @ModifyFunc func MyFunc() { counter++ } func exModifyFunc() { println("I'm in function body") MyFunc(123) println("MyFunc called: ${counter} times") return 0 } main(): Int64 { exModifyFunc() }
Likewise, the preceding two code segments are located in different files. The macro definition file
macro_definition.cj
is compiled before the macro call filemacro_call.cj
to generate an executable file.In this example, a function declaration is input for the ModifyFunc macro. Therefore, the parentheses can be omitted.
@ModifyFunc func MyFunc() { counter++ }
The following code is obtained after macro expansion:
func MyFunc(id: Int64) { println("start ${id}") counter++ println("end") }
Since MyFunc is called in main and the argument it receives is also defined in main, a valid Cangjie program is formed. The following information is printed during program execution:
I'm in function body start 123 end MyFunc called: 1 times
Attribute Macros
Compared with a non-attribute macro, an attribute macro adds to its definition an input parameter of the Tokens
type, which allows developers to privide additional information. For example, if a developer wants to use different macro expansion strategies in different call scenarios, they can utilize the attribute to tag a setting. In addition, the attribute input parameter can accept any tokens, which can be combined with the code modified by the macro. The following is a simple example.
// Macro definition with attribute
public macro Foo(attrTokens: Tokens, inputTokens: Tokens): Tokens {
return attrTokens + inputTokens // Concatenate attrTokens and inputTokens.
}
As shown in the preceding macro definition, the number of input parameters of the attribute macro is 2 and the input parameter type is Tokens
. A series of transformation operations such as combination and concatenation can be performed on attrTokens
and inputTokens
. Finally, a new Tokens
is returned.
The method for calling macros with or without attributes is similar. When a macro with attributes is called, the new input parameter attrTokens is passed through []. The way of calling is as follows:
// attribute macro with parentheses
var a: Int64 = @Foo[1+](2+3)
// attribute macro without parentheses
@Foo[public]
struct Data {
var count: Int64 = 100
}
-
The preceding shows an example of calling the Foo macro, in which the parameter
2+3
is concatenated with the attribute1+
in[]
to obtainvar a: Int64 = 1+2+3
after macro expansion. -
When the parameter is changed to struct Data and is concatenated with the attribute
public
in[]
, the following is obtained after macro expansion:public struct Data { var count: Int64 = 100 }
Be mindful of the following points about attribute macros:
-
Attribute macros share the same AST that can be modified with non-attribute macros, with some enhancements in input parameters.
-
The validity requirements for the parameters in the parentheses in attribute macros are the same as those in non-attribute macros.
-
Valid parameters (attributes) in the square brackets in attribute macros must meet the following requirements:
-
The input must be a sequence composed of valid
Token
s. Symbols like "#", "`", and "\" cannot be used individually as input, as they are not valid CangjieToken
s. -
If the input contains unmatched square brackets, use the escape character ("\") to escape them.
-
If the at sign "@" needs to be included as the input
Token
, use the escape character ("\") to escape it.
// Illegal attribute Tokens @MacroName[#]() // Not a whole Token @MacroName[`]() // Not a whole Token @MacroName[@]() // Not escape for @ @MacroName[[]() // [ and ] not match @MacroName[\(]() // Escape for unsupported symbol // Legal attribute Tokens @MacroName[#"abc"#]() @MacroName[`class`]() @MacroName[(]() @MacroName[()]() @MacroName[\[]() @MacroName[\@]()
-
-
The macro definition must be consistent with the call type. If a macro definition contains two input parameters, it is an attribute macro definition, indicating
[]
must be added on the call site and the content can be left blank. If the macro definition contains one input parameter, it is a non-attribute macro definition, in which case[]
cannot be used during the call.
Nested Macros
The Cangjie language does not support nesting macro definitions, but allows nesting macro calls in macro definitions and macro calls in some circumstances.
Nesting Macro Calls in a Macro Definition
The following is an example of a macro definition containing other macro calls.
The macro package pkg1
defines the getIdent
macro.
macro package pkg1
import std.ast.*
public macro getIdent(attr:Tokens, input:Tokens):Tokens {
return quote(
let decl = (parseDecl(input) as VarDecl).getOrThrow()
let name = decl.identifier.value
let size = name.size - 1
let $(attr) = Token(TokenKind.IDENTIFIER, name[0..size])
)
}
The macro package pkg2
defines the Prop
macro, in which the getIdent
macro is called.
macro package pkg2
import std.ast.*
import pkg1.*
public macro Prop(input:Tokens):Tokens {
let v = parseDecl(input)
@getIdent[ident](input)
return quote(
$(input)
public prop $(ident): $(decl.declType) {
get() {
this.$(v.identifier)
}
}
)
}
The macro package pkg3
package calls the Prop
macro.
package pkg3
import pkg2.*
class A {
@Prop
private let a_: Int64 = 1
}
main() {
let b = A()
println("${b.a}")
}
Note that the preceding three files must be compiled in the sequence of pkg1, pkg2, pkg3 based on the constraint that a macro definition must be compiled before a macro call point. The definition of the Prop
macro in pkg2 is as follows:
public macro Prop(input:Tokens):Tokens {
let v = parseDecl(input)
@getIdent[ident](input)
return quote(
$(input)
public prop $(ident): $(decl.declType) {
get() {
this.$(v.identifier)
}
}
)
}
The following code is expanded before compilation:
public macro Prop(input: Tokens): Tokens {
let v = parseDecl(input)
let decl = (parseDecl(input) as VarDecl).getOrThrow()
let name = decl.identifier.value
let size = name.size - 1
let ident = Token(TokenKind.IDENTIFIER, name[0 .. size])
return quote(
$(input)
public prop $(ident): $(decl.declType) {
get() {
this.$(v.identifier)
}
}
)
}
Nesting Macro Calls in a Macro Call
A common scenario of macro nesting is that macros are called in the code blocks modified by a macro. Here is an example:
The pkg1
package defines the Foo
and Bar
macros.
macro package pkg1
import std.ast.*
public macro Foo(input: Tokens): Tokens {
return input
}
public macro Bar(input: Tokens): Tokens {
return input
}
The pkg2
package defines the addToMul
macro.
macro package pkg2
import std.ast.*
public macro addToMul(inputTokens: Tokens): Tokens {
var expr: BinaryExpr = match (parseExpr(inputTokens) as BinaryExpr) {
case Some(v) => v
case None => throw Exception()
}
var op0: Expr = expr.leftExpr
var op1: Expr = expr.rightExpr
return quote(($(op0)) * ($(op1)))
}
The pkg3
package uses the three macros defined above:
package pkg3
import pkg1.*
import pkg2.*
@Foo
struct Data {
let a = 2
let b = @addToMul(2+3)
@Bar
public func getA() {
return a
}
public func getB() {
return b
}
}
main(): Int64 {
let data = Data()
var a = data.getA() // a = 2
var b = data.getB() // b = 6
println("a: ${a}, b: ${b}")
return 0
}
As shown in the preceding code, the macro Foo
modifies struct Data
, in which the addToMul
and Bar
macros are called. In such a nesting scenario, the rule for code transformation is to expand the inner macros (addToMul
and Bar
) within the nested structure before expanding the outer macro (Foo
). The same rule applies to the scenario where multi-layer macros are nested.
Macros can be nested in macro calls with or without parentheses. The two types of macro calls can be combined only if there is no ambiguity and the macro expansion sequence is clear.
var a = @foo(@foo1(2 * 3)+@foo2(1 + 3)) // foo1, foo2 have to be defined.
@Foo1 // Foo2 expands first, then Foo1 expands.
@Foo2[attr: struct] // Attribute macro can be used in nested macro.
struct Data{
@Foo3 @Foo4[123] var a = @bar1(@bar2(2 + 3) + 3) // bar2, bar1, Foo4, Foo3 expands in order.
public func getA() {
return @foo(a + 2)
}
}
Message Transfer Between Nested Macros
Here we focus on the message transfer between macros nested in macro calls.
An inner macro can call the assertParentContext
library function to ensure it is nested in a specific outer macro call. If the inner macro calls the function but fails to be nested in the given outer macro call, the function throws an error. The InsideParentContext
library function also checks whether an inner macro call is nested in a specific outer macro call and returns a Boolean value. The following is a simple example.
The macro is defined as follows:
public macro Outer(input: Tokens): Tokens {
return input
}
public macro Inner(input: Tokens): Tokens {
assertParentContext("Outer")
return input
}
The macro is called as follows:
@Outer var a = 0
@Inner var b = 0 // Error, The macro call 'Inner' should with the surround code contains a call 'Outer'.
As shown in the preceding code, the Inner
macro uses the assertParentContext
function to check whether it is within the Outer
macro during the call. The macro call example in the code indicates such a nesting relationship does not exist between the two macros during the call. Therefore, the compiler reports an error.
The inner macros can communicate with the outer macro by sending key/value pairs. When an inner macro is executed, the standard library function setItem
is called to send a message to the outer macro. When an outer macro is executed, the standard library function getChildMessages
is called to receive the message (a key/value pair mapping) sent by each inner macro. The following is a simple example.
The macro is defined as follows:
macro package define
import std.ast.*
public macro Outer(input: Tokens): Tokens {
let messages = getChildMessages("Inner")
let getTotalFunc = quote(public func getCnt() {
)
for (m in messages) {
let identName = m.getString("identifierName")
// let value = m.getString("key") // Receive multiple groups of messages
getTotalFunc.append(Token(TokenKind.IDENTIFIER, identName))
getTotalFunc.append(quote(+))
}
getTotalFunc.append(quote(0))
getTotalFunc.append(quote(}))
let funcDecl = parseDecl(getTotalFunc)
let decl = (parseDecl(input) as ClassDecl).getOrThrow()
decl.body.decls.append(funcDecl)
return decl.toTokens()
}
public macro Inner(input: Tokens): Tokens {
assertParentContext("Outer")
let decl = parseDecl(input)
setItem("identifierName", decl.identifier.value)
// setItem ("key," "value") // Transfer multiple groups of messages through different key values
return input
}
The macro is called as follows:
import define.*
@Outer
class Demo {
@Inner var state = 1
@Inner var cnt = 42
}
main(): Int64 {
let d = Demo()
println("${d.getCnt()}")
return 0
}
In the preceding code, Outer
receives the variable names sent by two Inner
macros and automatically adds the following content to the class:
public func getCnt() {
state + cnt + 0
}
The process is detailed as follows: The inner macros Inner
send messages to the outer macro through setItem
. The Outer
macro receives a group of message objects sent by Inner
macros through the getChildMessages
function (Inner
can be called for multiple times in Outer
) and obtains the corresponding value of the message objects through the getString
function.