Macro Implementation

This section describes the definition and usage of the Cangjie macros, which can be classified into Non-attribute Macros and Attribute Macros. In addition, the behaviors when macros are nested are illustrated.

Non-attribute Macros

Non-attribute macros receive only code and reject other parameters (attributes). They are defined in the following format:

import std.ast.*

public macro MacroName(args: Tokens): Tokens {
    ... // Macro body
}

A macro can be called in the following format:

@MacroName(...)

A macro is called with (). The value in the parentheses can be any valid tokens or blank.

When a macro is used for declaration, parentheses can be omitted under normal circumstances. The following presents some examples.

@MacroName func name() {}        // Before a FuncDecl
@MacroName struct name {}        // Before a StructDecl
@MacroName class name {}         // Before a ClassDecl
@MacroName var a = 1             // Before a VarDecl
@MacroName enum e {}             // Before an Enum
@MacroName interface i {}        // Before a InterfaceDecl
@MacroName extend e <: i {}      // Before an ExtendDecl
@MacroName mut prop i: Int64 {}  // Before a PropDecl
@MacroName @AnotherMacro(input)  // Before a macro call

Valid Tokens in the parentheses must meet the following requirements:

  • The input must be a sequence composed of valid Tokens. Symbols like "#", "`", and "\" cannot be used individually as input, as they are not valid Cangjie Tokens.

  • If the input contains unmatched parentheses, use the escape character ("\") to escape them.

  • If the at sign "@" needs to be included as the input Token, use the escape character ("\") to escape it.

The following example illustrates these special requirements.

// Illegal input Tokens
@MacroName(#)    // Not a whole Token
@MacroName(`)    // Not a whole Token
@MacroName(()    // ( and ) not match
@MacroName(\[)   // Escape for unsupported symbol

// Legal input Tokens
@MacroName(#"abc"#)
@MacroName(`class`)
@MacroName([)
@MacroName([])
@MacroName(\()
@MacroName(\@)

The macro expansion process works on the Cangjie syntax tree. After the macro is expanded, the compiler continues the subsequent compilation. Therefore, you need to ensure that the code after the expansion is still valid Cangjie code. Otherwise, compilation problems may occur. When a macro is used for declaration, if the parentheses are omitted, the macro input must be a declaration with valid syntax. IDE enables syntax check and highlighting.

Here are some typical examples of macro application.

  • Example 1

    Macro definition file macro_definition.cj

    macro package macro_definition
    
    import std.ast.*
    
    public macro testDef(input: Tokens): Tokens {
        println("I'm in macro body")
        return input
    }
    

    Macro call file macro_call.cj

    package macro_calling
    
    import macro_definition.*
    
    main(): Int64 {
        println("I'm in function body")
        let a: Int64 = @testDef(1 + 2)
        println("a = ${a}")
        return 0
    }
    

    For details about how to compile the preceding code, see Compiling and Using Macros.

    I'm in macro body from the macro definition will be printed during macro_call.cj compilation, when the macro definition is evaluated. In addition, the macro call point will be expanded. The following is an example of code compilation:

    let a: Int64 = @testDef(1 + 2)
    

    The compiler inserts the Tokens returned by the macro to the syntax tree where the call point is located to obtain following code:

    let a: Int64 = 1 + 2
    

    That is, the actual code in the executable program turns into the following:

    main(): Int64 {
        println("I'm in function body")
        let a: Int64 = 1 + 2
        println("a = ${a}")
        return 0
    }
    

    The value of a is calculated to be 3. Therefore, the value of a is printed as 3. Thus, the execution result of the preceding program is:

    I'm in function body
    a = 3
    

A more significant example in processing functions with macros is presented as follows: where the ModifyFunc macro adds the Composer parameter to MyFunc and inserts a code segment before and after counter++.

  • Example 2

    Macro definition file macro_definition.cj

    // file macro_definition.cj
    macro package macro_definition
    
    import std.ast.*
    
    public macro ModifyFunc(input: Tokens): Tokens {
        println("I'm in macro body")
        let funcDecl = FuncDecl(input)
        return quote(
        func $(funcDecl.identifier)(id: Int64) {
            println("start ${id}")
            $(funcDecl.block.nodes)
            println("end")
        })
    }
    

    Macro call file macro_call.cj

    package macro_calling
    
    import macro_definition.*
    
    var counter = 0
    
    @ModifyFunc
    func MyFunc() {
        counter++
    }
    
    func exModifyFunc() {
        println("I'm in function body")
        MyFunc(123)
        println("MyFunc called: ${counter} times")
        return 0
    }
    
    main(): Int64 {
        exModifyFunc()
    }
    

    Likewise, the preceding two code segments are located in different files. The macro definition file macro_definition.cj is compiled before the macro call file macro_call.cj to generate an executable file.

    In this example, a function declaration is input for the ModifyFunc macro. Therefore, the parentheses can be omitted.

    @ModifyFunc
    func MyFunc() {
        counter++
    }
    

    The following code is obtained after macro expansion:

    func MyFunc(id: Int64) {
        println("start ${id}")
        counter++
        println("end")
    }
    

    Since MyFunc is called in main and the argument it receives is also defined in main, a valid Cangjie program is formed. The following information is printed during program execution:

    I'm in function body
    start 123
    end
    MyFunc called: 1 times
    

Attribute Macros

Compared with a non-attribute macro, an attribute macro adds to its definition an input parameter of the Tokens type, which allows developers to privide additional information. For example, if a developer wants to use different macro expansion strategies in different call scenarios, they can utilize the attribute to tag a setting. In addition, the attribute input parameter can accept any tokens, which can be combined with the code modified by the macro. The following is a simple example.

// Macro definition with attribute
public macro Foo(attrTokens: Tokens, inputTokens: Tokens): Tokens {
    return attrTokens + inputTokens  // Concatenate attrTokens and inputTokens.
}

As shown in the preceding macro definition, the number of input parameters of the attribute macro is 2 and the input parameter type is Tokens. A series of transformation operations such as combination and concatenation can be performed on attrTokens and inputTokens. Finally, a new Tokens is returned.

The method for calling macros with or without attributes is similar. When a macro with attributes is called, the new input parameter attrTokens is passed through []. The way of calling is as follows:

// attribute macro with parentheses
var a: Int64 = @Foo[1+](2+3)

// attribute macro without parentheses
@Foo[public]
struct Data {
    var count: Int64 = 100
}
  • The preceding shows an example of calling the Foo macro, in which the parameter 2+3 is concatenated with the attribute 1+ in [] to obtain var a: Int64 = 1+2+3 after macro expansion.

  • When the parameter is changed to struct Data and is concatenated with the attribute public in [], the following is obtained after macro expansion:

    public struct Data {
        var count: Int64 = 100
    }
    

Be mindful of the following points about attribute macros:

  • Attribute macros share the same AST that can be modified with non-attribute macros, with some enhancements in input parameters.

  • The validity requirements for the parameters in the parentheses in attribute macros are the same as those in non-attribute macros.

  • Valid parameters (attributes) in the square brackets in attribute macros must meet the following requirements:

    • The input must be a sequence composed of valid Tokens. Symbols like "#", "`", and "\" cannot be used individually as input, as they are not valid Cangjie Tokens.

    • If the input contains unmatched square brackets, use the escape character ("\") to escape them.

    • If the at sign "@" needs to be included as the input Token, use the escape character ("\") to escape it.

    // Illegal attribute Tokens
    @MacroName[#]()    // Not a whole Token
    @MacroName[`]()    // Not a whole Token
    @MacroName[@]()    // Not escape for @
    @MacroName[[]()    // [ and ] not match
    @MacroName[\(]()   // Escape for unsupported symbol
    
    // Legal attribute Tokens
    @MacroName[#"abc"#]()
    @MacroName[`class`]()
    @MacroName[(]()
    @MacroName[()]()
    @MacroName[\[]()
    @MacroName[\@]()
    
  • The macro definition must be consistent with the call type. If a macro definition contains two input parameters, it is an attribute macro definition, indicating [] must be added on the call site and the content can be left blank. If the macro definition contains one input parameter, it is a non-attribute macro definition, in which case [] cannot be used during the call.

Nested Macros

The Cangjie language does not support nesting macro definitions, but allows nesting macro calls in macro definitions and macro calls in some circumstances.

Nesting Macro Calls in a Macro Definition

The following is an example of a macro definition containing other macro calls.

The macro package pkg1 defines the getIdent macro.

macro package pkg1

import std.ast.*

public macro getIdent(attr:Tokens, input:Tokens):Tokens {
    return quote(
        let decl = (parseDecl(input) as VarDecl).getOrThrow()
        let name = decl.identifier.value
        let size = name.size - 1
        let $(attr) = Token(TokenKind.IDENTIFIER, name[0..size])
    )
}

The macro package pkg2 defines the Prop macro, in which the getIdent macro is called.

macro package pkg2

import std.ast.*
import pkg1.*

public macro Prop(input:Tokens):Tokens {
    let v = parseDecl(input)
    @getIdent[ident](input)
    return quote(
        $(input)
        public prop $(ident): $(decl.declType) {
            get() {
                this.$(v.identifier)
            }
        }
    )
}

The macro package pkg3 package calls the Prop macro.

package pkg3

import pkg2.*
class A {
    @Prop
    private let a_: Int64 = 1
}

main() {
    let b = A()
    println("${b.a}")
}

Note that the preceding three files must be compiled in the sequence of pkg1, pkg2, pkg3 based on the constraint that a macro definition must be compiled before a macro call point. The definition of the Prop macro in pkg2 is as follows:

public macro Prop(input:Tokens):Tokens {
    let v = parseDecl(input)
    @getIdent[ident](input)
    return quote(
        $(input)
        public prop $(ident): $(decl.declType) {
            get() {
                this.$(v.identifier)
            }
        }
    )
}

The following code is expanded before compilation:

public macro Prop(input: Tokens): Tokens {
    let v = parseDecl(input)

    let decl = (parseDecl(input) as VarDecl).getOrThrow()
    let name = decl.identifier.value
    let size = name.size - 1
    let ident = Token(TokenKind.IDENTIFIER, name[0 .. size])

    return quote(
        $(input)
        public prop $(ident): $(decl.declType) {
            get() {
                this.$(v.identifier)
            }
        }
    )
}

Nesting Macro Calls in a Macro Call

A common scenario of macro nesting is that macros are called in the code blocks modified by a macro. Here is an example:

The pkg1 package defines the Foo and Bar macros.

macro package pkg1

import std.ast.*

public macro Foo(input: Tokens): Tokens {
    return input
}

public macro Bar(input: Tokens): Tokens {
    return input
}

The pkg2 package defines the addToMul macro.

macro package pkg2

import std.ast.*

public macro addToMul(inputTokens: Tokens): Tokens {
    var expr: BinaryExpr = match (parseExpr(inputTokens) as BinaryExpr) {
        case Some(v) => v
        case None => throw Exception()
    }
    var op0: Expr = expr.leftExpr
    var op1: Expr = expr.rightExpr
    return quote(($(op0)) * ($(op1)))
}

The pkg3 package uses the three macros defined above:

package pkg3

import pkg1.*
import pkg2.*
@Foo
struct Data {
    let a = 2
    let b = @addToMul(2+3)

    @Bar
    public func getA() {
        return a
    }

    public func getB() {
        return b
    }
}

main(): Int64 {
    let data = Data()
    var a = data.getA() // a = 2
    var b = data.getB() // b = 6
    println("a: ${a}, b: ${b}")
    return 0
}

As shown in the preceding code, the macro Foo modifies struct Data, in which the addToMul and Bar macros are called. In such a nesting scenario, the rule for code transformation is to expand the inner macros (addToMul and Bar) within the nested structure before expanding the outer macro (Foo). The same rule applies to the scenario where multi-layer macros are nested.

Macros can be nested in macro calls with or without parentheses. The two types of macro calls can be combined only if there is no ambiguity and the macro expansion sequence is clear.

var a = @foo(@foo1(2 * 3)+@foo2(1 + 3))  // foo1, foo2 have to be defined.

@Foo1 // Foo2 expands first, then Foo1 expands.
@Foo2[attr: struct] // Attribute macro can be used in nested macro.
struct Data{
    @Foo3 @Foo4[123] var a = @bar1(@bar2(2 + 3) + 3)  // bar2, bar1, Foo4, Foo3 expands in order.
    public func getA() {
        return @foo(a + 2)
    }
}

Message Transfer Between Nested Macros

Here we focus on the message transfer between macros nested in macro calls.

An inner macro can call the assertParentContext library function to ensure it is nested in a specific outer macro call. If the inner macro calls the function but fails to be nested in the given outer macro call, the function throws an error. The InsideParentContext library function also checks whether an inner macro call is nested in a specific outer macro call and returns a Boolean value. The following is a simple example.

The macro is defined as follows:

public macro Outer(input: Tokens): Tokens {
    return input
}

public macro Inner(input: Tokens): Tokens {
    assertParentContext("Outer")
    return input
}

The macro is called as follows:

@Outer var a = 0
@Inner var b = 0 // Error, The macro call 'Inner' should with the surround code contains a call 'Outer'.

As shown in the preceding code, the Inner macro uses the assertParentContext function to check whether it is within the Outer macro during the call. The macro call example in the code indicates such a nesting relationship does not exist between the two macros during the call. Therefore, the compiler reports an error.

The inner macros can communicate with the outer macro by sending key/value pairs. When an inner macro is executed, the standard library function setItem is called to send a message to the outer macro. When an outer macro is executed, the standard library function getChildMessages is called to receive the message (a key/value pair mapping) sent by each inner macro. The following is a simple example.

The macro is defined as follows:

macro package define

import std.ast.*

public macro Outer(input: Tokens): Tokens {
    let messages = getChildMessages("Inner")

    let getTotalFunc = quote(public func getCnt() {
                       )
    for (m in messages) {
        let identName = m.getString("identifierName")
        // let value = m.getString("key")            // Receive multiple groups of messages
        getTotalFunc.append(Token(TokenKind.IDENTIFIER, identName))
        getTotalFunc.append(quote(+))
    }
    getTotalFunc.append(quote(0))
    getTotalFunc.append(quote(}))
    let funcDecl = parseDecl(getTotalFunc)

    let decl = (parseDecl(input) as ClassDecl).getOrThrow()
    decl.body.decls.append(funcDecl)
    return decl.toTokens()

}

public macro Inner(input: Tokens): Tokens {
    assertParentContext("Outer")
    let decl = parseDecl(input)
    setItem("identifierName", decl.identifier.value)
    // setItem ("key," "value")                     // Transfer multiple groups of messages through different key values
    return input
}

The macro is called as follows:

import define.*

@Outer
class Demo {
    @Inner var state = 1
    @Inner var cnt = 42
}

main(): Int64 {
    let d = Demo()
    println("${d.getCnt()}")
    return 0
}

In the preceding code, Outer receives the variable names sent by two Inner macros and automatically adds the following content to the class:

public func getCnt() {
    state + cnt + 0
}

The process is detailed as follows: The inner macros Inner send messages to the outer macro through setItem. The Outer macro receives a group of message objects sent by Inner macros through the getChildMessages function (Inner can be called for multiple times in Outer) and obtains the corresponding value of the message objects through the getString function.