Static Compilation Optimization

Cangjie compilation uses modular compilation. IR is used as the carrier between compilation processes. Different compilation optimizations do not affect each other. The adaptation of compilation optimization and the adjustment of compilation processes are more flexible. The Cangjie language uses static compilation methods to compile Cangjie programs and core library code into machine code, accelerating program running.

GC Optimizations

In Cangjie static compilation, many runtime joint optimizations are added. For example, optimization for reading and writing objects on the heap, creating heap objects, and the heap memory management signal mechanism. Static analysis and runtime joint optimization accelerate the running speed of Cangjie programs in terms of object creation, read/write, and member function call.

When objects are read and written, the Cangjie static backend uses vectorized access to guarantee data read/write and computing rates, minimizing the impact of function calls on performance. Analysis of the active scope of heap objects also ensures that the static backend can determine allocation addresses of heap objects. No matter in the heap, stack, or constant area, the static backend can optimize allocation based on characteristics of objects.

The Cangjie language can accelerate GC information collection based on accurate recording of references on the stack. Accurate recording of stack objects reduces the number of garbage collection root sets, avoiding redundant address judgment of object pointers. In the scanning and fixing phases, accurate recording of stack objects guarantees effective running of GC programs. Based on GC functions, the Cangjie language optimizes fast paths ath for object creation, reading, and writing. As shown in the following figure, when a memory access operation is compiled, a fast path and an instruction for efficiently determining the fast path are generated to reduce performance overheads. gcoptimization

Escape Analysis

For global analysis and optimization, escape analysis of references is added in the Cangjie language. For a reference type, the Cangjie language analyzes the lifecycle of the reference. For a reference that does not escape from the function in which the reference is located, stack allocation optimization can be used. The following code contains some escape analysis results.

class B {}

class A {
    var a: Int64 = 0
    var b: B = B()
}

var ga: A = A()

func test1(a: A) {
    a.a = 10
}

func test2(a: A) {
    ga = a // escape to global
}

func test3(a: A, b: B) {
    a.b = b
}

main() {
    var instance: A = A() // alloca on stack, not escape out this func
    instance.a = 10
    var instance1: A = A() // alloca on stack, test1 not escape param a
    test1(instance1)
    var instance2: A = A() // gc malloc in heap, test2 escape param a
    test2(instance2)
    var instance3: B = B() // alloca on stack, instance3 store into instance1, but instance1 not escaped.
    test3(instance1, instance3)
    var instance4: B = B() // gc malloc in heap, instance4 store int instance2 and instance2 escaped to global.
    test3(instance2, instance4)
}

Stack allocation optimization can directly reduce the GC pressure of automatic memory management, reduce the frequency of memory allocations on the heap, and reduce the garbage collection frequency. The read/write barrier of the heap memory can also change to direct data storage and access due to allocation on the stack, accelerating memory access. After an object is allocated on the stack, optimization measures such as SROA and DSE can be taken for the stack memory to reduce the number of memory read/write times.

Type Analysis and Devirtualization

The Cangjie language supports static analysis of global types and type prediction based on Profile. The Cangjie language supports type inheritance, virtual function calls, and interface function calls. Compared with direct calls, calls of virtual functions and interface functions require extra search and access overheads. For global, local, and interprocedural references, the Cangjie language changes calls of some virtual functions into direct calls through static analysis to accelerate function calls and improve optimization opportunities such as function inlining. In PGO mode, the Cangjie language supports statistics on the types and number of virtual function calls. The hot type and hot call parts captured based on the Profile information accelerate function calls and program execution in a conservative devirtualization manner. typeanalysis