Names, Scopes, Variables, and Modifiers

This chapter first describes names, scopes, and shadowing, then introduces one of the names—variables, including their definition and initialization, and finally introduces modifiers.

Names

In Cangjie, names are used to identify entities such as variables, functions, types, packages, and modules.

names must be valid [Identifiers].

In Cangjie, keywords, variables, functions, types (including class, interface, struct, enum, and type alias), generic parameters, package names, and module names share the same namespace. That is, entities declared or defined in the same scope must have different names (except the names that constitute overloading). Entities declared or defined in different scopes can have the same name, but this may lead to shadowing.

let f2 = 77

func f2() { // Error, function name is the same as the variable f2
    print("${f2}")
}     

// Variable, function, type names cannot be the same as keywords 
let Int64 = 77   // Error: 'Int64' is a keyword

func class() {  // Error: class is a keyword
    print("${f2}")
}    

main(): Int64 {  
    print("${f2}")   // Print 77

    let f2 = { => // Shadowed the global variable f2
       print("10")
    }
    f2()
    return 1
}

Scopes

An entity can be accessed by name within the scope of its name without needing a prefix qualifier. A scope can be nested, meaning that a scope contains itself and any nested scopes within it. If a name is not shadowed or overridden within its nested scope, it can also be accessed directly.

Blocks

In Cangjie, a structure consisting of a pair of matching curly braces and an optional expression and declaration sequence in the braces is called a block. Blocks are everywhere in Cangjie. For example, the function body defined by a function, two branches of an if expression, and the loop body of a while expression are all blocks. Blocks lead to new scopes.

The syntax of a block is defined as follows:

block
    : '{' expressionOrDeclarations '}'
    ;

A block has a value. The value of a block is determined by the expression and the declaration sequence within it. When a block is evaluated, the evaluation is performed in the order of the expressions and variable declarations within the block.

If the last item of a block is an expression, the value of that expression is the value of the block once it is evaluated.

{
    let a = 1
    let b = 2
    a + b
} // The value of the block is a + b

If the last item of a block is a declaration, the value of the block is () after the declaration is processed.

{
    let a = 1
} // The value of the block is ()

If a block does not contain any expression or declaration, the value of the block is ().

{ } // The value of this empty block is ()

Scope Levels

If duplicate names exist in the nested multi-level scope, the name introduced by the inner scope shadows the name of the outer scope, meaning the inner scope has a higher level than the outer one.

The comparison of scope levels in nested scopes is defined as follows:

Names introduced by import have the lowest scope level.
Top-level names in a package have a higher scope level than names in point 1.
Names introduced within a type, function definition, or expression, typically defined within a pair of curly braces {} (that is, in blocks), have a higher scope level than names outside {}.
For classes and interfaces, names in a subclass have a higher scope level than those in the superclass and may shadow or override the latter.

import p1.a   // a has the lowest scope level

var x = 1   // x 's scope level is higher than p1.a

open class A {    // A's scope level the same as x at line 3
    var x = 3     // This x has higher scope level than the x at line 3
    var y = 'a'  
    func f(a: Int32, b: Float64): Unit {

    }
}

class B <: A {
    var x = 5   // This x has higher scope level than the x at line 6

    func f(x!: Int32 = 7) { // This x's scope level is higher than the x at line 14

    }
}

Scope Principles

Based on the location of name introduction, there are three types of scopes: top-level, local-level, and type-internal. The following describes the different principles of the three types of scopes.

Top-level

Names introduced at the top level adhere to the following scope principles:

The scope of top-level functions and types is the entire package, and their names are visible to the entire package. These types include class, interface, enum, struct, and type.
Names of top-level variables, introduced by let, var, and const, have a scope that begins after their definition (including initialization) is complete and does not include the range from the start of the file to the variable declaration. These names are visible to other files in the package. However, the initialization process of variables may have side effects. Therefore, variables must be declared and initialized before use.

/* Global variables can be accessed only after defined. */
let x = y         //Error: y is used before being defined
let y = 4
let a = b         //Error: b is used before being defined 
let b = a         
let c = c         //Error: unresolved identifier 'c' (in the right of '=')

//Function names are visible in the entire package
func test() { test2() }   // OK

func f() { test() } // OK

func test2(): Int64 {
    var x = 99
    return x
}

Local-level

Names declared or defined within a function definition or expression have a local-level scope. Variables defined in a block have a higher scope level than those defined outside the block.

The scope of a local variable ranges from its declaration to the end of the current scope. A local variable must be defined and initialized before use. The shadowing of a local variable starts after the declaration or definition of the variable name.
The scope of a local function ranges from its declaration to the end of the current scope. Recursive definition is supported, but mutual recursion is not supported.
The scope of a parameter or a generic parameter of a function ranges from the parameter name declaration to the end of the function body. The scope level is the same as that of the variables defined in the function body.
- The function definition func f(x: Int32, y!: Int32 = x) {} is valid.
- The function definition func f(x!: Int32 = x) {} is invalid.
Generic parameters introduced in generic type declarations or extensions have a scope that starts from the declaration of the parameter names to the end of the type body or extension body. The scope level of the generic parameter is the same as the names defined in the type.
- In the generic type definition class C<T> {}, the scope of T ranges from its declaration to the end of the class C declaration.
- In the generic type extension extend<T> C<T> {}, the scope of T ranges from its declaration to the end of the extension definition.
Similar to a function, the scope of the parameter name of the lambda expression is the lambda expression function body. The scope level can be considered as the same as that of a variable defined in the function body of the lambda expression.
The parameter names of the main functions and constructors are considered to be introduced by the function body block. Introducing a name that is the same as the parameter name in the function body will trigger a redefinition error.
Names introduced in the condition of an if-let expression are considered to be introduced by the if block. If the same name is introduced again in the if block, a redefinition error is triggered.
Names introduced by patterns in the match cases of a match expression have a higher scope level than the surrounding match expression, with a scope ranging from their introduction to the end of that match case. Each match case has an independent scope. Names introduced in the pattern binding of a match case are considered to be introduced by the scope following the fat arrow =>. If the same name is introduced again after the =>, a redefinition error is triggered.
For all three types of loops, the scope level of the loop condition is the same as that of the loop block, meaning names introduced in them cannot shadow each other. In addition, loop conditions cannot reference variables defined in the loop body. Therefore:
- In a for-in expression, the loop body can reference variables introduced in the loop condition.
- In while and do-while expressions, their loop conditions cannot reference variables introduced in their loop bodies, even if the do-while condition follows the loop body.
Variables introduced in the loop condition of a for-in loop cannot be used in expressions following the in keyword.
For try exception handling, the scope of the block following try and that of each catch block are independent of each other. Names introduced by catch patterns are viewed as introduced by the block following catch. Introducing the same name in the catch block will trigger a redefinition error.
In the try-with-resources expression, names introduced between the try keyword and the {} are considered to be introduced by the try block. Introducing the same name in the try block will trigger a redefinition error.

// a: The scope of a local variable begins after the declaration
let x = 4
func f(): Unit {
    print("${x}") // Print 4

    let x = 99    
    print("${x}") // Print 99
}

let y = 5
func g(): Unit {
    let y = y     // 'y' in the right of '=' is the global variable 'y'
    print("${y}") // Print 5 

    let z = z     // Error: unresolved identifier 'z' (in the right of '=')
}

// b: The scope of a local function begins after definition
func test1(): Unit {
    func test2(): Unit {
        print("test2")
        test3() // Error, test3's scope begins after definition
    }
    func test3(): Unit {
        test2()
    }

    test2()    
}

let score: Int64 = 90
let good = 70
var scoreResult: String = match (score) { // binding pattern
    case 60 => "pass" // constant pattern.
    case 100 => "Full" // constant pattern.
    case good => "good" // This good has higher scope level than the good at line 2
}

Type-internal

The scope of members in class, interface, struct, or enum is the entire definition of that class, interface, struct, or enum.
The scope of constructors defined within an enum is the entire enum definition. For specific rules regarding access to constructor names within an enum, see [Enum Type].

Shadowing

Generally, if the same name is introduced into two overlapping scopes with different levels, shadowing occurs. The name with a higher scope level shadows the name with a lower scope level. As a result, the name with a lower scope level either need to be prefixed with qualifiers or cannot be accessed. Shadowing is removed when the scope of the name with a higher scope level ends. Specifically, when the scope levels are different:

If the name C with a higher scope level is of a type, shadowing occurs directly.

// == in package a ==
public class C {} // ver 1

// == in package b ==
import a.*

class C {} // ver 2

let v = C() // will use ver 2

If the name x with a higher scope level is a variable, shadowing occurs directly. For details about the shadowing rules of member variables, see [Classes and Interfaces].
```
let x = 1

func foo() {
    let x = 2
    println(x) // will print 2
}
```

If the name p with a higher scope level is a package, shadowing occurs directly.

// == in package a ==
public class b {
    public static let c = 1
}

// == in package a.b ==
public let c = 2

// == in package test ==
import a.*
import a.b.*

let v = a.b.c // will be 1

If the name f with a higher scope level is a member function, the overloading rules are used to determine whether f is overloaded. If there is no overloading, it may result in overriding or redefinition; if there is no overloading and overriding or redefinition is not possible, an error is reported. For details, see [Overriding] and [Redefinition].
```
open class A {
    public open func foo() {
        println(1)
    }
}

class B <: A {
    public override func foo() {  // override
        println(2)
    }

    public func foo() {  // error, conflicting definitions
        println(3)
    }
}
```
If the name f with a higher scope level is a non-member function, the overloading rules are used to determine whether f is overloaded. If there is no overloading, it is considered as shadowing.
```
func foo() { 1 }

func test() {
    func foo() { 2 } // shadows
    func foo(x: Int64) { 3 } // overloads

    println(foo()) // will print 2
}
```

The following example shows the shadowing relationship between multiple names (including types and variables) with different scope levels.

func g(): Unit {
  	f(1, 2)  // OK. f is a top-level definition, which is visible in the whole file 
}

func f(x: Int32, y: Int32): Unit {  
    // var x = 1 // Error, x was introduced by parameter
    var i = 1       // OK

    for (i in 1..3) { // OK, a new i in the block
        let v = 1   
    }              // i and v disappear

    print("${i}")     // OK. i is defined at line 9
    // print("${v}") // Error, v disappeared and cannot be found
}

enum Maze {
    C1 | C2
    // | C1 // Error. The C1 has been used
}

The following example shows the shadowing relationship between names (including types and variables) in different packages.

// File a.cj
package p1

class A {}
var v: Int32 = 10

// File b.cj
package p2
import p1.A        // A belongs to the package p1
import p1.v        // v belongs to the package p1

// p2 has defined its own v, whose scope level is higher than the v from package p1
var v: Int32 = 20     

func displayV1(): Unit {
    // According to the scope level, p2.v shadows p1.v, therefore access p2.v here 
    print("${v}")          // output: 20
}

var a = A()         // Invoke A in p1

The following example shows the shadowing relationship in inheritance.

var x = 1   

open class A {
  	var x = 3   // this x shadows the top level x
}

class B <: A {
    var x = 5   // error: a member of the subtype must not shadow a member of the supertype.

    func f(x!: Int32 = 7): Unit { // this x shadows all the previous x

    }
}

Variables

As a statically typed language, Cangjie requires that the type of each variable be determined during compilation.

Variables can be classified into three types based on whether they can be modified: immutable variables (whose values cannot be changed once initialized), mutable variables (whose values can be changed), and const variables (which must be initialized during compilation and cannot be changed).

Definition of Variables

The syntax of defining variables is as follows:

variableDeclaration
    : variableModifier* ('let' | 'var' | 'const') patternsMaybeIrrefutable (((':' type)? ('=' expression)) | (':' type))
    ;

patternsMaybeIrrefutable
    : wildcardPattern
    | varBindingPattern
    | tuplePattern
    | enumPattern
    ;

The definition of a variable consists of four parts: modifier, the let, var, or const keywords, patternsMaybeIrrefutable, and variable type, which are described as follows:

Modifiers
- The modifiers of the top-level variable are public, protected, private, and internal.
- Local variables cannot be modified by modifiers.
- The modifiers of the member variables of the class type are public, protected, private, internal, and static.
- The modifiers of member variables of the struct type are public, private, internal, and static.
Keywords let, var, and const
- let: defines immutable variables whose values cannot be changed once initialized.
- var: defines mutable variables.
- const: defines const variables.
patternsMaybeIrrefutable
- let (or var, const) can only be followed by patterns that must or may be irrefutable (see [Classification of Patterns]). During semantic check, the system checks whether the pattern is irrefutable. If the pattern is not irrefutable, a compilation error is reported.
- The new variables introduced in the pattern after let (or var, const) are all variables modified by let (or var).
- When defining member variables in a class or struct, you can only use the binding patterns (see [Binding Patterns]).
The variable type is optional. If the variable type is not declared, an initial value must be assigned to the variable, and the compiler will attempt to infer the variable type from the initial value.
Variables can be defined at the top-level, within expressions, or in the class or struct types.

Note that:

(1) Use a colon (:) to separate the pattern and variable type. The pattern type must match the type after the colon.

(2) The keywords let, var, and const and the pattern are required.

(3) In addition to the preceding syntax definition, local variables are introduced in the following scenarios:

For details about patterns between for and in in a for-in loop expression, see [For-in Expression].
For details about parameters in the function and lambda definition, see [Parameters].
For details about ResourceSpecifications in the try-with-resource expression, see [Exceptions].
For details about pattern after case in the match expression, see [Pattern Matching Expressions].

(4) You can use a pair of backquotes () to change the keyword to a valid identifier, such as <idp:inline val="code" displayname="code">open and throw.

The following are some examples of variable definition:

let b: Int32                // Define read-only variable b with type Int32.
let c: Int64                // Define read-only variable c with type Int64.
var bb: String              // Define writeable variable bb with type String.
var (x, y): (Int8, Int16)   // Define two writeable variable: x with type Int8, x with type Int16.
var `open` = 1              // Define a variable named `open` with value 1.
var `throw` = "throw"       // Define a variable named `throw` with value "throw".
const d: Int64 = 0		// Define a const variable named d with value 0.

Initializing Variables

Immutable and Mutable Variables

Both immutable variables and mutable variables can be initialized in either of the following ways: initialization during definition and initialization after definition. Note that each variable must be initialized before being used. Otherwise, a compilation error is reported. The following is an example of variable initialization:

func f() {
    let a = 1               // Define and initialize immutable variable a.
    let b: Int32            // Define immutable variable b without initialization.
    b = 10                  // Initialize variable b.
    var aa: Float32 = 3.14  // Define and initialize mutable variable aa.
}

An immutable variable defined by let can be assigned a value only once (that is, initialization). If the variable is assigned a value for multiple times, a compilation error is reported. Mutable variables defined by var can be assigned multiple values.

func f() {
    let a = 1             // Define and initialize immutable variable a.
    a = 2                 // error: immutable variable a cannot be reassigned.
    var b: Float32 = 3.14 // Define and initialize mutable b.
    b = 3.1415            // ok: mutable variable b can be reassigned.
}

class C {
    let m1: Int64
    init(a: Int64, b: Int64) {
        m1 = a
        if (b > 0) {
            m1 = a * b  // OK: immutable variable can be reassigned in constructor.
        }
    }
}

Global and Static Variables

Global variables are defined at the top level. Static variables include those defined in class or struct. The initialization of global and static variables must meet the following rules:

Global variables must be initialized immediately when they are declared. Otherwise, an error is reported. That is, an initialization expression must be provided during declaration.
Static variables must be initialized immediately when they are declared. They can be initialized in the same way as global variables or in the static initializer. For details, see [Static Initializers].
- Note that static variables cannot be initialized in other static variables.
```
class Foo {
    static let x: Int64
    static let y = (x = 1) // it's forbidden
}
```
The initialization expression e cannot depend on uninitialized global variables or static variables. The compiler performs conservative analysis. If e may access an uninitialized global variable or static variable, an error is reported. The detailed analysis depends on the implementation of the compiler and is not specified in the specification.

The initialization time and sequence of global/static variables are as follows:

All global/static variables are initialized before main (program entry).
For global/static variables declared in the same file, the initialization sequence is from top to bottom based on the variable declaration sequence. If a static initializer is used, the initialization sequence is based on the initialization sequence rules of the static initializer. For details, see [Static Initializers].
For global/static variables declared in different files of the same package or different packages, their initialization sequence depends on the dependency between the global/static variables in the files or packages.
If the global/static variables in the A.cj file are directly or indirectly accessed during the initialization of the global/static variables in the B.cj file, the initialization of the global/static variables in the B.cj file depends on that of the global/static variables in the A.cj file. The reverse is also true.
If the initialization processes of global/static variables in two or more files depend on each other, a cyclic dependency is formed, and the compiler reports an error. If there is no dependency between these files, the initialization sequence is uncertain and determined by the compiler implementation.
If the variable, function, or type defined in the A package is directly or indirectly used in the B package, the B package depends on the A package. The reverse is also true.

If there are cyclic dependencies between packages, the compiler reports an error. If there is no dependency between packages, their initialization sequence is uncertain and determined by the compiler implementation.

/* The initialization of the global variable cannot depend on the global  
   variables defined in other files of the same package. */
// a.cj
let x = 2
let y = z       // OK, b.cj does not depend on this file directly or indirectly.
let a = x       // OK.
let c = A()

/* c.f is an open function, the compiler cannot statically determine whether the
   function meets the initialization rules of global variables, and an error may 
   be reported. */
let d = c.f()

open class A {
    // static var x = A.z     // Error, A.z is used before its initialization.
    // static var y = B.f     // Error, B.f is used before its initialization.
    static var z = 1
    public open func f(): Int64 {
        return 77
    }
}

class B {
    static var e = A.z    // OK.
    static var f = x      // OK.     
}

// b.cj      
let z = 10      
// let y = 10   // Error, y is already defined in a.cj.

// main.cj
main(): Int64 {
    print("${x}")
    print("${y}")
    print("${z}")
    return 1
}

const Variables

For details, see [Constant Evaluation].

Modifiers

Cangjie provides many modifiers, which are classified into the following types:

Access modifier
Non-access modifier

Modifiers are usually placed at the beginning of a definition to indicate that the definition has certain features.

Access Modifier

For details, see Access Modifiers in [Packages and Modules].

Non-access Modifier

Cangjie provides many non-access modifiers to support various functionalities.

open: The instance member can be overridden by a subclass, or the class can be inherited by a subclass. For details, see [Classes].
override: The member overrides that of the superclass. For details, see [Classes].
redef: The static member redefines that of the superclass. For details, see [Classes].
static: the member is a static member and cannot be accessed through an instance object. For details, see [Classes].
abstract: The class is an abstract class. For details, see [Classes].
foreign: The member is an external member. For details, see [Cross-Language Interoperability].
unsafe: The context for interoperability with the C language. For details, see [Cross-Language Interoperability].
sealed: The class or interface can be inherited or implemented only in the current package. For details, see [Classes].
mut: The member has variable semantics. For details, see [Functions].

For details about these modifiers, see the corresponding sections.