Expressions

In many programming languages, an expression consists of operands and operators, always implying a calculation process with a result. If an expression contains only operands, the result is the operand itself. If it includes operators, the result is the value derived from applying the operators to the operands. Such expressions are also known as arithmetic expressions.

In Cangjie, the definition of expressions is both simplified and extended: all evaluable language elements are considered expressions. This means that, in addition to arithmetic expressions, Cangjie includes conditional expressions, loop expressions, and try expressions, all of which can be evaluated and used as values—such as function arguments or initial variable assignments. Furthermore, as a strongly-typed language, every Cangjie expression is not only evaluable but also has a well-defined type, which is the same as that of its result.

The expressions of Cangjie will be described in the subsequent sections. This section describes the most commonly used condition expressions, loop expressions, and some control transfer expressions (break and continue).

The execution process of programs involves only three basic structures: sequential structure, branching structure, and loop structure. The branching structure and loop structure are achieved through instructions that control jumps in the current sequential execution flow, allowing programs to express more complex logic. In Cangjie, the language elements used to control the execution flow are condition expressions and loop expressions.

In Cangjie, condition expressions are classified into if and if-let expressions. Their values and types are determined based on application scenarios. There are four types of loop expressions: for-in, while, do-while, and while-let. Their type is always Unit, which as its unique value (). Among the four types, if-let and while-let are related to pattern matching. For details, see if-let Expression and while-let Expression. This section focuses on the other two types of expressions.

In Cangjie, a group of expressions enclosed in braces ({}) is called a code block. A code block represents a sequential flow of execution, with expressions being evaluated in the order they appear. If the code block contains one or more expressions, its value and type are determined by the last expression. If the code block is empty, its type is Unit, so its value is ().

Note:

A code block itself is not an expression and cannot be used independently. It depends on the execution and evaluation of functions, condition expressions, and loop expressions.

if Expressions

The basic form of if expressions is as follows:

if (Condition) {
  Branch 1
} else {
  Branch 2
}

In the preceding, "Condition" is a Boolean expression, and "Branch 1" and "Branch 2" are code blocks. An if expression is executed as follows:

  1. Evaluate the "Condition" expression. If the value is true, go to step 2. If the value is false, go to step 3.
  2. Perform "Branch 1" and go to step 4.
  3. Perform "Branch 2" and go to step 4.
  4. Continue to execute the code following the if expression.

In some scenarios, we may only focus on what to do when the condition is met, so else and the corresponding code block can be omitted.

The following program demonstrates the basic usage of if expressions:

import std.random.*

main() {
    let number: Int8 = Random().nextInt8()
    println(number)
    if (number % 2 == 0) {
        println("even")
    } else {
        println("odd")
    }
}

In this program, we use the random package of the standard library to generate a random integer. We then use the if expression to determine whether the integer can be exactly divided by 2, and print "even" or "odd" in different condition branches.

Cangjie is strongly typed. The condition of the if expression can only be of the Boolean type and cannot be an integer or floating-point number. Different from the C language, the if expression does not select a branch based on whether the condition value is 0. For example, the following program will report a compilation error:

main() {
    let number = 1
    if (number) { // Error, mismatched types
        println("nonzero")
    }
}

In many scenarios, when a condition is not met, one or more conditions may need to be determined before the corresponding action is executed. A new if expression can be followed after else. In this way, multi-level condition judgment and branch execution are supported. For example:

import std.random.*

main() {
    let speed = Random().nextFloat64() * 20.0
    println("${speed} km/s")
    if (speed > 16.7) {
        Println("Third cosmic speed, meeting on the magpie bridge across the Milky Way")
    } else if (speed > 11.2) {
        Println("Second cosmic speed, flying to the moon")
    } else if (speed > 7.9) {
        println("First cosmic speed, riding clouds")
    } else {
        println("Down to earth, looking up to the starry sky")
    }
}

The value and type of an if expression are determined based on the usage form and scenario.

  • When an if expression containing one or more else branches is evaluated, the type of the if expression needs to be determined based on the evaluation context.

    • If the context requires that the value type be T, the type of the code block in each branch in the if expression must be the subtype of T. In this case, the type of the if expression is determined as T. If the subtype constraint is not met, an error is reported during compilation.
    • If the context does not have specific type requirements, the type of the if expression is the minimum common parent type of the code block type of each branch. If the minimum common parent type does not exist, an error is reported during compilation.

    If the compilation is successful, the value of the if expression is the value of the code block of the executed branch.

  • If the if expression containing one or more else branches is not evaluated, developers may only wish to perform different operations in different branches and do not pay attention to the value and type of the last expression in each branch. In this scenario, the if expression type is Unit, the value is (), and each branch is not subject to the preceding type check.

  • For an if expression that does not contain any else branches, the if branch may not be executed. Therefore, the type of such if expressions is Unit and the value is ().

For example, the following program calculates a simple analog-to-digital conversion process based on if expression evaluation:

main() {
    let zero: Int8 = 0
    let one: Int8 = 1
    let voltage = 5.0
    let bit = if (voltage < 2.5) {
        zero
    } else {
        one
    }
}

In the preceding program, the if expression is used as the initial value of variable definition. Because the variable bit is not marked with a type and its type needs to be deduced from the initial value, the type of the if expression is the smallest common parent type of the code block types of the two branches. According to the preceding description about code blocks, the code block type of both branches is Int8. Therefore, the type of the if expression is Int8, and the value is the value of the executed branch, that is, the value of the else branch code block. Therefore, the type of the variable bit is Int8, and the value is 1.

while Expressions

The basic form of while expressions is as follows:

while (Condition) {
  Loop body
}

In the preceding, "Condition" is a Boolean expression, and "Loop body" is a code block. A while expression is executed as follows:

  1. Evaluate the "Condition" expression. If the value is true, go to step 2. If the value is false, go to step 3.
  2. Execute "Loop body" and go to step 1.
  3. End the loop and continue to execute the code following the while expression.

For example, the following program uses a while expression to approximate the square root of 2 based on the bisection method:

main() {
    var root = 0.0
    var min = 1.0
    var max = 2.0
    var error = 1.0
    let tolerance = 0.1 ** 10

    while (error ** 2 > tolerance) {
        root = (min + max) / 2.0
        error = root ** 2 - 2.0
        if (error > 0.0) {
            max = root
        } else {
            min = root
        }
    }
    println("The square root of 2 is approximately equal to: ${root}")
}

Running the preceding program produces the output:

The square root of 2 is approximately equal to: 1.414215.

do-while Expressions

The basic form of do-while expressions is as follows:

do {
  Loop body
} while (Condition)

In the preceding, "Condition" is a Boolean expression, and "Loop body" is a code block. A do-while expression is executed as follows:

  1. Execute "Loop body" and go to step 2.
  2. Evaluate the "Condition" expression. If the value is true, go to step 1. If the value is false, go to step 3.
  3. End the loop and continue to execute the code following the do-while expression.

For example, the following program uses a do-while expression to approximate the value of pi based on the Monte Carlo algorithm:

import std.random.*

main() {
    let random = Random()
    var totalPoints = 0
    var hitPoints = 0

    do {
        // Randomly select a point in the square ((0, 0), (1, 1)).
        let x = random.nextFloat64()
        let y = random.nextFloat64()
        // Determine whether it falls on an inscribed circle in the square.
        if ((x - 0.5) ** 2 + (y - 0.5) ** 2 < 0.25) {
            hitPoints++
        }
        totalPoints++
    } while (totalPoints < 1000000)

    let pi = 4.0 * Float64(hitPoints) / Float64(totalPoints)
    println("The value of pi is approximately equal to: ${pi}")
}

Running the preceding program produces the output:

The value of pi is approximately equal to: 3.141872

NOTE

The algorithm involves random numbers. Therefore, the output value may be different each time the program is executed, but the output value is approximately equal to 3.14.

for-in Expressions

A for-in expression can traverse type instances that extend the iterator interface Iterable<T>. The basic form of for-in expressions is as follows:

for (Iteration variable in Sequence) {
  Loop body
}

In the preceding, "Loop body" is a code block. "Iteration variable" is a single identifier or a tuple consisting of multiple identifiers. It is used to bind the data pointed to by the iterator in each round of traversal and can be used as a local variable in "Loop body". "Sequence" is an expression that is evaluated only once. The traversal is performed based on the value of the expression. The type of "Sequence" must extend the iterator interface Iterable<T>. A for-in expression is executed as follows:

  1. Evaluate the "sequence" expression, take its value as the traversal object, and initialize the iterator of the traversal object.
  2. Update the iterator. If the iterator terminates, go to step 4. Otherwise, go to step 3.
  3. Bind the data pointed to by the current iterator to "Iteration variable", execute the loop body, and go to step 2.
  4. End the loop and continue to execute the code following the for-in expression.

NOTE

The Iterable<T> interface has been extended by the built-in interval and array types.

For example, the following program uses a for-in expression to traverse the array noumenonArray consisting of the 12 Chinese earthly branches. The heavenly stem and earthly branch for each month of the lunar year 2024 are output.

main() {
    let metaArray = [r'甲', r'乙', r'丙', r'丁', r'戊',
        r'己', r'庚', r'辛', r'壬', r'癸']
    let noumenonArray = [r'寅', r'卯', r'辰', r'巳', r'午', r'未',
        r'申', r'酉', r'戌', r'亥', r'子', r'丑']
    let year = 2024
    // Heavenly stem index corresponding to the first month of the year
    let metaOfYear = ((year % 10) + 10 - 4) % 10
    // Heavenly stem index corresponding to the first month of the year
    var index = (2 * metaOfYear + 3) % 10 - 1
    println("Heavenly stem and earthly branch for each month of the lunar year 2024:")
    for (noumenon in noumenonArray) {
        print("${metaArray[index]}${noumenon} ")
        index = (index + 1) % 10
    }
}```

Running the program outputs:

```text
Heavenly stem and earthly branch for each month of the lunar year 2024:
丙寅 丁卯 戊辰 己巳 庚午 辛未 壬申 癸酉 甲戌 乙亥 丙子 丁丑

Traversing the Interval Type

A for-in expression can traverse interval type instances. For example:

main() {
    var sum = 0
    for (i in 1..=100) {
        sum += i
    }
    println(sum)
}

Running the program outputs:

5050

For details about the interval type, see Range.

Traversing Sequences Composed of Tuples

If the elements of a sequence are of the tuple type, the iteration variable can be written in the tuple form when the for-in expression is used for traversal to deconstruct the sequence elements. For example:

main() {
    let array = [(1, 2), (3, 4), (5, 6)]
    for ((x, y) in array) {
        println("${x}, ${y}")
    }
}

Run the preceding program and the following information will be displayed:

1, 2
3, 4
5, 6

Iteration Variables Cannot Be Modified

In the loop body of the for-in expression, the iteration variable cannot be modified. For example, the following program reports an error during compilation:

main() {
    for (i in 0..5) {
        i = i * 10 // Error, cannot assign to value which is an initialized 'let' constant
        println(i)
    }
}

Using a Wildcard (_) to Replace the Iteration Variable

In some application scenarios, if you need to perform certain operations cyclically without using an iteration variable, you can use a wildcard _ to replace the iteration variable. For example:

main() {
    var number = 2
    for (_ in 0..5) {
        number *= number
    }
    println(number)
}

Running the program outputs:

4294967296

Note:

In this scenario, if you use a common identifier to define an iteration variable, the "unused variable" warning will be generated during compilation. You can use the wildcard _ to prevent receiving this warning.

where Condition

In some loop traversal scenarios, you may need to skip iteration variables with specific values and enter the next loop. Although you can use the if expression and continue expression to implement this logic in the loop body, the where keyword can be used to introduce a Boolean expression after the traversed sequence. In this way, the expression is evaluated each time before the loop body is executed. If the value is true, the loop body is executed, otherwise, the next cycle starts. The following is an example:

main() {
    for (i in 0..8 where i % 2 == 1) { // The loop body is executed only if i is an odd number.
        println(i)
    }
}

Run the preceding program and the following information will be displayed:

1
3
5
7

break and continue Expressions

In programs with loop structures, there are situations where it is necessary to exit a loop early or skip an iteration based on specific conditions. To address this, Cangjie introduces the break and continue expressions, which can be used within the body of a loop. The break expression immediately terminates the loop and transfers control to the code following the loop. On the other hand, the continue expression skips the current iteration and proceeds to the next iteration of the loop. Both break and continue expressions have the Nothing type.

For example, the following program uses the for-in and break expressions to find the first number that is divisible by 5 in a given integer array:

main() {
    let numbers = [12, 18, 25, 36, 49, 55]
    for (number in numbers) {
        if (number % 5 == 0) {
            println(number)
            break
        }
    }
}

When for-in iterates to the third number (25) of the numbers array, println followed by break in the if branch are executed, because 25 is divisible by 5. break terminates the for-in loop, and the subsequent numbers in numbers are not traversed. Therefore, the preceding program will output the following information:

25

The following program uses the for-in and continue expressions to print the odd numbers in a given integer array:

main() {
    let numbers = [12, 18, 25, 36, 49, 55]
    for (number in numbers) {
        if (number % 2 == 0) {
            continue
        }
        println(number)
    }
}

In the loop iteration, when the value of number is even, the continue expression is triggered. This causes the current iteration to end immediately, and the loop proceeds to the next iteration. As a result, the println statement is not executed for even values of number. Therefore, the program will output the following:

25
49
55