Identifiers
In Cangjie, developers can assign names to various program elements, known as identifiers. Identifiers are categorized into two types: common identifiers and raw identifiers, each with its own set of naming rules.
A common identifier cannot be a keyword in Cangjie. It must be formed from one of the following two types of character sequences:
- Starting with an XID_Start character, followed by XID_Continue characters of any length.
- Starting with
_
, followed by at least one XID_Continue character.
For definitions of XID_Start and XID_Continue, see Unicode Standard. Cangjie uses the Unicode standard 15.0.0.
In Cangjie, all identifiers are identified as Normalization Form C (NFC). If two identifiers are equal after normalization to NFC, they are considered to be the same.
For example, all of the following strings are valid common identifiers:
abc
_abc
abc_
a1b2c3
a_b_c
a1_b2_c3
仓颉
__こんにちは
The following character strings are invalid common identifiers:
ab&c // The invalid character "&" is used.
3abc // An identifier cannot start with a digit.
while // An identifier cannot be a keyword in Cangjie.
A raw identifier is a common identifier or a Cangjie keyword enclosed in backticks (`). It is primarily used when a keyword needs to be treated as an identifier.
For example, all of the following strings are valid raw identifiers:
`abc`
`_abc`
`a1b2c3`
`if`
`while`
`à֮̅̕b`
In each of the following character strings, the part enclosed in backticks is an invalid common identifier. Therefore, they are all invalid raw identifiers.
`ab&c`
`3abc`