Cangjie-C Interoperability
To ensure compatibility with existing ecosystems, Cangjie supports calling C language functions and allows C language functions to call Cangjie functions.
Cangjie Calling C Function
To call a C function in Cangjie, you need to declare the function using the @C and foreign modifiers in Cangjie, but@C can be omitted if foreign is present.
For example, if you want to call the rand and printf functions of C, their function signatures are as follows:
// stdlib.h
int rand();
// stdio.h
int printf (const char *fmt, ...);
In this case, a manner of calling the two functions in Cangjie is as follows:
// declare the function by `foreign` keyword, and omit `@C`
foreign func rand(): Int32
foreign func printf(fmt: CString, ...): Int32
main() {
// call this function by `unsafe` block
let r = unsafe { rand() }
println("random number ${r}")
unsafe {
var fmt = LibC.mallocCString("Hello, No.%d\n")
printf(fmt, 1)
LibC.free(fmt)
}
}
Note that:
foreign: used to modify a function declaration, indicating that the function is an external function. A function modified byforeigncan have only a function declaration but not a function implementation.foreign: functions declared withforeignmust have parameters and return types that comply with the mapping between C and Cangjie data types. For details, see [Type Mapping] (./cangjie-c.md#type-mapping).- Functions on the C side may cause unsafe operations. Therefore, when calling the function modified by
foreign, it must be wrapped by theunsafeblock. Otherwise, a compilation error occurs. - The
foreignkeyword modified by@Ccan only be used to modify function declarations and cannot be used to modify other declarations. Otherwise, a compilation error occurs. @Csupports only theforeignfunction, non-generic functions in thetop-levelscope, andstructtypes.- The
foreignfunction does not support named parameters or default parameter values. Theforeignfunction allows variable-length parameters, which are expressed by...and can be used only at the end of the parameter list. Variable-length parameters must meet theCTypeconstraint, but they do not need to be of the same type. - Although Cangjie (CJNative backend) provides the stack capacity expansion capability, it cannot detect the actual stack size used by the C-side function. Therefore, after the FFI calls the C-side function, stack overflow may still occur. You need to modify the
cjStackSizeconfiguration based on actual situation.
Sample codes for invalid foreign declarations are as follows:
foreign func rand(): Int32 { // compiler error
return 0
}
@C
foreign var a: Int32 = 0 // compiler error
@C
foreign class A{} // compiler error
@C
foreign interface B{} // compiler error
CFunc
In Cangjie, CFunc refers to the function that can be called by C language code. There are three forms:
foreignfunction modified by@C.- Cangjie function modified by
@C. - The lambda expression of the
CFunctype is different from the common lambda expression. TheCFunclambda expression cannot capture variables.
// Case 1
foreign func free(ptr: CPointer<Int8>): Unit
// Case 2
@C
func callableInC(ptr: CPointer<Int8>) {
print("This function is defined in Cangjie.")
}
// Case 3
let f1: CFunc<(CPointer<Int8>) -> Unit> = { ptr =>
print("This function is defined with CFunc lambda.")
}
The type of functions declared or defined in the preceding three forms is CFunc<(CPointer<Int8>) -> Unit>. CFunc corresponds to the function pointer type of the C language. This type is a generic type. Its generic parameter indicates the type of the CFunc input parameter and return value. The usage is as follows:
foreign func atexit(cb: CFunc<() -> Unit>): Int32
Similar to the foreign function, the parameters and return types of other CFunc functions must meet the CType constraint and do not support named parameters and default parameter values.
When CFunc is called in Cangjie code, it must be in the unsafe context.
The Cangjie language can convert a variable of the CPointer<T> type to a specific CFunc. The generic parameter T of CPointer can be any type that meets the CType constraint. The method is as follows:
main() {
var ptr = CPointer<Int8>()
var f = CFunc<() -> Unit>(ptr)
unsafe { f() } // core dumped when running, because the pointer is nullptr.
}
Note:
It is dangerous to forcibly convert a pointer to
CFuncand call a function. You need to ensure that the pointer points to an available function address. Otherwise, a runtime error occurs.
inout Parameter
When CFunc is called in Cangjie, the parameter can be modified by the inout keyword to form a reference value transfer expression. In this case, the parameter is transferred by reference. The type of the referenced value transfer expression is CPointer<T>, where T is the type of the expression modified by inout.
The value transfer expression by reference has the following restrictions:
- It can only be used to call
CFunc. - The type of the modifier object must meet the
CTypeconstraint, but cannot beCString. - The modifier object cannot be defined by
letor temporary variables such as literals, input parameters, and values of other expressions. - The pointer transferred to the C side by using the value transfer expression on the Cangjie side is valid only during function calling. That is, in this scenario, the C side should not save the pointer for future use.
Variables modified by inout can be variables defined in top-level scope, local variables, and member variables in struct types, but cannot be directly or indirectly derived from instance member variables of class types.
The following is an example:
foreign func foo1(ptr: CPointer<Int32>): Unit
@C
func foo2(ptr: CPointer<Int32>): Unit {
let n = unsafe { ptr.read() }
println("*ptr = ${n}")
}
let foo3: CFunc<(CPointer<Int32>) -> Unit> = { ptr =>
let n = unsafe { ptr.read() }
println("*ptr = ${n}")
}
struct Data {
var n: Int32 = 0
}
class A {
var data = Data()
}
main() {
var n: Int32 = 0
unsafe {
foo1(inout n) // OK
foo2(inout n) // OK
foo3(inout n) // OK
}
var data = Data()
var a = A()
unsafe {
foo1(inout data.n) // OK
foo1(inout a.data.n) // Error, n is derived indirectly from instance member variables of class A
}
}
Note:
When the macro extension feature is used, the
inoutparameter feature cannot be used in the macro definition.
unsafe
Many unsafe factors of C are also introduced during the introduction of interoperability with C language. Therefore, the unsafe keyword is used in Cangjie to identify unsafe behaviors of cross-C calling.
The unsafe keyword is described as follows:
unsafecan be used to modify functions, expressions, or a scope.- Functions modified by
@Cmust be called in theunsafecontext. - When
CFuncis called, it must be used in theunsafecontext. - When a
foreignfunction is called in Cangjie, the call must be in theunsafecontext. - When the called function is modified by
unsafe, the call must be in theunsafecontext.
The method is as follows:
foreign func rand(): Int32
@C
func foo(): Unit {
println("foo")
}
var foo1: CFunc<() -> Unit> = { =>
println("foo1")
}
main(): Int64 {
unsafe {
rand() // Call foreign func.
foo() // Call @C func.
foo1() // Call CFunc var.
}
0
}
Note that the common lambda expression cannot transfer the unsafe attribute. When an unsafe lambda expression escapes, it can be directly called without any compilation error in the unsafe context. To call an unsafe function in a lambda expression, you are advised to call the function in an unsafe block. For details, see the following case:
unsafe func A(){}
unsafe func B(){
var f = { =>
unsafe { A() } // Avoid calling A() directly without unsafe in a normal lambda.
}
return f
}
main() {
var f = unsafe{ B() }
f()
println("Hello World")
}
Calling Conventions
Function calling conventions describe how the caller and callee call functions (for example, how parameters are transferred and who clears the stack). The caller and callee must use the same calling conventions to run properly. The Cangjie programming language uses @CallingConv to indicate various calling conventions. The supported calling conventions are as follows:
- CDECL: The default calling conventions used by the C compiler of Clang on different platforms.
- STDCALL: The calling conventions used by the Win32 API.
If a C function is called using the C language interoperability mechanism, the default CDECL calling conventions is used when no calling convention is specified. The following is an example of calling the rand function in the C standard library:
@CallingConv[CDECL] // Can be omitted in default.
foreign func rand(): Int32
main() {
println(unsafe { rand() })
}
@CallingConv can only be used to modify the foreign block, a single foreign function, and a CFunc function in the top-level scope. When @CallingConv modifies the foreign block, the same @CallingConv modification is added to each function in the foreign block.
Type Mapping
Base Types
The Cangjie and C languages support the mapping of basic data types. The general principles are as follows:
- The Cangjie type does not contain references pointing to the managed memory.
- The Cangjie type and the C type have the same memory layout.
For example, some basic type mapping relationships are as follows:
| Cangjie Type | C Type | Size (byte) |
|---|---|---|
Unit | void | 0 |
Bool | bool | 1 |
UInt8 | char | 1 |
Int8 | int8_t | 1 |
UInt8 | uint8_t | 1 |
Int16 | int16_t | 2 |
UInt16 | uint16_t | 2 |
Int32 | int32_t | 4 |
UInt32 | uint32_t | 4 |
Int64 | int64_t | 8 |
UInt64 | uint64_t | 8 |
IntNative | ssize_t | platform dependent |
UIntNative | size_t | platform dependent |
Float32 | float | 4 |
Float64 | double | 8 |
Note:
Due to the uncertainty of the
intandlongtypes on different platforms, programmers need to specify the corresponding Cangjie programming language type. In C interoperability scenarios, similar to the C language, theUnittype can only be used as the return type inCFuncand the generic parameter ofCPointer.
Cangjie also supports the mapping with the structures and pointer types of the C language.
Structure
For the structure type, Cangjie uses struct modified by @C. For example, the C language has the following structure:
typedef struct {
long long x;
long long y;
long long z;
} Point3D;
The corresponding Cangjie type can be defined as follows:
@C
struct Point3D {
var x: Int64 = 0
var y: Int64 = 0
var z: Int64 = 0
}
If the C language contains such a function:
Point3D addPoint(Point3D p1, Point3D p2);
Accordingly, the function can be declared in Cangjie as follows:
foreign func addPoint(p1: Point3D, p2: Point3D): Point3D
The struct modified by @C must meet the following requirements:
- The type of a member variable must meet the
CTypeconstraint. interfacetypes cannot be implemented or extended.- To be used as an associated value type of
enumis not allowed. - Closure capture is not allowed.
- Generic parameters are not allowed.
The struct modified by @C automatically meets the CType constraint.
Pointer
For the pointer type, Cangjie provides the CPointer<T> type to correspond to the pointer type on the C side. The generic parameter T must meet the CType constraint. For example, the signature of the malloc function in C is as follows:
void* malloc(size_t size);
In Cangjie, it can be declared as follows:
foreign func malloc(size: UIntNative): CPointer<Unit>
The CPointer can be used for read and write, offset calculation, null check, and pointer conversion. For details about the API, see "Cangjie Programming Language Library API". Read, write, and offset calculation are unsafe behaviors. When invalid pointers call these functions, undefined behaviors may occur. These unsafe functions need to be called in unsafe blocks.
The following is an example of using CPointer:
foreign func malloc(size: UIntNative): CPointer<Unit>
foreign func free(ptr: CPointer<Unit>): Unit
@C
struct Point3D {
var x: Int64
var y: Int64
var z: Int64
init(x: Int64, y: Int64, z: Int64) {
this.x = x
this.y = y
this.z = z
}
}
main() {
let p1 = CPointer<Point3D>() // create a CPointer with null value
if (p1.isNull()) { // check if the pointer is null
print("p1 is a null pointer")
}
let sizeofPoint3D: UIntNative = 24
var p2 = unsafe { malloc(sizeofPoint3D) } // malloc a Point3D in heap
var p3 = unsafe { CPointer<Point3D>(p2) } // pointer type cast
unsafe { p3.write(Point3D(1, 2, 3)) } // write data through pointer
let p4: Point3D = unsafe { p3.read() } // read data through pointer
let p5: CPointer<Point3D> = unsafe { p3 + 1 } // offset of pointer
unsafe { free(p2) }
}
Cangjie supports forcible type conversion between CPointer. The generic parameter T of CPointer before and after the conversion must meet the constraints of CType. The method is as follows:
main() {
var pInt8 = CPointer<Int8>()
var pUInt8 = CPointer<UInt8>(pInt8) // CPointer<Int8> convert to CPointer<UInt8>
0
}
Cangjie can convert a variable of the CFunc type to a specific CPointer. The generic parameter T of CPointer can be any type that meets the CType constraint. The method is as follows:
foreign func rand(): Int32
main() {
var ptr = CPointer<Int8>(rand)
0
}
Note:
It is safe to forcibly convert a
CFuncto a pointer. However, noreadorwriteoperation should be performed on the converted pointer, which may cause runtime errors.
Array
Cangjie uses the VArray type to map to the array type of C. The VArray type can be used as a function parameter or @C struct member. When the element type T in VArray<T, $N> meets the CType constraint, the VArray<T, $N> type also meets the CType constraint.
As a function parameter type:
When VArray is used as a parameter of CFunc, the function signature of CFunc can only be of the CPointer<T> or the VArray<T, $N> type. If the parameter type in the function signature is VArray<T, $N>, the parameter is transferred in the CPointer<T> format.
The following is an example of using VArray as a parameter:
foreign func cfoo1(a: CPointer<Int32>): Unit
foreign func cfoo2(a: VArray<Int32, $3>): Unit
The corresponding C-side function definition may be as follows:
void cfoo1(int *a) { ... }
void cfoo2(int a[3]) { ... }
When calling CFunc, you need to use inout to modify the variable of the VArray type.
var a: VArray<Int32, $3> = [1, 2, 3]
unsafe {
cfoo1(inout a)
cfoo2(inout a)
}
VArray cannot be used as the return value type of CFunc.
As a member of @C struct:
When VArray is a member of @C struct, its memory layout is the same as the structure layout on the C side. Ensure that the declaration length and type on the Cangjie side are the same as those on the C side.
struct S {
int a[2];
int b[0];
}
In Cangjie, the following structure can be declared to correspond to the C code:
@C
struct S {
var a = VArray<Int32, $2>(item: 0)
var b = VArray<Int32, $0>(item: 0)
}
Note:
In the C language, the last field of a structure can be an array whose length is not specified. The array is called a flexible array. Cangjie does not support the mapping of structures that contain flexible arrays.
Character String
Particularly, for a string type in the C language, a CString type is designed in Cangjie. To simplify operations on C language strings, CString provides the following member functions:
init(p: CPointer<UInt8>): constructs a CString through CPointer.func getChars(): obtaining the address of a character string. The type isCPointer<UInt8>.func size(): Int64: calculates the length of the character string.func isEmpty(): Bool: checks that the length of the string is 0. If the pointer of the string is null,trueis returned.func isNotEmpty(): Bool: checks that the length of the string is not 0. If the pointer of the string is null,falseis returned.func isNull(): Bool: checks whether the pointer of the character string is null.func startsWith(str: CString): Bool: checks whether the character string starts with str.func endsWith(str: CString): Bool: checks whether the character string ends with str.func equals(rhs: CString): Bool: checks whether the character string is equal to rhs.func equalsLower(rhs: CString): Bool: checks whether the character string is equal to rhs. The value is case-insensitive.func subCString(start: UInt64): CString: truncates a substring from start and stores the returned substring in the newly allocated space.func subCString(start: UInt64, len: UInt64): CString: truncates a substring whose length is len from start and stores the returned substring in the newly allocated space.func compare(str: CString): Int32: returns a result which is the same asstrcmp(this, str)in the C language compared with str.func toString(): String: constructs a new String object using this string.func asResource(): CStringResource: obtains the resource type of CString.
In addition, the mallocCString function in LibC can be called to convert String to CString. After the conversion is complete, CString needs to be released.
The following is an example of using CString:
foreign func strlen(s: CString): UIntNative
main() {
var s1 = unsafe { LibC.mallocCString("hello") }
var s2 = unsafe { LibC.mallocCString("world") }
let t1: Int64 = s1.size()
let t2: Bool = s2.isEmpty()
let t3: Bool = s1.equals(s2)
let t4: Bool = s1.startsWith(s2)
let t5: Int32 = s1.compare(s2)
let length = unsafe { strlen(s1) }
unsafe {
LibC.free(s1)
LibC.free(s2)
}
}
sizeOf/alignOf
Cangjie also provides the sizeOf and alignOf functions to obtain the memory usage and memory alignment values (in bytes) of the preceding C interoperability types. The function declaration is as follows:
public func sizeOf<T>(): UIntNative where T <: CType
public func alignOf<T>(): UIntNative where T <: CType
Example:
@C
struct Data {
var a: Int64 = 0
var b: Float32 = 0.0
}
main() {
println(sizeOf<Data>())
println(alignOf<Data>())
}
If you run the command on a 64-bit computer, the following information is displayed:
16
8
CType
In addition to the types that are mapped to C-side types provided in "Type Mapping", Cangjie provides a CType interface. The interface does not contain any method and can be used as the parent type of all types supported by C interoperability for easy use in generic constraints.
Note that:
- The
CTypeinterface itself does not meet theCTypeconstraint. - The
CTypeinterface cannot be inherited or extended. - The
CTypeinterface does not break the usage restrictions of subtypes.
The following is an example of using CType:
func foo<T>(x: T): Unit where T <: CType {
match (x) {
case i32: Int32 => println(i32)
case ptr: CPointer<Int8> => println(ptr.isNull())
case f: CFunc<() -> Unit> => unsafe { f() }
case _ => println("match failed")
}
}
main() {
var i32: Int32 = 1
var ptr = CPointer<Int8>()
var f: CFunc<() -> Unit> = { => println("Hello") }
var f64 = 1.0
foo(i32)
foo(ptr)
foo(f)
foo(f64)
}
The result is as follows:
1
true
Hello
match failed
C Calling Cangjie Functions
Cangjie provides the CFunc type to correspond to the function pointer type on the C side. The function pointer on the C side can be transferred to Cangjie, and Cangjie can also construct and transfer a variable corresponding to the function pointer on the C side.
Assume that a C library API is as follows:
typedef void (*callback)(int);
void set_callback(callback cb);
Correspondingly, the function in Cangjie can be declared as follows:
foreign func set_callback(cb: CFunc<(Int32) -> Unit>): Unit
Variables of the CFunc type can be transferred from the C side or constructed on the Cangjie side. There are two methods to construct the CFunc type on the Cangjie side. One is to use the function modified by @C, and the other is to use the closure marked as CFunc.
The function modified by @C indicates that its function signature meets the calling rules of C and the definition is still written in Cangjie. The function modified by foreign is defined on the C side.
Note:
For functions modified by
foreignand@C, which are namedCFunc, you are advised not to useCJ_(case-insensitive) as the prefix. Otherwise, the names may conflict with internal compiler symbols such as the standard library and runtime, resulting in undefined behavior.
Example:
@C
func myCallback(s: Int32): Unit {
println("handle ${s} in callback")
}
main() {
// the argument is a function qualified by `@C`
unsafe { set_callback(myCallback) }
// the argument is a lambda with `CFunc` type
let f: CFunc<(Int32) -> Unit> = { i => println("handle ${i} in callback") }
unsafe { set_callback(f) }
}
Assume that the library compiled by the C function is "libmyfunc.so". You need to run the cjc -L. -lmyfunc test.cj -o test.out compilation command to enable the Cangjie compiler to link to the library. Finally, the desired executable program can be generated.
In addition, when compiling the C code, enable the -fstack-protector-all/-fstack-protector-strong stack protection option. By default, the Cangjie code has the overflow check and stack protection functions. After the C code is introduced, the security of overflows in unsafe blocks needs to be ensured.
Compiler Options
To use C interoperability, you need to manually link the C library. The Cangjie compiler provides corresponding options.
-
--library-path <value>,-L <value>,-L<value>: specifies the directory of the library file to be linked.--library-path <value>: adds the specified path to the library file search paths of the linker. The path specified by the environment variableLIBRARY_PATHwill also be added to the library file search paths of the linker. The path specified by--library-pathhas a higher priority than the path specified byLIBRARY_PATH. -
--library <value>,-l <value>,-l<value>: specifies the library file to be linked.The specified library file is directly transferred to the linker. The library file name must be in the
lib[arg].[extension]format.
For details about all compilation options supported by the CJC compiler, see [CJC Compilation Options] (../Appendix/compile_options_OHOS.md).
Example
Assume that there is a C library libpaint.so whose header file is as follows:
include <stdint.h>
typedef struct {
int64_t x;
int64_t y;
} Point;
typedef struct {
int64_t x;
int64_t y;
int64_t r;
} Circle;
int32_t DrawPoint(const Point* point);
int32_t DrawCircle(const Circle* circle);
The sample code for using the C library in the Cangjie code is as follows:
// main.cj
foreign {
func DrawPoint(point: CPointer<Point>): Int32
func DrawCircle(circle: CPointer<Circle>): Int32
func malloc(size: UIntNative): CPointer<Int8>
func free(ptr: CPointer<Int8>): Unit
}
@C
struct Point {
var x: Int64 = 0
var y: Int64 = 0
}
@C
struct Circle {
var x: Int64 = 0
var y: Int64 = 0
var r: Int64 = 0
}
main() {
let SIZE_OF_POINT: UIntNative = 16
let SIZE_OF_CIRCLE: UIntNative = 24
let ptr1 = unsafe { malloc(SIZE_OF_POINT) }
let ptr2 = unsafe { malloc(SIZE_OF_CIRCLE) }
let pPoint = CPointer<Point>(ptr1)
let pCircle = CPointer<Circle>(ptr2)
var point = Point()
point.x = 10
point.y = 20
unsafe { pPoint.write(point) }
var circle = Circle()
circle.r = 1
unsafe { pCircle.write(circle) }
unsafe {
DrawPoint(pPoint)
DrawCircle(pCircle)
free(ptr1)
free(ptr2)
}
}
Run the following command to compile the Cangjie code (using the CJNative backend as an example):
cjc -L . -l paint ./main.cj
In the compilation command, -L . indicates that the library is queried from the current directory (assume that libpaint.so exists in the current directory). -l paint indicates the name of the linked library. After the compilation is successful, the binary file main is generated by default. The command for running the binary file is as follows:
LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH ./main