Cangjie-C Interoperability
To ensure compatibility with existing ecosystems, Cangjie supports calling C language functions and allows C language functions to call Cangjie functions.
Cangjie Calling C Function
To call a C function in Cangjie, you need to declare the function using the @C
and foreign
modifiers in Cangjie, but@C
can be omitted if foreign
is present.
For example, if you want to call the rand
and printf
functions of C, their function signatures are as follows:
// stdlib.h
int rand();
// stdio.h
int printf (const char *fmt, ...);
In this case, a manner of calling the two functions in Cangjie is as follows:
// declare the function by `foreign` keyword, and omit `@C`
foreign func rand(): Int32
foreign func printf(fmt: CString, ...): Int32
main() {
// call this function by `unsafe` block
let r = unsafe { rand() }
println("random number ${r}")
unsafe {
var fmt = LibC.mallocCString("Hello, No.%d\n")
printf(fmt, 1)
LibC.free(fmt)
}
}
Note that:
foreign
: used to modify a function declaration, indicating that the function is an external function. A function modified byforeign
can have only a function declaration but not a function implementation.foreign
: functions declared withforeign
must have parameters and return types that comply with the mapping between C and Cangjie data types. For details, see [Type Mapping] (./cangjie-c.md#type-mapping).- Functions on the C side may cause unsafe operations. Therefore, when calling the function modified by
foreign
, it must be wrapped by theunsafe
block. Otherwise, a compilation error occurs. - The
foreign
keyword modified by@C
can only be used to modify function declarations and cannot be used to modify other declarations. Otherwise, a compilation error occurs. @C
supports only theforeign
function, non-generic functions in thetop-level
scope, andstruct
types.- The
foreign
function does not support named parameters or default parameter values. Theforeign
function allows variable-length parameters, which are expressed by...
and can be used only at the end of the parameter list. Variable-length parameters must meet theCType
constraint, but they do not need to be of the same type. - Although Cangjie (CJNative backend) provides the stack capacity expansion capability, it cannot detect the actual stack size used by the C-side function. Therefore, after the FFI calls the C-side function, stack overflow may still occur. You need to modify the
cjStackSize
configuration based on actual situation.
Sample codes for invalid foreign
declarations are as follows:
foreign func rand(): Int32 { // compiler error
return 0
}
@C
foreign var a: Int32 = 0 // compiler error
@C
foreign class A{} // compiler error
@C
foreign interface B{} // compiler error
CFunc
In Cangjie, CFunc
refers to the function that can be called by C language code. There are three forms:
foreign
function modified by@C
.- Cangjie function modified by
@C
. - The lambda expression of the
CFunc
type is different from the common lambda expression. TheCFunc
lambda expression cannot capture variables.
// Case 1
foreign func free(ptr: CPointer<Int8>): Unit
// Case 2
@C
func callableInC(ptr: CPointer<Int8>) {
print("This function is defined in Cangjie.")
}
// Case 3
let f1: CFunc<(CPointer<Int8>) -> Unit> = { ptr =>
print("This function is defined with CFunc lambda.")
}
The type of functions declared or defined in the preceding three forms is CFunc<(CPointer<Int8>) -> Unit>
. CFunc
corresponds to the function pointer type of the C language. This type is a generic type. Its generic parameter indicates the type of the CFunc
input parameter and return value. The usage is as follows:
foreign func atexit(cb: CFunc<() -> Unit>): Int32
Similar to the foreign
function, the parameters and return types of other CFunc
functions must meet the CType
constraint and do not support named parameters and default parameter values.
When CFunc
is called in Cangjie code, it must be in the unsafe
context.
The Cangjie language can convert a variable of the CPointer<T>
type to a specific CFunc
. The generic parameter T
of CPointer
can be any type that meets the CType
constraint. The method is as follows:
main() {
var ptr = CPointer<Int8>()
var f = CFunc<() -> Unit>(ptr)
unsafe { f() } // core dumped when running, because the pointer is nullptr.
}
Note:
It is dangerous to forcibly convert a pointer to
CFunc
and call a function. You need to ensure that the pointer points to an available function address. Otherwise, a runtime error occurs.
inout Parameter
When CFunc
is called in Cangjie, the parameter can be modified by the inout
keyword to form a reference value transfer expression. In this case, the parameter is transferred by reference. The type of the referenced value transfer expression is CPointer<T>
, where T
is the type of the expression modified by inout
.
The value transfer expression by reference has the following restrictions:
- It can only be used to call
CFunc
. - The type of the modifier object must meet the
CType
constraint, but cannot beCString
. - The modifier object cannot be defined by
let
or temporary variables such as literals, input parameters, and values of other expressions. - The pointer transferred to the C side by using the value transfer expression on the Cangjie side is valid only during function calling. That is, in this scenario, the C side should not save the pointer for future use.
Variables modified by inout
can be variables defined in top-level scope, local variables, and member variables in struct
types, but cannot be directly or indirectly derived from instance member variables of class
types.
The following is an example:
foreign func foo1(ptr: CPointer<Int32>): Unit
@C
func foo2(ptr: CPointer<Int32>): Unit {
let n = unsafe { ptr.read() }
println("*ptr = ${n}")
}
let foo3: CFunc<(CPointer<Int32>) -> Unit> = { ptr =>
let n = unsafe { ptr.read() }
println("*ptr = ${n}")
}
struct Data {
var n: Int32 = 0
}
class A {
var data = Data()
}
main() {
var n: Int32 = 0
unsafe {
foo1(inout n) // OK
foo2(inout n) // OK
foo3(inout n) // OK
}
var data = Data()
var a = A()
unsafe {
foo1(inout data.n) // OK
foo1(inout a.data.n) // Error, n is derived indirectly from instance member variables of class A
}
}
Note:
When the macro extension feature is used, the
inout
parameter feature cannot be used in the macro definition.
unsafe
Many unsafe factors of C are also introduced during the introduction of interoperability with C language. Therefore, the unsafe
keyword is used in Cangjie to identify unsafe behaviors of cross-C calling.
The unsafe keyword is described as follows:
unsafe
can be used to modify functions, expressions, or a scope.- Functions modified by
@C
must be called in theunsafe
context. - When
CFunc
is called, it must be used in theunsafe
context. - When a
foreign
function is called in Cangjie, the call must be in theunsafe
context. - When the called function is modified by
unsafe
, the call must be in theunsafe
context.
The method is as follows:
foreign func rand(): Int32
@C
func foo(): Unit {
println("foo")
}
var foo1: CFunc<() -> Unit> = { =>
println("foo1")
}
main(): Int64 {
unsafe {
rand() // Call foreign func.
foo() // Call @C func.
foo1() // Call CFunc var.
}
0
}
Note that the common lambda expression cannot transfer the unsafe
attribute. When an unsafe
lambda expression escapes, it can be directly called without any compilation error in the unsafe
context. To call an unsafe
function in a lambda expression, you are advised to call the function in an unsafe
block. For details, see the following case:
unsafe func A(){}
unsafe func B(){
var f = { =>
unsafe { A() } // Avoid calling A() directly without unsafe in a normal lambda.
}
return f
}
main() {
var f = unsafe{ B() }
f()
println("Hello World")
}
Calling Conventions
Function calling conventions describe how the caller and callee call functions (for example, how parameters are transferred and who clears the stack). The caller and callee must use the same calling conventions to run properly. The Cangjie programming language uses @CallingConv
to indicate various calling conventions. The supported calling conventions are as follows:
- CDECL: The default calling conventions used by the C compiler of Clang on different platforms.
- STDCALL: The calling conventions used by the Win32 API.
If a C function is called using the C language interoperability mechanism, the default CDECL
calling conventions is used when no calling convention is specified. The following is an example of calling the rand
function in the C standard library:
@CallingConv[CDECL] // Can be omitted in default.
foreign func rand(): Int32
main() {
println(unsafe { rand() })
}
@CallingConv
can only be used to modify the foreign
block, a single foreign
function, and a CFunc
function in the top-level
scope. When @CallingConv
modifies the foreign
block, the same @CallingConv
modification is added to each function in the foreign
block.
Type Mapping
Base Types
The Cangjie and C languages support the mapping of basic data types. The general principles are as follows:
- The Cangjie type does not contain references pointing to the managed memory.
- The Cangjie type and the C type have the same memory layout.
For example, some basic type mapping relationships are as follows:
Cangjie Type | C Type | Size (byte) |
---|---|---|
Unit | void | 0 |
Bool | bool | 1 |
UInt8 | char | 1 |
Int8 | int8_t | 1 |
UInt8 | uint8_t | 1 |
Int16 | int16_t | 2 |
UInt16 | uint16_t | 2 |
Int32 | int32_t | 4 |
UInt32 | uint32_t | 4 |
Int64 | int64_t | 8 |
UInt64 | uint64_t | 8 |
IntNative | ssize_t | platform dependent |
UIntNative | size_t | platform dependent |
Float32 | float | 4 |
Float64 | double | 8 |
Note:
Due to the uncertainty of the
int
andlong
types on different platforms, programmers need to specify the corresponding Cangjie programming language type. In C interoperability scenarios, similar to the C language, theUnit
type can only be used as the return type inCFunc
and the generic parameter ofCPointer
.
Cangjie also supports the mapping with the structures and pointer types of the C language.
Structure
For the structure type, Cangjie uses struct
modified by @C
. For example, the C language has the following structure:
typedef struct {
long long x;
long long y;
long long z;
} Point3D;
The corresponding Cangjie type can be defined as follows:
@C
struct Point3D {
var x: Int64 = 0
var y: Int64 = 0
var z: Int64 = 0
}
If the C language contains such a function:
Point3D addPoint(Point3D p1, Point3D p2);
Accordingly, the function can be declared in Cangjie as follows:
foreign func addPoint(p1: Point3D, p2: Point3D): Point3D
The struct
modified by @C
must meet the following requirements:
- The type of a member variable must meet the
CType
constraint. interface
types cannot be implemented or extended.- To be used as an associated value type of
enum
is not allowed. - Closure capture is not allowed.
- Generic parameters are not allowed.
The struct
modified by @C
automatically meets the CType
constraint.
Pointer
For the pointer type, Cangjie provides the CPointer<T>
type to correspond to the pointer type on the C side. The generic parameter T
must meet the CType
constraint. For example, the signature of the malloc function in C is as follows:
void* malloc(size_t size);
In Cangjie, it can be declared as follows:
foreign func malloc(size: UIntNative): CPointer<Unit>
The CPointer
can be used for read and write, offset calculation, null check, and pointer conversion. For details about the API, see "Cangjie Programming Language Library API". Read, write, and offset calculation are unsafe behaviors. When invalid pointers call these functions, undefined behaviors may occur. These unsafe functions need to be called in unsafe blocks.
The following is an example of using CPointer
:
foreign func malloc(size: UIntNative): CPointer<Unit>
foreign func free(ptr: CPointer<Unit>): Unit
@C
struct Point3D {
var x: Int64
var y: Int64
var z: Int64
init(x: Int64, y: Int64, z: Int64) {
this.x = x
this.y = y
this.z = z
}
}
main() {
let p1 = CPointer<Point3D>() // create a CPointer with null value
if (p1.isNull()) { // check if the pointer is null
print("p1 is a null pointer")
}
let sizeofPoint3D: UIntNative = 24
var p2 = unsafe { malloc(sizeofPoint3D) } // malloc a Point3D in heap
var p3 = unsafe { CPointer<Point3D>(p2) } // pointer type cast
unsafe { p3.write(Point3D(1, 2, 3)) } // write data through pointer
let p4: Point3D = unsafe { p3.read() } // read data through pointer
let p5: CPointer<Point3D> = unsafe { p3 + 1 } // offset of pointer
unsafe { free(p2) }
}
Cangjie supports forcible type conversion between CPointer
. The generic parameter T
of CPointer
before and after the conversion must meet the constraints of CType
. The method is as follows:
main() {
var pInt8 = CPointer<Int8>()
var pUInt8 = CPointer<UInt8>(pInt8) // CPointer<Int8> convert to CPointer<UInt8>
0
}
Cangjie can convert a variable of the CFunc
type to a specific CPointer
. The generic parameter T
of CPointer
can be any type that meets the CType
constraint. The method is as follows:
foreign func rand(): Int32
main() {
var ptr = CPointer<Int8>(rand)
0
}
Note:
It is safe to forcibly convert a
CFunc
to a pointer. However, noread
orwrite
operation should be performed on the converted pointer, which may cause runtime errors.
Array
Cangjie uses the VArray
type to map to the array type of C. The VArray
type can be used as a function parameter or @C struct
member. When the element type T
in VArray<T, $N>
meets the CType
constraint, the VArray<T, $N>
type also meets the CType
constraint.
As a function parameter type:
When VArray
is used as a parameter of CFunc
, the function signature of CFunc
can only be of the CPointer<T>
or the VArray<T, $N>
type. If the parameter type in the function signature is VArray<T, $N>
, the parameter is transferred in the CPointer<T>
format.
The following is an example of using VArray
as a parameter:
foreign func cfoo1(a: CPointer<Int32>): Unit
foreign func cfoo2(a: VArray<Int32, $3>): Unit
The corresponding C-side function definition may be as follows:
void cfoo1(int *a) { ... }
void cfoo2(int a[3]) { ... }
When calling CFunc
, you need to use inout
to modify the variable of the VArray
type.
var a: VArray<Int32, $3> = [1, 2, 3]
unsafe {
cfoo1(inout a)
cfoo2(inout a)
}
VArray
cannot be used as the return value type of CFunc
.
As a member of @C struct:
When VArray
is a member of @C struct
, its memory layout is the same as the structure layout on the C side. Ensure that the declaration length and type on the Cangjie side are the same as those on the C side.
struct S {
int a[2];
int b[0];
}
In Cangjie, the following structure can be declared to correspond to the C code:
@C
struct S {
var a = VArray<Int32, $2>(item: 0)
var b = VArray<Int32, $0>(item: 0)
}
Note:
In the C language, the last field of a structure can be an array whose length is not specified. The array is called a flexible array. Cangjie does not support the mapping of structures that contain flexible arrays.
Character String
Particularly, for a string type in the C language, a CString
type is designed in Cangjie. To simplify operations on C language strings, CString
provides the following member functions:
init(p: CPointer<UInt8>)
: constructs a CString through CPointer.func getChars()
: obtaining the address of a character string. The type isCPointer<UInt8>
.func size(): Int64
: calculates the length of the character string.func isEmpty(): Bool
: checks that the length of the string is 0. If the pointer of the string is null,true
is returned.func isNotEmpty(): Bool
: checks that the length of the string is not 0. If the pointer of the string is null,false
is returned.func isNull(): Bool
: checks whether the pointer of the character string is null.func startsWith(str: CString): Bool
: checks whether the character string starts with str.func endsWith(str: CString): Bool
: checks whether the character string ends with str.func equals(rhs: CString): Bool
: checks whether the character string is equal to rhs.func equalsLower(rhs: CString): Bool
: checks whether the character string is equal to rhs. The value is case-insensitive.func subCString(start: UInt64): CString
: truncates a substring from start and stores the returned substring in the newly allocated space.func subCString(start: UInt64, len: UInt64): CString
: truncates a substring whose length is len from start and stores the returned substring in the newly allocated space.func compare(str: CString): Int32
: returns a result which is the same asstrcmp(this, str)
in the C language compared with str.func toString(): String
: constructs a new String object using this string.func asResource(): CStringResource
: obtains the resource type of CString.
In addition, the mallocCString
function in LibC can be called to convert String
to CString
. After the conversion is complete, CString
needs to be released.
The following is an example of using CString
:
foreign func strlen(s: CString): UIntNative
main() {
var s1 = unsafe { LibC.mallocCString("hello") }
var s2 = unsafe { LibC.mallocCString("world") }
let t1: Int64 = s1.size()
let t2: Bool = s2.isEmpty()
let t3: Bool = s1.equals(s2)
let t4: Bool = s1.startsWith(s2)
let t5: Int32 = s1.compare(s2)
let length = unsafe { strlen(s1) }
unsafe {
LibC.free(s1)
LibC.free(s2)
}
}
sizeOf/alignOf
Cangjie also provides the sizeOf
and alignOf
functions to obtain the memory usage and memory alignment values (in bytes) of the preceding C interoperability types. The function declaration is as follows:
public func sizeOf<T>(): UIntNative where T <: CType
public func alignOf<T>(): UIntNative where T <: CType
Example:
@C
struct Data {
var a: Int64 = 0
var b: Float32 = 0.0
}
main() {
println(sizeOf<Data>())
println(alignOf<Data>())
}
If you run the command on a 64-bit computer, the following information is displayed:
16
8
CType
In addition to the types that are mapped to C-side types provided in "Type Mapping", Cangjie provides a CType
interface. The interface does not contain any method and can be used as the parent type of all types supported by C interoperability for easy use in generic constraints.
Note that:
- The
CType
interface itself does not meet theCType
constraint. - The
CType
interface cannot be inherited or extended. - The
CType
interface does not break the usage restrictions of subtypes.
The following is an example of using CType
:
func foo<T>(x: T): Unit where T <: CType {
match (x) {
case i32: Int32 => println(i32)
case ptr: CPointer<Int8> => println(ptr.isNull())
case f: CFunc<() -> Unit> => unsafe { f() }
case _ => println("match failed")
}
}
main() {
var i32: Int32 = 1
var ptr = CPointer<Int8>()
var f: CFunc<() -> Unit> = { => println("Hello") }
var f64 = 1.0
foo(i32)
foo(ptr)
foo(f)
foo(f64)
}
The result is as follows:
1
true
Hello
match failed
C Calling Cangjie Functions
Cangjie provides the CFunc
type to correspond to the function pointer type on the C side. The function pointer on the C side can be transferred to Cangjie, and Cangjie can also construct and transfer a variable corresponding to the function pointer on the C side.
Assume that a C library API is as follows:
typedef void (*callback)(int);
void set_callback(callback cb);
Correspondingly, the function in Cangjie can be declared as follows:
foreign func set_callback(cb: CFunc<(Int32) -> Unit>): Unit
Variables of the CFunc
type can be transferred from the C side or constructed on the Cangjie side. There are two methods to construct the CFunc
type on the Cangjie side. One is to use the function modified by @C
, and the other is to use the closure marked as CFunc
.
The function modified by @C
indicates that its function signature meets the calling rules of C and the definition is still written in Cangjie. The function modified by foreign
is defined on the C side.
Note:
For functions modified by
foreign
and@C
, which are namedCFunc
, you are advised not to useCJ_
(case-insensitive) as the prefix. Otherwise, the names may conflict with internal compiler symbols such as the standard library and runtime, resulting in undefined behavior.
Example:
@C
func myCallback(s: Int32): Unit {
println("handle ${s} in callback")
}
main() {
// the argument is a function qualified by `@C`
unsafe { set_callback(myCallback) }
// the argument is a lambda with `CFunc` type
let f: CFunc<(Int32) -> Unit> = { i => println("handle ${i} in callback") }
unsafe { set_callback(f) }
}
Assume that the library compiled by the C function is "libmyfunc.so". You need to run the cjc -L. -lmyfunc test.cj -o test.out
compilation command to enable the Cangjie compiler to link to the library. Finally, the desired executable program can be generated.
In addition, when compiling the C code, enable the -fstack-protector-all/-fstack-protector-strong
stack protection option. By default, the Cangjie code has the overflow check and stack protection functions. After the C code is introduced, the security of overflows in unsafe blocks needs to be ensured.
Compiler Options
To use C interoperability, you need to manually link the C library. The Cangjie compiler provides corresponding options.
-
--library-path <value>
,-L <value>
,-L<value>
: specifies the directory of the library file to be linked.--library-path <value>
: adds the specified path to the library file search paths of the linker. The path specified by the environment variableLIBRARY_PATH
will also be added to the library file search paths of the linker. The path specified by--library-path
has a higher priority than the path specified byLIBRARY_PATH
. -
--library <value>
,-l <value>
,-l<value>
: specifies the library file to be linked.The specified library file is directly transferred to the linker. The library file name must be in the
lib[arg].[extension]
format.
For details about all compilation options supported by the CJC compiler, see [CJC Compilation Options] (../Appendix/compile_options_OHOS.md).
Example
Assume that there is a C library libpaint.so
whose header file is as follows:
include <stdint.h>
typedef struct {
int64_t x;
int64_t y;
} Point;
typedef struct {
int64_t x;
int64_t y;
int64_t r;
} Circle;
int32_t DrawPoint(const Point* point);
int32_t DrawCircle(const Circle* circle);
The sample code for using the C library in the Cangjie code is as follows:
// main.cj
foreign {
func DrawPoint(point: CPointer<Point>): Int32
func DrawCircle(circle: CPointer<Circle>): Int32
func malloc(size: UIntNative): CPointer<Int8>
func free(ptr: CPointer<Int8>): Unit
}
@C
struct Point {
var x: Int64 = 0
var y: Int64 = 0
}
@C
struct Circle {
var x: Int64 = 0
var y: Int64 = 0
var r: Int64 = 0
}
main() {
let SIZE_OF_POINT: UIntNative = 16
let SIZE_OF_CIRCLE: UIntNative = 24
let ptr1 = unsafe { malloc(SIZE_OF_POINT) }
let ptr2 = unsafe { malloc(SIZE_OF_CIRCLE) }
let pPoint = CPointer<Point>(ptr1)
let pCircle = CPointer<Circle>(ptr2)
var point = Point()
point.x = 10
point.y = 20
unsafe { pPoint.write(point) }
var circle = Circle()
circle.r = 1
unsafe { pCircle.write(circle) }
unsafe {
DrawPoint(pPoint)
DrawCircle(pCircle)
free(ptr1)
free(ptr2)
}
}
Run the following command to compile the Cangjie code (using the CJNative backend as an example):
cjc -L . -l paint ./main.cj
In the compilation command, -L .
indicates that the library is queried from the current directory (assume that libpaint.so
exists in the current directory). -l paint
indicates the name of the linked library. After the compilation is successful, the binary file main
is generated by default. The command for running the binary file is as follows:
LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH ./main