Speeding Up Python Apps With CFFI

Speeding Up Python Apps With CFFI

Post Stastics

  • This post has 3750 words.
  • Estimated read time is 17.86 minute(s).

Calling C/C++ Code from Python

Introduction

Python is a powerful and versatile programming language, but one of its weaknesses is that it can be slow compared to other languages like C or C++. However, Python can use C/C++ code to speed up the application or simply make use of functionality not available in Python. By using the C Foreign Function Interface (CFFI) you can call C code from Python. Large Python application can take advantage of the speed of C/C++ by writing performance-critical parts of the code in C/C++ and calling that code from Python. This will give your application a significant speed boost!

In this article, we will explore how to use CFFI to call C code from Python and provide several examples of how this can be used to speed up your Python applications. The same techniques work in C++, provided you define your C++ code as extern “C”.

In C++, the extern "C" syntax is used to declare a function or variable as having “C” linkage. This means that the name of the function or variable is not mangled by the C++ compiler, allowing it to be used from other programming languages that follow the C calling convention.

C++ uses a technique called name mangling to encode additional information about functions and methods in the compiled code. This encoding makes it possible to support features like function overloading, but it can also make it difficult to use C++ code from other programming languages. By using extern "C", you can tell the C++ compiler to use the same naming convention as C, which makes it easier to interoperate with other languages.

Other programming languages that support something similar to extern "C" include D, Rust, and Swift. In D, you can use the extern(C) keyword to declare functions with C linkage, and in Rust, you can use the extern "C" attribute. In Swift, you can use the @_cdecl("functionName") attribute to specify the C-compatible name of a function.

Any language that supports C linkage can be used by Python, providing you follow the languages procedures for compiling the code using the C linkage type for those functions you wish to call from Python. However, today I will focus on using C with Python. You will find the techniques are similar for all other languages providing C linkage.

Installing CFFI

Before we can start using CFFI, we need to install it. You can install CFFI using pip, the Python package manager, like this:

pip install cffi

Writing a Simple C Function

Example 1: Passing integer data to C and back to Python:

Let’s start with a simple C function that takes two integer arguments and returns their sum:

int add(int x, int y){
    return x + y; 
}

Save this code to a file called example.c.

Compiling the C Code

Before we can use this C function from Python, we need to compile it into a shared library that Python can load. On Linux, we can do this using GCC:

gcc -shared -o example.so example.c

This will create a shared library called example.so that contains the add function.

On Windows, the process is slightly different:

gcc -shared -o example.dll example.c

This will create a shared library called example.dll that contains the add function.

On MacOS, the process is also slightly different:

gcc -shared -o example.dylib example.c

This will create a shared library called example.dylib that contains the add function.

Using CFFI to Call a C Function from Python

Now that we have a shared library containing the add function, we can use CFFI to call it from Python. Here’s how:

import cffi

ffi = cffi.FFI()
lib = ffi.dlopen("./example.so")  # or "./example.dll" on Windows, or "./example.dylib" on macOS

ffi.cdef("""
    int add(int x, int y);
""")

result = lib.add(2, 3)
print(result)  # Output: 5

Let’s break this down:

  1. We import the cffi module.
  2. We create a new FFI object, which we will use to interface with the C code.
  3. We load the shared library containing the add function using ffi.dlopen(). We use a different filename depending on the platform we are running on.
  4. We use ffi.cdef() to define the C function signature. This tells CFFI what arguments the add function takes and what type of value it returns.
  5. We call the add function using lib.add(). This is equivalent to calling the add function in C.

IMPORTANT NOTE: In the examples below, be sure to add the function signature to the ffi.cdef() each time you add a function to the example.c. Also, don’t forget to recompile example.c every time you edit it.

The add function took two integer values and returned an integer. But what if you have a function that takes floats?

Example 2: Passing & Returning Floats

Let’s add a new function to our example.c file:

/**
 * multiply(float x, float y) - Multiplies two float numbers
 * param: x : float - multiplicand float value
 * param: y : float - multiplier float value
 * return product : float
 */
float multiply(float x, float y) {
    return x * y;
}

Our new multiply function takes two floats and returns a float value. It should be noted that a function can take any type you want and return any other type. Later we will see an example of passing complex types.

The code to call our new function from Python is simple:

def multiply(x: float, y: float) -> float:
    return lib.multiply(x, y)

This should look pretty similar to the add function. Only the types have changed. We still use the lib.<function_name> to call our function. In this case lib.multiply.

Example 3: Passing & Returning Strings

While integers and floats seem to pass between Python and C unchanged, some types like strings, need some special handling. Let’s see how to handle passing strings back and forth. For this we will create a new function in our example.c file named “greet”. The greet function will take a string and return a string.

/**
 * greet(char *name) - Given a name returns a greeting
 * param: name : char* - the name to greet
 * return: greeting : char*
 */
char* greet(char* name) {
    static char message[50];
    sprintf(message, "Hello, %s!", name);
    return message;
}

Our new greet function creates a buffer for a new string named message. It will take a string called name and combine it with the “Hello, ” using the sprintf() function. The results are placed in message.

Let’s now see how we call this from Python:

def greet(name: str) -> str:
    b_string = name.encode() # Note the name string must be converted to bytes.
    result = lib.greet(b_string)
    bstring = ffi.string(result)
    return bstring.decode("utf8")

Here, the Python function takes a str type and converts it to a byte array which is stored in b_string. This byte array is passed to the lib.greet() function. In the C function, a char array is the same as a byte array and so C can work with this type. However, the returned value must again be converted back to a str type for use with Python. This is done by first taking the result and passing it to the ffi.string() method to cast it to a string. However, there is still work to be done. This string is just raw bytes. Python needs a utf-8 string. So we take the bstring value and call decode() on it passing the type to decode the string to as “utf8”. Now we have a str type string we can use in Python.

One thing to be aware of here, is that many types require type conversions be done both when passing the data in to C and when returning the result to Python. With some types this conversion is not trivial and requires some time. This conversion creates an overhead when calling C functions. This overhead can really slow things down sometimes. So, it is best to only use C functions when the overhead is a small part of the actual processing time. It is always best to test this on the target hardware under “real” conditions and evaluate the results to determine if using C is warranted. In some cases you may actually be making your code slower by using C.

Passing and Returning Complex Data Types

Example 4: Passing an Array

OK, so far we’ve seen arguments passed as integers, floats, and strings. But what if we need to pass something a little more complex, like a Python list? Well, that’s not hard either. Let’s add a new function to example.c:

/**
 * sum(int *array, int length) : Sums all integers in array
 * param: array: int - Array containing integer values to sum.
 * param: length: int - Length of array.
 * return int - sum of all integers in array.
 */
int sum(int *array, int length) {
    int result = 0;
    for (int i = 0; i < length; i++) {
        result += array[i];
    }
    return result;
}

Our new sum() function takes a list (array) of integers and returns an integer result, which is the sum of all the integer values in the list.

Now let’s see the Python code to call this function:

def sum(arr: list[int]) -> int:
    # convert the Python list to a C array
    c_array = ffi.new("int[]", arr)

    # call the C function with the C array and the length of the array
    result = lib.sum(c_array, len(arr))

    # convert the result back to a Python int
    return int(result)

As you can see, this is trivial. A call to ffi.new() creates a new C array, allocating memory for the array and inserting the values from arr. We need to pass the type of the C array “int[]” and the data to be stored in the array (arr). Then we call lib.sum() to run the C function. Next we need to cast the return value to a Python int using int(result).

Example 5: Using C code to speed up matrix multiplication in Python

Matrix multiplication is a common operation in scientific computing that can benefit greatly from C code optimization. Here’s an example of using CFFI to implement matrix multiplication in C and call it from Python:

// matrix.c
#include <stdio.h>
#include <stdlib.h>

void matrix_multiply(double* A, double* B, double* C, int m, int n, int p) {
    for (int i = 0; i < m; i++) {
        for (int j = 0; j < p; j++) {
            double sum = 0;
            for (int k = 0; k < n; k++) {
                sum += A[i * n + k] * B[k * p + j];
            }
            C[i * p + j] = sum;
        }
    }
}
def matrix_multiply():
    q = 3
    A = [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]
    B = [[9.0, 8.0, 7.0], [6.0, 5.0, 4.0], [3.0, 2.0, 1.0]]
    C = [[0.0] * q for _ in range(q)]

    print("Matrix Multiply...")
    print(f"A: {A}\n\nB: {B}\n\nC: {C}\n\n")

    # Convert to C types
    _A = ffi.new("double[]", __builtins__.sum(A, []))
    _B = ffi.new("double[]", __builtins__.sum(B, []))
    _C = ffi.new("double[]", __builtins__.sum(C, []))

    m, n, p = 2, 2, 2

    lib.matrix_multiply(_A, _B, _C, m, n, p)
    C = [[_C[i * q + j] for j in range(q)] for i in range(q)]

    print(f"Result:\n\nA: {A}\n\nB: {B}\n\nC: {C}\n\n")

In this example, we define a C function matrix_multiply that takes three double arrays A, B, and C, and their dimensions m, n, and p, respectively. The function performs matrix multiplication on A and B, and stores the result in C. We then use CFFI to define the C interface of the function, load the C library, and call the function from Python using lists.

Also, note the use of the __builtins__ to explicitly refer to the built-in sum. This is only needed because we defined a local function named sum above, and it hides the Python built-in sum function. We use sum() to flatten the 2D lists into 1D lists of C type doubles. Since A and B are lists of lists, passing [] as the start value ensures that sum() returns a list of all the values in the nested lists.

Example 6: Using C code to read and write files in Python

Reading and writing files in Python can be slow for large files. Here’s an example of using CFFI to implement file I/O in C and call it from Python:

int read_file(char* filename, double* data, int size) {
    FILE *file = fopen(filename, "r");
    if (file == NULL) {
        return -1;
    }

    for (int i = 0; i < size; i++) {
        if (fscanf(file, "%lf", &data[i]) == EOF) {
            break;
        }
    }

    fclose(file);
    return 0;
}

int write_file(char* filename, double* data, int size) {
    FILE *file = fopen(filename, "w");
    if (file == NULL) {
        return -1;
    }

    for (int i = 0; i < size; i++) {
        fprintf(file, "%lf\n", data[i]);
    }

    fclose(file);
    return 0;
}

Before we can use our new file I/O routines we need a data file to read from. Below is a short script placed in the data_gen.py file. This script will generate 100,000 random numerical values and save them in the data.txt file. We will use the file with our new read file function.

# File: data_gen.py
import random

def main():
    with open('data.txt', 'w') as f:
        for i in range(100_000):
            x = random.uniform(-10000, 10000)
            f.write(str(x) + '\n')


if __name__ == '__main__':
    main()

Create a new python file named data_gen.py and add the code above. Run the script and check that the data.txt file now exists.

Now let’s see how to call our C file functions from Python. Add the following code to our main.py file:

def read_data():
    # read data from file
    size = 100_000
    data = ffi.new("double[]", size)
    filename = ffi.new("char[]", b"data.txt")
    retval = lib.read_file(filename, data, size)

    # check if read was successful
    if retval == 0:
        print("Data read successfully:")
        for i in range(size):
            print(data[i])
    else:
        print("Error reading data from file.")
    return data


def write_data(data_list: list):
    size = len(data_list)
    data = ffi.new("double[]", data_list)

    # write data to file
    filename = ffi.new("char[]", b"output.txt")
    retval = lib.write_file(filename, data, size)

    # check if write was successful
    if retval == 0:
        print("Data written successfully.")
    else:
        print("Error writing data to file.")

This all looks pretty simple right! Nothing out of the ordinary here. Now, let’s see the entire C and Python files:

/**************************************************
 *    C function file for Python CFFI Demo        *
 **************************************************
 * File: example.c
 * Compile: gcc -shared -o example.so example.c // on Linux
 * See notes for other platforms
 * Author: R. Morgan <rmorgan62@gmail.com>
 * Date: 2023-05-05
 * Version: 0.0.1
 */

#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>


/**
 * add(x, y) - Simply adds two integers.
 * param: x: int
 * param: y: int
 * return int
 */
int add(int x, int y) {
    return x + y;
}

/**
 * multiply(float x, float y) - Multiplies two float numbers
 * param: x : float - multiplicand float value
 * param: y : float - multiplier float value
 * return product : float
 */
float multiply(float x, float y) {
    return x * y;
}

/**
 * greet(char *name) - Given a name returns a greeting
 * param: name : char* - the name to greet
 * return: greeting : char*
 */
char* greet(char* name) {
    static char message[50];
    sprintf(message, "Hello, %s!", name);
    return message;
}


/**
 * sum(int *array, int length) : Sums all integers in array
 * param: array: int - Array containing integer values to sum.
 * param: length: int - Length of array.
 * return int - sum of all integers in array.
 */
int sum(int *array, int length) {
    int result = 0;
    for (int i = 0; i < length; i++) {
        result += array[i];
    }
    return result;
}


/**
 * callback(int (*f)(int))
 */
void callback(int (*f)(int)) {
    f(42);
}

/**
 * Struct to hold point values
 */
typedef struct {
    int x;
    int y;
} Point;

/**
 * move_point( Point p, int dx, int dy)
 * param: p : Point
 * param: dx : int
 * param: dy : int
 * return: p - modified by (dx, dy)
 */
Point move_point(Point p, int dx, int dy) {
    p.x += dx;
    p.y += dy;
    return p;
}

/**
 * bool is_even(int x) - returns true if x is event.
 * param: x : int - integer value to test
 * return bool - True if x is divisible by two, without
 *               remainder, false otherwise.
 */
bool is_even(int x) {
    return x % 2 == 0;
}

/**
 * Matrix multiplication()
 * param: A : double* - Multiplicand
 * param: B : double* - Multiplier
 * param: C : double* - Result
 * param: m : int - dimension of A
 * param: n : int - dimension of B
 * param: p : int - dimension of C
 */
void matrix_multiply(double* A, double* B, double* C, int m, int n, int p) {
    for (int i = 0; i < m; i++) {
        for (int j = 0; j < p; j++) {
            double sum = 0;
            for (int k = 0; k < n; k++) {
                sum += A[i * n + k] * B[k * p + j];
            }
            C[i * p + j] = sum;
        }
    }
}


/**
 * read_file(char* filename, double* data, int size)
 * param: filename : char* - pointer to string containing
 * param: data : double* - pointer to data array
 * param: size : int - size of data array
 * return: int : success code, 0 on success, -1 on error
 */
int read_file(char* filename, double* data, int size) {
    FILE *file = fopen(filename, "r");
    if (file == NULL) {
        return -1;
    }

    for (int i = 0; i < size; i++) {
        if (fscanf(file, "%lf", &data[i]) == EOF) {
            break;
        }
    }

    fclose(file);
    return 0;
}

/**
 * write_file(char* filename, double* data, int size)
 * param: filename : char* - pointer to C string
 * param: data : double* - pointer to data array
 * param: size : int - size of data array
 * return: int : success code, 0 on success, -1 otherwise
 */
int write_file(char* filename, double* data, int size) {
    FILE *file = fopen(filename, "w");
    if (file == NULL) {
        return -1;
    }

    for (int i = 0; i < size; i++) {
        fprintf(file, "%lf\n", data[i]);
    }

    fclose(file);
    return 0;
}
#!/usr/bin/env python3
# (-*-coding: utf-8-*-)

#=======================================
# Python Application to Demo Using CFFI
#    to call C functions from Python
#=======================================
# File: main.py

__author__ = "Randall Morgan"
__copyright__ = "Copyright 2023, SensorNet"
__credits__ = ["Randall Morgan"]
__license__ = "GPL"
__version__ = "1.0.0"
__maintainer__ = "Randall Morgan"
__email__ = "rmorgan@sensornet.us"
__status__ = "Beta"

import cffi

ffi = cffi.FFI()
lib = ffi.dlopen("./example.so")  # or "./example.dll" on Windows, or "./example.dylib" on macOS

# Add C function Signatures in ffi.cdef()
ffi.cdef("""
    int add(int x, int y);
    int sum(int *array, int length);
    float multiply(float x, float y);
    char* greet(char* name);
    void callback(int (*f)(int));
    typedef struct {
        int x;
        int y;
    } Point;
    Point move_point(Point p, int dx, int dy);
    bool is_even(int num);
    void matrix_multiply(double* A, double* B, double* C, int m, int n, int p);
    int read_file(char* filename, double* data, int size);
    int write_file(char* filename, double* data, int size);
""")


def add(x: int, y: int) -> int:
    return lib.add(x, y)


def mult(x: float, y: float) -> float:
    return lib.multiply(x, y)


def greet(name: str) -> str:
    b_string = name.encode()  # Note the name string must be converted to bytes.
    result = lib.greet(b_string)
    bstring = ffi.string(result)
    return bstring.decode("utf8")


def sum(arr: list[int]) -> int:
    # convert the Python list to a C array
    c_array = ffi.new("int[]", arr)

    # call the C function with the C array and the length of the array
    result = lib.sum(c_array, len(arr))

    # convert the result back to a Python int
    return int(result)


def move_point(p: dict[int, int], dx: int, dy: int) -> dict[int, int]:
    # Calling the move_point() function
    p = ffi.new("Point*", p)
    result = lib.move_point(p[0], dx, dy)
    return result


def is_even(x: bool) -> bool:
    result = lib.is_even(x)
    return bool(result)


def matrix_multiply():
    q = 3
    A = [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]
    B = [[9.0, 8.0, 7.0], [6.0, 5.0, 4.0], [3.0, 2.0, 1.0]]
    C = [[0.0] * q for _ in range(q)]

    print("Matrix Multiply...")
    print(f"A: {A}\n\nB: {B}\n\nC: {C}\n\n")

    # Convert to C types
    _A = ffi.new("double[]", __builtins__.sum(A, []))
    _B = ffi.new("double[]", __builtins__.sum(B, []))
    _C = ffi.new("double[]", __builtins__.sum(C, []))

    m, n, p = 2, 2, 2

    lib.matrix_multiply(_A, _B, _C, m, n, p)
    C = [[_C[i * q + j] for j in range(q)] for i in range(q)]

    print(f"Result:\n\nA: {A}\n\nB: {B}\n\nC: {C}\n\n")


def read_data():
    # read data from file
    size = 100_000
    data = ffi.new("double[]", size)
    filename = ffi.new("char[]", b"data.txt")
    retval = lib.read_file(filename, data, size)

    # check if read was successful
    if retval == 0:
        print("Data read successfully:")
        for i in range(size):
            print(data[i])
    else:
        print("Error reading data from file.")
    return data


def write_data(data_list: list):
    size = len(data_list)
    data = ffi.new("double[]", data_list)

    # write data to file
    filename = ffi.new("char[]", b"output.txt")
    retval = lib.write_file(filename, data, size)

    # check if write was successful
    if retval == 0:
        print("Data written successfully.")
    else:
        print("Error writing data to file.")


def main():
    x, y = 2, 3
    result = add(x, y)
    print(f"Add({x}, {y}) = {result}")  # Output: 5

    # build a list of values to sum
    arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
    # sum the values in the array and print the results
    print(f"The sum of 1-12 is: {sum(arr)}")

    # Multiply two float values and print the result
    x, y = 2.5, 3.5
    print(f"multiply({x}, {y}) = {mult(x, y)}")

    # Pass a name (string) to a C function and get a greeting
    name = "Gorge"
    print(f'Greeting: {greet(name)}')

    # Create a point and delta values
    p = {"x": 12, "y": 21}
    dx = 10
    dy = 9
    # call move_point()
    result = move_point(p, dx, dy)
    # show result
    print(f"move_point: ({result.x}, {result.y})")

    # Handling boolean values
    x = 27
    print(f"{x} is even. This statement is {is_even(x)}")
    x = 32
    print(f"{x} is even. This statement is {is_even(x)}")

    # Do matrix multiply
    matrix_multiply()

    # Read file
    num_data = read_data()
    print(num_data)

    # Write file
    write_data(list(num_data))


# Press the green button in the gutter to run the script.
if __name__ == '__main__':
    main()

We’ve covered a lot of ground for a simple demo of CFFI. However, we have barely scratched the surface of what can be done with CFFI. You can learn more about CFFI at: https://cffi.readthedocs.io/en/latest/

I encourage you to play around with CFFI and C. Try passing a Python function to C to be used as a callback, and any other scenarios you can come up with. CFFI is powerful, easier to use than ctypes, the built-in python module for calling C from Python. Using CFFI results in cleaner, less complicated code.

Resources

Leave a Reply

Your email address will not be published. Required fields are marked *