NumPy#

Buffer protocol#

Python 支持非常通用和方便的方法来在插件库之间交换数据。类型可以公开 buffer视图 1,它提供对原始内部数据表示形式的快速直接访问。假设想要绑定以下简单的 Matrix 类:

class Matrix {
public:
    Matrix(size_t rows, size_t cols) : m_rows(rows), m_cols(cols) {
        m_data = new float[rows*cols];
    }
    float *data() { return m_data; }
    size_t rows() const { return m_rows; }
    size_t cols() const { return m_cols; }
private:
    size_t m_rows, m_cols;
    float *m_data;
};

下面的绑定代码将 Matrix 内容公开为 buffer 对象,使得将 Matrices 转换为 NumPy 数组成为可能。甚至可以完全避免使用 Python 表达式 np.array(matrix_instance, copy = False) 进行复制运算。

py::class_<Matrix>(m, "Matrix", py::buffer_protocol())
   .def_buffer([](Matrix &m) -> py::buffer_info {
        return py::buffer_info(
            m.data(),                               /* Pointer to buffer */
            sizeof(float),                          /* Size of one scalar */
            py::format_descriptor<float>::format(), /* Python struct-style format descriptor */
            2,                                      /* Number of dimensions */
            { m.rows(), m.cols() },                 /* Buffer dimensions */
            { sizeof(float) * m.cols(),             /* Strides (in bytes) for each index */
              sizeof(float) }
        );
    });

在新类型中支持 buffer protocol 需要在 py::class_ 构造函数中指定特殊的 py::buffer_protocol() tag,并使用 lambda 函数调用 def_buffer() 方法,该方法根据需要创建 py::buffer_info 描述记录,描述给定的矩阵实例。py::buffer_info 的内容反映了 Python buffer protocol 规范。

struct buffer_info {
    void *ptr;
    py::ssize_t itemsize;
    std::string format;
    py::ssize_t ndim;
    std::vector<py::ssize_t> shape;
    std::vector<py::ssize_t> strides;
};

要创建可以接受 Python buffer 对象作为参数的 C++ 函数,只需使用 py::buffer 类型作为它的参数之一。Buffer 可以存在于各种各样的配置中,因此在函数体中通常需要进行一些安全检查。下面,你可以看到基本的例子,关于如何为 Eigen 双精度矩阵(Eigen::MatrixXd)类型定义自定义构造函数,它支持从兼容的 buffer 对象(例如 NumPy matrix)初始化。

/* Bind MatrixXd (or some other Eigen type) to Python */
typedef Eigen::MatrixXd Matrix;

typedef Matrix::Scalar Scalar;
constexpr bool rowMajor = Matrix::Flags & Eigen::RowMajorBit;

py::class_<Matrix>(m, "Matrix", py::buffer_protocol())
    .def(py::init([](py::buffer b) {
        typedef Eigen::Stride<Eigen::Dynamic, Eigen::Dynamic> Strides;

        /* Request a buffer descriptor from Python */
        py::buffer_info info = b.request();

        /* Some basic validation checks ... */
        if (info.format != py::format_descriptor<Scalar>::format())
            throw std::runtime_error("Incompatible format: expected a double array!");

        if (info.ndim != 2)
            throw std::runtime_error("Incompatible buffer dimension!");

        auto strides = Strides(
            info.strides[rowMajor ? 0 : 1] / (py::ssize_t)sizeof(Scalar),
            info.strides[rowMajor ? 1 : 0] / (py::ssize_t)sizeof(Scalar));

        auto map = Eigen::Map<Matrix, 0, Strides>(
            static_cast<Scalar *>(info.ptr), info.shape[0], info.shape[1], strides);

        return Matrix(map);
    }));

为了引用,这个 Eigen 数据类型的 def_buffer() 调用应该如下所示:

.def_buffer([](Matrix &m) -> py::buffer_info {
    return py::buffer_info(
        m.data(),                                /* Pointer to buffer */
        sizeof(Scalar),                          /* Size of one scalar */
        py::format_descriptor<Scalar>::format(), /* Python struct-style format descriptor */
        2,                                       /* Number of dimensions */
        { m.rows(), m.cols() },                  /* Buffer dimensions */
        { sizeof(Scalar) * (rowMajor ? m.cols() : 1),
          sizeof(Scalar) * (rowMajor ? 1 : m.rows()) }
                                                 /* Strides (in bytes) for each index */
    );
 })

对于绑定 Eigen 类型(尽管有一些限制)的更简单的方法,请参阅 Eigen 的章节。

参见

文件 tests/test_buffers.cpp 包含完整的示例,它更详细地演示了使用 pybind11 的 buffer protocol。

1

http://docs.python.org/3/c-api/buffer.html

数组#

在上面的代码片段中,通过将 py::bufferpy::array 互换,可以限制该函数,使其只接受 NumPy 数组(而不是任何满足 buffer protocol 的 Python 对象类型)。

在许多情况下,希望定义函数,它只接受某个数据类型的 NumPy 数组。这可能要借助 py::array_t<T> 模板。例如,下面的函数要求参数成为包含双精度值的 NumPy 数组。

void f(py::array_t<double> array);

当它以不同的类型(例如整数或整数列表)调用时,绑定代码将尝试将 input 输入到请求类型的 NumPy 数组中。这个特性需要包含 pybind11/numpy.h。注意 pybind11/numpy.h 不依赖于 NumPy 头,因此可以在没有声明 NumPy 的构建时依赖的情况下使用;NumPy>=1.7.0 是运行时依赖。

在 NumPy 数组中数据不保证以密集的方式填充;此外,条目可以由任意的列和行大步分离。有时,要求函数只接受密集数组(使用 C(行为主)或 Fortran(列为主)排序)会很有用。这可以通过第二个模板参数 py::array::c_stylepy::array::f_style 来完成。

void f(py::array_t<double, py::array::c_style | py::array::forcecast> array);

py::array::forcecast 参数是第二个模板形参的默认值,它确保不符合要求的参数被转换为满足指定要求的数组,而不是尝试下一次函数重载。

数组有几种方法;引用下面列出的方法可以工作,以及以下基于 NumPy API 的函数:

  • .dtype() 返回所包含值的类型。

  • .strides() 返回指向数组步长的指针(可以选择传递整数轴来获得数字)。

  • .flags() 返回标志设置。.writable().owndata() 是直接可用的。

  • .offset_at() returns the offset (optionally pass indices).

  • .squeeze() returns a view with length-1 axes removed.

  • .view(dtype) returns a view of the array with a different dtype.

  • .reshape({i, j, ...}) returns a view of the array with a different shape. .resize({...}) is also available.

  • .index_at(i, j, ...) gets the count from the beginning to a given index.

还有几种获取引用的方法(如下所述)。

结构化类型#

为了让 py::array_t 与结构化(记录)类型一起工作,首先需要注册该类型的内存布局。这可以通过 PYBIND11_NUMPY_DTYPE 宏来完成,在插件定义代码中调用,它期望类型后面跟着字段名:

struct A {
    int x;
    double y;
};

struct B {
    int z;
    A a;
};

// ...
PYBIND11_MODULE(test, m) {
    // ...

    PYBIND11_NUMPY_DTYPE(A, x, y);
    PYBIND11_NUMPY_DTYPE(B, z, a);
    /* now both A and B can be used as template arguments to py::array_t */
}

该结构应该由基本算术类型、std::complex、之前注册的子结构和上述任何类型的数组组成。

向量化函数#

假设想将具有以下签名的函数绑定到 Python,以便它除了处理普通参数外,还可以处理任意 NumPy 数组参数(向量、矩阵、一般的 N-D 数组):

double my_func(int x, float y, double z);

在包含 pybind11/numpy.h 头之后,这是极其简单的:

m.def("vectorized_func", py::vectorize(my_func));

Invoking the function like below causes 4 calls to be made to my_func with each of the array elements. The significant advantage of this compared to solutions like numpy.vectorize() is that the loop over the elements runs entirely on the C++ side and can be crunched down into a tight, optimized loop by the compiler. The result is returned as a NumPy array of type numpy.dtype.float64.

>>> x = np.array([[1, 3], [5, 7]])
>>> y = np.array([[2, 4], [6, 8]])
>>> z = 3
>>> result = vectorized_func(x, y, z)

The scalar argument z is transparently replicated 4 times. The input arrays x and y are automatically converted into the right types (they are of type numpy.dtype.int64 but need to be numpy.dtype.int32 and numpy.dtype.float32, respectively).

备注

Only arithmetic, complex, and POD types passed by value or by const & reference are vectorized; all other arguments are passed through as-is. Functions taking rvalue reference arguments cannot be vectorized.

In cases where the computation is too complicated to be reduced to vectorize, it will be necessary to create and access the buffer contents manually. The following snippet contains a complete example that shows how this works (the code is somewhat contrived, since it could have been done more simply using vectorize).

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>

namespace py = pybind11;

py::array_t<double> add_arrays(py::array_t<double> input1, py::array_t<double> input2) {
    py::buffer_info buf1 = input1.request(), buf2 = input2.request();

    if (buf1.ndim != 1 || buf2.ndim != 1)
        throw std::runtime_error("Number of dimensions must be one");

    if (buf1.size != buf2.size)
        throw std::runtime_error("Input shapes must match");

    /* No pointer is passed, so NumPy will allocate the buffer */
    auto result = py::array_t<double>(buf1.size);

    py::buffer_info buf3 = result.request();

    double *ptr1 = static_cast<double *>(buf1.ptr);
    double *ptr2 = static_cast<double *>(buf2.ptr);
    double *ptr3 = static_cast<double *>(buf3.ptr);

    for (size_t idx = 0; idx < buf1.shape[0]; idx++)
        ptr3[idx] = ptr1[idx] + ptr2[idx];

    return result;
}

PYBIND11_MODULE(test, m) {
    m.def("add_arrays", &add_arrays, "Add two NumPy arrays");
}

参见

The file tests/test_numpy_vectorize.cpp contains a complete example that demonstrates using vectorize() in more detail.

Direct access#

For performance reasons, particularly when dealing with very large arrays, it is often desirable to directly access array elements without internal checking of dimensions and bounds on every access when indices are known to be already valid. To avoid such checks, the array class and array_t<T> template class offer an unchecked proxy object that can be used for this unchecked access through the unchecked<N> and mutable_unchecked<N> methods, where N gives the required dimensionality of the array:

m.def("sum_3d", [](py::array_t<double> x) {
    auto r = x.unchecked<3>(); // x must have ndim = 3; can be non-writeable
    double sum = 0;
    for (py::ssize_t i = 0; i < r.shape(0); i++)
        for (py::ssize_t j = 0; j < r.shape(1); j++)
            for (py::ssize_t k = 0; k < r.shape(2); k++)
                sum += r(i, j, k);
    return sum;
});
m.def("increment_3d", [](py::array_t<double> x) {
    auto r = x.mutable_unchecked<3>(); // Will throw if ndim != 3 or flags.writeable is false
    for (py::ssize_t i = 0; i < r.shape(0); i++)
        for (py::ssize_t j = 0; j < r.shape(1); j++)
            for (py::ssize_t k = 0; k < r.shape(2); k++)
                r(i, j, k) += 1.0;
}, py::arg().noconvert());

To obtain the proxy from an array object, you must specify both the data type and number of dimensions as template arguments, such as auto r = myarray.mutable_unchecked<float, 2>().

If the number of dimensions is not known at compile time, you can omit the dimensions template parameter (i.e. calling arr_t.unchecked() or arr.unchecked<T>(). This will give you a proxy object that works in the same way, but results in less optimizable code and thus a small efficiency loss in tight loops.

Note that the returned proxy object directly references the array’s data, and only reads its shape, strides, and writeable flag when constructed. You must take care to ensure that the referenced array is not destroyed or reshaped for the duration of the returned object, typically by limiting the scope of the returned instance.

The returned proxy object supports some of the same methods as py::array so that it can be used as a drop-in replacement for some existing, index-checked uses of py::array:

  • .ndim() returns the number of dimensions

  • .data(1, 2, ...) and r.mutable_data(1, 2, ...)` returns a pointer to the const T or T data, respectively, at the given indices. The latter is only available to proxies obtained via a.mutable_unchecked().

  • .itemsize() returns the size of an item in bytes, i.e. sizeof(T).

  • .ndim() returns the number of dimensions.

  • .shape(n) returns the size of dimension n

  • .size() returns the total number of elements (i.e. the product of the shapes).

  • .nbytes() returns the number of bytes used by the referenced elements (i.e. itemsize() times size()).

参见

The file tests/test_numpy_array.cpp contains additional examples demonstrating the use of this feature.

Ellipsis#

Python provides a convenient ... ellipsis notation that is often used to slice multidimensional arrays. For instance, the following snippet extracts the middle dimensions of a tensor with the first and last index set to zero.

a = ...  # a NumPy array
b = a[0, ..., 0]

The function py::ellipsis() function can be used to perform the same operation on the C++ side:

py::array a = /* A NumPy array */;
py::array b = a[py::make_tuple(0, py::ellipsis(), 0)];

Memory view#

For a case when we simply want to provide a direct accessor to C/C++ buffer without a concrete class object, we can return a memoryview object. Suppose we wish to expose a memoryview for 2x4 uint8_t array, we can do the following:

const uint8_t buffer[] = {
    0, 1, 2, 3,
    4, 5, 6, 7
};
m.def("get_memoryview2d", []() {
    return py::memoryview::from_buffer(
        buffer,                                    // buffer pointer
        { 2, 4 },                                  // shape (rows, cols)
        { sizeof(uint8_t) * 4, sizeof(uint8_t) }   // strides in bytes
    );
});

This approach is meant for providing a memoryview for a C/C++ buffer not managed by Python. The user is responsible for managing the lifetime of the buffer. Using a memoryview created in this way after deleting the buffer in C++ side results in undefined behavior.

We can also use memoryview::from_memory for a simple 1D contiguous buffer:

m.def("get_memoryview1d", []() {
    return py::memoryview::from_memory(
        buffer,               // buffer pointer
        sizeof(uint8_t) * 8   // buffer size
    );
});

在 2.6 版更改: memoryview::from_memory added.