Efficient model inference pipelines are crucial for machine learning applications, as they enable real-time predictions, cost savings, scalability, and improved user experience. However, large-scale inference pipelines tend to steer away from Python to C++ due to its performance and memory usage limitations, as well as the need for specialized hardware.
PyTorch provides a C++ API called LibTorch that can be used to build inference pipelines in C++. To integrate C++ with modern architectures like Golang (or Rust, Elixir), developers can introduce language-specific bindings to C++. However, this can be a complex task that requires a deep understanding of both LibTorch and Golang. CGO is a tool that enables high-performance and seamless integration between Golang and existing C/C++ codebases, making it useful for calling C/C++ code from Golang, including LibTorch C++ models for efficient model inference.
The motivation for the blog post is to provide a simple guide for developers who want to use LibTorch C++ models in their Golang applications using CGO.
To create a TorchScript module, a PyTorch model is converted to a serialized script module that can be executed independently of the Python runtime. My previous blog post gives a detailed guide on how to create a TorchScript module, which involves defining the model’s forward function using PyTorch’s scripting language and compiling the script module using the torch.jit.script
function. The resulting script module can be saved and loaded for efficient model inference in C++ or Python applications.
To create a simple C++ program to load a pre-trained PyTorch model and perform inference, you can follow these steps:
torch::jit::load
functionmodule.forward
function#include <torch/script.h> // One-stop header.
#include <iostream>
#include <memory>
int main() {
torch::jit::script::Module module;
try {
// Load the serialized script module using torch::jit::load function
module = torch::jit::load("model.pt");
}
catch (const c10::Error& e) {
std::cerr << "error loading the model\n";
return -1;
}
// Prepare the input data in the appropriate format required by the model
auto input = torch::ones({1, 3, 224, 224});
// Perform inference using module.forward function
auto output = module.forward({input}).toTensor();
std::cout << output << '\n';
}
You can refer to the PyTorch documentation for detailed instructions and examples on how to use the PyTorch C++ library: https://pytorch.org/cppdocs.
For this experiment, you can run make deps
after cloning the source code, for installing LibTorch and Golang dependencies.
Here’s a very simple example for understanding CGO:
Suppose you have a C++ library that contains a function add
which takes two integers as input and returns their sum. Here’s how you can do it!
add.h
and its implementation in add.cpp
:// add.h
#ifdef __cplusplus
extern "C" {
#endif
int add(int a, int b);
#ifdef __cplusplus
}
#endif
// add.cpp
int add(int a, int b) {
return a + b;
}
g++ -c add.cpp -o add.o
g++ -shared -o libadd.so add.o
// main.go
package main
// #include "add.h"
// #cgo LDFLAGS: -L. -ladd
import "C"
import "fmt"
func main() {
a := 1
b := 2
sum := C.add(C.int(a), C.int(b))
fmt.Printf("%d + %d = %d\n", a, b, sum)
}
In this example, we wrote a C wrapper for the add function, compiled it with the C++ library into a shared object, and loaded it into the Golang program using CGO. We then called the add
function from Golang using CGO, passing the two integers as parameters, and printed the result.
Here’s the C wrapper for LibTorch C++ code which loads the module, runs inference, and returns the result. Notice, this code is supposed to change depending on your data type for the C++ output. In this, experiment the generated TorchScript module returns a dictionary {"label": score}
. So the GetResult
function contains code to return those values in a readable format to Golang.
#ifndef __GOLIBTORCH_H__
#define __GOLIBTORCH_H__
#ifdef __cplusplus
extern "C" {
#endif // __cplusplus
typedef void *mModel;
typedef struct {
char** labels;
float* scores;
} Result;
mModel NewModel(char *modelFile);
Result *GetResult(mModel model, float *inputData, int *channels, int *width, int *height);
void DeleteModel(mModel model);
#ifdef __cplusplus
}
#endif // __cplusplus
#endif // __GOLIBTORCH_H__
You can find the C++ implementation of the Header file here.
In your Golang code, you can include the following comments for CGO:
// #cgo LDFLAGS: -lstdc++ -L/usr/lib/libtorch/lib -ltorch_cpu -lc10
// #cgo CXXFLAGS: -std=c++17 -I${SRCDIR} -g -O3
// #cgo CFLAGS: -D_GLIBCXX_USE_CXX11_ABI=1
// #include <stdio.h>
// #include <stdlib.h>
// #include "golibtorch.h"
import "C"
Commented lines 1 to 3 are for the C++ compiler to know where to find LibTorch and which C++ version to use for compiling.
Commented lines 4 to 6 includes the necessary headers for your C wrapper.
You can find the Go implementation here.
In your Golang code, you can access the C wrapper variables and functions using C.xxx
:
type Model struct {
model C.mModel
}
func NewModel(modelFile string) (*Model, error) {
return &Model{
model: C.NewModel(
C.CString(modelFile),
),
}, nil
}
In Python, it is very easy to use TorchVision to preprocess the images. In Golang, the image read from the disk needs to be converted into bytes so that it can be converted to float
array Tensor. You can refer to the preProcess
function in golibtorch.go which converts the image to a float32
array and returns the size of the image. It is a good opportunity to explore the scriptable transforms as well as create a TorchVision Golang wrapper.
In CGO, data exchange between C and Golang occurs through pointers. Pointers allow both languages to access and manipulate the same block of memory. When calling a C function from Golang, the function parameters are passed by value, but pointers can be used to pass a memory address instead. Similarly, when returning data from C to Golang, the C function can write the result to a memory location pointed to by a Golang pointer. This leaves room for some memory management.
You can prepare the inputs which can be passed to the CGO functions:
inputPtr := (*C.float)(unsafe.Pointer(&data[0]))
channelPtr := (*C.int)(unsafe.Pointer(&vals[0]))
widthPtr := (*C.int)(unsafe.Pointer(&vals[1]))
heightPtr := (*C.int)(unsafe.Pointer(&vals[2]))
Here, we are converting the float32
slice in C++ to C float
array pointer, num of channels and size of the image in C int
pointers.
Now, we can run the forward pass by passing the required input pointers like:
cResult := C.GetResult(m.model, inputPtr, channelPtr, widthPtr, heightPtr)
defer C.free(unsafe.Pointer(cResult))
if cResult == nil {
return nil, errors.New("error in getting result")
}
Note, its recommended to always use C.free
to free the memory taken by the result pointers.
Similarly, now you can use the result pointer to parse the output in Go specific format:
// length of slice for labels and scores
const TopK = 5
// create go specific slice of strings for labels and slice of floats for scores
labels := make([]string, TopK)
scores := make([]float32, TopK)
// convert C pointer to slice of floats
scoresHeader := (*[1 << 30]float32)(unsafe.Pointer(cResult.scores))[:TopK:TopK]
copy(scores, scoresHeader)
// convert C pointer to slice of strings
ptr := uintptr(unsafe.Pointer(cResult.labels))
for i := 0; i < TopK; i++ {
labelPtr := (**C.char)(unsafe.Pointer(ptr))
labels[i] = C.GoString(*labelPtr)
ptr += unsafe.Sizeof(uintptr(0))
}
This blog post has explained how to call a LibTorch C++ model from Golang using CGO. By following the steps outlined in this post, developers can create fast and efficient inference pipelines using Golang and LibTorch C++.
If you liked this experiment, go check it out on GitHub. If you have an idea or a suggestion for improvement, feel free to contribute via Issues/Pull Requests!
Source Code: GitHub