FastDeploy AI模型快速部署

2023-02-06 tuwei312

FastDeploy是一款全场景、易用灵活、极致高效的AI推理部署工具。提供开箱即用的云边端部署体验, 支持超过150+ Text, Vision, Speech和跨模态模型，

并实现端到端的推理性能优化。包括图像分类、物体检测、图像分割、人脸检测、人脸识别、关键点检测、抠图、OCR、NLP、TTS等任务，满足开发者多场景、多硬件、

多平台的产业部署需求。

PaddlePaddle FastDeploy官网开源网址：https://github.com/PaddlePaddle/FastDeploy，所有更新以官方公布为准。

Jetson部署库编译

FastDeploy当前在Jetson仅支持ONNX Runtime CPU和TensorRT GPU/Paddle Inference三种后端推理。

C++ SDK编译安装

编译需满足

gcc/g++ >= 5.4(推荐8.2)

cmake >= 3.10.0

jetpack >= 4.6.1

如果需要集成Paddle Inference后端，在Paddle Inference预编译库页面根据开发环境选择对应的Jetpack C++包下载，并解压。

git clone https://github.com/PaddlePaddle/FastDeploy.git

cd FastDeploy

mkdir build && cd build

cmake .. -DBUILD_ON_JETSON=ON \

-DENABLE_VISION=ON \

-DENABLE_PADDLE_BACKEND=ON \ # 可选项，如若不需要Paddle Inference后端，可关闭

-DPADDLEINFERENCE_DIRECTORY=/Download/paddle_inference_jetson \

-DCMAKE_INSTALL_PREFIX=${PWD}/installed_fastdeploy

make -j8

make install

编译完成后，即在CMAKE_INSTALL_PREFIX指定的目录下生成C++推理库。

C++推理

#include "fastdeploy/runtime.h"
namespace fd = fastdeploy;
int main(int argc, char* argv[]) {
  std::string model_file = "mobilenetv2/inference.pdmodel";
  std::string params_file = "mobilenetv2/inference.pdiparams";
  // setup option
  fd::RuntimeOption runtime_option;
  runtime_option.SetModelPath(model_file, params_file, fd::ModelFormat::PADDLE);
  runtime_option.UseOrtBackend();
  runtime_option.SetCpuThreadNum(12);
  // init runtime
  std::unique_ptr<fd::Runtime> runtime =
      std::unique_ptr<fd::Runtime>(new fd::Runtime());
  if (!runtime->Init(runtime_option)) {
    std::cerr << "--- Init FastDeploy Runitme Failed! "
              << "\n--- Model:  " << model_file << std::endl;
    return -1;
  } else {
    std::cout << "--- Init FastDeploy Runitme Done! "
              << "\n--- Model:  " << model_file << std::endl;
  }
  // init input tensor shape
  fd::TensorInfo info = runtime->GetInputInfo(0);
  info.shape = {1, 3, 224, 224};
 
  std::vector<fd::FDTensor> input_tensors(1);
  std::vector<fd::FDTensor> output_tensors(1);
  std::vector<float> inputs_data;
  inputs_data.resize(1 * 3 * 224 * 224);
  for (size_t i = 0; i < inputs_data.size(); ++i) {
    inputs_data[i] = std::rand() % 1000 / 1000.0f;
  }
  input_tensors[0].SetExternalData({1, 3, 224, 224}, fd::FDDataType::FP32, inputs_data.data());
  //get input name
  input_tensors[0].name = info.name;
  runtime->Infer(input_tensors, &output_tensors);
  output_tensors[0].PrintInfo();
  return 0;
}

Python编译安装

编译过程同样需要满足

gcc/g++ >= 5.4(推荐8.2)

cmake >= 3.10.0

jetpack >= 4.6.1

python >= 3.6

Python打包依赖wheel，编译前请先执行pip install wheel

如果需要集成Paddle Inference后端，在Paddle Inference预编译库页面根据开发环境选择对应的Jetpack C++包下载，并解压。

所有编译选项通过环境变量导入

git clone https://github.com/PaddlePaddle/FastDeploy.git

cd FastDeploy/python

export BUILD_ON_JETSON=ON

export ENABLE_VISION=ON

# ENABLE_PADDLE_BACKEND & PADDLEINFERENCE_DIRECTORY为可选项

export ENABLE_PADDLE_BACKEND=ON

export PADDLEINFERENCE_DIRECTORY=/Download/paddle_inference_jetson

python setup.py build

python setup.py bdist_wheel

编译完成即会在FastDeploy/python/dist目录下生成编译后的wheel包，直接pip install即可。

编译过程中，如若修改编译参数，为避免带来缓存影响，可删除FastDeploy/python目录下的build和.setuptools-cmake-build两个子目录后再重新编译。

Python推理

import fastdeploy as fd

model_url = "https://bj.bcebos.com/fastdeploy/models/mobilenetv2.tgz"

fd.download_and_decompress(model_url, path=".")

option = fd.RuntimeOption()

option.set_model_path("mobilenetv2/inference.pdmodel",

"mobilenetv2/inference.pdiparams")

# **** CPU 配置 ****

option.use_cpu()

option.use_ort_backend()

option.set_cpu_thread_num(12)

# 初始化构造runtime

runtime = fd.Runtime(option)

# 获取模型输入名

input_name = runtime.get_input_info(0).name

# 构造随机数据进行推理

results = runtime.infer({

input_name: np.random.rand(1, 3, 224, 224).astype("float32")

})

print(results[0].shape)

常见问题

常见问题

FastDeploy AI模型快速部署

常见问题

为您推荐