FEATURES & FUNCTIONALITY
CHOOSING THE RIGHT TECHNOLOGY
When it comes to deployment, there are several platforms to choose from: CPU, GPU or FPGA. Each target has its own advantages and disadvantages. CPUs are the most flexible and allow to design and deploy a large variety of models, as well as implement custom data preprocessing before feeding data into the network. GPUs are de facto the most powerful devices for deployment but at the cost of higher power consumption. FPGAs shine in cases when small physical footprint and low power consumption are required.
FPGAs can provide up to 5x performance boost while consuming less power compared to embedded CPUs. YOLO (You Only Look Once) tiny version 2 object detection model inference performance comparison on CPU and FPGA is presented below.
CPU - Intel Core i7-5650U (NI IC-3173 )
FPGA - Xilinx Kintex-7 XC7K160T NI (IC-3173)
Convolutional kernel size: 3x3
Maximum image size: 512x512px
Maximum number of layers (Conv-Activte-Pool): 16
Supported NI platforms: Kintex-7, Zynq-7000 (NI IC-3173, NI sbRIO-9607)
Deep Neural Network Accelerator for FPGAs employs 8-bit fixed point data precision for storing the activations and weights and is capable to achieve similar performance as a single-precision floating-point format.
The Accelerator comes as a precompiled bit-file for a particular FPGA with accompanying API for deploying pre-trained neural networks.
This product works with Deep Learning Toolkit only, download free trial if you haven't already: