In the automotive industry, OEMs and Tier 1s commonly send out requests for information (RFIs) and requests for quotation (RFQs) to vendors to understand a solution’s compute performance and system resource requirements. This is a crucial step when assessing whether the sourced part is a suitable choice when designing the next-generation compute platform. Traditionally, some relatively simple benchmarks have been used, but as the complexity of software has increased with higher levels of automation, these benchmarks no longer provide sufficient information when making important design and sourcing decisions. In this paper, AVCC presents recommendations on how to benchmark deep neural networks (DNNs) for automated and assisted driving use-cases.

  • By requesting this technical report you are agreeing to our Legal Terms & Conditions of Use and our Privacy Policy.
  • This field is for validation purposes and should be left unchanged.

 

Figure 3 Real Vehicle Pipeline versus AVCC Benchmark Pipeline

AVCC has created run rules which dictate how the DNN benchmark should be executed for the results to be deemed valid. These run rules cover which optimizations are allowed, e.g., re-training and compression, what’s considered being part of the pre/post-processing, model accuracy requirements, and more. In order to make fair comparisons and for vendors to show their optimal performance, two sets of run rules have been defined, creating two so-called divisions: baseline and optimized. The baseline division creates common ground for OEMs and Tier 1s, and enable apples-to-apples comparisons, while the optimized division enables each vendor to showcase their product’s maximum performance.

In addition to reporting the DNN benchmark inference result, more information is required to be reported, such as memory bandwidth utilization. This additional reporting allows OEMs and Tier 1s to better understand the full system impact of the solution which is being considered.

More work is on-going and, in this paper, certain aspects of DNN benchmarking are not included. These topics relate to the models and datasets that will be used, the scenario in which the benchmark is executed (e.g., how samples are distributed and the number of parallel streams), and how to measure power. These important topics will be addressed and published in future papers.