In early 2013, Qualcomm entered the automotive processor market. The first product was 602A at the end of 2013. It was officially launched in early 2014, but the response was flat after the launch. The on-board system development cycle is much longer than the development cycle of mobile phones. The chip must consider the market at least five years later. Qualcomm launched 820A in early 2016 (two chips, one is 820A without Modem, and the other is 820Am with Modem) , leading performance immediately received praise.
Jaguar Land Rover was the first to use the 820A as a cockpit domain controller. In the second half of 2017, there were 820A for the production vehicles of Land Rover and Jaguar I-PACE, followed by Volkswagen, Honda, Geely, PSA, BYD, and Maserati. The 820A is used as a cabin controller. Honda's 10th-generation Accord US version also uses 820A, has officially listed, domestic because of 1.5T engine oil increase, 10 Accord delayed listing.
The picture above shows the cockpit of the Land Rover Range Rover. It uses four screens, of which the full LCD is 12.3 inches, the Infotainment is 10 inches, and the onboard information display is also 10 inches. There are two large aluminum knobs on the screen, and there is a The 3.1-inch display used by HUD. It may be built by Panasonic or Harman. The Infotainment screen will automatically lean forward after starting to facilitate driver's viewing. The infotainment screen below the Infotainment screen can control the air conditioning, seat and vehicle settings of the vehicle. In addition to the air conditioning temperature knob and volume knob, the rest are touch operated. The instrument system uses QNX and Infotainment uses the Linux+Android interface. It may use Android Auto in the future.
Jaguar's first pure electric vehicle, the I-Pace, uses a star-like design and only replaces the onboard information display with 5.5 inches. This system is quite satisfactory. The only thing that shines is the laser HUD. Jaguar Land Rover is the first depot to use laser HUD. The laser HUD display has high contrast and bright colors.
Although it can't be regarded as AR HUD in the strict sense, it is very good and can communicate with ADAS and navigation map system.
The 820A has several advantages over the NXP i.mx8 or the Renesas R-CAR H3, or the Texas Instruments Jacinto EX:
Most of the 820A's development costs have been shared by mobile phone manufacturers and there is plenty of room for price reductions.
820 is originally for mobile devices, low power consumption is self-evident
Powerful CPU and GPU, but also specifically designed for Android
The biggest advantage of the 820A, including 4G Modem, no need to add communication modules
The disadvantage of the 820A is that it is not originally aimed at the automotive market, it is not based on the development process of the ISO26262 standard, and the safety level cannot be achieved with ASIL Class A, only AEC-Q100 Level 3. R-CAR H3 achieves ASIL B level. i.mx8 has not yet applied for ASIL level, but it can achieve ASIL B level with NXP capability. Of course, the overall system can reach the ISO 26262 standard by adding other components, but it is a trouble after all.
Compared with the Mercedes-Benz MBUX, the 820Am did not bring much surprises to the cockpit of the domain controller. 820Am had another strong opponent, which was Nvidia Parker. Parker began targeting the automotive market with an ASIL Class B security architecture, a lock-stepping R5 core, and memory correction. Another advantage of Parker is deep learning ability. The 820A's hardware also has deep learning capabilities, but Qualcomm started a little late and Qualcomm released the NPE SDK in July 2017. So the application in this area has not yet been seen, it is estimated that it will wait until the next generation 820 cockpit system. Future 820 domain controllers may take over some of the ADAS functions, such as 360 look around, lane line and pedestrian identification, front collision alarm, lane line deviation alarm, but it should only be used in the alarm system, it will not be used to actively execute the system, after all, it does not consider Functional safety, it is still a little uncomfortable to use.
In addition, there is not much theoretical basis for deep learning this thing, and more like brute force search - very deep layers, tens of millions or even billions of parameters, and then adjust the parameter fitting input and output. This is an uninterpretable black box. It is fine to use it on a mobile phone other than a car. However, there must be sufficient interpretability in the automotive field to assess security risks. However, in order to realize the semantic level recognition, deep learning is almost the best method. Although the auto industry does not like this kind of black box, it still has to pay great attention to deep learning.
Qualcomm's NPE runs primarily on Caffe2 and Tensor Flow. Caffe2 is primarily a graphics class. Tensor Flow can also handle voice classes. The 820 uses a heterogeneous architecture internally and has three computing units, including CPUs, GPUs, and DSPs for different types of applications.
The 820 internal Hexagon 680 DSP has a built-in 1024-bit SMID vector data register. Qualcomm calls it Hexagon Vector Extensions—Hexagon vector extension, abbreviated as HVX. The HVX can process four VLIW vector instructions at a time and can process up to 4096 bits of data per cycle. It should be noted that the instructions in general practical applications are much smaller than the maximum instruction width supported by the DSP, but with the aid of SIMD and system features. A single instruction can operate on multiple data at a time. Therefore, many data can be filled into the process at one time for maximum efficiency.
In addition, HVX also designed 32 vector registers for context switching. In terms of specifications, HVX supports 32-bit fixed-point decimal numbers. Generally, it is INT8 bit, but it does not support floating-point calculation. After all, the cost still needs to be considered. VX has L1 data and instruction cache, 4 parallel VLIW scalar processing units, 500MHz operating frequency, and shared L2 cache.
In addition, there are two independent sets of vector elements in the HVX. This design is actually designed to perform multi-threaded tasks such as processing audio and image processing at the same time. Vector elements can be calculated independently. Visual processing in ADAS such as 360-degree look, lane recognition, which is quantized 8-bit data to the DSP is more appropriate. The energy consumption ratio of DSP is almost the best. Unquantified 32-bit data is processed by the CPU or GPU.
The above figure shows the NPE workflow
Deep learning in the cockpit area may be mainly focused on speech recognition or NLP natural speech processing, and is a localized offline NLP combined with the cloud. Off-line NLP has little processing power for high-end CPUs or GPUs. The key is to store the voice library model. The cost is high, and the rising memory prices are a headache. There may be a problem with intellectual property rights, and offline data packets may be cracked.
Pictured above is a complete deep learning platform. It can be said that deep learning is based on NVIDIA's GPU. Without NVIDIA's GPU, deep learning will not reach today's level. Today, most of the deep learning training is partially accelerated. Completed. And NVIDIA had a long-term layout. When CUDA was launched in 2007, it was thought that CUDA had established an Intel-like ecosystem. Although the official release of the CUDA Toolkit is not always the most efficient implementation, there is a certain "black hole" in the knowledge that ordinary users can't exceed their performance bottlenecks by optimizing CUDA C programs anyway.
The official release of the library, from the earlier CUBLAS, CUFFT to later CUDNN for deep learning, is not written in CUDA C, but is done by NVIDIA's internal compiler (this is a non-public version). Obviously, it can not only sell hardware, but also stay ahead of the software and increase the user's viscosity. From the user's point of view, using a highly-encapsulated library can reduce the threshold for development and debugging. Directly calling the C API can implement its own algorithm without knowing the CUDA C design details. Even Google is choosing CUDA instead of OpenCL as the back end of TensorFlow.
The problem is that Qualcomm's GPU naturally cannot use CUDA, only OpenCL. CUDA has a strong ecosystem, especially in the field of deep learning and training, which is much easier to use than OpenCL. OpenCL is syntactically similar to CUDA, but it emphasizes the underlying operation more, so it is more difficult, but because of this, OpenCL can run across platforms. CUDA based on C language is packaged into an easy-to-write code, so even researchers who are not familiar with chip construction may use CUDA tools to write practical programs. Programmers prefer CUDA.
Of course, OpenCL and CUDA are not in a strict sense of competition. CUDA is a parallel computing architecture that includes an instruction set architecture and corresponding hardware engine. OpenCL is a parallel computing application programming interface (API). CUDA C is a high-level language. Non-professionals who have little knowledge of hardware can easily get started. OpenCL is an application development interface for hardware. It gives programmers more control over the hardware. It will be harder to get started and develop. Therefore, using Qualcomm's GPU for deep learning is very difficult. Fortunately, Qualcomm has a DSP. Although this DSP can only perform fixed-point operations, it is still useful, such as eliminating background noise during voice processing.
Nvidia's Parker has obvious advantages in terms of performance. It has more obvious advantages in the deep learning field, but it is probably not as good as the Qualcomm 820Am in terms of power consumption. Although Qualcomm did not give accurate TDP figures, Nvidia did not give accurate figures, saying 7.5 Watts, also say up to 21 watts. TDP this index has been a bit out of date, it is difficult to accurately assess the chip's power consumption, but the mobile phone power consumption requirements are certainly more demanding than the car, especially Qualcomm 810 complaints are severely heated, 820 will not be taken lightly, the car's estimates are also appropriate Reduce CPU performance to reduce power consumption.
Under the background of Sino-US trade disputes, Qualcomm's acquisition of NXP may not receive approval. Future Qualcomm's 820A, Nvidia's Paker will be veteran manufacturer NXP's i.mx8 (i.mx8QM, i.mx8QP will not be available until the third quarter of 2018 mass production sample, and still 28 nm FD-SOI process), Renesas The R-CAR H3 and Texas Instruments Jacinto 6 Plus pose a powerful threat. The 820A may squeeze i.mx8's mid-range market, while Parker may dominate the high-end market and squeeze the R-CAR H3 market.
Lcd Display Panel,Flat Panel Monitor,Lcd Touch Screen,Lcd Tv Panel
Huangshan Kaichi Technology Co.,Ltd , https://www.kaichitech.com