On May 23rd, in the 103-year-old San Francisco Art Palace, Intel’s new technology conference —— artificial intelligence developer conference (abbreviated as “AIDC”) arrived on time. This time, Intel focused on broadening the artificial intelligence ecosystem.
Between the Romanesque architecture and the technological AI scene, Intel’s AI helmer Naveen Rao talked about Intel’s artificial intelligence software and hardware combination, and the most important information was the announcement of the Nervana neural network chip, according to the plan. Intel's latest AI chip, the Nervana NNP L-1000, will be officially introduced to the market in 2019. This is Intel's first commercial neural network processor product.
Two years ago, Naveen Rao was CEO and co-founder of deep learning startup Nervana Systems. After the company was acquired by Intel, Nervana became the core battleship of Intel artificial intelligence, Nervana NNP series came into being, Naveen Rao was appointed as the general manager of the artificial intelligence products division.
Carey Kloss, vice president of Intel's artificial intelligence products division and Nervana team member, said in an exclusive interview with 21st Century Business Herald reporter: "We started to develop Lake Crest (Nervana NNP series initial chip code) at the beginning of our business. At that time, our entire team was about 45 people and was building the largest Die (silicon chip). We developed Neon (deep learning software) and also built a cloud stack. These were completed by small teams. However, this is also a challenge. There will be pain in the growth of small teams. It took us a long time to bring out the first batch of products. Nervana was founded in 2014 and it was not until the chip was actually introduced last year. ”
However, after joining Intel, Nervana can use Intel's various resources. "Of course, calling resources is not an easy task, but Intel has extensive experience in the marketization of products." At the same time, Intel has the best post-silicon bring-up and architecture analysis I've seen so far. "Carey Kloss told the 21st Century Business Herald reporter," "We have hundreds of systems running at the same time on the chip. Nervana's employees and members who joined just 6 months ago are also cooperating in order to continue working day and night." "In his view, Nervana is now at a reasonable pace and has all the ingredients for success next year."
In addition to Nervana, Intel’s flagship AI acquisition includes Movidius, FPGA (Field Programmable Gate Array) giant Altera, and Smarty’s Mobileye. In fact, since 2011, Intel has been continuously investing in artificial intelligence-related companies, including China's Cambrian and Horizon.
At the same time, Intel’s competitors are also growing. Nvidia’s GPUs have made great strides in artificial intelligence; Google recently released a third-generation AI chip, TPU, optimized for Google’s deep learning architecture, TensorFlow, and Google provided developers with TPU and other underlying services; last year, Baidu United ARM, Ziguang Zengrui and Hanfeng Electronics released DuelOS smart chip to provide voice interactive solutions; Facebook and Alibaba also entered the chip field, among which Alibaba Dharma is developing a neural network chip called Ali-NPU. It is mainly used for scenes such as image, video recognition and cloud computing.
In the "fighting battle" of this artificial intelligence chip, how will Intel respond?
Three factions fighting for hegemony
From an overall perspective, the current pattern of global artificial intelligence is not yet clear, and it belongs to the local warfare for each of its technological explorations, and it has not yet entered the overall battle for crowds. Artificial intelligence is a general concept. The specific application scenarios are quite different. The focus of each company is different. If classified according to technology and business genre, it can divide the global company into three factions.
One is the system application school. The most typical representatives are Google and Facebook. They not only develop system-level frameworks for artificial intelligence, such as Google's famous artificial intelligence framework Tensorflow, Facebook's Pytorch, but also large-scale application. For example, Google has been investing heavily in R&D autopilot and launching 2C services such as translation. Facebook also applies artificial intelligence technology to a wide range of areas such as image processing and natural language processing in social networks.
The second category is the chip school. Currently, it mainly provides computing support. The biggest players are Intel and Nvidia. Nvidia’s GPU captures the critical timing of computing device demand, and has performed well in computing in graphics rendering, artificial intelligence, and blockchain, and has also put pressure on Intel in these businesses. At the same time, Nvidia seems to be different from Intel’s “Intel Inside”, and it hopes to become a true computing platform and successfully launched its own CUDA platform.
On May 30th, Nvidia released the world's first computing platform that combines artificial intelligence and high-performance computing. This is also the largest computing platform behind GPU——DGX-2.
Intel, as the traditional leader in computing power, is naturally not to be outdone. The 50-year-old company is full of old meanings. In recent years, it frequently initiates heavyweight mergers and acquisitions in the area of artificial intelligence: In 2015, the company acquired $16.7 billion in "field programmable gate array giants". (Field Programmable Gate Array, FPGA) Altera, lays the foundation for future development trend of computing power, FPGA has great potential in cloud computing, Internet of things, edge computing, etc.; Intel acquired Nervana in 2016 and plans to use this company in Deep learning capabilities competed against GPUs; in the same year, Movidius, a vision processing chip startup, was acquired; in 2017, Intel acquired Israel to help drive the company Mobileye for $15.3 billion, aiming to enter the field of automated driving.
In addition to system application and chip distribution, the third category is technology application, and most of the remaining companies belong to this category. Although different companies all claim to have profound and even unique technical accumulation in the field of deep learning and artificial intelligence, they are mostly based on system application and chip technology platforms. Only the technology applications are more C-terminal users, including autopilot, image recognition, and enterprise-level applications. Objectively speaking, the technical application belongs to the "Gentlemen's good at material also".
Judging from the current competitive landscape, System Application School has gradually taken up the overall advantages and has the most core competitiveness in the field of artificial intelligence. In the traditional era of computers and mobile phones, systems and chips are more in a cooperative relationship, and chips even more dominant. Specifically, for example, in the computer market, Intel completely dominates the field of computing power and crosses the PC and Apple's MAC. On the system side, Windows and iOS each have their own merits and cannot replace them, but their common Intel cannot replace them. In the era of mobile phones, although the protagonist of computing has changed from Intel to Qualcomm, the chip is still at the core, and its importance and operating system are equally divided.
In the past 1-2 years, the situation has changed rapidly. Apple has released its own R&D and production of MAC chips. Intel's share price has been falling. In the field of artificial intelligence, this trend is even more pronounced. Because the demand for computing scenarios is greatly different, it becomes necessary and technically feasible for Google to develop mature chips according to its own needs. If Intel wants to customize chips for different scenarios, it means that Intel will fully transition into 2B. Compared with the previous 2B2C model, pure 2B services will obviously look more like party B, and the complexity of the business line will increase dramatically. Historically, a company's shift from 2C to 2B is generally due to the loss of its core dominance in the industry and has to be retired.
Betting Nervana NNP
So, in the fierce competition, Intel how to further increase the chip business?
After Naveen Rao joined Intel, he became Intel's vice president and head of the AI Business Unit (AIPG) and led the launch of the Intel Nervana NNP family of chips. This time at the AIDC conference, it proposed to provide developers with software tools, hardware, and ecology. In the industry's view, with Intel's technical strength, software tools and hardware are not a problem, but the ecology is yet to be discussed. In the PC era, the core of the ecosystem is the chip. Therefore, Intel can be entrenched in the ecology of the chips. However, in the era of artificial intelligence, the artificial intelligence system is the core of the ecology. The chips that provide computing power are part of the ecology. The CPU can provide calculations. Force, GPU can also provide, Intel can produce, Nvidia can also produce, even Google, Apple itself can also produce.
Currently in the field of data science and deep learning computing, Intel's chip layout mainly includes Xeon (Xeon) chip series, Movidius' vision chip VPU, Nervana NNP series, and FPGA (Field Programmable Gate Array). These several product lines correspond to several different subdivided application scenarios.
Nervana NNP series is a neural network processor. In the deep learning training and inference phase, Nervana NNP is mainly for the calculation of the training phase. According to Intel's plan, Deep Learning (abbreviated as “DL”) will be implemented by 2020. The effect of ;) is increased by 100 times. This neural network processor was designed by Intel and Facebook. It can be predicted that this chip will have a good support for Pytorch, Facebook's machine learning framework. After all, Facebook's Pytorch's ambition is definitely to be with Google's Tensorflow. A showdown. However, the latest chips will not be commercially available until 2019, and how the pattern of deep learning will be unpredictable.
Naveen Rao wrote in his blog: "We are developing the first commercial neural network processor product, the Intel Nervana NNP-L1000 (codenamed Spring Crest), which is planned to be released in 2019." Compared with the first-generation Lake Crest product, we expect Intel Nervana NNP-L1000 to achieve 3-4 times training performance. The Intel Nervana NNP-L1000 will also support bfloat16, a widely used numerical network format for neural networks. In the future, Intel will expand support for bfloat16 on the artificial intelligence product line, including Intel Xeon processors and Intel FPGAs. ”
In fact, the rumors that Spring Crest launched at the end of 2018 already existed, but it seems that there is a slight delay at the time of official announcement in 2019. In response, Carey Kloss explained to reporters: "In order to enter a more modern process node, we have integrated more Dies (silicon chips) to achieve faster processing speeds. However, it takes a certain amount of time to manufacture silicon wafers and it also takes time to turn the silicon wafers into new neural network processors. This is the reason for the delay. ”
For the difference between the two generations of chips, he analyzed that: "Lake Crest as the first generation of processors, in the GEMM (matrix operations) and convolutional neurons have achieved a very good calculation utilization. This is not just the utilization of 96% throughput, but we have also achieved a computational utilization rate of more than 80% for GEMM in most cases without full customization. When we develop next-generation chips, if we can maintain high computational utilisation, the new product will have a 3 to 4x performance improvement. ”
Concerning competition, Carey Kloss said: "I don't know what our competitors' roadmap is, but our response rate is relatively fast, so I don't think we will be at a disadvantage in neural network processing. For example, bfloat16 has been around for a while. It has become more popular recently. Many customers have made requests for bfloat16, and we have gradually turned to support bfloat16. "Compared with Google's TPU, he thinks that the second generation of TPU is similar to Lake Crest, and the third generation of TPU is similar to Spring Crest."
Attack on all sides
In addition to the much-anticipated Nervana NNP, Intel's Xeon chips are targeted at servers and large computing devices. For example, China's supercomputers Tianhe 1 and 2 use Intel Xeon six-core processors.
In terms of visual chips, Intel’s business volume has grown rapidly. Movidius VPU chips have long been used in emerging hardware markets such as cars and drones. Movidius' vision chips are used in the drones, Tesla, and Google Clips cameras.
Gary Brown, market leader at Movidius, told 21st Century Business Herald: "At Movidius, the chip we developed is called the visual processing unit VPU. VPU is a chip that combines both computer vision and smart camera processors. So there are three types of processing that our chips do: ISP processing, which is image signal processing, processing based on camera capture technology, and computer vision and deep learning. ”
He cited examples of specific usage scenarios including VR products and robotics, smart homes, industrial cameras, AI cameras, and monitoring and security. Among them, "monitoring and security is a huge market, especially in China, the surveillance and security camera market is particularly large, there are some large companies in the development of surveillance cameras, such as Hikvision and Dahua. ”
Gary Brown also mentioned that the field of smart homes is currently developing rapidly. Although the market is small, it is developing rapidly. "There are many companies developing smart devices such as smart home security, personal home helpers, smart doorbells, and access control for apartments and homes." But in the home area, it is very challenging to make low cost, low energy consumption, long battery life, and very accurate. Because, for example, outdoor shades are moving, it is possible to trigger an anti-theft alarm. Therefore, a very low rate of false alarms is very important and must have good accuracy. ”
One of the company's challenges is how to continue to create high-performance chips. "We have some strategies, such as using a front-end algorithm to reduce power consumption, so that we can turn off most of the chips and only operate a small part of the optimized face detection." Features. When a face appears, other chips will be activated. This will keep the facial monitoring system on at all times. We also have a lot of energy-saving algorithms to make home smart cameras last for about 6 months. ” Gary Brown explained.
In addition, Altera is in charge of this line of FPGAs. With the arrival of the 5G wave, the demand for data analysis and computing of the IoT IoT will surge, and the number of access nodes for the Internet of Things will be at least tens of billions of dollars, which is 1-2 orders of magnitude higher than that of mobile phones. The typical requirement of the Internet of things is the need for flexible use of algorithmic changes. This is the strength of FPGAs. FPGAs can adapt to the needs of customized computing scenarios by changing their own architecture. This also allows Intel to provide more efficient devices for different types of devices in the future. Offering chips becomes possible. From the $16.7 billion acquisition amount, it can be seen that Intel’s purchase is obviously not just its immediate value.
Fast-breaking enterprise-level scenarios
According to a recent Intel survey, more than 50% of U.S. enterprise customers are moving to using existing Intel Xeon processor-based cloud solutions to meet their initial requirements for artificial intelligence. A number of Intel executives told reporters in an interview that no solution is suitable for all artificial intelligence scenarios. Intel will match technologies and businesses according to customer needs. For example, Intel will configure Xeon and FPGA, or Xeon and Movidius, to achieve higher performance artificial intelligence.
For Intel, these enhanced artificial intelligence features will be widely used in enterprise-level scenarios. Naveen Rao said: "We need to provide a comprehensive enterprise-class solution to accelerate the transition to artificial intelligence-driven computing in the future. This means that our solution provides the widest range of computing power and can support multiple architectures from milliwatts to kilowatts. ”
Carey Kloss further explained the application scenario of artificial intelligence chips to 21st Century Business Herald reporter: “The Spring Crest can be said to be the highest level Nervana neuron processor architecture. Therefore, its customers include large-scale computing centers, large-scale enterprises that already have considerable data science work, and governments. If you need low-latency and small models, Xeon can help you, it can open the data from the cloud to the end. ”
Specifically, Intel also explored scenarios such as medical care, driverlessness, new retail, and the Internet of Things. For example, in the medical area, according to reports, Intel is working with Novartis to use deep neural networks to accelerate high-content screening & mdash;& mdash; this is a key element of early drug development. The cooperation between the two sides reduced the time required to train the image analysis model from 11 hours to 31 minutes — efficiency increased by more than 20 times.
In terms of unattended stores, Intel provided "cloud computing" to Jingdong unmanned convenience stores, and it has been deployed and deployed in multiple smart stores (Sinopec Express Store, Jingdong Home) and smart vending machines. In terms of algorithm, JD.com stated that the machine learning algorithms used by unattended stores are mainly concentrated in the three directions of knowledge, knowledge, and knowledge. As unrelated data on the online and offline issues are involved, unstructured data such as videos are converted into structural data. For example, we need to use CNN (Convolutional Neural Network) algorithm in the now popular field of machine vision, traditional machine learning algorithms used in the intelligent supply chain, such as SVM, linear regression of statistics, and logistic regression. In the case of better network conditions, most video data can be completed in the cloud using a larger model. In the case of a poor network, through edge computing such as mobile, edge computing is done using a small network. The hardware used includes Intel's edge servers.
Although Intel’s foreign enemies have encountered strong enemies, the pace of transformation and expansion is very firm. From the R&D value alone, according to IC Insights, the total R&D expenditure of the top 10 semiconductor manufacturers in 2017 was US$35.9 billion, and Intel ranked first. According to the report, Intel’s R&D expenditure in 2017 was US$13.1 billion, accounting for 36% of the Group’s total expenditure, which is about one-fifth of Intel’s 2017 sales.
With each huge investment, the battle of AI chips will intensify.