Home > News content

Review of Tencent AI Lab in 2020

via:博客园     time:2021/1/4 17:00:44     readed:280

The year 2020 just passed is not a peaceful one. The novel coronavirus pneumonia emerge in an endless stream throughout the year. Even so, Tencent AI Lab, adhering to the belief of "technology for the good" and the vision of "make AI everywhere", has made many valuable achievements in this year, including the application of AI technology in epidemic prevention and control, pathological screening and other medical applications. At the same time, Tencent AI Lab has made further important progress in virtual reality integrated world, virtual human, robot, intelligent drug research and development, intelligent agriculture, data security and other fields. In addition to sharing research results with the community through research papers and open source projects, Tencent AI Lab is also actively cooperating with universities, enterprises and research institutions to explore the potential possibilities of AI technology.

1、 Dual track parallel to general AI: virtual real integrated world and robot

General artificial intelligence (AGI) is the core long-term goal established since Tencent AI Lab was founded, that is, to create AI systems that can perceive and understand the real world and effectively perform various tasks. In order to achieve this goal, we need not only the breakthrough in software, but also the iterative innovation in hardware, and also the effective integration and integration of software and hardware which are generally lacking in the industry.

In 2020, Dr. Zhang Zhengyou, director of Tencent AI Lab and robotics x lab, proposed a new concept: integrated physical digital world (iphd). It integrates the current prospects in the fields of AI, virtual reality (VR), augmented reality (AR), hybrid reality (MR) and even the ideas of Internet and Internet of things. On this basis, the vision of universal hardware, artificial intelligence and human is interwoven. At present, all the research of Tencent AI Lab can be incorporated into the overall framework of virtual reality integrated world. For a more detailed description of this concept, please refer to Dr. Zhang's speech.

Under the framework of virtual reality integrated world, the four development directions of reality virtualization, virtual reality, holographic Internet and intelligent agent will become important guidance for the future development of Tencent AI Lab and robotics x lab.

Here are two major breakthroughs made by Tencent AI Lab and robotics x lab in 2020: virtual human and robot. We can also describe them as software and hardware intelligent agents under the framework of virtual integrated world, and also cover some core technologies in the other three directions. For example, modeling a face into a digital version is a reality virtualization achievement.

Virtual human: or the virtual incarnation of human, or the digital world

Virtual human is a kind of multimodal technology, involving computer vision, speech recognition and generation, natural language understanding and generation and other technologies. According to different sources, virtual human can be roughly divided into two categories: human digital model and virtual world native virtual human.

In terms of building digital models of human beings, in fact, Tencent announced a model based on actor Jiang Bingjie as early as 2018. Siren's motion and expression are very high fidelity, but due to the use of the industry's top motion capture and real-time rendering technology, siren also has a relatively high production cost.

In terms of creating the original virtual human in the virtual world, the multimodal virtual human "Ai ailing" developed by Tencent AI Lab has met with the public in May 2020 and performed the new song "lighting up" with young actor singer Wang Junkai and xiong'an children on children's day. Now, you can also watch Eileen's 24-hour performance in BiliBili Studio:



AI Ailing is the product of Tencent AI Lab's research on vision, voice, natural language, human-computer interaction and other multimodal aspects for many years. For example, its durian speech synthesis framework [2] integrates Tencent AI Lab's many years of deep experience in the field of speech. It can not only achieve accurate and robust speech synthesis, but also generate high-quality facial expressions synchronized with the synthesized speech.


Schematic diagram of durian working process

Virtual world native virtual human has a wide range of application prospects in virtual idol, virtual assistant, online education, digital content generation and other fields. For example, AI Ailing is a virtual anchor and virtual singer, and she is also a creative singer. The songnet lyrics creation model adopted by AI ailing can generate corresponding text according to any format and template. Combined with the durian model above, Ai Ling can sing her own songs in a natural and beautiful voice. For more descriptions of durian and songnet, see.

Virtual human is an important part of the virtual reality integrated world. In order to achieve the long-term goal of virtual real integrated world, we also need to build high-speed and real-time Internet of things infrastructure, build high-precision real-world model and more interesting and useful virtual world, and create more secure and efficient automatic machines. Facing this goal, Tencent is forging ahead.

In order to make Jamoca have the ability to walk the plum blossom pile, Tencent robotics x laboratory has built an intelligent brain for Jamoca to deal with complex environment based on the self-developed robot control technology. This brain allows Jamoca to walk, trot and jump, and gives it the ability to locate and avoid obstacles on its own. This is a demonstration of the ability of Tencent robotics x laboratory in the core technology directions of robot perception, motion planning and control, and also shows the important ability of Tencent robotics x laboratory in the whole machine system design and construction.

In the research of self balancing, Tencent robotics X has researched a two wheeled mobile robot that can keep self balancing, which is the first self-developed robot of Tencent robotics x laboratory. On the basis of traditional wheeled mobile robot, momentum wheel and its motor drive system are added, so that the robot can keep balance in static and moving state. The two research papers based on the mobile robot platform are accepted by the international top conference of robot industry, iros 2020, as oral exhibition papers. This project can be regarded as a milestone of Tencent's mechanical design and whole machine system design and construction ability.


Tencent robotics x self balancing wheeled mobile robot

For more details, please refer to.

Mobile robot is one of the core components of "intelligent agent" in the framework of virtual real integrated world, and it is also an important way to achieve the ultimate goal of general artificial intelligence. Next, in addition to continuing to enable mobile robots to understand the surrounding environment with higher accuracy and take timely and reasonable actions, Tencent robotics x lab and Tencent AI Lab will continue to study how to integrate multimodal AI capabilities with robots, so as to create intelligent robots that can closely participate in human production and life, and even act as human colleagues and friends.

2、 Application of science and technology

The core mission of Tencent is to be good. As a member of Tencent family, Tencent AI Lab also adheres to this mission. At the same time, as a frontier explorer of AI technology, Tencent AI Lab is well aware of the potential of AI to transform the world. Therefore, while actively exploring the most cutting-edge AI technology, Tencent AI Lab is also committed to transforming such potential into practical applications, so as to better serve users and benefit the society.

In July 2020, academician Zhong Nanshan's team and Tencent AI Lab jointly released a research result of using AI to predict the probability of critical illness development of patients with cowid-19, which can respectively predict the probability of critical illness within 5 days, 10 days and 30 days, which is helpful for reasonable early triage of patients. The research was published in nature communications, a sub Journal of nature, a top international journal. At the same time, Tencent AI Lab has also opened up the relevant source code for the first time and built a free online query service platform, contributing to the fight against the new crown epidemic. Please refer to.


Deep learning survival model calculation tool for early stage of severe cowid-19 patients

In April, the smart microscope jointly developed by Tencent AI Lab obtained nmpa registration certificate, becoming the first smart microscope product approved for clinical application in China. The intelligent microscope product integrates the latest technology of pathological analysis and diagnosis, and iterates for many times according to the pathologist's workflow and habits. The test shows that this intelligent microscope can effectively improve the work efficiency of pathologists, the accuracy and consistency of pathological analysis, and is expected to alleviate the shortage of pathologists and lack of experience in hospitals (especially primary hospitals). It is also a good example of precision medicine from previous research to landing exploration. See.

In July 2020, Tencent AI Lab launched the first AI driven drug discovery platform "yunshenzhiyao". Yunshen intelligent medicine integrates the advantages of Tencent AI Lab and Tencent cloud in cutting-edge algorithm, optimization database and computing resources, and provides five modules covering preclinical new drug discovery process, including protein structure prediction, virtual screening, molecular design / optimization, ADMET attribute prediction and synthesis route planning.


Address of Yunshen intelligent medicine platform: drug.ai.tencent .com

In the aspect of protein structure prediction, yunshenzhiyao adopts the champion protein structure prediction technology developed by Tencent AI Lab, which covers two key technological breakthroughs: a protein folding method based on self supervised learning and an iterative method based on deep learning. The technology has won the monthly champion for five times in six months in cameo, the only protein structure prediction and automatic evaluation platform in the world, leading many international well-known research teams. In November, Tencent AI Lab published in nature communications, a sub Journal of nature, a top international journal. It introduced the results of using "de novo folding" protein structure prediction method to help analyze the crystal structure of SRD5A2, and revealed the inhibition mechanism of finasteride, a drug molecule for the treatment of alopecia and benign prostatic hyperplasia.

In terms of virtual screening, the virtual screening module of "cloud deep intelligence medicine" platform first applies meta learning and deep neural network algorithm to lbdd (ligand based drug design) task, and uses AI to "transfer" the knowledge learned from other targets (such as the influence of local molecular structure on the binding strength of targets) to improve the prediction accuracy of the model. At present, the median prediction accuracy (correlation between predictive activity and experimental measurement activity) of the algorithm on thousands of experimental data sets has increased from the current record of 0.36 to 0.42, and the percentage of screening available models has increased from 56% to 60%, breaking the industry standard.

In the aspect of molecular generation, the molecular generation algorithm of yunshenzhi medicine can learn the relationship between various structural information of small molecules and target points in the existing database through artificial intelligence, and then can learn a molecular space. The existing models currently support the molecular generation of 319 kinases and 52 GPCR targets. In the process of molecular generation, the algorithm of yunshenzhiyao can sample the mapping of different targets in the molecular space, so as to generate new molecules that may be active to the target.

In the aspect of ADMET attribute prediction, Yunshen platform also has excellent performance, and its ADMET attribute prediction module for small drug molecules has been 3% ~ 11% better than the best existing model in academic circles on multiple data sets; in the feedback of partners, the accuracy of self-developed algorithm of the platform is 6% ~ 37% higher than that of existing commercial software. In addition, yunshenzhiyao also uses attention and other mechanisms to visualize the influence of substructures in the molecule on the results, so as to provide the interpretability of the model. In addition, the platform can also provide flexible deployment forms such as local version to ensure the data security of users.

Tencent AI Lab is also continuing to promote AI based drug discovery technology and provide more and wider functions for cloud deep intelligent drug platform. For more information about the platform, please refer to or the project website.

In addition, Tencent AI Lab has also developed a large-scale self-monitoring molecular graph pre training model, Grover. Grover is the industry's first open source large-scale pre training model of graph data based on depth map neural network. Researchers can quickly apply it as a basic component to drug R & D related research that needs to encode small molecules, and help drug R & D related applications, such as molecular attribute prediction, virtual screening and other tasks.

Grover model:https://drug.ai.tencent.com/cn/news/5


In June, Tencent AI Lab and wur, a world-famous Agricultural University, jointly organized the "Second International Smart greenhouse planting challenge". Five leading-edge technologies, such as IOT and IOT, are used to optimize greenhouse planting. The five AI harvests of the second round team all exceeded those of the agricultural planting expert group with 20 years of experience. Among them, the champion group, automatoes, got full marks, realized the reduction of resource consumption per mu by 16%, and the increase of net profit by 121%, which fully demonstrated the technical value of agricultural intelligent decision-making and greenhouse automatic control, and the future potential of reducing farmers' burden.

In addition, Tencent AI Lab has joined hands with Tencent TEG Architecture Platform Department to create cloud original "Tencent aiot intelligent planting scheme Igrow" with the help of AI algorithm and technical experience developed in the first competition, which has been launched in Liaoning, China's agricultural province in 2020. The first phase of tomato pilot ushered in a "small bumper harvest", with net profit per mu increased by several thousand yuan per quarter. The commercial value of Igrow has been preliminarily verified.


Igrow scheme in Greenhouse of Liaoning Province

On November 27, Tencent cloud (Shen county) agricultural digital economy industrial base opened, which is Tencent group's first agricultural digital economy industrial base in China. In the new year, the Igrow solution developed by Tencent AI Lab will be further studied and applied in the base.

After chemical fertilizer, pesticide and large-scale mechanized planting, AI and Internet of things are expected to further break away from the traditional mode of relying on nature for food. By analyzing and predicting the changes of weather conditions, temperature and humidity, carbon dioxide concentration, and dynamically adjusting the planting strategy, the yield can be optimized. In the future, if combined with new agricultural technologies such as automatic greenhouse and vertical farm, the agricultural production efficiency is expected to achieve a qualitative leap, and even can be extended to areas that are not suitable for agricultural production, helping to eliminate the unsolved hunger problem of human society.

In April, Tencent AI Lab's go AI "unique skills" and China's national go team. As the "top coach" of Chinese go, we can explore more ways to help the development of Chinese go and go.

Based on the national mobile game "glory of the king", Tencent AI Lab has developed a strategic collaborative AI "Juewei". In 2020, through open challenge and professional competition, Tencent AI Lab's achievements in complex environment decision-making, multi-agent cooperation and game, strategy prediction and planning will be displayed

On May 1-4, 2020, Jue Wu will be open to players on a large scale for the first time. During this period, from professional players to game anchors to ordinary amateur players, a large number of game players with different game levels challenged Jue Wu, and learned Jue Wu's ability in tactical planning, player behavior prediction, multi hero cooperation, etc.

On November 28, we entered the king's Canyon and opened a three-day public experience between 28 and 30. Different from the version opened in May, juehu in the full body version has lifted the ban on all hero pools and mastered all the skills of all heroes. At the same time, many other strategies have also been optimized. The relevant information has been included in the top AI conference neurips 2020 and the top journal tnnls.

Link to the official website of Enlightenment: aiarena.tencent.com

In order to let Jue Wu AI master all heroes, Tencent AI Lab proposes a new method: course self game learning (CSPL). This is a progressive learning method to make AI from easy to difficult: first, introduce the "teacher separation" model, and each AI teacher will be trained to be proficient in a single team through deep reinforcement learning technology; then introduce an AI student to imitate and learn all AI teachers; finally, Jue Wu will master all the skills of all heroes and become a great master.


CSPL flow chart

Design idea: task from easy to difficult, model from simple to complex, knowledge layer by layer

MoBa games such as "King's glory" are very complex and involve a variety of cooperative and antagonistic game modes. Therefore, they are very suitable to be used as the development platform of strategic AI to develop general AI technologies suitable for different scenarios. Such technologies are also of great value in many real-world scenarios, such as coordinating the autonomous driving vehicles on complex city roads and planning the distribution area and route for express or express UAVs.

In addition, in December 2020, Tencent AI Lab Juewu team developed football AI with the help of "Enlightenment" platform. The competition uses Google research football, a reinforcement learning environment developed by Google brain based on the open source football game gameplay football. The kaggle competition is also the first of its kind. Different from "glory of the king", football AI game involves the cooperation of 11 agents and the confrontation with the other 11 agents. At the same time, compared with MoBa game, rewards are more sparse.


Wekick plays football

Even so, wekirk still won the championship with a significantly better result than second place. This reflects the universality of all the underlying technologies and frameworks of Jue Wu.

Although they are RTS (real time strategy) games, StarCraft needs to control a variety of different types and different numbers of units. These units have their own movement and attack characteristics, so the action space is larger and the strategy space is richer. Tencent robotics X has opened up the first general large-scale multi-agent game training framework, tleague [3], and trained a strong AI, tstarbot-x, which can beat masters. This interstellar AI uses only one-fifth of alphastar's computing power.

Tencent transmart is the only Internet machine translation product that can realize human-computer interaction. After three years of accumulation, the function has covered the whole process of manual translation, such as keys, words, phrases, sentences, translation memory and so on. In 2020, transmart will embark on a journey of commercial exploration, which has been actively recognized by industry partners

The number one translation engine of Qianwen Publishing Group in China will be ranked by the number one translation engine of Qianwen publishing group.

Huatai Securities: the top five securities companies in China, whose securities analysts release bilingual research reports efficiently through translation memory fusion and interactive translation;

Tencent cloud official website: in the process of translating the international version official website and technical documents, the customized translation engine accurately processes markdown, XML and other marked texts, efficiently reuses the equivalent language assets of terms and bilingual sentences, and helps hundreds of Tencent cloud products to go to sea.

Transmart inherits and develops the technical concept of interactive translation. While ensuring that human being is the main body of translation, the customized personalized machine translation enables the human translation process in an all-round way

Automatic translation quality: in the target scene, through corpus enhancement and model optimization, the quality of automatic translation is at the forefront of the industry;

Real time translation suggestions: intelligent recommendation of translation fragments and complete sentences can significantly reduce the trouble of users repeatedly modifying wrong translations and greatly improve the artificial translation experience;

Translation memory fusion: dynamically combine the user's completed bilingual sentence pairs to generate a more desirable automatic translation, which is significantly better than the traditional static and incremental training machine translation;

Translation input method: referring to the context of the original text and machine translation knowledge, it can achieve accurate word formation and speed up the input efficiency in the process of manual translation.

3、 Advances in frontier research

As a leading and world-class enterprise class Artificial Intelligence Laboratory in China, Tencent AI Lab has been adhering to the concept of open cooperation to explore the frontier of AI technology with global universities and research institutions.

In 2020, Tencent AI Lab's University cooperation project "rhinoceros special research plan" completed the third annual closed loop, and published more than 50 high-level papers. Many achievements of the project have been applied to intelligent voice interaction products, live broadcast automatic interpretation system and visual recognition system. In the new year, we will continue to find challenging problems in frontier research and carry out original research. At the same time, we will explore industry application cases of new technologies, and build a win-win industry university research cooperation ecology and scientific research achievements transformation platform.

In addition, Tencent AI Lab has also launched the ecological construction of "Enlightenment" Ai multi-agent and complex decision-making open research platform, launched the first king glory enlightenment AI academic exchange competition, and invited teachers and students from 18 universities including Tsinghua University, Peking University and Chinese Academy of Sciences to carry out special training and competition for 100 people, laying a good foundation for the opening of enlightenment platform to universities on a larger scale in the future.

In terms of academic achievements, Tencent AI Lab and robotics x lab have made industry-leading contributions in computer vision, voice, natural language processing, multimodality, knowledge mapping, machine learning, robotics and many other AI fields in 2020, and shared these labor achievements through academic conferences, journals and open platforms. Tencent AI Lab and robotics x lab have made significant contributions to ACL, interspeech, iros, neurips, AAAI and other top academic conferences, and the overall number of papers published is in the forefront of domestic enterprise laboratories.

According to acemap academic map statistics of Shanghai Jiaotong University, Tencent's papers in the AI field (a large part of them are from Tencent AI Lab) in 2020 ranked eighth among universities and institutions in the world, and h-index ranked fifth in the world. In China's universities and institutions ranking, Tencent ranked fourth in the number of AI papers, and h-index ranked second, significantly ahead of other domestic enterprises.


Tencent's papers published in the field of AI in 2020 are ranked eighth in the world, and h-index is ranked fifth in the worldhttps://www.acemap.info/ranking

Next, we will sort out some important research results of Tencent AI Lab in 2020.

Multimodal research

The goal of multimodal research is to enable AI or robot to understand the environment and make judgments by integrating signals from different sources, such as vision, radar, GPS, voice, language and Internet data. Therefore, multimodal research is of great value to the two long-term visions of general artificial intelligence and virtual reality integrated world. Although multimodal research is very important, there is no top-level conference or top-level Journal for multimodal research in AI field, so the multimodal research results of Tencent AI Lab are published in different academic conferences and journals.

In 2020, Tencent AI Lab's research on multimodality mainly focuses on multimodal learning of audio / video / image and text. In addition to the achievements of virtual human introduced in the previous paper, Tencent AI Lab also proposes a new method for learning the interaction between modes of temporal sentence location and event description in video [4]. This method can learn paired mode interaction, thus improving the performance of two tasks.

In addition, Tencent AI Lab also studies how to generate natural language description based on scene graph decomposition [5], improve visual and natural language matching through recursive subquery construction [6], and a new visual text matching model [7].


A video text multimodal learning framework for describing and locating video events

Besides video text multimodality, Tencent AI Lab has also made some achievements in video audio multimodality. For example, in an interspeech 2020 study, Tencent AI Lab proposed a method to assist obstacle speech recognition by using cross domain visual generation features [8]. This method can use a large number of audio-visual data outside the domain for training, so as to generate visual features for those speakers with limited or no visual data. The speech recognition technology proposed by this project is expected to realize some important "technology for the good" applications.


Multimodal speech separation framework

In addition, in terms of multimodal human-computer interaction, Tencent AI Lab also proposed multimodal speaker differentiation [9], multimodal speech separation [10], and multimodal speech recognition [11]. This is an integrated human-computer interaction solution for complex scenes such as cocktail party, which integrates multiple modes such as audio, video, voiceprint and spatial information.


Joint training framework for multimodal speech separation and recognition

Tencent AI Lab also proposed a new deep multi-mode fusion framework: channel switching network (CEN) [12]. By exchanging the features of a specific channel dynamically and self guided in the training, the framework can maintain enough feature learning within the mode and promote feature interaction between modes.

machine learning

Machine learning is the core process and hallmark ability of AI. The upsurge of AI development in recent years is the breakthrough of deep learning. Recently, in addition to optimizing deep learning methods and expanding its application scope, the research focus in machine learning field is also actively exploring its combination with other learning paradigms, and thus the successful technologies of deep reinforcement learning and generating confrontation network are born. In addition, deep map learning, which is good at combing network relationships, has become a hot research direction in this field.

In 2020, Tencent AI Lab has obtained important research results in many machine learning directions, and also made contributions to the theoretical analysis of machine learning model, such as interpretability and robustness. These research results can be seen in top AI conferences such as neurips 2020 and top journals such as nature communications.

Among them, deep reinforcement learning is a core research direction of Tencent AI lab. Based on go, King's glory and other video games, Tencent's AI Lab has reached the world's leading level in deep reinforcement learning. The "unique skill" of go AI developed based on this technology has been applied in the training of Chinese national go team, and the "Jue Wu" of "glory of the king" has evolved into "complete body", which has been tested by the majority of players through the first large-scale MoBa AI agent performance test. The success of complete realization is based on Tencent AI Lab's effective combination of new and mature methods, including course self game learning, multi head value estimation, strategy injection, Monte Carlo tree search and departure strategy.

Tencent AI Lab has also made great achievements in depth map learning, including the graph neural network framework Grover based on self supervised training mentioned above. By designing self-monitoring tasks at atomic, chemical bond and molecular levels, Grover can learn a lot of structural / semantic information from a large number of unlabeled molecules. At the same time, in order to encode the huge amount of complex information in molecules, Grover also integrates the message propagation network and transformer to get a graph neural network model gtransformer with stronger expression ability. It has broad application potential in drug research and development. Tencent AI Lab also proposed a graph variational self encoder framework based on Dirichlet distribution [13], and proved the equivalence between the framework and the classical balanced graph segmentation method. In addition, Tencent AI Lab also explored the application of depth map learning in the field of chemistry through molecular inverse synthesis analysis [14]. In addition, at the ACM SIGKDD conference in 2020, Tencent AI Lab, Tsinghua University, the Chinese University of Hong Kong and other institutions jointly organized a one-day course to systematically explain the graph neural network. See.

Tencent AI Lab also has a research achievement that combines deep reinforcement learning and graph learning, which is a deep reinforcement learning algorithm based on hierarchical stacking attention mechanism for word games [15]. In this study, knowledge graph is used for explicit reasoning to make decisions, so that the agent's decisions can be generated and supported by an interpretable reasoning program. With a new hierarchical attention mechanism, the explicit representation of reasoning process can be constructed by using the structure of knowledge graph.


Hierarchical stacked attention network architecture

Tencent AI Lab has also made some progress in network architecture search. Compared with the artificial design of network architecture, the search efficiency of automatic network architecture is higher, and it is possible to find the structure that is difficult for human to conceive. At present, this technology has been widely used in many fields. In terms of improving the computational efficiency of network architecture search, Tencent AI Lab proposes a transitional affine parameter sharing training strategy [16], quantifies the degree of parameter sharing, and dynamically adjusts the speed of search training and the distinguishability of alternative network structures, so as to improve the efficiency and accuracy of network search.

In combination with multi task learning, Tencent AI Lab adopts task-based structure controller to generate targeted network structure for different tasks, and adopts meta learning method to make network parameters quickly adapt to new tasks [17].

In addition, Tencent AI Lab has also made some contributions to relevant theoretical analysis, including an interpretable method for evaluating neural machine translation [18], which can help us open the black box of deep learning. In addition, Tencent AI Lab also studies the improvement of self attention network by selective mechanism [19], and explains the main contribution of this mechanism in sequential coding and structural modeling, which has certain inspiration and guiding significance for further improving self attention network.

Finally, an ECCV 2020 paper from Tencent AI Lab also proposes a new convolution algorithm based on neuroscience research

Context gated convolution [20]. This is a lightweight component, which can be well applied in the existing convolutional neural network, and can significantly improve the performance of existing models in image recognition, video understanding and machine translation.


Threshold convolution diagram

natural language processing

With the emergence of large-scale language models based on transformer such as Bert and openai GPT, some experts believe that the field of natural language processing will usher in a major breakthrough in the next decade. Tencent AI Lab is carrying out research work to promote the development of natural language processing technology. At ACL 2020, the top conference in the field of natural language processing held in July 2020, Tencent AI Lab contributed 20 papers, ranking in the forefront of domestic enterprise research institutions. For details, please refer to.

In terms of text understanding, Tencent AI Lab will open in April 2020, which can analyze Chinese and English texts in morphology, syntax and semantics. Compared with other open text understanding tools, texsmart not only supports common functions such as word segmentation, part of speech tagging, coarse-grained named entity recognition (NER), syntactic analysis, semantic role tagging, but also provides features such as fine-grained named entity recognition, semantic association, deep semantic expression, etc. Texsmart system won the best system demonstration award of China Computational Linguistics Conference 2020 (CCL). In the aspect of dialogue understanding, Tencent AI Lab puts forward the concept of conversational semantic role Labeling technology, which represents the semantics of dialogue as multiple "predicate argument" structures, can simultaneously deal with the common problems of missing information and reference in dialogue, and effectively improve the understanding of dialogue and the performance of downstream tasks, such as dialogue rewriting [21] and dialogue generation. At the same time, Tencent AI Lab combines this technology with other technologies of dialogue understanding,.

In addition, Tencent AI Lab has made some research progress in long text reading comprehension [22], generalization from high resource language to low resource language [23], and relationship extraction based on dialogue [24].

In terms of language generation and dialogue, in addition to songnet [25], which has been described previously, which can generate lyrics and poetry texts with controllable format, Tencent AI Lab also studies how to better understand the context of dialogue, how to build a dialogue robot with thousands of people, how to integrate common sense and other knowledge, and how to generate logical natural language while ensuring fluency. Related research achievements include semantic role annotation and dialogue rewriting for multi round dialogues [26], enhancing understanding of multi round dialogues by using gray data [27], knowledge fusion dialogue generation [28], logical natural language generation based on open domain tables [29], three-stage generation model for improving dialogue consistency [30], etc.

In machine translation, we are committed to improving the effect of translation model. Our data regeneration [31] and multi domain general translation model [32] can make more effective use of large-scale multi domain hybrid training data. At the same time, we continue to work hard to understand and improve the transformer model, including understanding the importance of selectivity mechanism to self attention network [33], reasoning confidence calibration research and evaluating the interpretability of neural machine translation [34]. The Chinese English translation system continued to be the first in the Chinese English translation (t2020) and the second in the international competition.

computer vision

In 2020, Tencent AI Lab will make great achievements in computer vision. In CVPR and ECCV, Tencent's AI Lab has 11 and 18 papers respectively, covering multimodal learning, video content understanding, anti attack and anti defense, image editing based on generative model, etc. In addition, a number of related papers were selected in neuroips 2020.

First of all, the problem of anti attack against vision is a core weakness of computer vision model based on deep neural network, which is also the last threshold of many practical computer vision applications. Of course, this is also an important research topic of Tencent AI lab. Tencent AI Lab in 2020 This paper proposes some new strategies to resist attacks, such as a strategy against deep clustering, which can mine samples that are easy to cause prediction bias in the clustering layer, but will not affect the performance of deep embedded network. This unsupervised counter clustering network can improve the robustness of deep clustering network by using anti attack and defense training methods [35]; another one is published in ECCV In addition, a new idea of sparsity against attacks using perturbation decomposition is proposed [36].


Example of sparse pit attack

At the same time, Tencent AI Lab has also proposed some technologies for defense and counterattack, including a robust target tracking method for counterattack [37], which can take temporal information into account when generating light counterdisturbance, so as to improve the robustness of the model.

Tencent AI Lab has also made great achievements in image deblurring and super-resolution. For example, in ECCV selected papers, Tencent AI Lab has two papers on how to eliminate raindrops in visual scene, including a binocular rain removal method based on semantic understanding [38] and an image rain removal technology based on rain trace and rain fog analysis [39]. In terms of super-resolution, Tencent AI Lab proposes a face super-resolution algorithm [40] combining 3D face structure prior, which can make full use of face structure and identity information to assist in handling difficult face pose changes.

Tencent AI Lab has also made some research progress in sign language automatic translation. It proposes a hierarchical feature learning method for sign language translation based on multi granularity video clips [43]. This method can adaptively use multi granularity temporal information to model the video semantics locally and globally, thus greatly alleviating the need for gesture segmentation and improving the translation quality. It is hoped that this research can be further transformed into the application of "technology for the good".


High quality speech data recognition is a problem that has been basically solved, but in real life applications, the field of speech is also faced with cocktail party problems and people's free chat colloquial expression style problems. In the aspect of speech synthesis, synthetic speech with high naturalness and expressiveness still needs continuous efforts.

In 2020, interspeech, the top voice technology conference, received 16 Tencent AI Lab papers, including further exploration in the direction of voice cutting-edge technology, some theoretical research and analysis, as well as application achievements in science and technology improvement and cultural heritage protection. Please refer to.

Among them, Tencent AI Lab has put forward a number of potential solutions to the cocktail party problem. The first is the use of visual data to assist recognition, which has been introduced in the previous part of multimodal learning; the second is the use of sound source data with strong interference to learn [44], which can "force" the model to learn enough distinguishing and generalization performance under very bad interference conditions; the third is to continuously improve the multi-channel speech enhancement beamforming technology by proposing a new recursive algorithm The beamforming method through network [45] breaks through the traditional beamforming technology for the first time, and achieves the optimal performance in PESQ and other objective indicators and wer and other speech recognition indicators at the same time. Fourth, an end-to-end Multi-channel Speech Separation Technology [46] is proposed, which is 10% higher than the traditional multi-channel technology.

In addition, in terms of speech recognition, Tencent AI Lab focuses on improving the recognition performance under complex conditions. By effectively combining separation and recognition technology, the accuracy of speech recognition under background music and human interference has been greatly improved by 20%. This technology is widely used in video content understanding, short video and live video subtitle generation.

In the aspect of speech synthesis, durian is an important crystallization of Tencent AI Lab Research for many years, and it is also the core component of Tencent virtual human speech system. Durian can not only synthesize more natural and fluent speech, but also synthesize songs. Tencent AI Lab even explored its application in Peking Opera synthesis [47], providing a direction for the protection and inheritance of Chinese traditional culture from a technical point of view. Tencent AI Lab's speech synthesis technology has been evolving towards a higher goal after the industrial implementation of end-to-end synthesis in 2020. Not only the number of speakers' timbres has increased significantly, but each timbre has the ability of speech synthesis with a variety of emotions and styles, which can integrate different styles and emotions to achieve more natural expression of different texts in different scenes. On this basis, Tencent AI Lab also realizes the fine-grained control ability of prosodic word and word level, which can flexibly adjust the mood and emotion of single word and word, and there can be rich changes in a sentence, which greatly improves the expressiveness and appeal of synthetic speech. Fine grained control of the synthesis technology is landing in the game interpretation and novel synthesis, which requires higher expressiveness and appeal.

4、 Summary and Prospect

The past year 2020 is bound to be written into history books. How to make the world better has become a problem for more people to think and explore actively, and science and technology will play a vital role in it.

Continuing to uphold the belief of "technology for the good" and the vision of "make AI everywhere", with the long-term goal of general artificial intelligence and virtual reality integration world, Tencent AI Lab has made more positive application and research contributions in this year, covering virtual human, multi-agent, agriculture, medical treatment, drug research and development, robot and many other fields.

In the face of the unknown future, Tencent AI Lab will continue to forge ahead and strive to overcome many macro problems and micro tasks in our daily life with science and technology. In the new year, we will not relax our further exploration in cutting-edge technology, but also expand the application of AI technology in more industries.

If you've ever fought with all juehu in wangzhe Canyon, if you've ever asked for a song in front of AI Ailing's stage and listened to her singing, or if you've ever developed your own project with the ideas of Tencent AI Lab, you might as well share your experience with us.

Happy New Year 2021!

Open source project

Hifi3dface: creating high fidelity 3D virtual human with high speed and low cost


Songnet: it can generate the corresponding text (poetry and ODE) according to any format and template. The project also released a pre trained Chinese model and a fine tuned Song Ci model.


Grover: large scale self-monitoring molecular graph pre training model (can be used for ADMET molecular attribute prediction and other tasks)


Deep learning survival model for early stage of severe cowid-19 patients

https://github.com/cojocchen/covid19_ critically_ Ill

Logicnlg: logical natural language generation based on open domain tables


Graph2tree: graph to tree learning for automatically solving mathematical application problems


Current transformer: memory enhanced circular transformer for generating more coherent video language description


Infece: Research on confidence calibration in inference stage of neural machine translation


Ssan: selective attention network


Data rejuvenation: reviving inactive samples in neural machine translation


Metahypernymy: a meta learning based method for predicting low resource languages


Dialog re (dialogue re) - based data sets



AMR multiview: structured information retention in graph to text generation


Lab ZP joint: Joint zero anaphora reduction and resolution training based on multi task training framework


Sub GC: natural language description generation based on scene graph decomposition


Featherwave: an efficient multi band parallel high quality speech synthesizer


Tspnet: hierarchical feature learning of sign language translation based on temporal semantic pyramid


Alrdc: robust deep clustering based on antagonistic learning


Proxygml: depth map metric learning method with fewer agents


CEN: channel switching network


Tstarbot-x: pure machine learning StarCraft II strong AI

https://github.com/tencent-ailab/tleague_ projpage

Tleague: a general training framework for large-scale multi-agent game

https://github.com/tencent-ailab/tleague_ projpage

Open project

Enlightenment: explore general artificial intelligence with games, which is now open to colleges and universities


Query service of deep learning survival model in early stage of severe cowid-19 patients


The first security attack matrix of AI in the industry: a practical security attack matrix


China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments