A person is proficient in all martial arts in Jin Yong martial arts, is this possible? It's hard in reality, but
On November 28th Tencent announced a strategic collaborative AI developed jointly by Tencent AI Lab and Honor of Kings
This means that Tencent strategy collaboration AI algorithm ability has been further improved, reached the international leading level.
Upgraded
Currently, this
Related research has also been included in AI top conference NeurIPS 2020 and top journals, showing Tencent's international first-class AI research and application capabilities.
Research on Reinforcement Learning NeurIPS 2020 AI Top Conference
AI game research is Tencent AI ultimate research problem
Unlock
In Honor of Kings, if each profession has four purple proficient heroes, you can unlock it
For AI, there is also a huge challenge: different heroes actually share a set of model parameters, and it is easy to master a single hero from scratch. But in the face of multi-hero combination, incomplete map information, the difference of each hero's combat strategy and the cooperation between each other lead to the increase of the geometric level of difficulty. And, the multi-hero combination brings AI
But
The first is to create an optimal AI model, which makes comprehensive use of the advantages of a large number of basic components of machine learning, so that the model can adapt to MoBa tasks, with strong expression ability and fine modeling of hero operation.
The second is to develop a CSPL progressive learning method (Curriculum Self-Play Learning, course self-playing learning), so that AI can master all the abilities step by step from easy to difficult.
The third is to build a large-scale training platform
With a military assistant
In a game, the key is not just to have
The ranking BP (BanPick) link in Honor of Kings is an important node that may affect the game between the two sides. The simple approach is to adopt
Inspired by the AI algorithm of go, the team innovatively adopts an automatic BP model combining Monte Carlo tree search (MCTS) and neural network, which can quickly and accurately select the hero with the greatest long-term value.
vs Human BP Test
In addition to the common single round BP, AI coach also learned Wang
Honor KPL the common multi-round BP system in the field, this mode can not choose repeated heroes, the selection strategy requirements are higher. A trained BP model based on
At this point, before the realization of a number of strong soldiers, after the military division to assist, completely
Jue Wu AI ability evolution route, from MoBa novice player to professional top level
Tencent has also developed a supervisory learning (SL) method to model overall vision and micro-operation strategies at the same time, have excellent long-term planning and instant operations at the same time, reaching the top level of non-professional players. The opening of the level 1-19 level of multiple levels is trained by supervised learning methods. Also in December 2018, related technical achievements were made public against the human players, and selected TNNLS. top journals
Supervised learning method paper address:https://arxiv.org/abs/2011.12582
User comments