On September 10th, Tencent AI Lab announced that it will open the "Tencent ML-Images" project at the end of September. The project consists of the multi-label image dataset ML-Images, and the highest precision depth residual in the industry's current deep learning model. The network ResNet-101 is composed.
The open source of the project is a release of the basic capabilities accumulated by Tencent AI Lab in the field of computer vision, providing sufficient high-quality training data for researchers and engineers in the field of artificial intelligence, and easy-to-use, powerful deep learning. The model promotes the common development of the artificial intelligence industry.
The image dataset ML-Images released by Tencent AI Lab contains 18 million images and more than 11,000 common object categories. The largest multi-label image dataset in the industry is large enough to meet the needs of general scientific research institutions and small and medium-sized enterprises. scenes to be used. In addition, Tencent AI Lab will provide ResNet-101, a deep residual network based on ML-Images training. The model has excellent visual representation and generalization performance, and has the highest precision in the current model in the industry. It will provide strong support for visual tasks including images, videos, etc., and help image classification, object detection, object tracking, semantics. Improvement in technical level such as segmentation.
The deep learning technology represented by deep neural network has fully demonstrated its excellent capabilities in many fields, especially in the field of computer vision, including the important tasks of classification, understanding and generation of images and videos. However, in order to give full play to the visual representation of deep learning, it must be based on sufficient high-quality training data, excellent model structure and model training methods, and strong computing resources and other basic capabilities.
Major technology companies have placed great emphasis on the building of the basic capabilities of artificial intelligence, and have built large image datasets that are only for their internals, such as Google's JFT-300M and Facebook's Instagram dataset. However, these data sets and the models they have trained are not disclosed. For general scientific research institutions and small and medium-sized enterprises, these artificial intelligence basic capabilities have very high thresholds.
The largest multi-label image dataset currently available in the industry is Google's Open Images, which includes 9 million training images and more than 6000 object categories. Tencent AI Lab's open source ML-Images dataset includes 18 million training images and more than 11,000 common object categories, or will become the new industry benchmark dataset. In addition to the data set, the Tencent AI Lab team will also detail in this open source project:
1) A method of constructing a large-scale multi-label image data set, including an image source, an image candidate category set, a category semantic relationship, and an image annotation. During the construction of ML-Images, the team took advantage of the category semantic relationships to help accurately mark the images.
2) Training method based on ML-Images for deep neural networks. The team's well-designed loss function and training method can effectively suppress the negative impact of category imbalance in large-scale multi-label datasets on model training.
3) The ResNet-101 model based on ML-Images training has excellent visual representation and generalization performance. Through migration learning, the model achieved 80.73% top-1 classification accuracy on the ImageNet validation set, exceeding the accuracy of Google's homogeneous model (migration learning mode), and it is worth noting that ML-Images is only JFT-300M. About 1/17. This fully demonstrates the effectiveness of ML-Images' quality and training methods. Compare the table below in detail.
Note: The Microsoft ResNet-101 model is trained for non-migration learning mode, ie the 1.2M pre-trained image is the image of the original dataset ImageNet.
Tencent AI Lab's open-source "Tencent ML-Images" project demonstrates Tencent's efforts in building the basic capabilities of artificial intelligence and the vision of promoting the common development of the industry through the opening of basic capabilities.
“Tencent ML-Images”, the deep learning model of the project, has played an important role in many of Tencent's business, such as “image quality evaluation and recommendation function”.
As shown in the figure below, the quality of the cover image of the Daily Express News has been significantly improved.
Optimized before optimization
In addition, the Tencent AI Lab team migrated the ResNet-101 model based on Tencent ML-Images to many other visual tasks, including image object detection, image semantic segmentation, video object segmentation, and video object tracking. These visual migration tasks further validate the model's powerful visual representation and excellent generalization performance. “Tencent ML-Images” will continue to play an important role in more visually relevant products in the future.
Since 2016, Tencent has released its open source project (https://github.com/Tencent) on GitHub for the first time. Currently, it has accumulated 57 projects covering the fields of artificial intelligence, mobile development, and small programs. To further contribute to the open source community, Tencent has joined Hyperledger, LF Networking and the Open Network Foundation, and has become a founding member of the LF Deep Learning Foundation and a Platinum Member of the Linux Foundation. As Tencent's “Open” strategy is reflected in the technical field, Tencent Open Source will continue to promote technology research and development to share, reuse and open source, release Tencent's research and development capabilities, provide technical support for domestic and foreign open source communities, and inject research and development. vitality.