On Monday, the Alibaba DAMO Academy announced the latest development of a multi-modal large model M6, whose parameters have jumped from 1 trillion to 10 trillion, far exceeding the trillion-level model previously released by Google and Microsoft, becoming the world’s largest AI pre-training model.
According to the company, the M6 has achieved the ultimate low carbon and high efficiency in the industry, using 512 GPUs to train a usable 10 trillion model within 10 days. Compared to the GPT-3, a large model released last year, M6 achieves the same parameter scale and consumes only 1% of its energy.
M6 is a general AI model developed by DAMO Academy, with multi-modal and multi-task functions. Its cognitive and creative capabilities surpass traditional AI, and it is especially good at design, writing and Q&A. It can be used widely across the fields of e-commerce, manufacturing, literature and art, scientific research and so on. Compared with traditional AI, the large model has hundreds or thousands of times the number of “neurons” and has input significant levels of data in advance, showing the learning ability of “drawing inferences from others,” much like human beings.
According to Alibaba, as the first commercialized multi-modal large model in China, M6 has been applied in over 40 scenarios, with a daily call volume of hundreds of millions.
SEE ALSO: Regulators Require 39 Companies Including Tencent and Alibaba to Establish Lists of Collected and Shared Personal Information
At the same time, DAMO Academy has also launched MUGE, the largest Chinese multi-modal evaluation data set at present, which covers the tasks of graphic description, text generation of images and cross-modal retrieval, filling the gap cause by a lack of relevant Chinese authoritative evaluation benchmarks.
Zhou Jingren, Head of Data Analytics and Intelligence Lab at DAMO Academy, said, “Next, we will deeply study the cognitive mechanism of the brain and strive to improve the cognitive ability of M6 to a level close to human beings. For example, by simulating human cross-modal knowledge extraction and understanding of humans, the underlying framework of general AI algorithms is constructed. On the other hand, the creativity of M6 in different scenarios is continuously enhanced to produce excellent application value. “