The Chinese Academy of Sciences has launched a new generation AI model, “Zidong Taichu 2.0”, which supports various modalities of data including video and 3D. The news was announced at a conference held by the Institute of Automation of the Chinese Academy of Sciences in Shanghai on June 16, 2023.
Compared to its first generation, the new model has significantly improved its decision-making and judgment capabilities, accomplishing a leap from perception and cognition to decision-making. It is expected to play a more significant role in various fields such as healthcare, transportation, and industrial production in the future.
As previously reported, the first generation of “Zidong Taichu”, released in 2021, was jointly developed by the Institute of Automation of the Chinese Academy of Sciences and Huawei. It was touted as the “world’s first multi-modal AI model with hundreds of billions of parameters”.
Unlike most existing language models that primarily focus on text, “Zidong Taichu” was designed from the ground up with a multi-modal approach at its core. It uses a variety of data types, including image, sound, and text, for unified representation and learning across modalities, achieving a “unified representation” and “mutual generation” among image, text, and voice data.
“Zidong Taichu 2.0” was built on Huawei’s fully domesticated software and hardware platform, Ascend AI and MindSpore, and was jointly developed by the Institute of Automation of the Chinese Academy of Sciences and Wuhan Artificial Intelligence Research Institute.
In addition to text, images, and audio, “Zidong Taichu 2.0” can integrate more modalities of data, including 3D, video, and sensor signals. It further enhances the cognitive fusion of voice, video, and text, and improves functions such as common sense computing, thereby further “breaking the interactive barrier of perception, cognition, and decision-making”.