ChatGPT developed by OpenAI ignited new interest in AI around the world. Now, however, Qiu Xipeng’s team from the School of Computer Science at Fudan University have released their own ChatGPT-like model, MOSS, on February 20. Xinmin Evening News, a state-owned newspaper, introduced the MOSS in detail.
The most exciting day for the team was January 19. Sun Tianxiang, the main developer of the project and a doctoral student from the School of Computer Science, asked MOSS a Chinese question during the test, but MOSS answered it correctly in English. At that time, the version of MOSS was still very rudimentary, and the Chinese corpus accounted for less than 0.1% of all training data.
The team hadn’t yet taught it machine translation, so they were thrilled by the potential that MOSS showed. Qiu Xipeng compared MOSS to a “smart child”. Even though it is not good at writing poems, solving problems or any one thing in specific, it has shown the potential to become a framework of artificial general intelligence (AGI).
Qiu began to research machine learning when he was a PhD student and entered the natural language processing research field after becoming a teacher at Fudan University. He and his team have developed many innovative research results on the basic model and algorithm of natural language processing. His book “Neural Networks and Deep Learning” is on the list of many “must-read books for artificial intelligence majors”.
Qiu describes ChatGPT as having as many as 175 billion parameters, while MOSS has only about 1/10 of that. At present, some users who have participated in the alpha test phase said that, although MOSS has less parameters than ChatGPT and the coverage of factual problems is not comprehensive enough, the basic functions of ChatGPT have been realized. Qiu believes that in the near future, large language models such as MOSS will become as conventional as search engines, providing help for all aspects of people’s lives. The new model is expected to be finished by the end of March.
MOSS was not developed overnight. Since 2021, the team has started to make Chinese generative pre-training models, which are available for others to download. Later, the concept of “language model as a service” was put forward and training in large language models started last year. They then took another half a year to study how to make large language models understand human instructions and have the ability to talk.
OpenAI announced on March 1 that it’s now letting third-party developers integrate ChatGPT into their apps and services via an API and that doing so will be significantly cheaper than using its existing language models. The company also is making Whisper, its AI-powered speech-to-text model, available for use through an API. The Whisper API is accessible through transcriptions or translation endpoints, which can transcribe or translate the source language into English.