Li Yan, former head of multimedia understanding at Chinese short video platform Kuaishou, has gone on to establish Yuanshi Technology, an AI company which is mainly engaged in the research and development of multimodal large models, in the second half of 2022, 36Kr reported.
Li, who graduated from the Institute of Computing Technology of Chinese Academy of Sciences, was the core figure of AI technology research and development at Kuaishou. In November 2015, with the support of Su Hua, then CEO of Kuaishou, Li set up the first internal deep learning group, with the goal of building an algorithm model to identify illegal video content.
With the accumulation of AI technology, Kuaishou requires the ability to screen video content. In 2016, Li changed the name of the team from deep learning to multimedia understanding. Besides solving security compliance problems, the team was also engaged in the research and development of algorithm models in various forms, such as voice, text and image.
Several people familiar with the matter said that Li offered to leave in Kuaishou in 2021 before establishing Yuanshi Technology in the second half of 2022. This time, Li found the most suitable technical path for content understanding – the multimodal large model, which refers to an artificial intelligence algorithm based on text, image, video, audio and other modal data for learning and training.
As early as 2018, Li publicly emphasized the importance of multimodal technology. “Video is a comprehensive form of information, incorporating vision, hearing and text, with user behavior another type of modal data, so video itself is a multimodal issue. Therefore, multimodal research is a very important subject for Kuaishou.”
At present, driven by the recent release of ChatGPT, AI competition among Chinese enterprises has become even more fierce. Baidu, ByteDance and other Internet giants are currently competing with each other while some startups are taking advantage of their own data advantages to edge out competition. The latest action is the launch of Dialogue Writing Cat, the Chinese version of ChatGPT, which is developed by Meta Sota, an AI service provider in the natural language processing field. At present, users can use the chatbot on a web page and in small programs after being invited to register. The platform demonstrates a better ability in the Chinese language and training data and is expected to make a breakthrough in the legal field.