**Zhipu AI: Let machines think like humans in the future**
Letting machines think like humans is the beautiful vision of many artificial intelligence (AI) practitioners, and it is also a track that many investors are optimistic about.
In September this year, Zhipu AI announced that it has obtained hundreds of millions of RMB in Series B financing. This financing is jointly led by Legend Capital and Qiming Venture Capital, and will be used to continue to invest in building high-performance 100 billion-level inclusive large-scale models.
Zhou Zhifeng, partner of Qiming Venture Partners, said: “In the next decade, artificial intelligence will move towards cognitive intelligence. The pre-trained large model is its core technical driving force and key infrastructure, allowing AI to absorb more knowledge to understand and think. Ultimately achieve cognition close to human level. At the same time, pre-training large models enables AI to move from relying on manual parameter tuning modeling to a stage of industrialization that can be replicated on a large scale.”
A few days ago, ChatGPT, a large-scale pre-trained language model, successfully broke the circle. It can write poems, write press releases and even generate codes as required, which has attracted much attention to the innovation boom of large models. For this reason, a reporter from “China Science Daily” interviewed Wang Shaolan, the president of Zhipu AI, and asked him to talk about the future development trend of AI technology and large models.
Create a domestic open source model
In June 2020, the artificial intelligence company OpenAI released the GPT -3 language model. Its scale of 100 billion parameters and powerful language processing capabilities have brought unprecedented shock to the AI â??â??â??â??world. Earlier this year, OpenAI fine-tuned GPT-3 to InstructGPT, reducing unreal, biased inputs. Today, OpenAI has further upgraded it to ChatGPT, and has shown amazing language ability in online tests.
Unfortunately, the model parameters of GPT-3 are not open source, and are only provided to overseas users in the form of paid API (application programming interface) (not available in China), which sets up barriers for researchers to explore the model in depth.
At present, although the Internet company Meta has open-sourced the large-scale model OPT, and the AI â??â??startup company Hugging Face has open-sourced BLOOM, users need at least one A100 (80G * 8 ) server to start inference, and most ordinary researchers are still blocked from the threshold .
In contrast, in August this year, the large-scale Chinese-English pre-training language model GLM-130B jointly developed by the Knowledge Engineering Laboratory (KEG) of Tsinghua University and Zhipu AI was officially released. Or download and use for free. The team has made a lot of attempts in model quantification, and users can perform inference on an A100 (40G * 8 ) or V100 (32G * 8 ) server.
Not long ago, Percy Liang, head of Stanford University’s Basic Model Research Center, conducted a study to compare many large models in the world, among which the GLM-130B model achieved the highest performance in terms of robustness (robustness) and accuracy. Good performance, and the evaluation only uses English tasks, GLM-130B can support both Chinese and English.
From December 2021, laboratories such as Tsinghua University’s KEG, PACMAN (Parallel and Distributed Computer Systems), and NLP (Natural Language Processing) have begun discussions on training dense models with hundreds of billions of parameters. As the work progressed, the research team has not found sufficient and stable computing resources for model training.
In April this year, after learning about the lack of computing resources for the 100 billion parameter GLM model training in the KEG laboratory of Tsinghua University, Zhipu AI decided to provide free computing power support for the model training for this project.
After the coordination and efforts of many parties, Zhipu AI finally rented nearly a hundred A100 servers, providing the KEG laboratory with the computing power required for model training, and is committed to making this research open source and open to the research community and industry. Both can understand large models at ultra-low cost and use large models at ultra-low cost.
“The cost of such a scale of computing power and monthly rental is not a small amount for a start-up company, but the company has decided to provide support for the project,” Wang Shaolan said, “We hope that through this way, more Many people directly use the large model, which drives more people to understand and recognize the large model. Ultimately, let the large model technology become the infrastructure of information and intelligent systems like cloud computing and big data, and empower all walks of life. ”
Digital Humans Empowered by Large Models
A few days ago, the 22nd FIFA World Cup is being held in Qatar. In the World Cup broadcast reports of related video apps, there will always be a familiar figure appearing in the lower left corner of the video program.
She is an AI sign language digital human classmate developed by Zhipu AI. She can provide professional and accurate sign language sports event commentary for the audience, and convey the “voice” on the football field to the hearing-impaired.
As early as during the Beijing Winter Olympics and Winter Paralympics, Hua served in the program “Good Morning Beijing”, broadcasting “Winter Olympics Highlights” and “Watching the Winter Olympics Together” in sign language.
“Using digital humans to complete sign language broadcasts not only reduces the operating costs of the Winter Olympics, but also demonstrates the technological innovation of the Winter Olympics.” Wang Shaolan introduced that the smart sign language series products developed by Zhipu AI include sign language broadcasts, sign language translation and sign language dictionaries. This application meets the needs of multiple scenarios such as sign language information broadcasting, real-time translation and communication, and sign language learning.
“The digital human empowered by the large-scale pre-training model embodies Zhipu AI’s social welfare feelings and makes technology more warm.” Wang Shaolan said, “With the 100-billion-level pre-trained large model as the core, our digital human is already in Scenarios such as AI virtual interviewers, virtual hosts, intelligent customer service, and chatting robots have been applied. Next, we will continue to expand the application scenarios of digital humans, establish digital human ecological cooperation, and accelerate the realization of ‘digital humans’.”
The reporter learned that based on the open source 100 billion bilingual pre-training model GLM, Zhipu AI has launched chat robots XDAI and chatGLM, which allow machines to simulate human thinking patterns and realize a dialogue system that concretizes knowledge.
At the same time, on the basis of large-scale model technology, Zhipu AI also proposed the market concept of Model as a Service (MaaS), that is, to provide model co-training services, model authorization services, and API open platforms, etc., to combine upstream and downstream Partners build a large model ecology.
In terms of ecological construction, Zhipu AI and the China Computer Federation (CCF) jointly launched the CCF-Zhipu Large Model Fund to provide funding for pre-training large-scale model theory, algorithms, models, applications and other related research, hoping to reduce the cost of large-scale model research. This threshold allows every scientific researcher in the computer field to have the opportunity to participate in the research of large models, and promote the innovation of large model technology and applications.
A heart that has stood the test of time
No success is accidental, especially when technology leaves the laboratory and enters the market.
Zhipu AI was established in 2019, transformed from the technical achievements of the KEG laboratory of Tsinghua University. In the core team of Zhipu AI, CEO Zhang Peng graduated from the Computer Department of Tsinghua University, Chairman Liu Debing is a student of Gao Wen, an academician of the Chinese Academy of Engineering, and Wang Shaolan is a leading innovation doctor of Tsinghua University.
As early as 2006, the KEG Laboratory of Tsinghua University started the research on the scientific and technological information analysis engine ArnetMiner (hereinafter referred to as AMiner), and it has been more than 10 years since the real industrialization, that is, the establishment of Zhipu AI. The research team that year won the Test-of-Time Award of the top international conference SIGKDD, the second prize of the National Science Progress Award, and the first prize of the Beijing Invention Patent Award.
“In order to realize the industrialization of technology, we need to have insight into market demand, continue to innovate and promote, and empower the industrial ecology.” Wang Shaolan recalled, “At that time, we only set up a branch in Nanjing for the purpose of capturing and cleaning data, with a total of more than 40 companies. People, from the initial manual labeling, to the gradual establishment of technical rules, and then to the flexible application of AI algorithms.”
After day after day of polishing and exploration, today’s AMiner system has collected 330 million papers, 110 million patents, and 2.8 million scientific research projects published by more than 100 million scholars and 380,000 institutions around the world, and built a billion-level The high-definition knowledge map covers 8 million knowledge concepts and 1.1 billion relationships in 40 disciplines, attracting more than 30 million independent IP visits from 220 countries/regions around the world every year.
Over the years, starting from the technology in the laboratory, Zhipu AI has always maintained its original intention and worked hard to learn. “The integration of knowledge and large models requires a strong combination of industry, education and research, and the creation of a research ecology, hardware ecology, intelligent computing ecology, application ecology, and organizational ecology.” Wang Shaolan revealed to the “China Science Daily”, “We hope to support different scenarios by creating , The underlying artificial intelligence architecture of intelligent applications in different directions, empowering thousands of industries, ‘making machines think like humans’ will be an accessible future.”
https://news.sciencenet.cn/htmlnews/2022/12/490914.shtm