In recent years, a Chinese company named
iFLYTEK has been gaining momentum in speech recognition and computer vision
areas. Hundreds of millions of people have benefitted from its Chinese translation and recognition applications. Cong Liu, the vice president of
iFLYTEK AI Research Institute, plays a crucial role in the development of
related technologies.
Liu has been working on speech recognition and its related fields at iFlYTEK since he
was a junior at the University of Science and Technology of China. Early in
his career, he primarily focused on Chinese recognition and translation and how
to increase the performance of such aspects. One of his ingenious creations was
the world's first Chinese dialect recognition tool, supporting up to 22 different
dialects.
Another one of Liu’s breakthrough innovations is the DFCNN (Deep Fully Convolutional Neural
Network) model. Compared to traditional CNN, it better expresses long-term
information by integrating many convolution layers and directly modeling every
sentence instead of every word. The development of this model helped Liu and
his team win three competitions at the 4th CHiME challenge in 2016.
After applying DFCNN to real-world applications, they discovered that is was capable
of boosting the performance of the iFLYTEK speech recognition engine by approximately 30% per year. Now, the speech recognition accuracy rate of the
engine is up to 98%.
Since 2014, Liu has become the vice president of iFLYTEK's AI Research Institute. Thereafter, computer vision became a new research sector for him. Under his
leadership, the company successfully migrated its sophisticated deep learning
models from speech recognition to computer vision.
“I see a bridge connecting the two areas. That bridge is deep learning,” explained Liu.
Under this vision, he has been leading the team into medical imaging,
video monitoring, and image-text recognition. iFLYTEK Computer-Aided Diagnosis in Medical Imaging, which has already been deployed in
more than 50 hospitals, is one of their recent achievements. This AI-powered
system increases doctors’ operation efficiency by performing certain imaging
diagnosis within one second.
Liu’s future research topics include cross-modal information integration and
customized speech recognition. “I am so fortunate to be able to live in this
era where core technological breakthroughs prosper. And I want to use them to
build more real-world applications,” he said.