Categories
News

Exclusive | PolyU’s top AI scientist Yang Hongxia seeks to revolutionise LLM development in Hong Kong


At current, Yang stated LLM development has principally relied on deploying superior and costly graphics processing models (GPUs), from the likes of Nvidia and Advanced Micro Devices, in data centres for initiatives involving huge quantities of uncooked information, which has put deep-pocketed Huge Tech corporations and well-funded start-ups at a significant benefit.
The doorway to the Hung Hom campus of Hong Kong Polytechnic College, the place synthetic intelligence scientist Yang Hongxia serves as a professor on the Division of Computing. Photograph: Solar Yeung

Yang stated she and her colleagues suggest a “model-over-models” method to LLM development. That requires a decentralised paradigm in which builders practice smaller fashions throughout hundreds of particular domains, together with code era, superior information evaluation and specialised AI brokers.

These smaller fashions would then evolve into a big and complete LLM, also referred to as a basis mannequin. Yang identified that this method might scale back the computational calls for at every stage of LLM development.

Area-specific fashions which can be sometimes capped at 13 billion parameters – a machine-learning time period for variables current in an AI system throughout coaching, which helps set up how information prompts yield the specified output – can ship efficiency that’s on par or exceeds OpenAI’s latest GPT-4 models, whereas utilizing far fewer GPUs from round 64 to 128 playing cards.

That paradigm could make LLM development extra accessible to college labs and small companies, in accordance to Yang. An evolutionary algorithm then evolves over these domain-specific fashions to finally construct a complete basis mannequin, she stated.

Efficiently initiating such LLM development in Hong Kong would rely as a giant win for the town, because it seems to be to flip into an innovation and know-how hub.

Yang Hongxia, a number one synthetic intelligence scientist, beforehand labored on AI fashions at TikTok-owner ByteDance in the US and Alibaba Group Holding’s analysis arm Damo Academy. Photograph: PolyU
Hong Kong’s dynamic environment, in addition to its entry to AI expertise and assets, make the town a really perfect place to conduct analysis into this new development paradigm, Yang stated. She added that PolyU president Teng Jin-guang shares this imaginative and prescient.

In accordance to Yang, her group has already verified that small AI fashions, as soon as put collectively, can outperform essentially the most superior LLMs in particular domains.

“There may be additionally a rising consensus in the trade that with high-quality, domain-specific information and steady pretraining, surpassing GPT-4/4V is extremely achievable,” she stated. Then multimodal GPT-4/4V analyses picture inputs offered by a consumer, and is the most recent functionality OpenAI has made broadly out there.

Yang stated the following step is to construct a extra inclusive infrastructure platform to appeal to extra expertise into the AI group, in order that some releases will be made by the tip of this 12 months or early subsequent 12 months.

“Sooner or later, whereas a number of cloud-based massive fashions will dominate, small fashions throughout numerous domains may also flourish,” she stated.

Yang, who acquired her PhD from Duke College in North Carolina, has revealed greater than 100 papers in top-tier conferences and journals, and holds greater than 50 patents in the US and mainland China. She performed a key function in growing Alibaba’s 10-trillion-parameter M6 multimodal AI mannequin.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *