- AI Singapore and Google collaborate on Project SEALD.
- The initiative aims to improve large language models and bridge communication gaps.
- Project SEALD will contribute to AISG’s Sea-Lion model and make findings publicly available.
AI Singapore (AISG) and Google Research have embarked on a collaborative effort to improve datasets for training and evaluating large language models (LLMs) in Southeast Asian languages.
The initiative, named Project SEALD (Southeast Asian Languages in One Network Data), aims to enhance the capabilities of LLMs, making them more useful across the region for the benefit of society.
The project will initially focus on Indonesian, Thai, Tamil, Filipino, and Burmese, creating rich and diverse datasets for languages spoken in Southeast Asia.
Bridging communication gaps
One of the key objectives of Project SEALD is to bridge communication gaps with underrepresented migrant worker communities in Singapore, where individuals may be more fluent in regional languages than in English.
By training LLMs with culturally relevant data, the project seeks to foster stronger engagement between the Singapore government, employers, and the migrant worker population.
Yolyn Ang, vice president of AsiaPacific business development at Google, stated, “This [project] will open new opportunities and make AI more inclusive, accessible, and helpful for individuals and businesses throughout the region.”
Contributing to the Sea-Lion model
In addition to improving language datasets, Project SEALD will contribute to the development of AISG’s model Sea-Lion (Southeast Asian Languages in One Network).
The collaboration between AISG and Google also involves making the datasets and findings from Project SEALD available to the public through open-source channels, promoting transparency and accessibility in the field of AI research.
Google’s involvement in Project SEALD is not its first foray into language-focused AI initiatives in the region. The tech giant has a similar partnership in India called Project Vaani, which focuses on the South Asian country’s diverse speech data across its 773 districts.
As AISG and Google continue to work together on Project SEALD, their efforts are expected to significantly advance the development of large language models in Southeast Asia, paving the way for more inclusive and effective AI solutions in the region.
To read the original article in its entirety: https://www.techinasia.com/ai-singapore-google-partner-boost-large-language-models-sea