BRIN, CORIKA, GDP Venture, And AI Singapore Develop Indonesian LLM Through SEA-LION Platform

JAKARTA National Research and Innovation Agency (BRIN), Collaboration in Research and Innovation of Artificial Intelligence (KORIKA), and two portfolios from GDP Venture, namely Glair.ai and Datasaur.ai, work together in building the Indonesian Language Model (LLM).

This LM will be presented through the SEA-LION platform made by AI Singapore. The hope is that this platform can be utilized by many parties so that technology and the application of science in Indonesian can be more advanced.

Singapore's Head of Strategy, Partnership, and AI Growth Darius Liu said that generative AIs such as GPT are already available in almost all countries, but their language skills are only superior in Western parts such as English.

"We have to work together to solve this problem now. AI Singapore has prepared an LLM program to achieve three things to train a small LLM family focused in Southeast Asia specifically," Darius said while announcing the cooperation of five institutions on Thursday, November 30.

With these obstacles, AI Singapore is trying to develop LLM Indonesian to their platform. Although still in the development stage, the SEA-LION has been tested several times and compared to two large chatbots such as GPT-4 and Llama 2.

From the results of the Singapore AI testing, the SEA-LION repeatedly responded better when using Indonesian compared to GPT and Llama. Both simple questions like what gotong royong was to the analysis of sentiment.

Although the three platformchatbots have been compared, AI Singapore has not been able to state how high the level of SEA-LION accuracy is when compared to GPT and Llama. However, the company already pocketed the percentage number.

"Later we will publish in a few days what percentage of the SEA-LION comparison, how much is the Llama percent," said Singapore's Head of Artificial Intelligence William Tjhi when asked by VOI.

William also explained that benchmarks in English are very easy to make, but different for languages in Southeast Asia. Therefore, AI Singapore must make its benchmarks specifically from scratch.

Meanwhile, when asked about its security, William said that this was their main focus in addition to developing its platform. However, the security of this platform requires cooperation from all institutions, including BRIN and CORIKA.

"The intention will be accessible to the wider community because this is a canoe source, yes. However, we as developers must be responsive. We must make sure that this LLM does not repeat somethingharmful for the Indonesian nation, "explained William.