Google AI Researchers Develop Innovative Method to Improve Performance of Large Language Models
JAKARTA - Artificial intelligence (AI) researchers from Google Research and Google DeepMind have developed a method where large language models (LLM) can be strengthened with other language models.
This addresses one of the biggest remaining problems with LLM by allowing developers to provide new capabilities to existing models without having to start from scratch or involve expensive training/advancement sessions.
According to the Google Research team, augmenting LLM with another language model not only improves performance on existing tasks, but also enables new tasks that the model alone cannot achieve.
Teaching old chatbots new tricks The research was conducted using Google's PaLM2-S LLM, a model called the equivalent of GPT-4, the artificial intelligence on which OpenAI's ChatGPT is based.
PaLM2-S was tested on its own in the team's experiments and then tested again after being reinforced with a smaller, more specialized language model. Tasks performed included translation, where the amplified version showed improvements of up to 13% compared to the baseline, and programming.
When tested in programming tasks, this hybrid model shows significant improvements, as explained in the paper:
"Similarly, when PaLM2-S was augmented with a programming-specific model, we saw a relative improvement of 40% compared to the base model for code generation and explanation tasks – comparable to the fully augmented model."
Potential huge implications Directly, the demonstrated performance improvements could have a direct impact on the artificial intelligence sector. Performance improvements in translation tasks, for example, were found to be greatest when translating low-support languages into English. This is still an unsolved problem in machine learning, and Google's research here has the potential to make a significant contribution.
More broadly, however, it's possible that this line of research could address the legal issues that threaten many tech CEOs in the artificial intelligence sector: legal issues that could destroy the very foundations of chatbots like ChatGPT.
Today the creators of some of the most popular large language models have been defendants in numerous lawsuits that hinge on allegations that these AI systems were trained using copyrighted data.
SEE ALSO:
The question that lawmakers and courts must answer is whether for-profit companies can legally use this data to train their language models. If a court decides that developers cannot use the data and that models trained with copyrighted material must be removed, it may be technically impossible or uneconomic to continue offering the affected services.
Essentially, due to the high costs involved in training large language models and their reliance on big data, products like ChatGPT, as currently built, may not be viable in the more regulated artificial intelligence landscape of the United States.
However, if Google's new LLM upgrade scheme is successful with further development, it is possible that many of the requirements and costs of starting an LLM from scratch or advancing an existing one could be reduced.