Apple Study: LLM-Based AI Model Is Still In Trouble Because It Can't Think Logically

JAKARTA A new study by Apple's artificial intelligence scientists found that machines based on large language models (LLM), such as those developed by Meta and OpenAI, are still lacking in basic reasoning capabilities.

Apple is proposing a new benchmark called GSM-Symbolic to help measure the reasoning abilities of these models.

In preliminary testing, it was found that small changes to the words in questions could result in very different answers, which undermine the reliability of the model. The study highlights "stifficacy" in model mathematical reasoning, in which adding contextual information that shouldn't affect computations lead to different results.

In particular, the performance of all models decreased when the numerical value in the question was changed to the GSM-Symbolic benchmark. The research also shows that the more complex the question with more clauses, the worse the performance of the model.

| TEKNOLOGI
Begini Nasib Jaringan 3G di Tahun 2024, Masih Ada yang Pakai?
14 Oktober 2024, 20:05
| TEKNOLOGI
iPhone 16 Masih Tertahan TKDN, Kapan Meluncur di Indonesia? Begini Kata Apple
14 Oktober 2024, 19:33
| TEKNOLOGI
Rumor Terbaru iMac M4, Mulai dari Kinerja Chip Hingga Peningkatan Wi-Fi
14 Oktober 2024, 19:04
| TEKNOLOGI
Peluncuran Gim Football Manager 25 Ditunda Hingga Maret 2025, Ini Alasannya
14 Oktober 2024, 18:34
| TEKNOLOGI
NASA dan SpaceX Akan Luncurkan Misi Europa Clipper Hari Ini
14 Oktober 2024, 17:35

In an example, Apple's team tested a simple mathematical problem that shouldn't be affected by any additional information. However, models from OpenAI and Meta mistakenly expose irrelevant information, proving that the model doesn't really understand the problem and relies solely on language patterns.

The study concludes that the current LLM model lacks critical reasoning capabilities and tends to use matching patterns that are prone to changes to simple words. Apple plans to introduce its own more sophisticated version of AI, starting with iOS 18.1, to overcome the limitations that exist in the current LLM.

The English, Chinese, Japanese, Arabic, and French versions are automatically generated by the AI. So there may still be inaccuracies in translating, please always see Indonesian as our main language. (system supported by DigitalSiber.id)

Tag: apple kecerdasan buatan