Meta joins AI chatbot race with own large language model for researchers

Meta is making LLaMA available at several sizes (7 billion, 13 billion, 33 billion, and 65 billion parameters).

New Delhi: After Microsoft ChatGPT and Google’s Bard, Meta is joining the AI chatbot race with its own state-of-the-art foundational large language model designed to help researchers advance their work in the field of artificial intelligence.

However, Meta’s Large Language Model Meta AI (LLaMA) isn’t like ChatGPT-driven Bing at the moment as it can’t yet talk to humans but will help researchers.

“Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratising access in this important, fast-changing field,” Meta said in a statement.

Meta is making LLaMA available at several sizes (7 billion, 13 billion, 33 billion, and 65 billion parameters).

Large language models — natural language processing (NLP) systems with billions of parameters — have shown new capabilities to generate creative text, solve mathematical theorems, predict protein structures, answer reading comprehension questions, and more.

“They are one of the clearest cases of the substantial potential benefits AI can offer at scale to billions of people,” said Meta.

Smaller models trained on more tokens — which are pieces of words — are easier to retrain and fine-tune for specific potential product use cases.

Meta has trained LLaMA 65 billion and LLaMA 33 billion on 1.4 trillion tokens.

“Our smallest model, LLaMA 7B, is trained on one trillion tokens,” said the company.

Like other large language models, LLaMA works by taking a sequence of words as an input and predicts a next word to recursively generate text.

“To train our model, we chose text from the 20 languages with the most speakers, focusing on those with Latin and Cyrillic alphabets,” Meta informed.

To maintain integrity and prevent misuse, the company said it is releasing the model under a noncommercial license focused on research use cases at the moment.

Back to top button