Introduction of Llama 3.2: Meta’s New AI Model

September 26, 2024 1 min read2

Overview of Llama 3.2

Contents show

Just two months after the last release, Meta has introduced Llama 3.2. This model is its first open-source and multi-modal AI capable of processing text, images, tables, charts, and image captions.

Advanced AI Applications

The new Llama 3.2 allows developers to create advanced AI applications. For instance, they can build virtual reality apps, visual search engines, and document analysis tools. Moreover, it can handle both text and image data simultaneously. This flexibility enables developers to interact easily with visual files.

Keeping Pace with Competitors

In response to multi-modal models from companies like OpenAI and Google, Meta aims to remain competitive. The addition of image processing is crucial for future developments, especially for hardware like the Meta Ray-Ban smart glasses.

Model Variants

Llama 3.2 comes in two vision versions (with 11 and 90 billion parameters) and two text versions (with 1 and 3 billion parameters). The smaller versions are designed to work with Qualcomm, MediaTek, and other ARM-based devices. Therefore, Meta may also introduce these models to smartphones.

China Unveils Text-to-Video AI Tool Vidu, Rival to OpenAI’s Sora

Competitive Performance

Meta asserts that Llama 3.2 competes strongly in image recognition. It performs well against models like Claude 3 Haiku from Anthropic and GPT4o-mini from OpenAI. Additionally, it outperforms models like Gemma and Phi 3.5-mini in following instructions, summarizing content, and rewriting prompts.

Availability

Currently, these models are available on the Llama.com website and through Meta’s partner platforms like Hugging Face.

The Psychology of Design: How Visual Elements Influence Perception and Behavior

How to Present Your Graphic Design Work to Clients (And Get More Approvals)

Best Courses for Designers: Upgrade Your Skills in 2024