Apple Quietly Reveals MM1, a Multimodal LLM

Paul Thurrott
Mar 17, 2024
14

Apple logo

Researchers from Apple quietly published a paper describing the company’s work on MM1, a set of multimodal LLMs (large language models) designed for captioning images, answering visual questions, and natural language inference. It indicates that Apple, which had been silent on AI as the rest of the industry seized on it as the next wave, has made some advances and could soon play a major role.

“In this work, we discuss building performant Multimodal Large Language Models (MLLMs),” the description of MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training on arxiv.org reads. “We demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data is crucial for achieving state-of-the-art few-shot results across multiple benchmarks, compared to other published pre-training results.”

Windows Intelligence In Your Inbox

Sign up for our new free newsletter to get three time-saving tips each Friday — and get free copies of Paul Thurrott's Windows 11 and Windows 10 Field Guides (normally $9.99) as a special welcome gift!

"*" indicates required fields

The paper describes MM1 as a family of multimodal models that support up to 30 billion parameters and “achieve competitive performance after supervised fine-tuning on a range of established multimodal benchmarks.” As the Apple researchers put it, MLLMs (multimodal large language models) have emerged as “the next frontier in foundation models” after traditional LLMs, and they “achieve superior capabilities.”

The Apple researchers believe they’ve made a breakthrough when it comes to training models with both images and text, and that these findings will help others trying to scale these models to ever-larger sets of data with better performance and reliability. Of course, for now, all we have to go on is the paper, as MM1 is not available for testing.

And it may never be: Apple is rumored to be working on an LLM framework code-named “Ajax” as part of a $1 billion AI R&D push. And the firm allegedly acquired the DarwinAI startup earlier this year to help goose those efforts.

“We view AI and machine learning as fundamental technologies, and they’re integral to virtually every product that we ship,” Apple CEO Tim Cook said during a post-earnings conference call in February after a year of silence on the topic. “We’re excited to share the details of our ongoing work in that space later this year.”

Since then, the company also highlighted the AI prowess of its recently announced MacBook Air M3 refresh. But the big push will likely come in June, when Apple is expected to host the next rendition of its annual WWDC developer show. It’s reasonable to expect that event to focus on AI, as will coming Google (I/O) and Microsoft (Build) developer shows.

Please check our Community Guidelines before commenting

About author

Paul Thurrott

Paul Thurrott is an award-winning technology journalist and blogger with over 25 years of industry experience and the author of 30 books. He is the owner of Thurrott.com and the host of three tech podcasts: Windows Weekly with Leo Laporte and Richard Campbell, Hands-On Windows, and First Ring Daily with Brad Sams. He was formerly the senior technology analyst at Windows IT Pro and the creator of the SuperSite for Windows from 1999 to 2014 and the Major Domo of Thurrott.com while at BWW Media Group from 2015 to 2023. You can reach Paul via email, Twitter or Mastodon.

View Articles

Currently on Forums
Visit the forums
- Relive History: AMD 40th Anniversary Special (4K)
  Posted by Sarah Duguay
  
  2
  comments
- [CLOSED] Ask Paul for Friday, May 3
  Posted by Paul Thurrott
  
  12
  comments
- Questions for 5/3? – closed
  Posted by Brad Sams
  
  5
  comments
- IT Desktop (Over)Engineering Strikes Again
  Posted by Greg Edwards
  
  14
  comments
Podcasts
Podcast Hub
- The Sams Report: Xbox Makes it Obvious
  
  Aired on May 03, 2024 by Brad Sams with 0 Comments
- First Ring Daily 1595: A Sticker of Time
  
  Aired on May 02, 2024 by Brad Sams with 1 Comment
- Windows Weekly 879: The Cockroach of CPUs
  
  Aired on May 02, 2024 by Paul Thurrott with 1 Comment
- OneDrive and Dropbox Leave Google Behind with New Offline and Collaboration Features
  
  Aired on May 02, 2024 by Russell Smith with 0 Comments
Join the crowd where the love of tech is real - become a Thurrott Premium Member today!

Explore Premium Benefits

Apple Quietly Reveals MM1, a Multimodal LLM

Windows Intelligence In Your Inbox

Share post

Relive History: AMD 40th Anniversary Special (4K)

[CLOSED] Ask Paul for Friday, May 3

Questions for 5/3? – closed

IT Desktop (Over)Engineering Strikes Again

The Sams Report: Xbox Makes it Obvious

First Ring Daily 1595: A Sticker of Time

Windows Weekly 879: The Cockroach of CPUs

OneDrive and Dropbox Leave Google Behind with New Offline and Collaboration Features

Windows Intelligence In Your Inbox

Sections

About Thurrott

Contact

Our Other Sites

Subscribe