「測試時匹配」方法讓AI模型無需額外訓練數據即可進步

「測試時匹配」方法讓AI模型無需額外訓練數據即可進步

Hacker News·

加州大學河濱分校的研究人員開發了一種名為「測試時匹配」(TTM)的新穎方法,讓AI模型無需額外的訓練數據,僅憑測試問題就能提升其推理能力,尤其是在理解文本和圖像之間的關係方面。

News

Follow US:

Image

Making AI smarter without more training data

‘Test-Time Matching’ method lets AI models improve with use

Image

A study led by UC Riverside researchers offers a practical fix to one of artificial intelligence’s toughest challenges by enabling AI systems to reason more like humans—without requiring new training data beyond test questions.

In a pre-print paper titled “Test-Time Matching: Unlocking Compositional Reasoning in Multimodal Models,” assistant professor Yinglun Zhu and students introduce a novel method called Test-Time Matching, or TTM. The approach significantly improves how AI systems interpret relationships between text and images, especially when presented with unfamiliar combinations.

Image

“Compositional reasoning is about generalizing in the way humans do and understanding new combinations based on known parts,” said Zhu, who led the study and is a member of the Department of Electrical and Computer Engineering at the Bourns College of Engineering. “It’s essential for developing AI that can make sense of the world, not just memorize patterns.”

Today’s leading AI models perform well on many tasks, but they can falter when asked to align visual scenes with language under compositional stress—such as when familiar objects and relationships are rearranged and described in new ways.

Researchers use specialized tests to evaluate whether AI models can integrate concepts in the way people do. Yet models often perform no better than chance, suggesting they struggle with grasping the nuanced relationships between words and images.
Zhu’s team found that existing evaluations may unfairly penalize models.

The widely used evaluation metrics now rely on isolated pairwise comparisons, imposing extra constraints that can obscure the best overall matching between images and captions, Zhu said.
To address this, the team created a new evaluation metric that identifies the best overall matching across a group of image-caption pairs. This metric improved scores and revealed previously unseen model capabilities.

Image

Building on this insight, the researchers then developed Test-Time Matching, or TTM, a technique that allows AI systems to improve with use without any external supervision.

The method works by having the AI model predict matches between images and captions, selecting its most confident predictions. The model then fine-tunes itself using those predictions, repeating the process to refine its performance. This self-improvement process mimics how people use context to reason more effectively.

The researchers tested their method on SigLIP-B16, a relatively small vision-language model designed to understand and connect visual and textual information. With TTM, SigLIP-B16 significantly improved its performance on compositional reasoning benchmarks, achieving or exceeding previous state-of-the-art results.

In one test, TTM boosted SigLIP-B16’s performance on a benchmark dataset known as MMVP-VLM to 89.4%, surpassing GPT-4.1.

“Even smaller models have the capacity for strong reasoning,” Zhu said. “We just need to unlock it with better evaluation and smarter test-time methods.”

The study suggests that test-time adaptation strategies like TTM could become essential as AI expands into real-world settings such as robotics, autonomous vehicles, and healthcare—domains where systems must quickly adapt to new situations.

Zhu’s findings challenge the prevailing assumption that larger models are always better. Instead, he calls for rethinking how AI systems are evaluated and deployed.

“Sometimes, the problem isn’t the model. It’s how we’re using it,” he said.

The full paper, co-authored with UCR’s Jiancheng Zhang and Fuzhi Tang, is available on arXiv.

(Header conceptual image/Getty Images)

Media Contacts

Related Articles

Image

Tango’s sweet legacy lives on

Image

Getting a grip on aging

Image

From waste to wearable

Image

Tiny Mars’ big impact on Earth’s climate

900 University Ave.
Riverside, CA 92521

Tel: (951) 827-1012

University of California, Riverside

900 University Ave.
Riverside, CA 92521

tel: (951) 827-1012 email: [email protected]

Find Us

Image

Follow US:

Related Links

Hacker News

相關文章

  1. AI模型開始透過自問自答來學習

    Wired - AI · 4 個月前

  2. 提升推理模型能力的推論端運算縮放方法

    Sebastian Raschka'S Blog · 大約 1 年前

  3. 對齊預訓練:AI論述導致自我實現的(錯)對齊

    Lesswrong · 4 個月前

  4. 理解 AI 與學習成效的新工具

    OpenAI · 大約 2 個月前

  5. AI 模型是進行推理還是僅僅鸚鵡學舌?

    4 個月前

其他收藏 · 0