Releases

DINOv3

Paper: Hugging Face Arxiv
Huggin Face Collection
Blog post: DINOv3: Self-supervised learning for vision at unprecedented scale
Website: Self-supervised learning for vision at unprecedented scale

DINOv3 is a generalist, computer vision foundation model that scales self-supervised learning (SSL) and produces high-resolution visual features eliminating the need for labeled data.

dinov3_benchmarks — Figure 1: DINOv3 benchmarks

Intern-S1: A Scientific Multimodal Foundation Model

Paper: Hugging Face – Arxiv
Models: Intern-S1 – Intern-S1-mini

Intern-S1 is a large-scale multimodal Mixture-of-Experts (MoE) foundation model released by The Shanghai AI Laboratory. It is designed to close the gap between general-purpose open-source models and expert-level closed-source models in scientific domains. The model has 28 billion active parameters, 241 billion total parameters, and it was pretrained on 5T tokens. The authors used Mixture-of-Rewards (MoR), a novel RL technique to train simultaneously on more than 1000 tasks.

Ovis2.5

Paper: Hugging Face – Arxiv: Ovis2.5 Technical Report

Ovis2.5 is an open-source multimodal model released by Alibaba that introduces native-resolution vision and reflective reasoning. It achieves state-of-the-art performance in STEM, chart analysis, and multimodal benchmarks.

Thyme: Think Beyond Image

Paper: Hugging Face – Arxiv

Thyme enables multimodal LLMs to autonomously generate code for image manipulation and math, and with its GRPO-ATS training strategy, achieves strong gains on high-resolution perception and complex reasoning benchmarks.

Organization Highlight

Polymathic AI

Hugging Face Organization card
Polymathic-AI
Mission: To usher in a new class of machine learning for scientific data, building models that can leverage shared concepts across disciplines. We aim to develop, train, and release such foundation models for use by researchers worldwide.
@PolymathicAI - X
Figure 2: Polymathic-AI: Advancing Science through Multi‑Disciplinary AI
Datasets released:
- The Well
  - A 15TB collection of physics simulation datasets.

Notable Papers

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models - 08/13/25
MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training - 01/13/2025
QDataSet, quantum datasets for machine learning - 09/23/2022

Repositories

gpt-oss
- gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
ai-engineering-toolkit
- A curated list of 100+ libraries and frameworks for AI engineers building with LLMs
Awesome-Efficient-Arch
- Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Textbooks

Prompt Engineering for LLMs: The Art and Science of Building Large Language Model-Based Applications: Berryman, John, Ziegler

Releases#

DINOv3#

Intern-S1: A Scientific Multimodal Foundation Model#

Ovis2.5#

Thyme: Think Beyond Image#

Organization Highlight#

Polymathic AI#

Notable Papers#

Repositories#

Textbooks#