Releases

DINOv3

DINOv3 is a generalist, computer vision foundation model that scales self-supervised learning (SSL) and produces high-resolution visual features eliminating the need for labeled data.

dinov3_benchmarks
Figure 1: DINOv3 benchmarks

Intern-S1: A Scientific Multimodal Foundation Model

Intern-S1 is a large-scale multimodal Mixture-of-Experts (MoE) foundation model released by The Shanghai AI Laboratory. It is designed to close the gap between general-purpose open-source models and expert-level closed-source models in scientific domains. The model has 28 billion active parameters, 241 billion total parameters, and it was pretrained on 5T tokens. The authors used Mixture-of-Rewards (MoR), a novel RL technique to train simultaneously on more than 1000 tasks.

Ovis2.5

Ovis2.5 is an open-source multimodal model released by Alibaba that introduces native-resolution vision and reflective reasoning. It achieves state-of-the-art performance in STEM, chart analysis, and multimodal benchmarks.

Thyme: Think Beyond Image

Thyme enables multimodal LLMs to autonomously generate code for image manipulation and math, and with its GRPO-ATS training strategy, achieves strong gains on high-resolution perception and complex reasoning benchmarks.

Organization Highlight

Polymathic AI

  • Hugging Face Organization card

  • Polymathic-AI

  • Mission: To usher in a new class of machine learning for scientific data, building models that can leverage shared concepts across disciplines. We aim to develop, train, and release such foundation models for use by researchers worldwide.

  • @PolymathicAI - X

    Polymathic AI
    Figure 2: Polymathic-AI: Advancing Science through Multi‑Disciplinary AI

  • Datasets released:

    • The Well
      • A 15TB collection of physics simulation datasets.

Notable Papers

Repositories

Textbooks