Newsletter

🔋Pilas: y25-w34

Releases DINOv3 Paper: Hugging Face Arxiv Huggin Face Collection Blog post: DINOv3: Self-supervised learning for vision at unprecedented scale Website: Self-supervised learning for vision at unprecedented scale DINOv3 is a generalist, computer vision foundation model that scales self-supervised learning (SSL) and produces high-resolution visual features eliminating the need for labeled data. Figure 1: DINOv3 benchmarks Intern-S1: A Scientific Multimodal Foundation Model Paper: Hugging Face – Arxiv Models: Intern-S1 – Intern-S1-mini Intern-S1 is a large-scale multimodal Mixture-of-Experts (MoE) foundation model released by The Shanghai AI Laboratory. It is designed to close the gap between general-purpose open-source models and expert-level closed-source models in scientific domains. The model has 28 billion active parameters, 241 billion total parameters, and it was pretrained on 5T tokens. The authors used Mixture-of-Rewards (MoR), a novel RL technique to train simultaneously on more than 1000 tasks. ...

🔋Pilas: y25-w20

Models/Systems INTELLECT-2 - Prime Intellect - 05/12/25 Organization: Prime Intellect Paper:INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning Blog: INTELLECT-2 Release: The First 32B Parameter Model Trained Through Globally Distributed Reinforcement Learning Hugginface Model Card: INTELLECT-2 Releasing INTELLECT-2: We’re open-sourcing the first 32B parameter model trained via globally distributed reinforcement learning: • Detailed Technical Report • INTELLECT-2 model checkpointhttps://t.co/iHDDHRyKN2 — Prime Intellect (@PrimeIntellect) May 12, 2025 SWE-1 - Windsurf - 05/15/25 Announcement: SWE-1: Our First Frontier Models The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models Announcement: Sharing new breakthroughs and artifacts supporting molecular property prediction, language processing, and neuroscience Hugging Face model card: facebook/OMol25 facebook/UMA Hugging Face collections: FAIR Chemistry Paper: UMA: A Family of Universal Models for Atoms The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models Announcing the newest releases from Meta FAIR. We’re releasing new groundbreaking models, benchmarks, and datasets that will transform the way researchers approach molecular property prediction, language processing, and neuroscience. 1️⃣ Open Molecules 2025 (OMol25): A dataset… pic.twitter.com/PAmnNgTVnB ...

🔋 Pilas: y25-w19

Models/Systems D-FINE: realtime object detector - 05/05/25 Organization: University of Science and Technology of China Paper: D-FINE: REDEFINE REGRESSION TASK IN DETRS AS FINE-GRAINED DISTRIBUTION REFINEMENT Hugging Face space A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to @huggingface transformers 🔥 D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🤩 Keep reading for the paper explainer, notebooks & demo 👀 pic.twitter.com/GNj2MMa8sK — merve (@mervenoyann) May 5, 2025 Kevin-32B: Multi-Turn RL for Writing CUDA Kernels - 05/06/25 Organization: Stanford University, Cognition AI Announcement Figure 1: Kevin-32B correctess and performance results Gemini 2.5 Pro (I/O Edition) - 05/06/25 Organization: Google Announcement Very excited to share the best coding model we’ve ever built! Today we’re launching Gemini 2.5 Pro Preview 'I/O edition' with massively improved coding capabilities. Ranks no.1 on LMArena in Coding and no.1 on the WebDev Arena Leaderboard. It’s especially good at building… pic.twitter.com/9vRaP6RTTo ...

🔋 Pilas: y25-w18

Models/Systems Qwen3 - 04/29/25 Released by: Model - Alibaba unveils Qwen3, a family of ‘hybrid’ AI reasoning models ⭐️ https://qwenlm.github.io/blog/qwen3/ Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general… pic.twitter.com/JWZkJeHWhC — Qwen (@Alibaba_Qwen) April 28, 2025 Byte Latent Transformer (blt) - Meta - 04/30/25 Hugging Face model card: facebook/blt Paper: Byte Latent Transformer: Patches Scale Better Than Tokens code: facebookresearch/blt Phi-4 - Microsoft - 04/30/25 Announcement: One year of Phi: Small language models making big leaps in AI Article: Microsoft’s most capable new Phi 4 AI model rivals the performance of far larger systems Paper: Phi-4-reasoning Technical Report Mellum - JetBrains - 04/30/25 Announcement: Mellum Goes Open Source: A Purpose-Built LLM for Developers, Now on Hugging Face Hugging Face model card: JetBrains/Mellum-4b-base OLMo 2 - AllenAI - 05/01/25 Project page: OLMo 2 Hugging Face Collection: OLMo 2 Paper: 2 OLMo 2 Furious Llama-Nemotron: Efficient Reasoning Models - 05/02/25 Paper NVIDIA Llama Nemotron Ultra Open Model Delivers Groundbreaking Reasoning Accuracy Hugging Face space F Lite - 04/29/25 F Lite - freepik - 04/29/25 Agents AMIE gains vision: A research AI agent for multimodal diagnostic dialogue Papers OLMOTRACE: Tracing Language Model Outputs Back to Trillions of Training Tokens The Leaderboard Illusion - 04/29/2 Phi-4-reasoning Technical Report - 04/30/25 Byte Latent Transformer: Patches Scale Better Than Tokens All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning - 05/03/25 WebThinker: Empowering Large Reasoning Models with Deep Research Capability Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA Practical Efficiency of Muon for Pretraining - 05/04/25 Articles Why We Think by Lilian Weng Lectures Yann LeCun: Models of SSL - 04/29/25