Paper Title Authors
High-Dimensional Calibration from Swap Regret
Maxwell Fishelson, Noah Golowich, Mehryar Mohri, Jon Schneider
 
High-dimensional neuronal activity from low-dimensional latent dynamics: a solvable model
Valentin Schmutz, Ali Haydaroğlu, Shuqi Wang, Yixiao Feng, Matteo Carandini, Kenneth D. Harris
 
Memory Mosaics at scale
Jianyu Zhang, Leon Bottou
 
SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts
Chen Yueh-Han, Guy Davidson, Brenden Lake
 
SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks
Yilun Zhao, Kaiyan Zhang, Tiansheng Hu, Sihong Wu, Ronan Le Bras, Yixin Liu, Xiangru Tang, Joseph Chee Chang, Jesse Dodge, Jonathan Bragg, Chen Zhao, Hannaneh Hajishirzi, Doug Downey, Arman Cohan
 
Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency
Renzhao Liang, Sizhe Xu, Chenggang Xie, Jingru Chen, Feiyang Ren, Shu Yang, Takahiro Yabe
 
Axial Neural Networks for Dimension-Free Foundation Models
Hyunsu Kim, Jonggeon Park, Joan Bruna, Hongseok Yang, Juho Lee
 
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
William Merrill, Shane Arora, Dirk Groeneveld, Hannaneh Hajishirzi
 
Depth-Width Tradeoffs for Transformers on Graph Tasks
Gilad Yehudai, Clayton Sanford, Maya Bechler-Speicher, Orr Fischer, Ran Gilad-Bachrach, Amir Globerson
 
Error Forcing in Recurrent Neural Networks
A Erdem Sağtekin, Colin Bredenberg, Cristina Savin
 
Estimating cognitive biases with attention-aware inverse planning
Sounak Banerjee, Daphne Cornelisse, Deepak Edakkattil Gopinath, Emily Sumner, Jonathan DeCastro, Guy Rosman, Eugene Vinitsky, Mark K Ho
 
From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers
Ryotaro Kawata, Yujin Song, Alberto Bietti, Naoki Nishikawa, Taiji Suzuki, Samuel Vaiter, Denny Wu
 
Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning
Jian Liu, Jing Xu, Song Guo, Jing Li, Guojingfeng, Jiaao Yu, Haohan Weng, Biwen Lei, Xianghui Yang, Zhuo Chen, Fangqi Zhu, Tao Han, Chunchao Guo
 
Multitask Learning with Stochastic Interpolants
Hugo Negrel, Florentin Coeurdoux, Michael Samuel Albergo, Eric Vanden-Eijnden
 
Precise Asymptotics and Refined Regret of Variance-Aware UCB
Yingying Fan, Yuxuan Han, Jinchi Lv, Xiaocong XU, Zhengyuan Zhou
 
QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models
Yutong Wang, Haiyu Wang, Sai Qian Zhang
 
Scaling can lead to compositional generalization
Florian Redhardt, Yassir Akram, Simon Schug
 
Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
David Heineman, Valentin Hofmann, Ian Magnusson, Yuling Gu, Noah A. Smith, Hannaneh Hajishirzi, Kyle Lo, Jesse Dodge
 
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping
Pu Yang, Yunzhen Feng, Ziyuan Chen, Yuhang Wu, Zhuoyuan Li
 
The Generative Leap: Tight Sample Complexity for Efficiently Learning Gaussian Multi-Index Models
Alex Damian, Jason D. Lee, Joan Bruna
 
Towards Physics-informed Spatial Intelligence with Human Priors: An Autonomous Driving Pilot Study
Guanlin Wu, Boyan Su, Yang Zhao, Pu Wang, Yichen Lin, Hao Frank Yang
 
Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift
Yi Zhang, Elynn Chen, Yujun Yan
 
UniTok: a Unified Tokenizer for Visual Generation and Understanding
Chuofan Ma, Yi Jiang, Junfeng Wu, Jihan Yang, Xin Yu, Zehuan Yuan, BINGYUE PENG, XIAOJUAN QI
 
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
Xiaomin Li, Zhou Yu, Zhiwei Zhang, Xupeng Chen, Ziji Zhang, Yingying Zhuang, Narayanan Sadagopan, Anurag Beniwal
 
All that structure matches does not glitter
Maya Martirossyan, Thomas Egg, Philipp Höllmer, George Karypis, Mark Transtrum, Adrian Roitberg, Mingjie Liu, Richard Hennig, Ellad B. Tadmor, Stefano Martiniani
 
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
Liyan Tang, Grace Kim, Xinyu Zhao, Thom Lake, Wenxuan Ding, Fangcong Yin, Prasann Singhal, Manya Wadhwa, Zeyu Leo Liu, Zayne Rea Sprague, Ramya Namuduri, Bodun Hu, Juan Diego Rodriguez, Puyuan Peng, Greg Durrett
 
DGCBench: A Deep Graph Clustering Benchmark
Benyu Wu, Yue Liu, Qiaoyu Tan, Xinwang Liu, Wei Du, Jun Wang, Guoxian Yu
 
Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data
Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich
 
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia
Chandler Smith, Marwa Abdulhai, Manfred Diaz, Marko Tesic, Rakshit Trivedi, Sasha Vezhnevets, Lewis Hammond, Jesse Clifton, Minsuk Chang, Edgar A. Duéñez-Guzmán, John P Agapiou, Jayd Matyas, Danny Karmon, Beining Zhang, Jim Dilkes, Akash Kundu, Jord Nguyen, Emanuel Tewolde, Jebish Purbey, Ram Mohan Rao Kadiyala, Siddhant Gupta, Aliaksei Korshuk, Buyantuev Alexander, Ilya Makarov, Gang Zhao, Rolando Fernandez, Zhihan Wang, Caroline Wang, Jiaxun Cui, Lingyun Xiao, Di Yang Shi, Yoonchang Sung, Arrasy Rahman, Peter Stone, Yipeng Kang, Hyeonggeun Yun, Ananya Ananya, Taehun Cha, Zhiqiang Wu, Elizaveta Tennant, Olivia Macmillan-Scott, Marta Emili García Segura, Diana Riazi, Fuyang Cui, Sriram Ganapathi Subramanian, Toryn Q. Klassen, Nico Schiavone, Mogtaba Alim, Sheila A. McIlraith, Manuel Sebastian Rios Beltran, Oswaldo Peña, Carlos Saith Rodriguez Rojas, Manuela Chacon-Chamorro, Ruben Manrique, Luis Felipe Giraldo, Nicanor Quijano, Yiding Wang, Yuxuan Chen, Fangwei Zhong, Mengmeng Wang, Wenming Tu, Zhaowei Zhang, Ziang Chen, Zixia Jia, Xue Feng, Zilong Zheng, Chichen Lin, Weijian Fan, Chenao Liu, Sneheel Sarangi, Ziyan Wang, Shuqing Shi, Yali Du, Avinaash Anand Kulandaivel, Yang Liu, Wu Ruiyang, Chetan Talele, 陆孙嘉, Gema Parreño Piqueras, Shamika Dhuri, Bain McHale, Tim Baarslag, Dylan Hadfield-Menell, Natasha Jaques, Jose Hernandez-Orallo, Joel Z Leibo
 
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
Zihan Zheng, Zerui Cheng, Zeyu Shen, Shang Zhou, Kaiyuan Liu, Hansen He, Dongruixuan Li, Stanley Wei, Hangyi Hao, Jianzhu Yao, Peiyao Sheng, Zixuan Wang, Wenhao Chai, Aleksandra Korolova, Peter Henderson, Sanjeev Arora, Pramod Viswanath, Jingbo Shang, Saining Xie
 
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Weizhe Yuan, Jane Yu, Song Jiang, Karthik Padthe, Yang Li, Dong Wang, Ilia Kulikov, Kyunghyun Cho, Yuandong Tian, Jason E Weston, Xian Li
 
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Ling Fu, Zhebin Kuang, Jiajun Song, Mingxin Huang, Biao Yang, Yuzhe Li, Linghao Zhu, Qidi Luo, Xinyu Wang, Hao Lu, Zhang Li, Guozhi Tang, Bin Shan, Chunhui Lin, Qi Liu, Binghong Wu, Hao Feng, Hao Liu, Can Huang, Jingqun Tang, Wei Chen, Lianwen Jin, Yuliang Liu, Xiang Bai
 
RBench-V: A Primary Assessment for Visual Reasoning Models with Multimodal Outputs
Meng-Hao Guo, Xuanyu Chu, Qianrui Yang, Zhe-Han Mo, Yiqing Shen, Pei-lin Li, Xinjie Lin, Jinnian Zhang, Xin-Sheng Chen, Yi Zhang, Kiyohiro Nakayama, Zhengyang Geng, Houwen Peng, Han Hu, Shi-min Hu
 
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Patrick Yubeaton, Andre Nakkab, Weihua Xiao, Luca Collini, Ramesh Karri, Chinmay Hegde, Siddharth Garg
 
Embracing Contradiction: Theoretical Inconsistency Will Not Impede the Road of Building Responsible AI Systems
Gordon Dai, Yunze Xiao
 
LLM Generated Persona is a Promise with a Catch
Ang Li, Haozhe Chen, Hongseok Namkoong, Tianyi Peng
 
A Data-Driven Prism: Multi-View Source Separation with Diffusion Model Priors
Sebastian Wagner-Carena, Aizhan Akhmetzhanova, Sydney Erickson
 
A Latent Multilayer Graphical Model For Complex, Interdependent Systems
Martin Ondrus, Ivor Cribben, Yang Feng
 
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
William Merrill, Ashish Sabharwal
 
A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values
Tyler Chen, Akshay Seshadri, Mattia Jacopo Villani, Pradeep Niroula, Shouvanik Chakrabarti, Archan Ray, Pranav Deshpande, Romina Yalovetzky, Marco Pistoia, Niraj Kumar
 
AION-1: Omnimodal Foundation Model for Astronomical Sciences
Liam Holden Parker, Francois Lanusse, Jeff Shen, Ollie Liu, Tom Hehir, Leopoldo Sarra, Lucas Thibaut Meyer, Micah Bowles, Sebastian Wagner-Carena, Helen Qu, Siavash Golkar, Alberto Bietti, Hatim Bourfoune, Pierre Cornette, Keiya Hirashima, Geraud Krawezik, Ruben Ohana, Nicholas Lourie, Michael McCabe, Rudy Morel, Payel Mukhopadhyay, Mariel Pettee, Kyunghyun Cho, Miles Cranmer, Shirley Ho
 
Adaptive Time Encoding for Irregular Multivariate Time-Series Classification
Sangho Lee, Kyeongseo Min, Youngdoo Son, Hyungrok Do
 
AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents
Hanjun Luo, Shenyu Dai, Chiming Ni, Xinfeng Li, Guibin Zhang, Kun Wang, Tongliang Liu, Hanan Salam
 
Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector
Haoyan Yang, Runxue Bao, Cao Xiao, Jun Ma, Parminder Bhatia, Shangqian Gao, Taha Kass-Hout
 
Architectural and Inferential Inductive Biases for Exchangeable Sequence Modeling
Daksh Mittal, Ang Li, Thomson Yen, C. Daniel Guetta, Hongseok Namkoong
 
Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization
Guojingfeng, Jian Liu, Jinnan Chen, Shiwei Mao, Changrong Hu, Puhua Jiang, Junlin Yu, Jing Xu, Qi Liu, LiXin Xu, Zhuo Chen, Chunchao Guo
 
BlockScan: Detecting Anomalies in Blockchain Transactions
Jiahao Yu, Xian Wu, Hao Liu, Wenbo Guo, Xinyu Xing
 
CSGO: Content-Style Composition in Text-to-Image Generation
Peng Xing, Haofan Wang, Yanpeng Sun, wangqixun, Baixu, Hao Ai, Jen-Yuan Huang, Zechao Li
 
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference
Xiang Liu, Zhenheng Tang, Peijie Dong, Zeyu Li, Liuyue, Bo Li, Xuming Hu, Xiaowen Chu
 
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger, Timothy Kostolansky, Hannes Whittingham, Mary Phuong
 
ComPO: Preference Alignment via Comparison Oracles
Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin
 
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
Gilad Yehudai, Noah Amsel, Joan Bruna
 
Contrastive Self-Supervised Learning As Neural Manifold Packing
Guanming Zhang, David Heeger, Stefano Martiniani
 
DISC: Dynamic Decomposition Improves LLM Inference Scaling
Jonathan Light, Wei Cheng, Benjamin Riviere, Yue Wu, Masafumi Oyamada, Mengdi Wang, Yisong Yue, Santiago Paternain, Haifeng Chen
 
Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Thérien, Sambit Sahu, Tom Goldstein, Supriyo Chakraborty
 
Differentiable extensions with rounding guarantees for combinatorial optimization over permutations
Robert R Nerem, Zhishang Luo, Akbar Rafiey, Yusu Wang
 
Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search
Yousef Al-Jazzazi, Haya Diwan, Jinrui Gou, Cameron N Musco, Christopher Musco, Torsten Suel
 
Do different prompting methods yield a common task representation in language models?
Guy Davidson, Todd M. Gureckis, Brenden Lake, Adina Williams
 
Dynamic Test-Time Compute Scaling in Control Policy: Difficulty-Aware Stochastic Interpolant Policy
Inkook Chun, Seungjae Lee, Michael Samuel Albergo, Saining Xie, Eric Vanden-Eijnden
 
ESCA: Enabling Seamless Codec Avatar Execution through Algorithm and Hardware Co-Optimization for Virtual Reality
Mingzhi Zhu, Ding Shang, Sai Qian Zhang
 
Efficient semantic uncertainty quantification in language models via diversity-steered sampling
Ji Won Park, Kyunghyun Cho
 
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu, Jason D. Lee
 
Emergence of Linear Truth Encodings in Language Models
Shauli Ravfogel, Gilad Yehudai, Tal Linzen, Joan Bruna, Alberto Bietti
 
Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
Yuzhou Gu, Yanjun Han, Jian Qian
 
Exact Expressive Power of Transformers with Padding
William Merrill, Ashish Sabharwal
 
FEAT: Free energy Estimators with Adaptive Transport
Yuanqi Du, Jiajun He, Francisco Vargas, Yuanqing Wang, Carla P Gomes, José Miguel Hernández-Lobato, Eric Vanden-Eijnden
 
Feature-Based Instance Neighbor Discovery: Advanced Stable Test-Time Adaptation in Dynamic World
Qinting Jiang, Chuyang Ye, Dongyan Wei, Bingli Wang, Yuan Xue, Jingyan Jiang, Zhi Wang
 
FlowMixer: A Depth-Agnostic Neural Architecture for Interpretable Spatiotemporal Forecasting
Fares B. Mehouachi, Saif Eddin Jabari
 
Geometric Algorithms for Neural Combinatorial Optimization with Constraints
Nikolaos Karalias, Akbar Rafiey, Yifei Xu, Zhishang Luo, Behrooz Tahmasebi, Connie Jiang, Stefanie Jegelka
 
HOComp: Interaction-Aware Human-Object Composition
Dong Liang, Jinyuan Jia, Yuhao LIU, Rynson W. H. Lau
 
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer
 
How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime?
Wei Huang, Andi Han, Yujin Song, Yilan Chen, Denny Wu, Difan Zou, Taiji Suzuki
 
How to Auto-optimize Prompts for Domain Tasks? Adaptive Prompting and Reasoning through Evolutionary Domain Knowledge Adaptation
Yang Zhao, Pu Wang, Hao Frank Yang
 
How to Scale Second-Order Optimization
Zixi Chen, Shikai Qiu, Hoang Phan, Qi Lei, Andrew Gordon Wilson
 
How to build a consistency model: Learning flow maps via self-distillation
Nicholas Matthew Boffi, Michael Samuel Albergo, Eric Vanden-Eijnden
 
Implicit Generative Property Enhancer
Pedro O. Pinheiro, Pan Kessel, Aya Abdelsalam Ismail, Sai Pooja Mahajan, Kyunghyun Cho, Saeed Saremi, Natasa Tagasovska
 
Improved Balanced Classification with Theoretically Grounded Loss Functions
Corinna Cortes, Mehryar Mohri, Yutao Zhong
 
Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits
Yuxuan Han, Jose Blanchet, Zhengyuan Zhou
 
Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay
Yifan Sun, Jingyan Shen, Yibin Wang, Tianyu Chen, Zhendong Wang, Mingyuan Zhou, Huan Zhang
 
Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations
Li Ji-An, Hua-Dong Xiong, Robert Wilson, Marcelo G Mattar, Marcus K. Benna
 
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Vlad Sobal, Wancong Zhang, Kyunghyun Cho, Randall Balestriero, Tim G. J. Rudner, Yann LeCun
 
Learning normalized image densities via dual score matching
Florentin Guth, Zahra Kadkhodaie, Eero P Simoncelli
 
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
Gerard Ben Arous, Murat A Erdogdu, Nuri Mert Vural, Denny Wu
 
Length Generalization via Auxiliary Tasks
Pranjal Awasthi, Anupam Gupta, Ravi Kumar
 
Making Classic GNNs Strong Baselines Across Varying Homophily: A Smoothness–Generalization Perspective
Ming Gu, Zhuonan Zheng, Sheng Zhou, Meihan Liu, Jiawei Chen, Qiaoyu Tan, Liangcheng Li, Jiajun Bu
 
MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference
Wenyuan Zhang, Jimin Tang, Weiqi Zhang, Yi Fang, Yu-Shen Liu, Zhizhong Han
 
MetaFind: Scene-Aware 3D Asset Retrieval for Coherent Metaverse Scene Generation
Zhenyu Pan, Yucheng Lu, Han Liu
 
Modeling Neural Activity with Conditionally Linear Dynamical Systems
Victor Geadah, Amin Nejatbakhsh, David Lipshutz, Jonathan W. Pillow, Alex H Williams
 
NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables
Lanrui Wang, Mingyu Zheng, Hongyin Tang, Zheng Lin, Yanan Cao, Jingang Wang, Xunliang Cai, Weiping Wang
 
Neurons as Detectors of Coherent Sets in Sensory Dynamics
Joshua L. Pughe-Sanford, Xuehao Ding, Jason J Moore, Anirvan M. Sengupta, Charles Epstein, Philip Greengard, Dmitri Chklovskii
 
OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation
Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun, Farshad Khorrami
 
Optimal Estimation of the Best Mean in Multi-Armed Bandits
Takayuki Osogami, Junya Honda, Junpei Komiyama
 
Parsimonious Predictions for Strategyproof Scheduling
Richard Cole, Anupam Gupta, Pranav Jangir
 
Performative Risk Control: Calibrating Models for Reliable Deployment under Performativity
Victor Li, Baiting Chen, Yuzhen Mao, Qi Lei, Zhun Deng
 
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Teng Hu, Zhentao Yu, Zhengguang Zhou, Jiangning Zhang, Yuan Zhou, Qinglin Lu, Ran Yi
 
Predicting Empirical AI Research Outcomes with Language Models
Jiaxin Wen, Chenglei Si, Chen Yueh-Han, He He, Shi Feng
 
Predicting Functional Brain Connectivity with Context-Aware Deep Neural Networks
Alexander Ratzan, Sidharth Goel, Junhao Wen, Christos Davatzikos, Erdem Varol
 
Predicting partially observable dynamical systems via diffusion models with a multiscale inference scheme
Rudy Morel, Francesco Pio Ramunno, Jeff Shen, Alberto Bietti, Kyunghyun Cho, Miles Cranmer, Siavash Golkar, OLEXANDR GUGNIN, Geraud Krawezik, Tanya Marwah, Michael McCabe, Lucas Thibaut Meyer, Payel Mukhopadhyay, Ruben Ohana, Liam Holden Parker, Helen Qu, François Rozet, K.D. Leka, Francois Lanusse, David Fouhey, Shirley Ho
 
Preserving Task-Relevant Information Under Linear Concept Removal
Floris Holstege, Shauli Ravfogel, Bram Wouters
 
Principled Model Routing for Unknown Mixtures of Source Domains
Christoph Dann, Yishay Mansour, Teodor Vanislavov Marinov, Mehryar Mohri
 
Procurement Auctions with Predictions: Improved Frugality for Facility Location
Eric Balkanski, Nicholas DeFilippis, Vasilis Gkatzelis, Xizhi Tan
 
Regression-adjusted Monte Carlo Estimators for Shapley Values and Probabilistic Values
R. Teal Witter, Yurong Liu, Christopher Musco
 
Robust Contextual Pricing
Anupam Gupta, Guru Guruganesh, Renato Paes Leme, Jon Schneider
 
Scalable inference of functional neural connectivity at submillisecond timescales
Arina Medvedeva, Edoardo Balzani, Alex H Williams, Stephen L Keeley
 
Self supervised learning for in vivo localization of microelectrode arrays using raw local field potential
Tianxiao He, Malhar Patel, Chenyi Li, Anna Maslarova, Mihály Vöröslakos, Nalini Ramanathan, Wei-Lun Hung, Gyorgy Buzsaki, Erdem Varol
 
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful
Martin Marek, Sanae Lotfi, Aditya Somasundaram, Andrew Gordon Wilson, Micah Goldblum
 
Soft-consensual Federated Learning for Data Heterogeneity via Multiple Paths
Sheng Huang, Lele Fu, Fanghua Ye, Tianchi Liao, Bowen Deng, zhangchuanfu, Chuan Chen
 
Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics
Lorenzo Magnino, Kai Shao, Zida Wu, Jiacheng Shen, Mathieu Lauriere
 
Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang, Wenxuan Ding, Shangbin Feng, Greg Durrett, Yulia Tsvetkov
 
Spectral Analysis of Representational Similarity with Limited Neurons
Hyunmo Kang, Abdulkadir Canatar, SueYeon Chung
 
Split Gibbs Discrete Diffusion Posterior Sampling
Wenda Chu, Zihui Wu, Yifan Chen, Yang Song, Yisong Yue
 
SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing
Sung-Hoon Yoon, Minghan Li, Gaspard Beaudouin, Congcong Wen, Muhammad Rafay Azhar, Mengyu Wang
 
Test Time Scaling for Neural Processes
Hyungi Lee, Moonseok Choi, Hyunsu Kim, Kyunghyun Cho, Rajesh Ranganath, Juho Lee
 
The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement
Ruihan Yang, Fanghua Ye, Jian Li, Siyu Yuan, Yikai Zhang, Zhaopeng Tu, Xiaolong Li, Deqing Yang
 
The Rise of Parameter Specialization for Knowledge Storage in Large Language Models
Yihuai Hong, Yiran Zhao, Wei Tang, Yang Deng, Yu Rong, Wenxuan Zhang
 
ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation
Jiuhong Xiao, Roshan Nayak, Ning Zhang, Daniel Toertei, Giuseppe Loianno
 
Thinking vs. Doing: Improving Agent Reasoning by Scaling Test-Time Interaction
Junhong Shen, Hao Bai, Lunjun Zhang, Yifei Zhou, Amrith Setlur, Shengbang Tong, Diego Caples, Nan Jiang, Tong Zhang, Ameet Talwalkar, Aviral Kumar
 
Tight Lower Bounds and Improved Convergence in Performative Prediction
Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami, Simon Lacoste-Julien, Gauthier Gidel
 
U-CAN: Unsupervised Point Cloud Denoising with Consistency-Aware Noise2Noise Matching
Junsheng Zhou, XingYu Shi, Haichuan Song, Yi Fang, Yu-Shen Liu, Zhizhong Han
 
Understanding outer learning rates in Local SGD
Ahmed Khaled, Satyen Kale, Arthur Douillard, Chi Jin, Rob Fergus, Manzil Zaheer
 
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding
Zaiquan Yang, Yuhao LIU, Gerhard Petrus Hancke, Rynson W. H. Lau
 
VeriLoC: Line-of-Code Level Prediction of Hardware Design Quality from Verilog Code
Raghu Vamshi Hemadri, Jitendra Bhandari, Andre Nakkab, Johann Knechtel, Badri P Gopalan, Ramesh Narayanaswamy, Ramesh Karri, Siddharth Garg
 
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota, Minh Pham, David Bau, Chinmay Hegde, Niv Cohen
 
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
Alireza Mousavi-Hosseini, Clayton Sanford, Denny Wu, Murat A Erdogdu
 
Whole-Body Conditioned Egocentric Video Prediction
Yutong Bai, Danny Tran, Amir Bar, Yann LeCun, Trevor Darrell, Jitendra Malik
 
Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation
Sungmin Cha, Kyunghyun Cho
 
Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion
Alan Nawzad Amin, Nate Gruver, Andrew Gordon Wilson