| Paper Title | Authors | |
|---|---|---|
| High-Dimensional Calibration from Swap Regret |
Maxwell Fishelson, Noah Golowich, Mehryar Mohri, Jon Schneider
|
|
| High-dimensional neuronal activity from low-dimensional latent dynamics: a solvable model |
Valentin Schmutz, Ali Haydaroğlu, Shuqi Wang, Yixiao Feng, Matteo Carandini, Kenneth D. Harris
|
|
| Memory Mosaics at scale |
Jianyu Zhang, Leon Bottou
|
|
| SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts |
Chen Yueh-Han, Guy Davidson, Brenden Lake
|
|
| SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks |
Yilun Zhao, Kaiyan Zhang, Tiansheng Hu, Sihong Wu, Ronan Le Bras, Yixin Liu, Xiangru Tang, Joseph Chee Chang, Jesse Dodge, Jonathan Bragg, Chen Zhao, Hannaneh Hajishirzi, Doug Downey, Arman Cohan
|
|
| Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency |
Renzhao Liang, Sizhe Xu, Chenggang Xie, Jingru Chen, Feiyang Ren, Shu Yang, Takahiro Yabe
|
|
| Axial Neural Networks for Dimension-Free Foundation Models |
Hyunsu Kim, Jonggeon Park, Joan Bruna, Hongseok Yang, Juho Lee
|
|
| Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training |
William Merrill, Shane Arora, Dirk Groeneveld, Hannaneh Hajishirzi
|
|
| Depth-Width Tradeoffs for Transformers on Graph Tasks |
Gilad Yehudai, Clayton Sanford, Maya Bechler-Speicher, Orr Fischer, Ran Gilad-Bachrach, Amir Globerson
|
|
| Error Forcing in Recurrent Neural Networks |
A Erdem Sağtekin, Colin Bredenberg, Cristina Savin
|
|
| Estimating cognitive biases with attention-aware inverse planning |
Sounak Banerjee, Daphne Cornelisse, Deepak Edakkattil Gopinath, Emily Sumner, Jonathan DeCastro, Guy Rosman, Eugene Vinitsky, Mark K Ho
|
|
| From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers |
Ryotaro Kawata, Yujin Song, Alberto Bietti, Naoki Nishikawa, Taiji Suzuki, Samuel Vaiter, Denny Wu
|
|
| Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning |
Jian Liu, Jing Xu, Song Guo, Jing Li, Guojingfeng, Jiaao Yu, Haohan Weng, Biwen Lei, Xianghui Yang, Zhuo Chen, Fangqi Zhu, Tao Han, Chunchao Guo
|
|
| Multitask Learning with Stochastic Interpolants |
Hugo Negrel, Florentin Coeurdoux, Michael Samuel Albergo, Eric Vanden-Eijnden
|
|
| Precise Asymptotics and Refined Regret of Variance-Aware UCB |
Yingying Fan, Yuxuan Han, Jinchi Lv, Xiaocong XU, Zhengyuan Zhou
|
|
| QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models |
Yutong Wang, Haiyu Wang, Sai Qian Zhang
|
|
| Scaling can lead to compositional generalization |
Florian Redhardt, Yassir Akram, Simon Schug
|
|
| Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation |
David Heineman, Valentin Hofmann, Ian Magnusson, Yuling Gu, Noah A. Smith, Hannaneh Hajishirzi, Kyle Lo, Jesse Dodge
|
|
| Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping |
Pu Yang, Yunzhen Feng, Ziyuan Chen, Yuhang Wu, Zhuoyuan Li
|
|
| The Generative Leap: Tight Sample Complexity for Efficiently Learning Gaussian Multi-Index Models |
Alex Damian, Jason D. Lee, Joan Bruna
|
|
| Towards Physics-informed Spatial Intelligence with Human Priors: An Autonomous Driving Pilot Study |
Guanlin Wu, Boyan Su, Yang Zhao, Pu Wang, Yichen Lin, Hao Frank Yang
|
|
| Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift |
Yi Zhang, Elynn Chen, Yujun Yan
|
|
| UniTok: a Unified Tokenizer for Visual Generation and Understanding |
Chuofan Ma, Yi Jiang, Junfeng Wu, Jihan Yang, Xin Yu, Zehuan Yuan, BINGYUE PENG, XIAOJUAN QI
|
|
| When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs |
Xiaomin Li, Zhou Yu, Zhiwei Zhang, Xupeng Chen, Ziji Zhang, Yingying Zhuang, Narayanan Sadagopan, Anurag Beniwal
|
|
| All that structure matches does not glitter |
Maya Martirossyan, Thomas Egg, Philipp Höllmer, George Karypis, Mark Transtrum, Adrian Roitberg, Mingjie Liu, Richard Hennig, Ellad B. Tadmor, Stefano Martiniani
|
|
| ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models |
Liyan Tang, Grace Kim, Xinyu Zhao, Thom Lake, Wenxuan Ding, Fangcong Yin, Prasann Singhal, Manya Wadhwa, Zeyu Leo Liu, Zayne Rea Sprague, Ramya Namuduri, Bodun Hu, Juan Diego Rodriguez, Puyuan Peng, Greg Durrett
|
|
| DGCBench: A Deep Graph Clustering Benchmark |
Benyu Wu, Yue Liu, Qiaoyu Tan, Xinwang Liu, Wei Du, Jun Wang, Guoxian Yu
|
|
| Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data |
Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich
|
|
| Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia |
Chandler Smith, Marwa Abdulhai, Manfred Diaz, Marko Tesic, Rakshit Trivedi, Sasha Vezhnevets, Lewis Hammond, Jesse Clifton, Minsuk Chang, Edgar A. Duéñez-Guzmán, John P Agapiou, Jayd Matyas, Danny Karmon, Beining Zhang, Jim Dilkes, Akash Kundu, Jord Nguyen, Emanuel Tewolde, Jebish Purbey, Ram Mohan Rao Kadiyala, Siddhant Gupta, Aliaksei Korshuk, Buyantuev Alexander, Ilya Makarov, Gang Zhao, Rolando Fernandez, Zhihan Wang, Caroline Wang, Jiaxun Cui, Lingyun Xiao, Di Yang Shi, Yoonchang Sung, Arrasy Rahman, Peter Stone, Yipeng Kang, Hyeonggeun Yun, Ananya Ananya, Taehun Cha, Zhiqiang Wu, Elizaveta Tennant, Olivia Macmillan-Scott, Marta Emili García Segura, Diana Riazi, Fuyang Cui, Sriram Ganapathi Subramanian, Toryn Q. Klassen, Nico Schiavone, Mogtaba Alim, Sheila A. McIlraith, Manuel Sebastian Rios Beltran, Oswaldo Peña, Carlos Saith Rodriguez Rojas, Manuela Chacon-Chamorro, Ruben Manrique, Luis Felipe Giraldo, Nicanor Quijano, Yiding Wang, Yuxuan Chen, Fangwei Zhong, Mengmeng Wang, Wenming Tu, Zhaowei Zhang, Ziang Chen, Zixia Jia, Xue Feng, Zilong Zheng, Chichen Lin, Weijian Fan, Chenao Liu, Sneheel Sarangi, Ziyan Wang, Shuqing Shi, Yali Du, Avinaash Anand Kulandaivel, Yang Liu, Wu Ruiyang, Chetan Talele, 陆孙嘉, Gema Parreño Piqueras, Shamika Dhuri, Bain McHale, Tim Baarslag, Dylan Hadfield-Menell, Natasha Jaques, Jose Hernandez-Orallo, Joel Z Leibo
|
|
| LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? |
Zihan Zheng, Zerui Cheng, Zeyu Shen, Shang Zhou, Kaiyuan Liu, Hansen He, Dongruixuan Li, Stanley Wei, Hangyi Hao, Jianzhu Yao, Peiyao Sheng, Zixuan Wang, Wenhao Chai, Aleksandra Korolova, Peter Henderson, Sanjeev Arora, Pramod Viswanath, Jingbo Shang, Saining Xie
|
|
| NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions |
Weizhe Yuan, Jane Yu, Song Jiang, Karthik Padthe, Yang Li, Dong Wang, Ilia Kulikov, Kyunghyun Cho, Yuandong Tian, Jason E Weston, Xian Li
|
|
| OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning |
Ling Fu, Zhebin Kuang, Jiajun Song, Mingxin Huang, Biao Yang, Yuzhe Li, Linghao Zhu, Qidi Luo, Xinyu Wang, Hao Lu, Zhang Li, Guozhi Tang, Bin Shan, Chunhui Lin, Qi Liu, Binghong Wu, Hao Feng, Hao Liu, Can Huang, Jingqun Tang, Wei Chen, Lianwen Jin, Yuliang Liu, Xiang Bai
|
|
| RBench-V: A Primary Assessment for Visual Reasoning Models with Multimodal Outputs |
Meng-Hao Guo, Xuanyu Chu, Qianrui Yang, Zhe-Han Mo, Yiqing Shen, Pei-lin Li, Xinjie Lin, Jinnian Zhang, Xin-Sheng Chen, Yi Zhang, Kiyohiro Nakayama, Zhengyang Geng, Houwen Peng, Han Hu, Shi-min Hu
|
|
| VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification |
Patrick Yubeaton, Andre Nakkab, Weihua Xiao, Luca Collini, Ramesh Karri, Chinmay Hegde, Siddharth Garg
|
|
| Embracing Contradiction: Theoretical Inconsistency Will Not Impede the Road of Building Responsible AI Systems |
Gordon Dai, Yunze Xiao
|
|
| LLM Generated Persona is a Promise with a Catch |
Ang Li, Haozhe Chen, Hongseok Namkoong, Tianyi Peng
|
|
| A Data-Driven Prism: Multi-View Source Separation with Diffusion Model Priors |
Sebastian Wagner-Carena, Aizhan Akhmetzhanova, Sydney Erickson
|
|
| A Latent Multilayer Graphical Model For Complex, Interdependent Systems |
Martin Ondrus, Ivor Cribben, Yang Feng
|
|
| A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers |
William Merrill, Ashish Sabharwal
|
|
| A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values |
Tyler Chen, Akshay Seshadri, Mattia Jacopo Villani, Pradeep Niroula, Shouvanik Chakrabarti, Archan Ray, Pranav Deshpande, Romina Yalovetzky, Marco Pistoia, Niraj Kumar
|
|
| AION-1: Omnimodal Foundation Model for Astronomical Sciences |
Liam Holden Parker, Francois Lanusse, Jeff Shen, Ollie Liu, Tom Hehir, Leopoldo Sarra, Lucas Thibaut Meyer, Micah Bowles, Sebastian Wagner-Carena, Helen Qu, Siavash Golkar, Alberto Bietti, Hatim Bourfoune, Pierre Cornette, Keiya Hirashima, Geraud Krawezik, Ruben Ohana, Nicholas Lourie, Michael McCabe, Rudy Morel, Payel Mukhopadhyay, Mariel Pettee, Kyunghyun Cho, Miles Cranmer, Shirley Ho
|
|
| Adaptive Time Encoding for Irregular Multivariate Time-Series Classification |
Sangho Lee, Kyeongseo Min, Youngdoo Son, Hyungrok Do
|
|
| AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents |
Hanjun Luo, Shenyu Dai, Chiming Ni, Xinfeng Li, Guibin Zhang, Kun Wang, Tongliang Liu, Hanan Salam
|
|
| Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector |
Haoyan Yang, Runxue Bao, Cao Xiao, Jun Ma, Parminder Bhatia, Shangqian Gao, Taha Kass-Hout
|
|
| Architectural and Inferential Inductive Biases for Exchangeable Sequence Modeling |
Daksh Mittal, Ang Li, Thomson Yen, C. Daniel Guetta, Hongseok Namkoong
|
|
| Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization |
Guojingfeng, Jian Liu, Jinnan Chen, Shiwei Mao, Changrong Hu, Puhua Jiang, Junlin Yu, Jing Xu, Qi Liu, LiXin Xu, Zhuo Chen, Chunchao Guo
|
|
| BlockScan: Detecting Anomalies in Blockchain Transactions |
Jiahao Yu, Xian Wu, Hao Liu, Wenbo Guo, Xinyu Xing
|
|
| CSGO: Content-Style Composition in Text-to-Image Generation |
Peng Xing, Haofan Wang, Yanpeng Sun, wangqixun, Baixu, Hao Ai, Jen-Yuan Huang, Zechao Li
|
|
| ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference |
Xiang Liu, Zhenheng Tang, Peijie Dong, Zeyu Li, Liuyue, Bo Li, Xuming Hu, Xiaowen Chu
|
|
| CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring |
Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger, Timothy Kostolansky, Hannes Whittingham, Mary Phuong
|
|
| ComPO: Preference Alignment via Comparison Oracles |
Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin
|
|
| Compositional Reasoning with Transformers, RNNs, and Chain of Thought |
Gilad Yehudai, Noah Amsel, Joan Bruna
|
|
| Contrastive Self-Supervised Learning As Neural Manifold Packing |
Guanming Zhang, David Heeger, Stefano Martiniani
|
|
| DISC: Dynamic Decomposition Improves LLM Inference Scaling |
Jonathan Light, Wei Cheng, Benjamin Riviere, Yue Wu, Masafumi Oyamada, Mengdi Wang, Yisong Yue, Santiago Paternain, Haifeng Chen
|
|
| Dense Backpropagation Improves Training for Sparse Mixture-of-Experts |
Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Thérien, Sambit Sahu, Tom Goldstein, Supriyo Chakraborty
|
|
| Differentiable extensions with rounding guarantees for combinatorial optimization over permutations |
Robert R Nerem, Zhishang Luo, Akbar Rafiey, Yusu Wang
|
|
| Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search |
Yousef Al-Jazzazi, Haya Diwan, Jinrui Gou, Cameron N Musco, Christopher Musco, Torsten Suel
|
|
| Do different prompting methods yield a common task representation in language models? |
Guy Davidson, Todd M. Gureckis, Brenden Lake, Adina Williams
|
|
| Dynamic Test-Time Compute Scaling in Control Policy: Difficulty-Aware Stochastic Interpolant Policy |
Inkook Chun, Seungjae Lee, Michael Samuel Albergo, Saining Xie, Eric Vanden-Eijnden
|
|
| ESCA: Enabling Seamless Codec Avatar Execution through Algorithm and Hardware Co-Optimization for Virtual Reality |
Mingzhi Zhu, Ding Shang, Sai Qian Zhang
|
|
| Efficient semantic uncertainty quantification in language models via diversity-steered sampling |
Ji Won Park, Kyunghyun Cho
|
|
| Emergence and scaling laws in SGD learning of shallow neural networks |
Yunwei Ren, Eshaan Nichani, Denny Wu, Jason D. Lee
|
|
| Emergence of Linear Truth Encodings in Language Models |
Shauli Ravfogel, Gilad Yehudai, Tal Linzen, Joan Bruna, Alberto Bietti
|
|
| Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits |
Yuzhou Gu, Yanjun Han, Jian Qian
|
|
| Exact Expressive Power of Transformers with Padding |
William Merrill, Ashish Sabharwal
|
|
| FEAT: Free energy Estimators with Adaptive Transport |
Yuanqi Du, Jiajun He, Francisco Vargas, Yuanqing Wang, Carla P Gomes, José Miguel Hernández-Lobato, Eric Vanden-Eijnden
|
|
| Feature-Based Instance Neighbor Discovery: Advanced Stable Test-Time Adaptation in Dynamic World |
Qinting Jiang, Chuyang Ye, Dongyan Wei, Bingli Wang, Yuan Xue, Jingyan Jiang, Zhi Wang
|
|
| FlowMixer: A Depth-Agnostic Neural Architecture for Interpretable Spatiotemporal Forecasting |
Fares B. Mehouachi, Saif Eddin Jabari
|
|
| Geometric Algorithms for Neural Combinatorial Optimization with Constraints |
Nikolaos Karalias, Akbar Rafiey, Yifei Xu, Zhishang Luo, Behrooz Tahmasebi, Connie Jiang, Stefanie Jegelka
|
|
| HOComp: Interaction-Aware Human-Object Composition |
Dong Liang, Jinyuan Jia, Yuhao LIU, Rynson W. H. Lau
|
|
| Hankel Singular Value Regularization for Highly Compressible State Space Models |
Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer
|
|
| How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime? |
Wei Huang, Andi Han, Yujin Song, Yilan Chen, Denny Wu, Difan Zou, Taiji Suzuki
|
|
| How to Auto-optimize Prompts for Domain Tasks? Adaptive Prompting and Reasoning through Evolutionary Domain Knowledge Adaptation |
Yang Zhao, Pu Wang, Hao Frank Yang
|
|
| How to Scale Second-Order Optimization |
Zixi Chen, Shikai Qiu, Hoang Phan, Qi Lei, Andrew Gordon Wilson
|
|
| How to build a consistency model: Learning flow maps via self-distillation |
Nicholas Matthew Boffi, Michael Samuel Albergo, Eric Vanden-Eijnden
|
|
| Implicit Generative Property Enhancer |
Pedro O. Pinheiro, Pan Kessel, Aya Abdelsalam Ismail, Sai Pooja Mahajan, Kyunghyun Cho, Saeed Saremi, Natasa Tagasovska
|
|
| Improved Balanced Classification with Theoretically Grounded Loss Functions |
Corinna Cortes, Mehryar Mohri, Yutao Zhong
|
|
| Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits |
Yuxuan Han, Jose Blanchet, Zhengyuan Zhou
|
|
| Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay |
Yifan Sun, Jingyan Shen, Yibin Wang, Tianyu Chen, Zhendong Wang, Mingyuan Zhou, Huan Zhang
|
|
| Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations |
Li Ji-An, Hua-Dong Xiong, Robert Wilson, Marcelo G Mattar, Marcus K. Benna
|
|
| Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models |
Vlad Sobal, Wancong Zhang, Kyunghyun Cho, Randall Balestriero, Tim G. J. Rudner, Yann LeCun
|
|
| Learning normalized image densities via dual score matching |
Florentin Guth, Zahra Kadkhodaie, Eero P Simoncelli
|
|
| Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws |
Gerard Ben Arous, Murat A Erdogdu, Nuri Mert Vural, Denny Wu
|
|
| Length Generalization via Auxiliary Tasks |
Pranjal Awasthi, Anupam Gupta, Ravi Kumar
|
|
| Making Classic GNNs Strong Baselines Across Varying Homophily: A Smoothness–Generalization Perspective |
Ming Gu, Zhuonan Zheng, Sheng Zhou, Meihan Liu, Jiawei Chen, Qiaoyu Tan, Liangcheng Li, Jiajun Bu
|
|
| MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference |
Wenyuan Zhang, Jimin Tang, Weiqi Zhang, Yi Fang, Yu-Shen Liu, Zhizhong Han
|
|
| MetaFind: Scene-Aware 3D Asset Retrieval for Coherent Metaverse Scene Generation |
Zhenyu Pan, Yucheng Lu, Han Liu
|
|
| Modeling Neural Activity with Conditionally Linear Dynamical Systems |
Victor Geadah, Amin Nejatbakhsh, David Lipshutz, Jonathan W. Pillow, Alex H Williams
|
|
| NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables |
Lanrui Wang, Mingyu Zheng, Hongyin Tang, Zheng Lin, Yanan Cao, Jingang Wang, Xunliang Cai, Weiping Wang
|
|
| Neurons as Detectors of Coherent Sets in Sensory Dynamics |
Joshua L. Pughe-Sanford, Xuehao Ding, Jason J Moore, Anirvan M. Sengupta, Charles Epstein, Philip Greengard, Dmitri Chklovskii
|
|
| OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation |
Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun, Farshad Khorrami
|
|
| Optimal Estimation of the Best Mean in Multi-Armed Bandits |
Takayuki Osogami, Junya Honda, Junpei Komiyama
|
|
| Parsimonious Predictions for Strategyproof Scheduling |
Richard Cole, Anupam Gupta, Pranav Jangir
|
|
| Performative Risk Control: Calibrating Models for Reliable Deployment under Performativity |
Victor Li, Baiting Chen, Yuzhen Mao, Qi Lei, Zhun Deng
|
|
| PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement |
Teng Hu, Zhentao Yu, Zhengguang Zhou, Jiangning Zhang, Yuan Zhou, Qinglin Lu, Ran Yi
|
|
| Predicting Empirical AI Research Outcomes with Language Models |
Jiaxin Wen, Chenglei Si, Chen Yueh-Han, He He, Shi Feng
|
|
| Predicting Functional Brain Connectivity with Context-Aware Deep Neural Networks |
Alexander Ratzan, Sidharth Goel, Junhao Wen, Christos Davatzikos, Erdem Varol
|
|
| Predicting partially observable dynamical systems via diffusion models with a multiscale inference scheme |
Rudy Morel, Francesco Pio Ramunno, Jeff Shen, Alberto Bietti, Kyunghyun Cho, Miles Cranmer, Siavash Golkar, OLEXANDR GUGNIN, Geraud Krawezik, Tanya Marwah, Michael McCabe, Lucas Thibaut Meyer, Payel Mukhopadhyay, Ruben Ohana, Liam Holden Parker, Helen Qu, François Rozet, K.D. Leka, Francois Lanusse, David Fouhey, Shirley Ho
|
|
| Preserving Task-Relevant Information Under Linear Concept Removal |
Floris Holstege, Shauli Ravfogel, Bram Wouters
|
|
| Principled Model Routing for Unknown Mixtures of Source Domains |
Christoph Dann, Yishay Mansour, Teodor Vanislavov Marinov, Mehryar Mohri
|
|
| Procurement Auctions with Predictions: Improved Frugality for Facility Location |
Eric Balkanski, Nicholas DeFilippis, Vasilis Gkatzelis, Xizhi Tan
|
|
| Regression-adjusted Monte Carlo Estimators for Shapley Values and Probabilistic Values |
R. Teal Witter, Yurong Liu, Christopher Musco
|
|
| Robust Contextual Pricing |
Anupam Gupta, Guru Guruganesh, Renato Paes Leme, Jon Schneider
|
|
| Scalable inference of functional neural connectivity at submillisecond timescales |
Arina Medvedeva, Edoardo Balzani, Alex H Williams, Stephen L Keeley
|
|
| Self supervised learning for in vivo localization of microelectrode arrays using raw local field potential |
Tianxiao He, Malhar Patel, Chenyi Li, Anna Maslarova, Mihály Vöröslakos, Nalini Ramanathan, Wei-Lun Hung, Gyorgy Buzsaki, Erdem Varol
|
|
| Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful |
Martin Marek, Sanae Lotfi, Aditya Somasundaram, Andrew Gordon Wilson, Micah Goldblum
|
|
| Soft-consensual Federated Learning for Data Heterogeneity via Multiple Paths |
Sheng Huang, Lele Fu, Fanghua Ye, Tianchi Liao, Bowen Deng, zhangchuanfu, Chuan Chen
|
|
| Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics |
Lorenzo Magnino, Kai Shao, Zida Wu, Jiacheng Shen, Mathieu Lauriere
|
|
| Sparta Alignment: Collectively Aligning Multiple Language Models through Combat |
Yuru Jiang, Wenxuan Ding, Shangbin Feng, Greg Durrett, Yulia Tsvetkov
|
|
| Spectral Analysis of Representational Similarity with Limited Neurons |
Hyunmo Kang, Abdulkadir Canatar, SueYeon Chung
|
|
| Split Gibbs Discrete Diffusion Posterior Sampling |
Wenda Chu, Zihui Wu, Yifan Chen, Yang Song, Yisong Yue
|
|
| SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing |
Sung-Hoon Yoon, Minghan Li, Gaspard Beaudouin, Congcong Wen, Muhammad Rafay Azhar, Mengyu Wang
|
|
| Test Time Scaling for Neural Processes |
Hyungi Lee, Moonseok Choi, Hyunsu Kim, Kyunghyun Cho, Rajesh Ranganath, Juho Lee
|
|
| The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement |
Ruihan Yang, Fanghua Ye, Jian Li, Siyu Yuan, Yikai Zhang, Zhaopeng Tu, Xiaolong Li, Deqing Yang
|
|
| The Rise of Parameter Specialization for Knowledge Storage in Large Language Models |
Yihuai Hong, Yiran Zhao, Wei Tang, Yang Deng, Yu Rong, Wenxuan Zhang
|
|
| ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation |
Jiuhong Xiao, Roshan Nayak, Ning Zhang, Daniel Toertei, Giuseppe Loianno
|
|
| Thinking vs. Doing: Improving Agent Reasoning by Scaling Test-Time Interaction |
Junhong Shen, Hao Bai, Lunjun Zhang, Yifei Zhou, Amrith Setlur, Shengbang Tong, Diego Caples, Nan Jiang, Tong Zhang, Ameet Talwalkar, Aviral Kumar
|
|
| Tight Lower Bounds and Improved Convergence in Performative Prediction |
Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami, Simon Lacoste-Julien, Gauthier Gidel
|
|
| U-CAN: Unsupervised Point Cloud Denoising with Consistency-Aware Noise2Noise Matching |
Junsheng Zhou, XingYu Shi, Haichuan Song, Yi Fang, Yu-Shen Liu, Zhizhong Han
|
|
| Understanding outer learning rates in Local SGD |
Ahmed Khaled, Satyen Kale, Arthur Douillard, Chi Jin, Rob Fergus, Manzil Zaheer
|
|
| Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding |
Zaiquan Yang, Yuhao LIU, Gerhard Petrus Hancke, Rynson W. H. Lau
|
|
| VeriLoC: Line-of-Code Level Prediction of Hardware Design Quality from Verilog Code |
Raghu Vamshi Hemadri, Jitendra Bhandari, Andre Nakkab, Johann Knechtel, Badri P Gopalan, Ramesh Narayanaswamy, Ramesh Karri, Siddharth Garg
|
|
| When Are Concepts Erased From Diffusion Models? |
Kevin Lu, Nicky Kriplani, Rohit Gandikota, Minh Pham, David Bau, Chinmay Hegde, Niv Cohen
|
|
| When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective |
Alireza Mousavi-Hosseini, Clayton Sanford, Denny Wu, Murat A Erdogdu
|
|
| Whole-Body Conditioned Egocentric Video Prediction |
Yutong Bai, Danny Tran, Amir Bar, Yann LeCun, Trevor Darrell, Jitendra Malik
|
|
| Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation |
Sungmin Cha, Kyunghyun Cho
|
|
| Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion |
Alan Nawzad Amin, Nate Gruver, Andrew Gordon Wilson
|