Briefs

Short paper notes

  • Elastic Queries Reinforcement Learning: Self-Aware Policy Execution for VLA Models 2026-06-15
    frozen flow-based VLA는 그대로 둔 채, lightweight RL adaptor가 매 query마다 latent steering w, denoising steps K, execution chunk length C를 동적으로 선택해 hard state에서는 더 많은 compute와 잦은 replanning을, easy state에서는 낮은 compute와 긴 open-loop execution을 수행하도록 만드는 elastic VLA execution framework
    Korean inference-time success-rate VLA scheduler-training auxiliary-module-training
  • ReactVLA: Fast and Lightweight Reactive Robot Manipulation via Improved Mean Flow Action Generation 2026-06-15
    diffusion / flow 기반 VLA policy의 inference latency 병목을 줄이기 위해, action generation을 improved Mean Flow(iMF) 기반 one-to-few-step continuous action chunk generation으로 바꾸고 Attention Residuals(AttnRes) Transformer를 결합한 low-latency reactive robot manipulation policy
    Korean inference-time VLA component-scratch-training
  • WAM4D: Fast 4D World Action Model via Spatial Register Tokens 2026-06-15
    4D geometry를 inference-time output으로 직접 만들지 않고, training-time spatial register token으로 future depth를 예측하게 만들어 geometric foundation prior를 causal video-action WAM에 distill한 뒤, deploy 시 geometry branch를 제거해 action chunk를 빠르게 생성
    Korean success-rate WAM fine-tuning auxiliary-module-training component-scratch-training
  • µ0: A Scalable 3D Interaction-Trace World Model 2026-06-15
    pretraining 단계에서는 action-labeled robot data 없이 heterogeneous videos에서 추출한 semantic 3D interaction traces를 학습하고, downstream에서는 frozen trace world model의 hidden features를 action expert에 주입해 robot policy를 만드는 3D trace-space world model
    Korean success-rate WAM foundation-model training-data component-scratch-training
  • EgoEngine: From Egocentric Human Videos to High-Fidelity Dexterous Robot Demonstrations 2026-06-12
    egocentric human manipulation video를 digital twin 기반으로 변환해, robot observation video와 실행 가능한 로봇 action trajectory를 함께 생성하고, 이를 이용해 real-robot dexterous visuomotor policy를 학습하는 human-video-to-robot-demo data engine
    Korean success-rate training-data auxiliary-module-training
  • Improving Robotic Generalist Policies via Flow Reversal Steering 2026-06-12
    coarse semantic action을 frozen flow-matching VLA의 역방향 ODE로 latent noise에 매핑한 뒤 다시 denoise해, generalist policy prior 안의 더 정교한 action mode를 호출하는 training-free steering 방법
    Korean success-rate inference-time VLA auxiliary-module-training training-free
  • Ambient Diffusion Policy: Imitation Learning from Suboptimal Data in Robotics 2026-06-11
    suboptimal / OOD robot demonstrations를 Diffusion Policy 학습에 그냥 섞지 않고, diffusion timestep에 따라 “쓸 수 있는 구간”을 제한해 유용한 global plan 또는 local motion primitive만 뽑아 쓰는 imitation learning 방법
    Korean success-rate diffusion-policy scratch-training training-data
  • Dynamic Execution Horizon Prediction for Chunk-based Robot Policies 2026-06-11
    pretrained action-chunking robot policy의 action generator는 완전히 고정하고, 현재 observation과 예측된 action chunk를 보고 “이번에 몇 step을 open-loop로 실행할지”를 PPO로 학습하는 lightweight execution-horizon predictor
    Korean inference-time success-rate diffusion-policy scheduler-training auxiliary-module-training
  • Efficient-WAM: A 1B-Parameter World-Action Model with Low-Cost Future Imagination 2026-06-10
    WAM의 미래 영상 예측을 photorealistic video generation이 아니라 action generation을 돕는 저비용 coarse future guidance로 재정의하고, compact video expert + low-resolution future latent + asymmetric video-action denoising으로 약 1B 규모에서 real-world policy inference latency를 약 98 ms/chunk까지 낮춤
    Korean inference-time success-rate WAM fine-tuning component-scratch-training
  • SARM2: Multi-Task Stage Aware Reward Modeling for Self Improving Robotic Manipulation 2026-06-10
    long-horizon robotic manipulation에서 VLA policy의 self-improvement를 위해, action-primitive stage estimator와 multi-gate MoE value head로 dense reward/value model을 만들고, 이를 SPIRAL의 offline-to-online residual RL data flywheel에 통합한다
    Korean success-rate VLA fine-tuning auxiliary-module-training MoE