Beomyoung Kim 김범영

Applied Scientist — NAVER Cloud (Image Vision)

Computer Vision · Perception · VLMs · Visual Agents

Email·📍 Seoul, Korea
Google Scholar· GitHub· LinkedIn· DBLP

Summary

I am an Applied Scientist at NAVER Cloud and a Ph.D. student at KAIST, working at the intersection of computer vision and multimodal AI. My research journey began in core visual recognition — semantic and instance segmentation, object detection, and image matting — where I focused on making perception models both accurate and practical through label-efficient (weakly- and semi-supervised) and continual learning. This line of work has produced 14+ publications at top venues including CVPR, ICCV, ECCV, NeurIPS, and AAAI (572 citations, h-index 9), with an ICCV 2025 Highlight.

Today, I am extending that deep perception expertise into vision-language models & visual reasoning agents — researching grounded VLMs that resolve region-level queries directly from pixel evidence rather than unsupported text, and training-free agentic program synthesis that composes specialist vision models into query-specific reasoning workflows for spatial and temporal reasoning tasks. I am especially driven by research that reaches the real world: at NAVER Cloud, I have repeatedly carried ideas from paper to product — a zero-shot image-matting foundation model behind CLOVA-X's image editing, a foreground-segmentation API that outperforms leading commercial APIs, and a face-authentication system serving millions of users. My goal is to build AI that perceives, understands, and reasons about the visual world as richly and reliably as people do.

Highlights

7 first-author papers at top-tier venues — CVPR · ICCV · ECCV · NeurIPS
5+ years as an Applied Scientist at NAVER Cloud with a proven research → product track record
Ph.D. candidate pursued in parallel with full-time research
Continuously pushing new frontiers — Visual Reasoning Agents & grounded VLMs

Experience

Applied Scientist — NAVER Cloud, Image Vision

Jan 2021 – Present

Seongnam, Korea · formerly NAVER CLOVA, reorganized into NAVER Cloud

Grounded Multimodal AI (ongoing) — Researching grounded VLMs that connect language reasoning with pixel- and region-level visual evidence, so models can resolve region-based queries and reason more verifiably instead of relying on unsupported textual descriptions.
Visual Reasoning Agents (ongoing) — Researching training-free agentic program synthesis that dynamically builds query-specific, executable workflows from specialist vision models, targeting multi-step spatial and temporal reasoning that remains difficult for end-to-end VLMs, especially where task-specific data or post-training is impractical.
Created ZIM, a promptable zero-shot image-matting foundation model (ICCV 2025 Highlight), by improving SAM through architecture, loss, and data-pipeline design on an ~1M-image curated dataset; open-sourced with a public demo and integrated into the CLOVA-X Image Editing service.
Developed the model and dataset for a foreground-segmentation API now served internally, outperforming Photoroom, Remove.bg, and Adobe segmentation APIs in internal evaluation.
Built the face anti-spoofing model behind FaceSign, a face-authentication system replacing payment and employee-badge tagging, for this government-certified service.
Independently developed an efficient on-device (mobile-CPU) human-segmentation model, deployed via ONNX-based in-house serving, achieving ~10ms inference with robust quality under tight mobile compute budgets.
Advanced label-efficient and continual segmentation, producing first-author CVPR papers — ECLIPSE (continual panoptic), PointWSSIS (point-supervised instance), and BESTIE (weakly-supervised instance) — while developing annotation-efficient methods and widely adopted open-source codebases.

Research Internships

2018 – 2020

NAVER CLOVA Visual AI (FACE), 2020 — face recognition / understanding research.
Hyundai Mobis, Autonomous Driving Advanced Development, 2019 — perception for autonomous driving.
NAVER CLOVA Vision (OCR), 2018 — optical character recognition research.

Education

Ph.D., Kim Jaechul Graduate School of AI — KAIST

MLAI Lab, advised by Prof. Sung Ju Hwang (in parallel with full-time work)

2022 – Present

M.S., Electrical Engineering (Future Vehicle) — KAIST

SIIT Lab, advised by Prof. Junmo Kim

2019 – 2021

B.S., Information and Communication Engineering — Inha University

2013 – 2019

Under Review

Under review · NeurIPS 2026Visual Reasoning Agents. A diagnostic framework for 3D spatial reasoning that turns silent perception failures in visual program synthesis into typed diagnoses, driving targeted program repair to rival frontier VLMs without task-specific training.

Selected Publications (★ = first author · full list on Scholar)

★Learning from Adversity: Semantic-Aware Mask Refinement through Adversarial Perturbation. Beomyoung Kim, Sung Ju Hwang. ECCV 2026 · paper & code coming soon
★ZIM: Zero-Shot Image Matting for Anything. Beomyoung Kim, Chanyong Shin, Joonhyun Jeong, Hyungsik Jung, Se-Yun Lee, Sewhan Chun, Dong-Hyun Hwang, Joonsang Yu. ICCV 2025 Highlight · paper · code · project
★Towards Label-Efficient Human Matting: A Simple Baseline for Weakly Semi-Supervised Trimap-Free Human Matting. Beomyoung Kim, Myeong Yeon Yi, Joonsang Yu, Young Joon Yoo, Sung Ju Hwang. arXiv 2024 · paper · code
★Rethinking Saliency-Guided Weakly-Supervised Semantic Segmentation. Beomyoung Kim, Donghyeon Kim, Sung Ju Hwang. arXiv 2024 · paper · code
★ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning. Beomyoung Kim, Joonsang Yu, Sung Ju Hwang. CVPR 2024 · paper · code
EResFD: Rediscovery of the Effectiveness of Standard Convolution for Lightweight Face Detection. Joonhyun Jeong, Beomyoung Kim, Joonsang Yu, Youngjoon Yoo. WACV 2024 · paper · code
★The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation. Beomyoung Kim, Joonhyun Jeong, Dongyoon Han, Sung Ju Hwang. CVPR 2023 · paper · code
★Beyond Semantic to Instance Segmentation: Weakly-Supervised Instance Segmentation via Semantic Knowledge Transfer and Self-Refinement. Beomyoung Kim, Youngjoon Yoo, Chaeeun Rhee, Junmo Kim. CVPR 2022 · paper · code
Learning Features with Parameter-Free Layers. Dongyoon Han, Youngjoon Yoo, Beomyoung Kim, Byeongho Heo. ICLR 2022 · paper · code
★TricubeNet: 2D Kernel-Based Object Representation for Weakly-Occluded Oriented Object Detection. Beomyoung Kim, Janghyeon Lee, Sihaeng Lee, Doyeon Kim, Junmo Kim. WACV 2022 · paper · code
★SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning. Sungmin Cha*, Beomyoung Kim*, Youngjoon Yoo, Taesup Moon (* equal contribution). NeurIPS 2021 · paper · code
★Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation. Beomyoung Kim, Sangeun Han, Junmo Kim. AAAI 2021 · paper · code
★3D Point Cloud Upsampling and Colorization using GAN. Beomyoung Kim, Sangeun Han, Eojindl Yi, Junmo Kim. MIWAI 2021 · paper
Fully Automated Valet Parking System Based on Infrastructure Sensing. Hyunjee Ryu, Beomyoung Kim, Heecheol Yoo, Jungwon Lee. RiTA 2020 · paper

Honors & Awards

2025ICCV 2025 Highlight (top ~3% of accepted papers) — ZIM

Academic Service — Reviewer

2026CVPR · ECCV · NeurIPS · ICLR · TPAMI

2025CVPR · ICCV · NeurIPS · ICLR · TMLR

2024CVPR · ECCV · NeurIPS · ICLR · AAAI

2023CVPR · ICCV · NeurIPS · WACV

Invited Talks

2025Centum Digital Week — "Next Code 2025: Beyond AI, Into Agents"
2024TEAM NAVER DAN 24 — CLOVA-X Image Editing
2022Jinhaksa Catch Career-Con — AI Research Engineer career
2022Inha University — Weakly-Supervised Instance Segmentation
2021NeurIPS 2021 Social: ML in Korea — SSUL

Technical Skills

Research areas: Image Segmentation & Detection · Image Matting · Vision Foundation Models · Multimodal / Vision-Language Models · Visual Reasoning Agents · Label-Efficient & Continual Learning

Frameworks & tools: PyTorch · TensorFlow · large-scale / distributed training

Programming: Python · C++ · C · Java

Languages

Korean (native) · English (intermediate — conversational working proficiency)