Track: Research Track Oral Presentation: Security and Privacy

Tue 19 May 16:30 - 16:45 PDT

G-HEMP: FAST MULTI-GPU PRIVATE INFERENCE FOR LARGE-SCALE GCNS WITH HOMOMORPHIC ENCRYPTION

Ran Ran ⋅ Zhaoting Gong ⋅ Zhaowei Li ⋅ Xianting Lu ⋅ Jiajia Li ⋅ Wujie Wen

Homomorphic Encryption (HE) offers a promising solution for privacy-preserving Graph Convolutional Network (GCN) inference in untrusted cloud environments by enabling computation directly on encrypted data. This capability is particularly valuable in domains such as recommendation systems, financial analysis, and bioinfor- matics, where data confidentiality is paramount. However, applying HE to large-scale GCN inference introduces substantial computational and memory overhead, severely limiting scalability and runtime efficiency. While prior works focusing on algorithmic improvements have demonstrated feasibility on CPUs, these approaches struggle to scale effectively on GPUs due to excessive memory consumption and redundant computation. In this work, we present G-HEMP, the first framework that leverages multi-GPU systems to accelerate large-scale private GCN inference. G-HEMP introduces two key innovations: (i) a block-diagonal parallel packing scheme that eliminates redundant data replication in encrypted adjacency matrices, reducing the number of HE operations and achieving up to 4.41× speedup over conventional feature-wise packing under single GPU environment; and (ii) a multi-GPU workload partitioning strategy that halves per-GPU peak memory usage on a 4-GPU system and achieves up to 3.88× latency improvement. Compared to the limb-level-partitioning-based approach in Cinnamon–the state-of-the-art encrypted computation parallelization method, G-HEMP further attains up to 3.13× gain owing to our superior multi-device partition policy. Overall, G-HEMP is model-agnostic and scales seamlessly with graph size and GPU count, enabling efficient and practical privacy-preserving GCN inference on modern heterogeneous environments.

Tue 19 May 16:45 - 17:00 PDT

CSLE: A Reinforcement Learning Platform for Autonomous Security Management

Kim Hammar

Reinforcement learning is a promising approach to autonomous and adaptive security management in networked systems. However, current reinforcement learning solutions for security management are mostly limited to simulation environments and it is unclear how they generalize to operational systems. In this paper, we address this limitation by presenting CSLE: a reinforcement learning platform for autonomous security management that enables experimentation under realistic conditions. Conceptually, CSLE encompasses two systems. First, it includes an emulation system that replicates key components of the target system in a virtualized environment. We use this system to gather measurements and logs, based on which we identify a system model, such as a Markov decision process. Second, it includes a simulation system where security strategies are efficiently learned through simulations of the system model. The learned strategies are then evaluated and refined in the emulation system to close the gap between theoretical and operational performance. We demonstrate CSLE through four use cases: flow control, replication control, segmentation control, and recovery control. Through these use cases, we show that CSLE enables near-optimal security management in an environment that approximates an operational system.

Tue 19 May 17:00 - 17:15 PDT

Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

Jianming Tong ⋅ Hanshen Xiao ⋅ Krishna Nair ⋅ Hao Kang ⋅ Ashish Sirasao ⋅ Ziqi Zhang ⋅ G. Edward Suh ⋅ Tushar Krishna

Multi-user virtual reality (VR) applications such as football and concert experiences rely on real-time avatar reconstruction to enable immersive interaction. However, rendering avatars for numerous participants on each headset incurs prohibitive computational overhead, fundamentally limiting scalability. This work introduces a framework, Privatar, to offload avatar reconstruction from headset to untrusted devices within the same local network while safeguarding sensitive facial features against adversaries capable of intercepting offloaded data. Privatar builds on the insight that "domain-specific knowledge of avatar reconstruction enables provably private offloading at minimal cost". (1) _System level_. We observe avatar reconstruction is frequency-domain decomposable via block-wise DCT with negligible quality drop, and propose Horizontal Partitioning (HP) to keep the most energy frequency components on-device and offloads only low-energy components. HP offloads local computation while reducing information leakage to low-energy subsets only. (2) _Privacy level_. For _individually_ offloaded, _multi-dimensional_ signals without aggregation, worst-case local Differential Privacy requires prohibitive noise, ruining utility. We observe users’ expression statistical distribution are _slowly changing over time and trackable online_, and hence propose Distribution-Aware Minimal Perturbation (DAMP). DAMP minimizes noise based on each user’s expression distribution to significantly reduce its effects on utility and accuracy, retaining formal privacy guarantee. Combined, HP provides empirical privacy protection against expression identification attack. And DAMP further augments it to offer a formal guarantee against arbitrary adversaries. On a Meta Quest Pro, Privatar supports up to 2.37$\times$ more concurrent users at 5.7$\sim$6.5% higher reconstruction loss and $\sim$9% energy overhead, providing a better throughout-loss Pareto frontier over SotA quantization, sparsity, and local reconstruction baseline. Privatar further provides both provable privacy guarantee and stays robust against both empirical attack and NN-based Expression Identification Attack, proving its resilience in practice. Our code is open-sourced at https://github.com/georgia-tech-synergy-lab/Privatar.

Tue 19 May 17:15 - 17:30 PDT

ZK-APEX: ZERO-KNOWLEDGE APPROXIMATE PERSONALIZED UNLEARNING WITH EXECUTABLE PROOFS

Mohammadmahdi Maheri ⋅ Sunil Cotterill ⋅ Alex Davidson ⋅ Hamed Haddadi

Machine unlearning removes the influence of specified data from trained models to satisfy privacy, copyright, and safety requirements (e.g., the “right to be forgotten”). In practice, providers distribute a global model to edge devices, that each locally personalize the model based on their private data. However, since clients may ignore or falsify deletion requests, providers must verify correct unlearning for these distributed models, without accessing private parameters. This is particularly challenging for personalized models, which must forget designated samples without degrading local utility, while ensuring that verification remains efficient and scalable on resource-constrained edge devices. We formalize personalized unlearning and develop a zero-shot approximate unlearning algorithm that works directly on the personalized model without retraining. Our novel method, ZK-APEX, combines provider-side sparse masking for targeted removal with client-side Group-OBS compensation computed from a block-wise empirical Fisher. This technique yields a curvature-aware update designed for low-overhead execution and proof generation. Using modern Halo2 ZK-SNARKs, we prove operator compliance by showing that the unlearned model exactly matches the committed output of the prescribed transformation, without revealing personalized model parameters or data. On Vision Transformer (ViT) classification models, our approach recovers approximately 99\% Top-1 personalization accuracy while enforcing effective forgetting. We further evaluate the unlearning algorithm on a generative model, OPT125M, trained on the CodeParrot code dataset, achieving $\sim$70\% recovery of original accuracy. ZK-SNARK proof generation for the ViT case completes in $\approx$2 hours, which is more than $10^7\times$ faster than retraining based verification, with peak memory under 0.7 GB and proof sizes about 400 MB. Together, these results establish the first verifiable personalized unlearning framework practical for deployment on resource constrained edge devices.

Tue 19 May 17:30 - 17:45 PDT

Blueprint, Bootstrap, and Bridge: A Security Look at NVIDIA GPU Confidential Computing

Zhongshu Gu ⋅ Enriquillo Valdez ⋅ Salman Ahmed ⋅ Julian James stephen ⋅ Michael Le ⋅ Hani Jamjoom ⋅ Shixuan Zhao ⋅ Zhiqiang Lin

NVIDIA GPU Confidential Computing (GPU-CC) aims to provide secure execution for AI workloads. For end users, enabling GPU-CC is seamless and requires no modifications to existing applications. However, this ease of adoption relies on a proprietary and highly complex system that is difficult to inspect, creating challenges for researchers seeking to understand its architecture and security landscape. In this work, we provide a security look at GPU-CC by reconstructing a coherent view of the system. We first examine the system’s blueprint, focusing on the specialized architectural engines that support its security mechanisms. We then analyze the bootstrap process, which coordinates hardware and software components to establish these protections. Finally, we conduct targeted experiments to assess whether, under the GPU-CC threat model, data transfers along different paths remain protected across the bridge between trusted CPU and GPU domains. We responsibly disclosed all security findings presented in this paper to the NVIDIA Product Security Incident Response Team (PSIRT).

Tue 19 May 17:45 - 18:00 PDT

Toward Principled LLM Safety Testing: Solving the Jailbreak Oracle Problem

Shuyi Lin ⋅ Anshuman Suri ⋅ Alina Oprea ⋅ Cheng Tan

As large language models (LLMs) become increasingly deployed in safety-critical applications, the lack of systematic methods to assess their vulnerability to jailbreak attacks presents a critical security gap. We introduce the \emph{jailbreak oracle problem}: given a model, prompt, and decoding strategy, determine whether a jailbreak response can be generated with likelihood exceeding a specified threshold. This formalization enables a principled study of jailbreak vulnerabilities. Answering the jailbreak oracle problem poses significant computational challenges, as the search space grows exponentially with response length. We present BOA, the first system designed for efficiently solving the jailbreak oracle problem. BOA employs a two-phase search strategy: (1) breadth-first sampling to identify easily accessible jailbreaks, followed by (2) depth-first priority search guided by fine-grained safety scores to systematically explore promising yet low-probability paths. BOA enables rigorous security assessments including systematic defense evaluation, standardized comparison of red team attacks, and model certification under extreme adversarial conditions. Code is available at https://github.com/shuyilinn/BOA/tree/mlsys2026ae.

Main Navigation

Session