Session
Correctness and Security
Ballroom C
Moderator: Xiaoyuan Liu
Validating Large Language Models with ReLM
Michael Kuchnik · Virginia Smith · George Amvrosiadis
Although large language models (LLMs) have been touted for their ability to generate natural-sounding text, there are growing concerns around possible negative effects of LLMs such as data memorization, bias, and inappropriate language. Unfortunately, the complexity and generation capacities of LLMs make validating (and correcting) such concerns difficult. In this work, we introduce ReLM, a system for validating and querying LLMs using standard regular expressions. ReLM formalizes and enables a broad range of language model evaluations, reducing complex evaluation rules to simple regular expression queries. Our results exploring queries surrounding memorization, gender bias, toxicity, and language understanding show that ReLM achieves up to 15× higher system efficiency, 2.5× data efficiency, and increased statistical and prompt-tuning coverage compared to state-of-the-art ad-hoc queries. ReLM offers a competitive and general baseline for the increasingly important problem of LLM validation.
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency
Yan Wang · Yuhang Li · Ruihao Gong · Aishan Liu · yanfei wang · Jian Hu · Yongqiang Yao · Yunchen Zhang · tianzi xiaotian · Fengwei Yu · Xianglong Liu
Extensive studies have shown that deep learning models are vulnerable to adversarial and natural noises, yet little is known about model robustness on noises caused by different system implementations. In this paper, we for the first time introduce SysNoise, a frequently occurred but often overlooked noise in the deep learning training-deployment cycle. In particular, SysNoise happens when the source training system switches to a disparate target system in deployments, where various tiny system mismatch adds up to a non-negligible difference. We first identify and classify SysNoise into three categories based on the inference stage; we then build a holistic benchmark to quantitatively measure the impact of SysNoise on 20+ models, comprehending image classification, object detection, instance segmentation and natural language processing tasks. Our extensive experiments revealed that SysNoise could bring certain impacts on model robustness across different tasks and common mitigations like data augmentation and adversarial training show limited effects on it. Together, our findings open a new research topic and we hope this work will raise research attention to deep learning deployment systems accounting for model performance.
Be Careful with PyPI Packages: You May Unconsciously Spread Backdoor Model Weights
Tianhang Zheng · Hao Lan · Baochun Li
To facilitate deep learning project development, some popular platforms provide model (sub)packages for developers to import and instantiate a deep learning model with few lines of code. For example, PyTorch provides \texttt{torchvision.models} for developers to instantiate models such as VGG and ResNet. Although those model packages are easy to install and use, their integrity may not be well-protected locally. In this paper, we show that an adversary can manipulate the \texttt{.py} files in the developers' locally installed model packages, if the developers install the adversary's PyPI package for using its claimed features. When installing the adversary's package, the system does not report any warning or error related to the manipulation. Leveraging this integrity vulnerability, we design an attack to manipulate the model forward function in the local \texttt{.py} files, such as \texttt{resnet.py} in the local \texttt{torchvision.models} subpackage. With our attack, the adversary can implant a backdoor into the developers' trained model weights, even supposing that the developers use seemingly clean training data and seemingly normal training code.
Building Verified Neural Networks for Computer Systems with Ouroboros
Cheng Tan · Changliu Liu · Zhihao Jia · Tianhao Wei
Neural networks are powerful tools. Applying them in computer systems—operating systems, databases, and networked systems—attracts much attention. However, neural networks are complicated black boxes that may produce unexpected results. To train networks with well-defined behaviors, we introduce ouroboros, a system that constructs verified neural networks. Verified neural networks are those that satisfy user-defined safety properties, known as specifications. Ouroboros builds verified networks by a training-verification loop that combines deep learning training and neural network verification. The system employs multiple techniques to fill the gap between today’s verification and the properties required for systems. Ouroboros also accelerates the training-verification loop by spec-aware learning. Our experiments show that ouroboros can train verified networks for five applications that we study and has a 2.8× speedup on average compared with the vanilla training-verification loop.