Skip to yearly menu bar Skip to main content


talk
in
Workshop: SysML4Health: Scalable Systems for ML-driven Analytics in Healthcare

Maya Gokhale, "Efficient memory-mapped search in persistent memory/storage"

Maya Gokhale


Abstract:

Memory mapping is the standard technique for accessing external data objects as if in memory. With the increasing diversity of storage options, such as persistent memory, locally attached high performance SSDs, and network storage, memory mapping presents a uniform interface for applications to access out of core data sets. In this talk, I will discuss two approaches to efficient memory-mapped search of genetic data in persistent memory/storage.

The UMap library provides a memory mapping interface to external data sets. As a user level library, UMap can be easily adapted to application-specific access patterns and and to storage characteristics. This flexibility is not possible with system-wide services like mmap which are optimized for generality. UMap has been integrated into the Livermore Metagenomics Analysis Toolkit (LMAT) and improves performance by 15% over system mmap.

The second approach creates a hardware pipeline to efficiently search an in-memory key/value store and discuss its use to find k-mers. K-mer search is the first step in LMAT's metagenomic analysis to collect taxonomy information associated with k-mers found in a metagenomic sample. We find that hardware acceleration can speed up k-mer look up by 4X to 10X over software. Using an FPGA emulator, we can assess the performance impact of higher latency persistent memory on this important processing step.