Timezone: »

Demonstration of Ballet: A Framework for Open-Source Collaborative Feature Engineering
Micah J Smith · Kelvin Lu · Kalyan Veeramachaneni

Mon Mar 04:00 PM -- 07:00 PM PST @ Ballroom B + C

Feature engineering is a critical part of end-to-end learning pipelines in many practical supervised learning settings. While the most predictive features often build off of diverse domain expertise and human intuition, rarely are more than a small handful of data scientists and researchers involved in this process. Ballet addresses this problem by providing a framework for scaling feature engineering collaborations in an open-source setting. In our approach, collaborators incrementally submit patches containing standalone feature definitions to a central source code repository. Our framework provides functionality for composing the separate features into an executable end-to-end pipeline, evaluating feature submissions in a streaming fashion, and automating project management tasks for maintainers. In this demonstration, audience participants will collaborate in real-time in a feature engineering task on a complex, real-world dataset.

Author Information

Micah J Smith (MIT)
Kelvin Lu (MIT)
Kalyan Veeramachaneni (MIT)