A vector database for machine learning embeddings.
Featureform is a virtual feature store. It enables data scientists to define, manage, and serve their ML model’s features. Featureform sits atop your existing infrastructure and orchestrates it to work like a traditional feature store.
By using Featureform, a data science team can solve the following organizational problems:
Use your existing data infrastructure. Featureform does not replace your existing infrastructure. Rather, Featureform transforms your existing infrastructure into a feature store. In being infrastructure-agnostic, teams can pick the right data infrastructure to solve their processing problems, while Featureform provides a feature store abstraction above it. Featureform orchestrates and manages transformations rather than actually computing them. The computations are offloaded to the organization’s existing data infrastructure. In this way, Featureform is more akin to a framework and workflow, than an additional piece of data infrastructure.
Designed for both single data scientists and large enterprise teams Whether you’re a single data scientist or a part of a large enterprise organization, Featureform allows you to document and push your transformations, features, and training sets definitions to a centralized repository. It works everywhere from a laptop to a large heterogeneous cloud deployment.
Native embeddings support Featureform was built from the ground up with embeddings in mind. It supports vector databases as both inference and training stores. Transformer models can be used as transformations, so that embedding tables can be versioned and reliably regenerated. We even created and open-sourced a popular vector database, Emeddinghub.
Open-source Featureform is free to use under the Mozilla Public License 2.0.
In reality, the feature’s definition is split across different pieces of infrastructure: the data source, the transformations, the inference store, the training store, and all their underlying data infrastructure. However, a data scientist will think of a feature in its logical form, something like: “a user’s average purchase price”. Featureform allows data scientists to define features in their logical form through transformations, providers, labels, and training set resources. Featureform will then orchestrate the actual underlying components to achieve the data scientists’ desired state.
Featureform can be run locally on files or in Kubernetes with your existing infrastructure.
Featureform on Kubernetes can be used to connect to your existing cloud infrastructure and can also be run
locally on Minikube.
To check out how to run it in the cloud,
follow our Kubernetes deployment.
To try Featureform in a single docker container, follow our docker quickstart guide
Please help us by reporting any issues you may have while using Featureform.