Back to Glossary
MLOps

Feature Store

Definition

A feature store is a centralized repository for storing, managing, and serving ML features (the processed input variables used for model training and inference) ensuring consistency between training and production environments.

Why It Matters

Feature engineering is often the highest-impact work in ML, yet teams repeatedly rebuild the same features from scratch. A feature store solves this by making features reusable, discoverable, and consistently computed across training and serving.

The training-serving skew problem is particularly nasty. Your model trains on features computed one way in a batch job, but serves predictions using features computed differently in real-time. The model performs differently in production than in training, and debugging this is painful.

For AI engineers working with LLMs, feature stores become relevant when building hybrid systems. Combining LLM outputs with traditional ML features (user behavior, product attributes, historical patterns) requires consistent feature management.

Implementation Basics

Feature stores typically have two components:

1. Offline Store Batch storage for training data. Contains historical feature values for building training datasets. Usually backed by data warehouses (BigQuery, Snowflake) or object storage (S3, GCS). Used for: model training, batch predictions, historical analysis.

2. Online Store Low-latency storage for real-time serving. Contains the latest feature values for each entity (user, product, transaction). Usually backed by Redis, DynamoDB, or specialized feature serving databases. Used for: real-time predictions during inference.

Key capabilities to implement:

  • Feature definitions: Declarative specs for how features are computed
  • Point-in-time joins: Prevent data leakage by joining features with correct timestamps
  • Versioning: Track feature definition changes over time
  • Monitoring: Detect drift between training and serving distributions

Popular options: Feast (open-source), Tecton, AWS SageMaker Feature Store, Databricks Feature Store.

For most AI engineering work, start with simple feature computation in code. Adopt a feature store when you have multiple models sharing features, strict consistency requirements, or team collaboration needs.

Source

Feature stores provide a central hub for feature management, enabling feature sharing, versioning, and consistent serving for both training and inference.

https://www.featurestore.org/