@Checkpoint annotation enables automatic persistence of MongoDB Change Stream resume tokens. When placed on a class alongside @ChangeStream, the framework saves the resume token after processing events and restores it on restart — so your stream picks up exactly where it left off.
Basic Usage
Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
saveEveryN | int | 1 | Save a checkpoint every N successfully processed events |
saveIntervalSeconds | int | 5 | Periodic checkpoint interval in seconds (heartbeat). Set to 0 to disable |
startPosition | StartPosition | RESUME | Where to start consuming when the stream starts |
onHistoryLost | OnHistoryLost | FAIL | Strategy when the saved resume token has expired from the oplog |
Checkpoints are stored in the
_fw_checkpoints collection in your MongoDB database. The collection name is not configurable.saveEveryN
Controls how often checkpoints are saved based on event count. Setting this to a higher value reduces write pressure on MongoDB at the cost of potentially replaying more events after a crash.
saveIntervalSeconds
A periodic timer that saves the latest resume token at the specified interval, even when no events are arriving. This acts as a heartbeat, ensuring the checkpoint stays fresh for idle streams.
startPosition
Determines where the stream starts consuming when it is first created or when no checkpoint exists.
| Value | Description |
|---|---|
RESUME | Resume from the last persisted checkpoint (event-based or heartbeat). If no checkpoint exists (first-ever start, or saveIntervalSeconds = 0 with no events processed yet), starts from now. (default) |
LATEST | Ignore any existing checkpoint and start from now. Useful to skip a backlog after a long outage. |
onHistoryLost
Determines the recovery strategy when the saved resume token has expired from the MongoDB oplog. This happens when a stream is stopped for longer than the oplog retention window.
| Strategy | Events in gap | Behavior |
|---|---|---|
FAIL | Unknown | Stream refuses to start. Throws HistoryLostException. Operator must intervene. (default) |
RESUME_FROM_OPLOG_START | Partially lost | Resumes from the oldest available oplog entry. Falls back to RESUME_FROM_NOW if the oplog is inaccessible. |
RESUME_FROM_NOW | All lost | Starts from the current moment. The entire gap is skipped — no replay. |
HistoryLostException
HistoryLostException
When The exception message includes the stream name, the last checkpoint timestamp, and suggestions for recovery strategies.
onHistoryLost = FAIL, the framework throws a HistoryLostException with an actionable message:Comprehensive Example
How It Works
FlowWarden uses a dual-token checkpoint model internally:| Token | Description |
|---|---|
lastSeenToken | Resume token of the last event received from MongoDB |
lastProcessedToken | Resume token of the last event successfully handled |
lastProcessedToken, ensuring the event is re-delivered.
Interaction with @Pipeline
Interaction with @Pipeline
When a
@Pipeline is present, it filters events server-side, creating a gap between the actual oplog position and the last event your handler processed. The lastSeenToken tracks the oplog position independently (via periodic sampling), so on restart, MongoDB resumes from the most recent position — not from the last processed event.Without this mechanism, a restart would force MongoDB to re-scan the entire oplog from lastProcessedToken, potentially scanning tens of thousands of filtered-out events.See the @Pipeline reference for details.Checkpoint Storage (SPI)
TheCheckpointStore interface is the SPI that backs @Checkpoint. The MongoDB implementation is auto-configured, but you can provide your own by registering a CheckpointStore bean:
| Implementation | When active |
|---|---|
MongoCheckpointStore | Imperative mode (flowwarden.default-mode=IMPERATIVE) |
ReactiveMongoCheckpointStore | Reactive mode (flowwarden.default-mode=REACTIVE) |
_fw_checkpoints collection with the same document format, so switching between imperative and reactive mode is transparent.
Checkpoint record structure
Checkpoint record structure
Best Practices
- Keep
saveEveryN = 1for critical streams (e.g., order processing, billing) to minimize event replay on crash. - Increase
saveEveryNfor high-throughput streams where replaying a few events is acceptable — this reduces checkpoint write pressure. - Do not disable
saveIntervalSecondsunless you have a specific reason. The heartbeat checkpoint ensures idle streams stay recoverable. - Use
startPosition = LATESTonly for streams where historical events are irrelevant (e.g., cache invalidation).
See Also
Checkpoint & Resume Guide
Step-by-step guide to configuring checkpoint and resume behavior
@ChangeStream
The main annotation for declaring Change Stream handlers
@Pipeline
Server-side filtering and its interaction with dual-token checkpointing
@RetryPolicy
Configure retry behavior for failed event processing

