@Pipeline - FlowWarden

@Pipeline defines a server-side aggregation pipeline sent to MongoDB when the Change Stream starts. Only matching events are transmitted over the network, reducing traffic and application load. The pipeline is evaluated once at stream startup — not per event.

Basic Usage

Annotate a method in your @ChangeStream class. At most one @Pipeline method per class. The method must take no parameters.

@ChangeStream(collection = "orders")
public class OrderHandler {

    @Pipeline
    List<Bson> pipeline() {
        return List.of(
            Aggregates.match(Filters.in("fullDocument.status", "PAID", "SHIPPED", "CANCELLED"))
        );
    }

    @OnInsert
    void handle(ChangeStreamContext<Order> ctx) {
        // only receives events where status is PAID, SHIPPED, or CANCELLED
    }
}

Supported Return Types

Return type	Description
`List<Bson>`	Raw MongoDB BSON pipeline stages
`List<AggregationOperation>`	Spring Data aggregation operations
`Aggregation`	Spring Data `Aggregation` object

// From PipelineBsonCapture sample
@Pipeline
List<Bson> pipeline() {
    return List.of(
        Aggregates.match(Filters.in("operationType", "insert", "update"))
    );
}

// From OrderStreamWithPipeline sample
@Pipeline
Aggregation filterHighValueOrders() {
    return Aggregation.newAggregation(
        Aggregation.match(Criteria.where("fullDocument.total").gt(100))
    );
}

@Pipeline
List<AggregationOperation> pipeline() {
    return List.of(
        Aggregation.match(Criteria.where("operationType").in("insert", "update"))
    );
}

Combining with @Filter

@Pipeline can coexist with @Filter on the same @ChangeStream, forming a double funnel — pre-filter server-side, then refine with Java logic: The following example is from the PipelineFilterCapture sample — it pre-filters by operation type server-side, then checks the order status application-side:

@ChangeStream(documentType = Order.class, fullDocument = FullDocumentMode.UPDATE_LOOKUP)
public class PipelineFilterCapture {

    @Pipeline
    Aggregation pipeline() {
        return Aggregation.newAggregation(
            Aggregation.match(Criteria.where("operationType").in("insert", "update"))
        );
    }

    @Filter
    Predicate<ChangeStreamContext<?>> filter() {
        return ctx -> ((ChangeStreamContext<Order>) ctx).getFullDocument(Order.class)
            .map(order -> "CONFIRMED".equals(order.getStatus()))
            .orElse(false);
    }

    @OnInsert
    void onInsert(Order doc, ChangeStreamContext<Order> ctx) { ... }
}

Use @Pipeline to eliminate events you never need (reduce network traffic), then @Filter for logic that requires Spring beans, external service calls, or dynamic conditions.

Checkpoint Interaction

When @Pipeline is combined with @Checkpoint, FlowWarden’s dual-token model keeps restart fast.The pipeline filters events server-side, so only matching events ever reach FlowWarden. Each matching event advances the in-memory lastSeenToken tracker; the heartbeat timer (saveIntervalSeconds) persists it. On restart, the 3-level resume cascade starts at lastProcessedToken and falls back to lastSeenToken if the processed token has aged out of the oplog — which avoids a ChangeStreamHistoryLost failure on streams where the pipeline keeps the matching-event rate low.Keep saveIntervalSeconds > 0 to preserve cascade level 2. With saveIntervalSeconds = 0, only lastProcessedToken is persisted, so a long downtime on a pipeline-heavy stream can force escalation to onHistoryLost.

Oplog Internals and Limitations

How Change Stream pipelines interact with the oplog

Change Streams Cannot Use Indexes

MongoDB does not support creating indexes on the oplog. All Change Stream pipelines perform a sequential scan (COLLSCAN) of the oplog to evaluate your $match filters. This is a fundamental MongoDB limitation, not a FlowWarden one.

Pushdown-Eligible Fields

Not all filters are equal. Through empirical testing by the MongoDB community (this is not officially documented by MongoDB), only a handful of fields allow an optimized evaluation in the oplog scan:

Field	Pushdown
`operationType`	Yes
`ns.db`	Yes
`ns.coll`	Yes
`documentKey._id`	Yes
`fullDocument.*`	No — forces full deserialization
`updateDescription.*`	No — forces full deserialization
Computed or nested fields	No — forces full deserialization

Filtering on fullDocument fields (like fullDocument.status) forces MongoDB to deserialize and evaluate every oplog entry. This matters most during restarts, when MongoDB must scan the oplog forward from the resume token.

Best Practice: Layer Your Filters

Put pushdown-eligible filters first in your pipeline to reduce the number of entries that require full deserialization:

@Pipeline
List<Bson> pipeline() {
    return List.of(
        // Stage 1: pushdown-eligible — fast
        Aggregates.match(Filters.in("operationType", "insert", "update")),
        // Stage 2: requires deserialization — but fewer entries reach this stage
        Aggregates.match(Filters.eq("fullDocument.status", "PAID"))
    );
}

Performance Characteristics

Normal operation: No performance concern. MongoDB evaluates the pipeline on each new oplog entry as it arrives — there is no bulk scan of historical data.
At restart: MongoDB scans the oplog from the resume token forward. If the pipeline filters out most events, this scan can be slow — this is why FlowWarden’s dual-token checkpoint model exists.
Oplog retention: The oplog is a capped collection with a fixed size. If the application is down longer than the oplog retention window, the resume token expires and the stream cannot resume. Use @Checkpoint(onHistoryLost = ...) to handle this case.

Allowed aggregation stages in Change Stream pipelines

MongoDB restricts which aggregation stages can be used in a Change Stream pipeline:

Stage	Allowed
`$match`	Yes
`$project` / `$addFields`	Yes
`$replaceRoot` / `$replaceWith`	Yes
`$redact`	Yes
`$set` / `$unset`	Yes
`$group`	No
`$sort`	No
`$limit` / `$skip`	No
`$lookup`	No
`$out` / `$merge`	No

Using an unsupported stage causes a MongoDB error at stream startup (fail-fast).

Constraints

Constraint	Detail
One per class	At most one `@Pipeline` method per `@ChangeStream` class
No parameters	The method must take zero arguments
Evaluated once	The pipeline is resolved at stream startup, not per event
Immutable	Changing the pipeline requires restarting the stream

@Filter

Application-side Java predicate filtering

Filtering Events

When to use @Pipeline vs @Filter

Checkpoint & Resume

How dual checkpoint interacts with @Pipeline

How it Works

Full event processing pipeline architecture

​Basic Usage

​Supported Return Types

​Combining with @Filter

​Checkpoint Interaction

​Oplog Internals and Limitations

​Change Streams Cannot Use Indexes

​Pushdown-Eligible Fields

​Best Practice: Layer Your Filters

​Performance Characteristics

​Constraints

​See Also

@Filter

Filtering Events

Checkpoint & Resume

How it Works

Basic Usage

Supported Return Types

Combining with @Filter

Checkpoint Interaction

Oplog Internals and Limitations

Change Streams Cannot Use Indexes

Pushdown-Eligible Fields

Best Practice: Layer Your Filters

Performance Characteristics

Constraints

See Also