[Avg. reading time: 5 minutes]

Batch - Streaming - Microbatch

Batch Processing

Batch means collect first, process later.

  • Works on large chunks of accumulated data
  • High throughput, cheaper, simpler
  • Results are not real-time
  • Typically minutes, hours, or days delayed

Examples:

  • Daily or weekly sales reports
  • End-of-day stock portfolio reconciliation
  • Monthly billing cycles
  • ETL pipelines that refresh a data warehouse

Use cases

  • Immediate action is not required
  • Delay is acceptable
  • Working with large historical datasets

Stream Processing

Streaming means process events the moment they arrive.

  • Low-latency (milliseconds to seconds)
  • Continuous, event-by-event processing
  • Ideal for real-time analytics and alerting
  • Stateful systems maintain event history or running context

Examples:

  • Stock price updates
  • Fraud detection for credit cards
  • Real-time gaming leaderboards
  • IoT sensor monitoring

Use cases

  • You need instant reactions
  • Delays cause risk, loss, or bad UX

Micro Batch

Micro-batching = small batches processed very frequently.

  • Latency: ~0.5 to a few seconds
  • Not true real-time, but close
  • Simpler than full streaming
  • Common in systems like Spark Structured Streaming

batch pretending to be streaming


Examples

Fraud Detection (Streaming)

  • Decision must be immediate
  • Millisecond latency required
  • Delay = financial loss

Payment Posting (Micro-Batch)

  • Small delay is acceptable
  • Updates can lag slightly
  • No immediate risk

Monthly Statements (Batch)

  • No urgency
  • Process large volumes at once
  • Cost-efficient
STREAMING     > Event > Process > Output      (ms latency)
MICRO-BATCH   > Small windows > Process      (seconds)
BATCH         > Accumulate > Process         (minutes+)

#batch #streaming #kafka #realtimeVer 2.1.1

Last change: 2026-04-08