Building Compose CI/CD Pipelines: Visual Regression and Performance Assurance

By Hang Li · DevOps Architect • Published: 2025-10-12 • Updated: 2025-10-2414 min read

CI/CDTestingPerformance

After introducing Compose, traditional screenshot comparison and Espresso tests struggle to cover declarative features. We need to redesign the pipeline, combining "fast feedback + deep benchmarking" to build a tiered testing matrix.

The pipeline should include: static checks (ktlint, Detekt, Compose Metrics), visual regression (Paparazzi or Shot), performance benchmarks (Macrobenchmark), and accessibility validation (Accessibility Test Framework).

Pipeline Topology

  • Stage 1 - Static Analysis: Run `./gradlew lint ktlintCheck detekt` at PR level and enable Compose Compiler Metrics output;
  • Stage 2 - Visual Regression: Trigger Paparazzi to render Compose component snapshots and upload difference images to Artifact;
  • Stage 3 - Performance & Accessibility: Run Macrobenchmark and accessibility scripts on nightly schedule, sync results to DataDog/Grafana;
  • Stage 4 - Deployment & Rollback: Build internal Beta, push to QA and business stakeholders via Firebase App Distribution.

GitHub Actions Example

The following YAML snippet provides a common GitHub Actions configuration. We split Macrobenchmark and Paparazzi into reusable jobs for easy parallel scaling.

jobs:
   compose-static-checks:
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v4
       - uses: gradle/gradle-build-action@v3
         with:
           arguments: lint ktlintCheck detekt

   compose-visual-regression:
     needs: compose-static-checks
     runs-on: macos-latest
     steps:
       - uses: actions/checkout@v4
       - uses: gradle/gradle-build-action@v3
         with:
           arguments: verifyPaparazziDebug

   compose-performance:
     needs: compose-visual-regression
     runs-on: android-large
     steps:
       - uses: actions/checkout@v4
       - uses: gradle/gradle-build-action@v3
         with:
           arguments: :benchmark:connectedCheck

Visualization and Rollback Strategy

We inject release version numbers and Git commit hashes in App Startup, reporting logs to ELK. Dashboards display availability, frame drops, and accessibility pass rates by component dimension. Once anomalies occur, we can quickly switch back to the last stable version through Rollback Playbook.

Note: Recommend pointing Compose Compiler's `reportsDestination` to a separate CI Artifact for long-term tracking of `skipped` and `restart` metric fluctuations.
← Back to Blog