What is test-as-code?

Diego Salinas
Enterprise Content Manager
Table of contents

What is test as code?

Test as code is the practice of writing, managing, and executing performance tests using real programming languages and standard software development workflows instead of GUI-based tools or proprietary formats. It treats test scenarios as first-class code artifacts — version-controlled, peer-reviewed, and automated alongside the application they validate.

For engineering teams shipping software continuously, this approach solves a persistent problem: performance testing that lives outside the development workflow. When tests are locked in GUI tools or XML configurations, they're hard to version, hard to review, and easy to forget. Test as code brings performance validation into the same loop as your application code — same repo, same pipelines, same review process.

Gatling was built around this idea from day one. As a Continuous Performance Intelligence Platform, Gatling lets you write load test scenarios in Java, Kotlin, Scala, JavaScript, and TypeScript — then run them in CI/CD, set assertions as guardrails, and scale to enterprise-grade distributed testing when you're ready.

Test as code, explained

At its core, test as code means your performance tests follow the same principles as your application code:

  • Version control — tests live in Git alongside the code they validate, so every change is tracked and reversible
  • Automation — tests run as part of CI/CD pipelines, not as manual one-off exercises
  • Collaboration — teammates review test logic through pull requests, catching issues before they merge
  • Reproducibility — anyone on the team can clone the repo and run the exact same test with the exact same parameters

This is a fundamentally different model from GUI-based tools like JMeter, where test plans are stored as XML files that are difficult to diff, merge, or review. If you've ever tried to resolve a merge conflict in a JMeter .jmx file, you know the pain.

Here's a quick comparison:

Code-based vs GUI-based load testing GATLING • JMETER
Code-based (Gatling) GUI-based (JMeter)
Test format Real code: Java, Kotlin, JavaScript, and more XML configurations
Version control Clean diffs and easy merges XML diffs are noisy and error-prone
Code review Standard pull request workflows Impractical for XML blobs
CI/CD integration Native — runs like any build step Requires extra tooling and plugins
IDE support Full autocompletion, debugging, and refactoring Separate GUI application
Reusability Functions, classes, and shared libraries Copy-paste between test plans
Learning curve Requires programming knowledge Lower initial barrier, higher long-term ceiling

The tradeoff is real: GUI tools have a gentler on-ramp. But as teams scale and testing becomes a continuous activity rather than a periodic event, the benefits of code-based approaches compound.

You can read more about how this comparison plays out in practice in our Gatling vs JMeter breakdown.

Why engineering teams adopt test as code

The shift toward code-based testing isn't happening in isolation. It's part of a broader movement in how engineering organizations think about quality and performance.

Shift-left performance testing

Shift-left testing — moving validation earlier in the development lifecycle — is one of the most prominent trends in software testing heading into 2026, according to industry analyses from Xray, Parasoft, and others. Instead of discovering performance issues in staging or production, teams catch them during development, right alongside unit and integration tests.

Shift-left performance testing means developers own performance from the start, running load tests locally or in CI before code reaches shared environments.

Test as code makes this practical. When your load test is a Java class or a TypeScript file in the same repo as your application, there's no friction to running it early and often.

CI/CD as the backbone

Modern teams don't ship code manually, and they shouldn't test performance manually either. Test as code lets you embed load testing into CI/CD pipelines as a standard build step. Every pull request, every deploy, every release candidate gets a performance check — automatically.

This matters at scale. According to the World Quality Report 2025-26, 94% of organizations now review real production data to inform their testing strategy. That feedback loop only works when performance tests run continuously, not quarterly.

Collaboration through code review

When tests are code, they benefit from the same review culture as application code. A senior engineer can spot an unrealistic ramp-up pattern in a PR.

A teammate can suggest a better assertion threshold. The test logic becomes a shared artifact that the whole team understands and improves over time.

Maintainability at scale

According to TestGrid's 2026 industry analysis, the global software testing market is projected to grow from $48.17 billion in 2025 to $93.94 billion by 2030 (a 14.29% CAGR). As testing becomes a larger part of engineering budgets, maintainability matters more than ever.

Code-based tests are modular, refactorable, and composable — you can extract common patterns into shared libraries, organize scenarios by feature or team, and apply the same engineering standards you'd use anywhere else.

How Gatling implements test as code

Gatling's approach is built around three ideas: write in real languages, run in real pipelines, and assert on real metrics.

Write scenarios in real programming languages

Gatling supports five languages: Java, Kotlin, Scala, JavaScript, and TypeScript. You're not learning a DSL that only works inside Gatling — you're writing real code with full IDE support, autocompletion, and debugging.

Here's a basic Gatling simulation in Java that tests a web application's homepage and API endpoint:

public class BasicSimulation extends Simulation {

 HttpProtocolBuilder httpProtocol = http
     .baseUrl("https://your-app.example.com")
     .acceptHeader("application/json");

 ScenarioBuilder scenario = scenario("Homepage and API")
     .exec(http("Load homepage").get("/"))
     .pause(2)
     .exec(http("Fetch users").get("/api/users"));

 {
   setUp(
       scenario.injectOpen(rampUsers(100).during(60))
   ).protocols(httpProtocol)
    .assertions(global().responseTime().percentile3().lt(800));
 }
}

This simulation ramps up 100 virtual users over 60 seconds, hits two endpoints, and asserts that the 95th percentile response time stays under 800ms. It's a regular Java class — you can organize it alongside your application tests, extract shared logic into utility classes, and refactor it with standard IDE tools.

Run tests like you run builds

Because Gatling tests are code, they fit naturally into CI/CD. You can trigger them on every push, every merge to main, or on a schedule. Here's what a Gatling step looks like in a GitHub Actions pipeline:

- name: Run Gatling load test
 run: |
   mvn gatling:test -Dgatling.simulationClass=simulations.BasicSimulation
 env:
   GATLING_ENTERPRISE_API_TOKEN: ${{ secrets.GATLING_TOKEN }}

- name: Publish JUnit results
 uses: dorny/test-reporter@v1
 with:
   name: Gatling Results
   path: target/gatling/*/js/assertions.xml
   reporter: java-junit

Gatling exports results in JUnit format, so your CI system can treat performance test failures the same as any other test failure. No special integrations needed — just standard load testing in CI/CD.

Assertions as performance guardrails

Assertions turn your tests from "generate some traffic and look at charts" into automated pass/fail gates. You define the performance thresholds your application must meet, and Gatling enforces them:

// 95th percentile response time under 500ms
global().responseTime().percentile3().lt(500)

// Error rate below 1%
global().failedRequests().percent().lt(1.0)

// Throughput above 1000 requests/second
global().requestsPerSec().gte(1000.0)

These assertions run as part of your build. If your app's p95 latency drifts above 500ms after a code change, the pipeline fails — before that regression reaches production.

{{cta}}

Model real-world traffic, not synthetic loads

A test is only as useful as its resemblance to reality. Gatling gives you precise control over how virtual users behave, so your results reflect what actually happens in production.

Open and closed workload models

Gatling supports both open and closed workload models. Open models inject users at a constant rate regardless of system response — this simulates real internet traffic where new users keep arriving whether your server is keeping up or not. Closed models maintain a fixed number of concurrent users, which is better for capacity planning scenarios.

Choosing the right model matters. An open model will expose saturation points that a closed model masks. Most production web traffic follows an open model pattern, which is why Gatling defaults to it.

Data-driven testing with feeders

Hardcoded test data produces unrealistic cache hit ratios and database query patterns. Gatling's feeder system lets you parameterize requests with external data from CSV files, JSON files, or programmatic sources. Each virtual user gets different data, just like real users behave differently.

This approach lets you test with realistic distributions of product IDs, user credentials, search terms, and geographic data — so your results reflect actual usage patterns rather than synthetic best-case scenarios.

Test types for every scenario

Code-based tests make it straightforward to model different traffic patterns for different purposes:

  • Spike testing: sudden surges like a product launch or viral post, useful for testing auto-scaling and circuit breakers
  • Soak testing: sustained load over hours to expose memory leaks, connection pool exhaustion, and gradual degradation
  • Stress testing: gradually increasing load to find the breaking point, so you know your limits before users do
  • Volume testing: large data sets to validate database performance and storage behavior under realistic conditions

You can learn more about these approaches in our what is load testing guide and the documentation on how to simulate real users.

Challenges and best practices

Test as code isn't without friction. Here's what teams run into and how to handle it.

The learning curve for non-developers

Not every tester writes code daily. Gatling addresses this with multiple on-ramps: the Gatling Recorder captures browser interactions and generates test scripts automatically. Gatling Studio provides a no-code visual builder for creating scenarios without writing a line. And if you already have API collections, Postman import converts them into Gatling simulations directly.

These tools lower the barrier without sacrificing the benefits of code — you can start visually and graduate to hand-written code as your comfort grows.

Test maintenance as codebases grow

Like any code, tests can become unwieldy. A few practices keep things manageable:

  • keep individual scenarios focused and readable — one scenario per user journey, not a monolithic test that covers everything
  • extract common request chains and configurations into shared modules
  • use descriptive naming that tells you what the test validates without reading the implementation
  • review test code with the same rigor as application code

Getting started friction

The first test is always the hardest. Gatling provides project templates for every supported language, VS Code integration for a familiar editing experience, and starter tutorials in the documentation. Start with a single endpoint, get the pipeline working, and expand from there.

Scale with Gatling Enterprise

Once you've outgrown what a single machine can simulate, Gatling Enterprise provides the infrastructure to run tests at scale without managing it yourself.

Distributed execution across regions

Gatling Enterprise runs load generators across multiple cloud regions simultaneously. You can simulate traffic from North America, Europe, and Asia in a single test — matching the geographic distribution of your actual users. This reveals latency issues and CDN behavior that single-region tests miss entirely.

Real-time dashboards and run trends

Advanced dashboards show response times, throughput, and error rates as the test runs. But the real value is in trends: comparing results across runs to catch gradual regressions that single-run analysis misses. When your p95 latency creeps up by 20ms over ten releases, you want to see that trend before it becomes a problem.

Configuration as code

Enterprise configuration — test parameters, load injector setup, scheduling — is defined in YAML. This means your test infrastructure is versioned alongside your test code, reviewed in the same PRs, and reproducible across environments.

Observability integrations

Performance test results don't exist in isolation. Gatling Enterprise integrates with Datadog, Dynatrace, New Relic, and OpenTelemetry — so you can correlate load test metrics with application telemetry, infrastructure metrics, and traces in the tools your team already uses.

Security and governance

For enterprise teams, Gatling Enterprise includes SSO integration, role-based access control (RBAC), and audit logs. DevOps and platform teams get the controls they need to manage testing at organizational scale.

How Sophos made test as code work across teams

Sophos, a global cybersecurity company, uses Gatling Enterprise to decentralize performance testing across multiple development teams. Instead of routing all load tests through a single QA team, each team writes and runs their own Gatling tests within their CI/CD pipelines.

The result: performance issues get caught earlier, testing scales with the engineering organization, and teams own their performance outcomes directly. By embedding test-as-code practices into their development workflow, Sophos reduced the feedback loop from days to minutes.

Getting started with test as code

Here's a practical path from zero to automated performance testing:

  1. Pick your language: choose from Java, Kotlin, Scala, JavaScript, or TypeScript. Go with whatever your team already uses. VS Code support is built in for a familiar development experience.
  2. Write your first test: use the Gatling Recorder to capture a browser flow, import a Postman collection, start from a project template, or build visually with Gatling Studio. Don't aim for comprehensive coverage — start with one critical user journey.
  3. Codify expectations: add assertions for response time, error rate, and throughput. These turn your test into a pass/fail gate instead of a report you have to manually review.
  4. Automate in CI/CD: wire your test into your pipeline so it runs on every merge or deploy. Export results in JUnit format for native integration with your CI system. Check out our CI/CD best practices guide for pipeline examples.
  5. Scale with Enterprise: when you need distributed load generation, trend analysis, observability integrations, or team-level governance, Gatling Enterprise picks up where open source le

{{card}}

FAQ

What does “test-as-code” mean in load testing?

Test-as-code means defining test scenarios using programming languages rather than using graphical user interfaces. Instead of dragging and dropping test steps, you write them in code—allowing for version control, code reviews, automation, and reuse. In load testing, this makes it easier to model realistic traffic, simulate dynamic user journeys, and scale testing via CI/CD.

Why is Gatling considered a test-as-code tool?

Gatling was built from the ground up to support test-as-code. Its DSLs (in Java, Scala, Kotlin, and JavaScript) let you write load test scenarios directly in code. These scripts are just text files, easily maintained in Git, reviewed in pull requests, and triggered from your CI/CD workflows. This aligns perfectly with modern DevOps practices.

What are the benefits of using test-as-code with Gatling vs GUI-based tools?

Compared to GUI-based tools like JMeter, Gatling's code-centric model offers: Easier collaboration via version control and code review. Precise modeling of user logic with loops, conditions, and data feeders. Seamless CI/CD integration and automated performance gating. Cleaner maintenance and easier refactoring. Legacy tools often suffer from brittle test plans, poor diffing in Git, and limited reusability. Gatling eliminates those pain points.

Can non-developers use Gatling if it's code-based?

Yes—Gatling provides a Recorder that converts user sessions into code, and Gatling Enterprise includes a no-code scenario builder that exports test definitions as code. This enables QA engineers and testers without deep programming skills to contribute to performance testing—while keeping code as the source of truth.

Ready to move beyond local tests?

Start building a performance strategy that scales with your business.

Need technical references and tutorials?

Minimal features, for local use only