You have 1 article left to read this month before you need to register a free LeadDev.com account.
How engineering leaders can embrace Continuous Testing to accelerate, not restrict AI-powered teams.
Your team just shipped three major features this week, with the help of coding assistants that generated 30% of the new feature codebase – which was of course reviewed (meticulously) by the engineering team. Your infrastructure provisioned itself. Your deployment pipeline hummed along without intervention. Birds sang.
Then your phone buzzes at 2 AM. Production is down. A “simple” AI-generated refactor broke a critical integration that your test suite never caught.
AI tools let us build features faster, but our quality practices haven’t kept pace, which creates a bottleneck that at best slows delivery, and at worst compromises reliability.
The quality challenge in AI-accelerated development
Having spent over two decades in API development and testing, I’ve witnessed many technological shifts. AI-accelerated development is one of the most dramatic changes I’ve seen. But as AI tools are fundamentally transforming how we build software by automating routine tasks and accelerating coding, our quality practices remain anchored in pre-AI approaches.
We’ve seen customer teams follow the same consistent patterns: they report dramatic increases in coding velocity, but their existing testing practices and infrastructure struggles to keep pace. Without adapting their quality practices to the new AI-normal, many teams experience more production issues as they ship faster. Their traditional testing model breaks down when AI-generated code changes ripple across microservices and applications faster than manual reviews and testing can catch up.
The AI-optimists will tell you that “the models will get better and introduce less bugs over time.” While that is probably true, your end-users need things to work. Now.
A Continuous Testing approach for AI-powered velocity
Traditionally, automated test execution has been tightly coupled to CI/CD pipelines, but as organizations strive to shift left and migrate to cloud-native approaches for both applications and their delivery, testing is shifting to be continuous across the entire delivery pipeline.
This means:
- Local development environments allow for pre-commit/build test execution.
- Ephemeral environments provisioned during the code-commit/pull-request process allow for testing before CI/CD pipelines kick in.
- Running tests as part of CI/CD (as before) helps ensure both the functional and non-functional compliance of components being built and deployed with corresponding requirements.
- Running tests on a schedule in pre-production environments (decoupled from CI/CD pipelines) ensures that nothing falls through the cracks as new components get updated independently by different teams.
- Running tests asynchronously in reaction to system or infrastructure events ensures that even minor configuration or code changes don’t compromise application performance or functionality
Let’s overlay some Continuous Testing best practices applicable to all of the above for the world of AI-generated code:
- Intelligent Test Orchestration – running the right tests for each change
- Real-time Test Scheduling – running tests as early as possible
- Parallelized Test Execution – running tests efficiently
- Centralized Test Observability – all test results in one place
- Testing as a Platform – testing is a platform capability
That gives us ORPOP – not the catchiest of acronyms – but don’t judge an acronym by its catchiness!
1. Intelligent test orchestration: Test what matters, when it matters
Not every code, deployment, or infrastructure change requires you to run every test – it requires you to run the right test(s) to ensure all ripple effects of the change are tested before going into production. And in this context, AI itself can help in targeting the right tests for each change:
- Code or configuration diff analysis determines which tests that need to be run based on actual changes
- Historical failure patterns predict which tests are most likely to catch regressions
- Business impact scoring prioritizes critical user journeys during high-velocity periods
- Automated orchestration and execution of identified tests in applicable infrastructure
For test orchestration to be automated through AI, corresponding data must be made available to your AI agent(s) for analysis and action:
- Access to code repositories for both the components under test and the tests themselves
- A catalog of available automated tests labeled with metadata for targeting and prioritization
- Infrastructure events to understand changes to deployments and configurations
- Test observability (logs, results, artifacts) for previous test executions
The output of such an AI analysis is preferably some kind of test-execution plan in a declarative format that can be processed by corresponding tooling to run the actual tests for each iteration.
2. Real-time scheduling: Testing that never sleeps
While traditional testing waits for “the right moment” (merge of code, before/after deployment, during the release candidate phase), continuous testing runs continuously, triggered by events rather than schedules or builds.
Pushing this further in the context of AI-generated code, one should strive for:
- Unit and Component Tests fire automatically on every commit, not just pull/merge requests
- Pull/Merge request kick off E2E tests in dynamically provisioned (ephemeral) infrastructure instead of just unit/component tests.
- Application or Service deployments into testing and staging environments automatically kick off integration tests across dependent services
- Infrastructure updates trigger tests to validate compatibility with existing workloads
A genuine concern could be regarding increased resource costs and pipeline latency as more tests run, which is why aggressive real-time execution needs to be paired with Intelligent Test Orchestration and Parallel Processing.
3. Parallel processing: Breaking the sequential bottleneck
Traditional CI/CD runs tests sequentially, where integration tests wait for unit tests, performance tests wait for functional tests, etc. While there can certainly be sequential dependencies when running individual tests, parallelization should be harnessed aggressively to cut down on overall test execution times.
Key strategies:
- Service-level parallelization: Each microservice runs its test suite independently.
- Test-level parallelization: Unit, integration, and performance tests are parallelized/sharded across multiple nodes.
- Environment parallelization: Multiple versions of test environments handle different scenarios.
- Geographic parallelization: Distribute test execution across regions to reduce latency.
- Platform parallelization: The same test is run in parallel on different target platforms, i.e. browsers, operating systems, and devices.
In this context, both Intelligent test orchestration and fail-fast circuit-breakers are key to counter increased resource utilization/cost during intensive testing periods, and to shorten overall test execution times.
4. Centralized test observability
Getting insight to the behaviour and output of all components involved in executing complex E2E or performance tests in a cloud-native environment is key to understanding why tests failed, both for manual and AI-automation.
As testing moves from CI/CD to being continuous, logs, and artifacts for test execution are (at best) collected into some centralized storage via manual scripting, but rarely exposed to end users through any testing-focused interface for interaction. Furthermore, logs and metrics from the system(s) under test are usually captured elsewhere (if at all), for example in an APM solution.
This “dissonance” across different tools collecting different artifacts related to test executions results in a lack of a centralized and contextualised view of testing activities – which makes efficient troubleshooting and reporting for tests extremely difficult – both for manual and automated/AI processes.
Therefore, ensuring that contextual collection of all testing related artifacts into some centralized storage is key to building efficient pipelines. This data includes
- Log output from both the system-under-test and the tests themselves – for troubleshooting of error messages and stack traces.
- Artifacts (reports, videos, screenshots, etc) generated by the testing tool during test execution – for troubleshooting user interactions and test failures.
- Resource usage for both the test execution itself and any system-under-test component – for troubleshooting operational aspects of test-execution, and troubleshooting test flakiness related to resource availability.
- System events that occurred during (or before) the execution of a test – although not directly related to the test itself – can give valuable insight into why a test behaved the way it did.
Once aggregated, this data needs to be made available to both end-users (dev, qa) for manual troubleshooting and (AI) agents via for automated workflows related to tasks like Intelligent Test Execution (described above), automated troubleshooting/remediation, etc.
5. Testing as a platform capability
A key aspect to implementing continuous testing is to treat test orchestration and execution as platform capabilities – not CI/CD functionality. What this means is that the internal developer platform one builds for engineering teams should come with all aspects of continuous testing “built in”;
- Templates/workflows for how to define automated tests and their execution should be part of blueprints for new components
- Test triggering should be closely aligned with delivery pipelines, be it CI/CD, GitOps or asynchronous delivery.
- Test execution should be done by leveraging platform infrastructure itself and not from external tools (unless mandated for specific types of tests) – for scalability, security, and consistency across test executions and results.
- Test observability described above should be automated with platform level tools/components.
Depending on the size of your organization, you might have a dedicated Platform Engineering team responsible for all the above – or this is managed by the application team itself. In either case, embracing the concept of Continuous Testing as a platform capability is key vs expecting each engineering team to decide on and implement their own testing strategy.
Continuous Testing at AI speed
The power of Continuous Testing for AI-powered development comes together when all pillars work in concert:
- Intelligent Orchestration ensures that the right tests are run for every code or infrastructure change.
- Real-time Scheduling ensures immediate feedback on AI-generated changes
- Parallel Execution eliminates traditional bottlenecks Intelligent targeting optimizes resource usage and feedback quality
- Centralized Test Observability ensures all data related to test execution is available for manual and automated workflows
- Testing as a Platform Capability ensures consistent, secure, and scalable test orchestration and execution.
Getting started
Getting all this into place is obviously no small feat. Here’s how to get started:
- Find one team/component in your organization that is embracing AI for code generation and is willing to (or already) embrace a Continuous Testing approach.
- Start with testing as a platform capability:
- Ensure existing tests are run efficiently with existing infrastructure for test execution
- Start optimizing test executions by pursuing parallel execution opportunities for your tests and adopting a real-time scheduling approach:
- Decouple test execution from CI/CD to make it available across the entire SDLC
- Insert event-based triggers for tests across your pipelines (including CI of course)
- Parallelize tests as applicable to cut down overall execution times
- Strive for centralized test observability
- Ensure that logs, results, artifacts, and metrics for executed tests (and the system-under-test) be available via centralized solution(s), such as Grafana, Datadog, Testkube, Allure, etc.
- Now the fun (and hard) part: infuse AI for Intelligent Test Orchestration
- Ensure the different components of your testing infrastructure are exposed to AI tooling via MCP Servers or corresponding interfaces. This includes:
- Your centralized test result repository
- Your test orchestration/execution platform
- Your test catalog, with applicable metadata
- Your source-code repositories – both for your tests, system-under-test
- Your infrastructure events and configuration
- Start by defining a basic use-case and building the corresponding manual AI prompts that utilize one or more of these sources (based on which ones you have at your disposal) – for example, start with “Given this code change and based on previous test results, which tests should I run”
- Once happy with their outcome, encapsulate these prompts into automated AI Agents using appropriate tooling and insert these into your continuous testing pipelines.
- Ensure the different components of your testing infrastructure are exposed to AI tooling via MCP Servers or corresponding interfaces. This includes:
The path forward
AI has fundamentally changed how we build software, but most organizations are still testing like it’s 2015. The gap between development velocity and quality assurance continues to widen as teams embrace AI-generated code without expanding their automated testing efforts accordingly.
Continuous Testing offers a practical approach to close this gap. By proactively implementing the practices outlined above, engineering teams can maintain quality standards while embracing AI for both accelerated development and intelligent Continuous Testing practices across delivery pipelines in their entire organization.
Promoted Partner Content
