Back to Blog
Engineering Stories#AI#Governance#LLM#Automation#Testing#Leadership#PyTest

From Governance to Automation: Building a Safe AI Engineering Culture

Gerald M
15 min read
2025-05-15

In 2024, developers across our organization began adopting free-tier LLM tools — ChatGPT, GitHub Copilot, various open models — without any governance framework. Proprietary code snippets and client data were being pasted into prompts with no guardrails. The risk was real, but there was no formal mandate to address it.

I decided to act anyway. And what started as a defensive governance exercise eventually unlocked something much more valuable: a foundation for using AI safely and powerfully at scale — including a test automation pipeline that cut script development time by 70%.


Part 1 — Identifying the Risk

I conducted a survey across 30+ developers to map how AI tools were actually being used. The findings were concerning:

  • 72% had pasted production code into external LLMs at least once
  • 45% had included client-identifying information in prompts
  • Zero had any formal guidance on what was safe to share

The gap wasn't intentional — developers wanted to be productive. They just didn't have guidelines.


Part 2 — Building the Framework

The Prompt Engineering Handbook

I authored a comprehensive handbook covering:

  • Approved tools vs. restricted tools (with reasoning)
  • Prohibited data categories — PII, client credentials, internal API keys
  • Prompt templates — pre-vetted patterns for common tasks
  • Code review requirements — what AI-generated code needs before merging

The Tooling: yodlee-cursor-skills

Policy alone doesn't change behavior. To make governance practical, I created the yodlee-cursor-skills repository — a versioned collection of domain-specific AI skill files that embed:

  • Secure context (API schemas, test patterns)
  • Lint rules for AI-generated code
  • Test skeleton generators
  • All scoped to approved enterprise boundaries

Hands-on Enablement

Policy without training is just documentation. I ran workshops for 40+ engineers demonstrating:

  • How to use the handbook in daily workflows
  • How the skill files work with Cursor AI
  • How to keep all inference within approved boundaries

Part 3 — Turning Guardrails into a Launchpad

With a secure foundation in place, the team could finally lean into AI tooling with confidence. The most impactful application: LLM-powered test automation.

The Problem with Traditional Test Automation

Test development is time-consuming. Writing Selenium scripts, maintaining page object models, managing test data — these tasks consume significant engineering bandwidth. Traditional automation requires:

  • Deep knowledge of Selenium/Playwright APIs
  • Maintenance overhead as UI changes
  • Test data management complexity
  • Long feedback loops

The Solution: LLM Agents in the Test Pipeline

By integrating LLMs into the automation framework — governed by the same prompt templates and safety guardrails from the handbook — I enabled testers to write test cases as natural language descriptions:

"Given user is on login page, When they enter valid credentials and click submit,
Then they should see the dashboard"

The LLM agent then:

  1. Understands intent — Parses natural language test steps
  2. Generates selectors — Uses page understanding to locate elements
  3. Executes actions — Translates to Selenium/Playwright commands
  4. Validates results — Asserts outcomes

Implementation Stack

Layer Technology
Framework Pytest + Selenium + Cursor AI
Workflow Cursor AI Rules and Skills + prompt engineering
State Management Context window for multi-step workflows
Governance Prompt templates and safety guardrails from the Cursor rules

The key insight: LLMs are excellent at translating between human intent and machine execution. The governance framework didn't slow this down — it made it trustworthy enough to deploy at scale.


The Combined Impact

Outcome Result
Data-leak incidents post-rollout Zero
Test script development time ↓ 70%
Automation code bugs ↓ 50% (LLM validation)
QA automation speed (test skeletons) ↑ 60%
Recognition Company-wide best practice

Lessons Learned

  1. Don't wait for a mandate — When you see a risk, build a lightweight, enforceable process
  2. Make compliance easy — If the secure path is harder than the insecure path, people will take shortcuts
  3. Combine policy with tooling — A handbook alone doesn't change behavior; integrated tools do
  4. Governance is a prerequisite, not a blocker — Safe guardrails are what allow you to move faster, not slower
  5. Start small, scale fast — One team's success story becomes the template for the organization