Skip to content

Agent 1: Business Logic Extractor

Source Code → Business Requirements

Role

The Business Logic Extractor is the first agent in the Phoenix pipeline. It reads the entire legacy codebase — COBOL, Java, .NET, Python, whatever the stack — and produces a structured catalog of every business rule, workflow, decision tree, validation, calculation, and edge case.

The key distinction: it extracts what the code does, not how the code is written. Implementation details are discarded. Business intent is preserved.


Inputs

  • Complete legacy source code repository
  • Database schemas and stored procedures
  • Configuration files and environment settings
  • Batch job definitions and scheduled task configurations
  • Integration point specifications (APIs, file transfers, message queues)

Outputs

Business Rules Catalog

A structured registry of every rule the system enforces:

Rule IDDomainDescriptionSource LocationConfidence
BR-001PricingDiscount capped at 15% for standard accountspricing/calc.cob:L340-L367High
BR-002ValidationCustomer age must be 18+ for account creationcustomer/validate.cob:L89High
BR-003WorkflowOrders over $10K require manager approvalorders/approval.cob:L201-L245Medium

Each rule includes:

  • A natural language description of the business intent
  • The source code location where it was found
  • A confidence score indicating extraction certainty
  • Cross-references to related rules

Workflow Maps

Visual and structured representations of how business processes flow through the system. Decision points, branches, loops, exception paths — all captured as directed graphs.

Data Entity Model

A semantic model of the system's data — not the database schema, but the business entities and their relationships. What does a "customer" mean? What's an "order"? How do they relate?

Dependency Graph

Maps which business rules depend on which other rules, which workflows trigger other workflows, and which data entities are consumed by which processes.

Edge Case Registry

The most valuable output. Edge cases are the hard-won institutional knowledge buried in conditional branches, special-case handlers, and validation rules that nobody remembers adding. The extractor catalogs every one.


Methodology

The Extractor works in three passes:

Pass 1: Structural Analysis

Parse the codebase into an abstract representation. Identify modules, functions, classes, data structures, entry points, and call chains. Build the dependency graph.

Pass 2: Semantic Extraction

For each code unit, determine the business intent. Strip away language-specific implementation details. A COBOL PERFORM loop and a Java forEach loop that do the same thing produce the same business rule.

Pass 3: Cross-Referencing

Link business rules to workflows, workflows to data entities, and data entities back to rules. Identify orphaned code (rules that are never triggered) and dead paths (branches that are never reached).


Human Validation Gate

Before passing outputs to Agent 2, the AI Software Lead:

  • Reviews the business rules catalog against stakeholder knowledge
  • Flags rules that are obsolete vs. essential
  • Identifies rules the extractor may have missed (tribal knowledge)
  • Confirms the data entity model matches the organization's understanding

Next: Agent 2 — Interface Archaeologist →