Agent 1: Business Logic Extractor
Source Code → Business Requirements
Role
The Business Logic Extractor is the first agent in the Phoenix pipeline. It reads the entire legacy codebase — COBOL, Java, .NET, Python, whatever the stack — and produces a structured catalog of every business rule, workflow, decision tree, validation, calculation, and edge case.
The key distinction: it extracts what the code does, not how the code is written. Implementation details are discarded. Business intent is preserved.
Inputs
- Complete legacy source code repository
- Database schemas and stored procedures
- Configuration files and environment settings
- Batch job definitions and scheduled task configurations
- Integration point specifications (APIs, file transfers, message queues)
Outputs
Business Rules Catalog
A structured registry of every rule the system enforces:
| Rule ID | Domain | Description | Source Location | Confidence |
|---|---|---|---|---|
| BR-001 | Pricing | Discount capped at 15% for standard accounts | pricing/calc.cob:L340-L367 | High |
| BR-002 | Validation | Customer age must be 18+ for account creation | customer/validate.cob:L89 | High |
| BR-003 | Workflow | Orders over $10K require manager approval | orders/approval.cob:L201-L245 | Medium |
Each rule includes:
- A natural language description of the business intent
- The source code location where it was found
- A confidence score indicating extraction certainty
- Cross-references to related rules
Workflow Maps
Visual and structured representations of how business processes flow through the system. Decision points, branches, loops, exception paths — all captured as directed graphs.
Data Entity Model
A semantic model of the system's data — not the database schema, but the business entities and their relationships. What does a "customer" mean? What's an "order"? How do they relate?
Dependency Graph
Maps which business rules depend on which other rules, which workflows trigger other workflows, and which data entities are consumed by which processes.
Edge Case Registry
The most valuable output. Edge cases are the hard-won institutional knowledge buried in conditional branches, special-case handlers, and validation rules that nobody remembers adding. The extractor catalogs every one.
Methodology
The Extractor works in three passes:
Pass 1: Structural Analysis
Parse the codebase into an abstract representation. Identify modules, functions, classes, data structures, entry points, and call chains. Build the dependency graph.
Pass 2: Semantic Extraction
For each code unit, determine the business intent. Strip away language-specific implementation details. A COBOL PERFORM loop and a Java forEach loop that do the same thing produce the same business rule.
Pass 3: Cross-Referencing
Link business rules to workflows, workflows to data entities, and data entities back to rules. Identify orphaned code (rules that are never triggered) and dead paths (branches that are never reached).
Human Validation Gate
Before passing outputs to Agent 2, the AI Software Lead:
- Reviews the business rules catalog against stakeholder knowledge
- Flags rules that are obsolete vs. essential
- Identifies rules the extractor may have missed (tribal knowledge)
- Confirms the data entity model matches the organization's understanding