How a Leading Financial Services Firm Reclaimed
Control of 2.1M Lines of Legacy Code
Implementing LORE — an Agentic AI Reverse Engineering system — to decode undocumented monoliths, build a living knowledge base, and enable safe software enhancements across five technology stacks with zero institutional knowledge remaining.
The Client
The client is a mid-market financial services company processing over 400,000 transactions daily across loan origination, payment settlement, and regulatory reporting systems. Their technology estate had grown organically over 18 years, accumulating critical business logic in applications that no living engineer fully understood.
Organization Profile
With 1,200 employees and a 45-person engineering team, the organization operates 14 interconnected applications that form the backbone of their financial operations. These systems collectively handle regulatory compliance for three jurisdictions, integrate with 22 external partners, and process an average settlement volume of $1.8 billion per month.
Technology Landscape
Infrastructure managed via Ansible and Puppet scripts. Minimal CI/CD automation. Source code in Git with limited branching discipline.
A Perfect Storm of Legacy Risk
The client faced a convergence of compounding risks that threatened their ability to operate, comply with regulations, and evolve their products. Every challenge amplified the others, creating a situation where even minor changes carried disproportionate risk.
Vanishing Institutional Knowledge
Over the past four years, 60% of the senior engineers who originally built the core systems had left the organization through attrition and retirement. Critical business logic — particularly in the settlement engine and regulatory modules — existed only in the minds of departed staff. The remaining team operated through trial-and-error when modifying unfamiliar modules.
Documentation Bankruptcy
Of the 14 applications, only 3 had any form of documentation, and those documents were 4–7 years out of date. Architecture decision records did not exist. API contracts were informally communicated via Slack messages and tribal knowledge. New hires spent an average of 14 weeks before making their first meaningful code contribution.
Change Paralysis
The team had accumulated a backlog of 340+ Jira tickets for enhancements and bug fixes. Engineers feared making changes because they could not reliably predict the blast radius of any modification. A routine configuration change in 2024 caused a 6-hour production outage affecting settlement processing, costing the client $2.3M in penalties and operational losses.
Regulatory Pressure
An upcoming regulatory mandate required the client to implement new reporting fields across their transaction processing pipeline within 9 months. Without understanding the full data flow across their interconnected systems, the engineering team estimated 18 months for the change — a gap that risked regulatory non-compliance and potential license revocation.
Multi-Stack Complexity
The five different technology stacks, accumulated organically over nearly two decades, meant that no single engineer could work across the full system. Cross-stack integration points were particularly fragile — the Java-to-C++ settlement bridge and the PHP-to-.NET reporting handoff were single points of failure understood by no current team member.
No Safety Net
Test coverage averaged 12% across the portfolio. Many applications had zero automated tests. Database schemas had accumulated 340+ tables with undocumented foreign key relationships and 47 stored procedures containing embedded business rules. Manual deployments via shell scripts introduced human error at every release.
LORE: Legacy Ontology & Reverse Engineering
We designed and deployed LORE — an agentic AI platform built on Google Agent Development Kit (ADK) with Gemini models — structured around three pillars that progressively build understanding, knowledge, and operational capability over the legacy estate.
Technology Foundation
System Architecture
Documentation Generation Engine
Automated comprehension and living documentationThe first pillar automated the most time-consuming aspect of legacy management: understanding what the code actually does. Three specialized agents work in concert to analyze, document, and keep documentation synchronized with ongoing changes.
CodeScanner Agent
Uses tree-sitter to parse every source file across all five language stacks into Abstract Syntax Trees. Extracts classes, methods, call graphs, import chains, and inheritance hierarchies. Ingests database schemas, API specs, Ansible configs, and Jira tickets as supplementary context. Operates incrementally — after the initial full scan, subsequent runs process only changed files via git diff.
DocGenerator Agent
Produces four tiers of documentation: system-level architecture overviews for managers, module-level dependency maps for architects, method-level logic annotations for developers, and cross-cutting concern analysis (security, error handling, database patterns) for QA. Each tier uses specialized prompt templates with AST-grounded context fed into Gemini 2.5 Pro.
DocSync Agent
Monitors git commits and triggers AST-diff analysis to detect semantic code changes (not just text diffs). Regenerates only affected documentation sections. A Critic sub-agent validates updated docs against the code to prevent drift. Changes are propagated to MkDocs, in-repo Markdown, and Confluence pages automatically, producing a Documentation Change Report for each update cycle.
Result: Within 3 weeks, LORE generated comprehensive documentation for all 14 applications — producing 2,847 pages of structured content that would have taken the team an estimated 18 person-months to write manually. Documentation coverage went from 8% to 94%.
Knowledge Base & Ontology Agent
GraphRAG-powered institutional memory exposed via MCPThe second pillar transforms raw code understanding into a queryable, relationship-aware knowledge graph. Built on Neo4j with native vector indexing, the ontology captures not just what the code is, but how it connects, why it exists, and what would break if it changes.
Code Ontology in Neo4j
The GraphBuilder Agent constructs a property graph with 12 node types and 14 relationship types, linking code entities to database tables, API endpoints, configuration entries, and Jira tickets. Gemini enriches each node with business domain classification, design pattern identification, and natural language summaries. Vector embeddings enable semantic similarity search alongside graph traversal.
For the client, the graph contained 184,000+ nodes and 620,000+ relationships, mapping every function call, inheritance chain, database access, and configuration dependency across the entire portfolio.
Ontology MCP Server
The knowledge graph is exposed as a Model Context Protocol server, making it accessible to any MCP-compatible client — including Cursor IDE, custom scripts, and the SDLC agents themselves. Nine MCP tools provide capabilities ranging from semantic code search to impact analysis to call graph traversal.
The KBQuery Agent adds a conversational layer, enabling engineers to ask questions like “What database tables does the PaymentProcessor write to, and which other modules read from those tables?” — getting precise, relationship-aware answers in seconds instead of hours of manual code tracing.
Result: The impact analysis MCP tool became the team's most-used capability. Before any code change, engineers query the blast radius in under 3 seconds. This single capability eliminated the class of “surprise cascade failures” that had previously caused three production incidents per quarter.
Agentic SDLC for Safe Enhancements
AI-driven propose → review → approve workflowThe third pillar enables the client to safely execute enhancements on their legacy applications using an agentic workflow where AI agents plan, implement, test, and review changes while humans retain approval authority at every critical junction.
Context Engineering
The Coder Agent does not generate code in a vacuum. Before writing a single line, it queries the Knowledge Base to retrieve the target codebase's actual conventions — naming patterns, error handling strategies, logging formats, and architectural patterns. Generated code matches the existing style so closely that in blind reviews, the client's engineers could not distinguish agent-generated code from human-written code 68% of the time.
Inter-Agent Handoff (A2A)
All agents communicate via Google's Agent-to-Agent protocol with full context transfer at each handoff. If the Planner Agent identifies cross-stack impact (e.g., a Java change affecting the C++ settlement engine), the Orchestrator Agent spawns parallel Coder Agent instances for each stack, coordinating through the shared Knowledge Base. Session state is persisted in Vertex AI Session Service, enabling resume-from-checkpoint if any agent encounters an error.
Result: In the first 6 weeks of Pillar 3 operation, the agentic SDLC processed 47 enhancement tickets from the backlog. Average time from Jira ticket to merged PR dropped from 23 days to 4.2 days. Zero production incidents resulted from agent-assisted changes.
12-Week Delivery Timeline
LORE was delivered in three phases over 12 weeks, with each phase building on the previous one. A team of 6 engineers executed the implementation alongside the client's existing staff, who were progressively trained on the platform.
Team Composition
1 Platform Engineer — GCP infrastructure, Cloud Run, Neo4j Aura setup and ongoing management
2 Agent Engineers — ADK agent development, prompt engineering, A2A protocol implementation
1 Knowledge Engineer — Neo4j schema design, graph construction pipeline, MCP server development
1 Frontend Engineer — Web dashboard, documentation site, human approval gate UI
1 Integration Engineer — Jira, Confluence, Git integration, webhook and polling setup
Key decision: Starting with Java and C++ in Phase 1 allowed the team to validate the architecture on the two most critical stacks (settlement engine + core platform) before expanding to .NET, PHP, and Python. Adding new language support required only tree-sitter grammar configuration — no agent code changes.
Measurable Impact Across the Organization
LORE's impact was measured across four dimensions: developer productivity, operational risk, regulatory readiness, and organizational resilience. Results were tracked for the first 90 days following full deployment.
| Metric | Before LORE | After LORE | Change |
|---|---|---|---|
| New developer onboarding time | 14 weeks | 3.8 weeks | 73% reduction |
| Documentation coverage | 8% | 94% | 12x increase |
| Avg. time from ticket to merged PR | 23 days | 4.2 days | 82% reduction |
| Production incidents from code changes | 3.2 per quarter | 0 in first quarter | 100% elimination |
| Impact analysis time (per change) | 2–6 hours manual | 3 seconds automated | >99% reduction |
| Backlog tickets processed (first 6 weeks) | ~8 tickets/6 weeks | 47 tickets/6 weeks | 5.9x throughput |
| Cross-stack dependency visibility | None (undocumented) | Full graph coverage | Complete visibility |
| Regulatory field implementation estimate | 18 months (manual) | 4.5 months (with LORE) | 75% acceleration |
We went from being terrified to touch any code in the settlement engine to having an AI agent that could tell us, in three seconds, exactly what would be affected by any change. LORE didn't just document our systems — it gave us back the institutional knowledge we thought was permanently lost when our senior engineers left. For the first time in years, we're making forward progress on our backlog instead of just keeping the lights on.
What Made LORE Different
Several architectural and methodological decisions distinguished LORE from conventional code analysis tools and set the foundation for its effectiveness.
GraphRAG over Pure Vector RAG
By combining Neo4j graph traversal with vector similarity search in a single database, LORE answers relationship-aware questions that pure embedding-based retrieval cannot. “What calls this function?” is a graph query. “Find code similar to this pattern” is a vector query. LORE does both in the same request, producing contextually richer answers.
Context Engineering, Not Generic Prompts
The Coder Agent retrieves the target codebase's actual conventions from the Knowledge Base before generating code. It is primed with real examples of how this specific codebase handles error flows, naming, logging, and architectural patterns — not generic best practices. This produces code that is stylistically indistinguishable from human-written code in the same project.
AST-Grounded Documentation
Every generated documentation statement traces back to a specific AST node in the parsed code. The Critic sub-agent validates this traceability on every generation and update cycle, preventing hallucinated documentation from entering the system. This achieved a verified accuracy rate of 96.3% on a random sample of 500 documentation statements.
MCP as Universal Integration Layer
Exposing the Knowledge Base as an MCP server meant that the client's developers could query institutional knowledge directly from their Cursor IDE, custom scripts could run impact analysis in CI pipelines, and the SDLC agents themselves used the same MCP tools as human developers. One interface, every consumer.
Episodic Memory for Institutional Knowledge
Every enhancement processed through the Agentic SDLC is recorded as a ChangeEvent in the Knowledge Base, preserving what changed, why, and who approved it. This creates a growing institutional memory that compounds over time — the system gets smarter with every change, unlike documentation that degrades.
Risk-Calibrated Human Gates
The system automatically assesses the risk level of every proposed change based on the impact analysis scope. High-risk changes (cross-stack, touching settlement logic) require senior engineer approval. Low-risk changes (documentation updates, config tweaks) can be auto-approved via policy. This kept humans focused on decisions that mattered.
Insights for Other Organizations
This engagement surfaced several insights applicable to any organization managing significant legacy application portfolios.
- Start with understanding, not modernization. The impulse to rewrite or refactor legacy code is strong, but premature without deep understanding. LORE's documentation-first approach ensured that every subsequent decision was informed by actual system behavior, not assumptions about what the code should do.
- Graph databases unlock questions you didn't know to ask. The client's team discovered 23 undocumented cross-stack integration points during the initial graph construction — dependencies that no team member was aware of. Vector search alone could not have surfaced these structural relationships.
- Incremental ingestion is non-negotiable for adoption. Initial full-codebase analysis took 4 hours per application. Without incremental updates (processing only changed files via git diff), the system would have been too slow for daily use. The DocSync agent's AST-diff approach kept ongoing updates under 30 seconds per commit.
- Human gates build trust, and trust enables autonomy. By requiring human approval at every decision point initially, the team built confidence in the agents' output quality. After 4 weeks, the client voluntarily enabled auto-approval for low-risk changes — a decision they would never have made without the trust earned through transparent review cycles.
- Multi-persona interfaces are essential for organization-wide adoption. Engineers, architects, QA, and managers all interact with the same knowledge base but need fundamentally different views. The KBQuery agent's persona-based responses meant a manager asking about “payment system risk” received a business-level summary, while a developer asking the same question received specific function names and call paths.
- MCP creates an extensible future. Three months after deployment, the client's own engineers built two additional MCP tools on top of the ontology — a compliance audit tool and a dead-code detector — without any changes to the core LORE platform. The protocol-based architecture turned the knowledge base into a platform rather than a product.
Roadmap Beyond Initial Deployment
With LORE operational, the client is now pursuing three follow-on initiatives that build on the platform's foundation.
Self-Improving Agent Loop
When human reviewers modify agent-generated code during the approval process, those corrections are fed back into the Knowledge Base as training signal. Over time, the Coder Agent's output increasingly matches the team's preferences without explicit prompt tuning.
Modernization Planning
With full dependency graphs and business domain classification now available, the client is using LORE's impact analysis to identify clean extraction boundaries for a strangler-fig modernization of their core platform — decomposing the monolith into services with confidence in the boundaries.
Compliance Automation
The regulatory reporting fields that triggered the original engagement are now being implemented through the Agentic SDLC pipeline. Projected completion: 4.5 months — well within the 9-month regulatory deadline and 13.5 months ahead of the original manual estimate.