Enterprise Knowledge Graph Platform
Transforming hyper-complex EPC documentation into deterministic semantic intelligence.
Role & Ownership
- Owned complete semantic graph architecture and zero-orphan topology validation.
- Architected deterministic spaCy-based NLP extraction engine.
- Designed 13-specialist MCP server infrastructure for multi-agent validation.
- Formulated data lineage preservation strategy raw text → extracted entities → graph nodes.
The Challenge
Engineering, Procurement, and Construction (EPC) projects present hyper-complex documentation. Traditional RAG fails because flat vector embeddings lack contextual topology and regulatory rules. We needed to transform ambiguous text landscapes into auditable infrastructure.
NLP-First Extraction Pipeline
Replaced zero-shot LLM extractions with a hardened processing pipeline using custom spaCy pipelines for engineering domain and NER for engineering taxonomies.
- Deterministic extraction rules
- Typed immutable artifacts
- 100% data auditability
MCP Infrastructure & Service Layer
Built 13 specialized semantic tools exposed via dedicated MCP Server. Decoupled graph logic from consumption layer for safe query interface.
- Graph traversal algorithms
- Automated validation rules
- Dynamic entity discovery
Performance Comparison
| Metric | Before | After | Change |
|---|---|---|---|
| Latency | ~248ms | ~9ms | 96% Faster |
| Op. Cost | $6.8M/run | $0.02/proj | >99% Savings |
| Schema | 73.7% | 100% | +26.3% |
Strategic Key Takeaways
NLP-First Architecture eliminated hallucinations through deterministic parsing.
Foundation (topology) matters more than retrieval algorithms.
Agentic validation catches errors before production propagation.
MCP abstraction improved scalability and multi-agent safety.