REFERENCE ARCHITECTURE

Industrial RFQ Matching: Reference Architecture

A production approach to matching customer RFQs across large industrial catalogs

The Challenge

Industrial MRO distributors process RFQs (50-50,000 items) where customers use different part numbering systems. The matching system must map these to a large catalog (10M-14M+ products).

Critical Metric: Fill Rate vs Accuracy

Fill Rate (Primary Goal)

Definition: % of RFQ items successfully quoted

Industry Reality: <50% for many distributors

Why it matters: Half of customer requests go unfilled = lost revenue

Accuracy (Confidence on Matches)

Definition: Confidence score on items that ARE matched

Typical Target: 85%+ on matched items

Why it matters: Wrong match = returns, safety issues, trust loss

The Goal: Maximize fill rate (get to 50%+ quote rate) while maintaining high accuracy (85%+ confidence) on matches. These are different metrics addressing different problems.

The Solution: Category-Based Routing + Progressive Cascade

Core Architecture

graph TD A[RFQ Input
50-50K items] --> B[TIER 1: CATEGORY CLASSIFIER] B --> B1[Pattern-based
bearing 6208] B --> B2[Keyword-based
desc: FUSE] B --> B3[Pre-categorization
Motion IDs] B --> C1[Bearing Matcher
~1M products
Threshold: 0.85] B --> C2[Sprocket Matcher
~500K products
Threshold: 0.85] B --> C3[Fuse Matcher
~200K products
Threshold: 0.90] B --> C4[Hydraulic Matcher
~2M products
Threshold: 0.85] C1 --> D[Category-Specific
Progressive Cascade] C2 --> D C3 --> D C4 --> D style A fill:#667eea,stroke:#667eea,color:#fff style B fill:#764ba2,stroke:#764ba2,color:#fff style C3 fill:#f56565,stroke:#f56565,color:#fff style D fill:#48bb78,stroke:#48bb78,color:#fff

Progressive 11-Strategy Cascade

Progressive Cascade Flow

graph LR A[Query Part] --> B{Layer 1
Verified Data
70% coverage} B -->|Match| Z[Return Result
confidence 0.85-1.0] B -->|No Match| C{Layer 2
Semantic Patterns
+15% coverage} C -->|Match| Z C -->|No Match| D{Layer 3
AI Embeddings
+8% coverage} D -->|Match| Z D -->|No Match| E{Layer 4
Agentic Research
+2% coverage} E -->|Match| Z E -->|No Match| F[Manual Review] style B fill:#48bb78,stroke:#48bb78,color:#fff style C fill:#4299e1,stroke:#4299e1,color:#fff style D fill:#9f7aea,stroke:#9f7aea,color:#fff style E fill:#ed8936,stroke:#ed8936,color:#fff style Z fill:#667eea,stroke:#667eea,color:#fff style F fill:#cbd5e0,stroke:#a0aec0,color:#1a1a1a

Key Architectural Decisions

Category-First, Not Search-First

Search 1M per category vs 14M monolithic = 14x faster

Evidence Before AI

70% solved via verified data (100% accurate) before any ML

Multi-Dimensional Thresholds

Safety-critical (0.90+) vs general (0.85) vs commodity (0.80)

Category-Specific Logic

Bearings: base numbers. Sprockets: teeth count. Fuses: amperage

Common Pain Points in Industrial RFQ Matching

How this architecture addresses typical challenges

Pain #1: Large RFQs (50K-200K items) cause performance issues
Solution: Category routing reduces search space 10-14x (1M per category vs 10M-14M total). Batch: 397 items/sec throughput validated.
Pain #2: No visibility into per-category fill rates
Solution: Per-category reporting built-in. Example: "Bearings 95%, Hydraulics 78%, Electrical 85%" shows data gaps clearly.
Pain #3: Confidence scores don't reflect data quality
Solution: Progressive cascade provides explainable confidence. Exact match = 1.0, verified interchange = 0.95, AI = 0.65-0.90.
Pain #4: One-size-fits-all approach misses domain patterns
Solution: Category-specific strategies extract domain knowledge (bearing base numbers, sprocket teeth, fuse amperage).

Technology Stack

graph TB subgraph Input A[RFQ Items] end subgraph Classification B[Category Classifier
Pattern + Keyword] end subgraph Verified C1[Postgres DB
V1 Verified Data
Strategies 0-3] end subgraph AI_Layer D1[Sentence-BERT
Fine-tuned BGE-small] D2[FAISS Indexes
Category-specific
Strategy 6-7] end subgraph LLM_Layer E1[Claude Haiku
Extraction] E2[Claude Sonnet
Validation] E3[Gemini Flash
Explanations] end subgraph Agentic F[Claude Sonnet
Computer Use
Web Research] end A --> B B --> C1 C1 -->|70% coverage| G[Results] C1 -->|No match| D1 D1 --> D2 D2 -->|+8% coverage| G D2 -->|No match| F F -->|+2% coverage| G G --> E1 G --> E2 G --> E3 style A fill:#667eea,stroke:#667eea,color:#fff style B fill:#764ba2,stroke:#764ba2,color:#fff style C1 fill:#48bb78,stroke:#48bb78,color:#fff style D2 fill:#9f7aea,stroke:#9f7aea,color:#fff style G fill:#667eea,stroke:#667eea,color:#fff

Embeddings: Similarity Search (Strategy 6)

Model: Sentence-BERT (BGE-small-en-v1.5) fine-tuned on industrial parts

Purpose: Learn part number patterns from verified interchanges

Cost: $0 (local inference, no API calls)

FAISS IndexFlatIP <1ms queries 96.3% Recall@10 Category-specific indexes

LLMs: Validation & Explanation (Future Layer)

NOT for retrieval (embeddings faster/cheaper), used for: complex reasoning, validation, explanation

Total AI Cost: <$40/month for 200K items (vs $4.7M annual savings = 0.01% cost)

Agentic AI: Exception Research (2-5% of parts)

Use: Claude Sonnet with computer use for unknown manufacturers, discontinued parts, high-value RFQs

Cost: $0.50-$2.00 per part (exceptions only, economically justified)

Full Stack

Sentence-BERT (BGE-small) FAISS Postgres FastAPI Claude (Haiku/Sonnet) Gemini Flash Python

Key Takeaways

For Supply Chain Operations

For AI Engineering Teams