Industrial RFQ Matching | Reference Architecture

The Challenge

Industrial MRO distributors process RFQs (50-50,000 items) where customers use different part numbering systems. The matching system must map these to a large catalog (10M-14M+ products).

Critical Metric: Fill Rate vs Accuracy

Fill Rate (Primary Goal)

Definition: % of RFQ items successfully quoted

Industry Reality: <50% for many distributors

Why it matters: Half of customer requests go unfilled = lost revenue

Accuracy (Confidence on Matches)

Definition: Confidence score on items that ARE matched

Typical Target: 85%+ on matched items

Why it matters: Wrong match = returns, safety issues, trust loss

The Goal: Maximize fill rate (get to 50%+ quote rate) while maintaining high accuracy (85%+ confidence) on matches. These are different metrics addressing different problems.

The Solution: Category-Based Routing + Progressive Cascade

Core Architecture

graph TD A[RFQ Input
50-50K items] --> B[TIER 1: CATEGORY CLASSIFIER] B --> B1[Pattern-based
bearing 6208] B --> B2[Keyword-based
desc: FUSE] B --> B3[Pre-categorization
Motion IDs] B --> C1[Bearing Matcher
~1M products
Threshold: 0.85] B --> C2[Sprocket Matcher
~500K products
Threshold: 0.85] B --> C3[Fuse Matcher
~200K products
Threshold: 0.90] B --> C4[Hydraulic Matcher
~2M products
Threshold: 0.85] C1 --> D[Category-Specific
Progressive Cascade] C2 --> D C3 --> D C4 --> D style A fill:#667eea,stroke:#667eea,color:#fff style B fill:#764ba2,stroke:#764ba2,color:#fff style C3 fill:#f56565,stroke:#f56565,color:#fff style D fill:#48bb78,stroke:#48bb78,color:#fff

Progressive 11-Strategy Cascade

Layer 1: Verified Data (70% coverage, <1ms)
- Strategy 0: Exact match (confidence 1.0)
- Strategy 1: V1 interchange (confidence 0.95)
- Strategy 2: Fuzzy match (confidence 0.90)
- Strategy 3: Prefix match (confidence 0.85)
Layer 2: Semantic Patterns (+15% coverage, 5-10ms)
- Strategy 4: Bearing base numbers
- Strategy 5: Category-specific extraction
Layer 3: AI Embeddings (+8% coverage, <1ms FAISS)
- Strategy 6: Fine-tuned Sentence-BERT
- Strategy 7: Category-specific FAISS indexes
Layer 4: Agentic Research (+2% coverage, high-value only)
- Strategy 8: Web research for unknowns

Progressive Cascade Flow

graph LR A[Query Part] --> B{Layer 1
Verified Data
70% coverage} B -->|Match| Z[Return Result
confidence 0.85-1.0] B -->|No Match| C{Layer 2
Semantic Patterns
+15% coverage} C -->|Match| Z C -->|No Match| D{Layer 3
AI Embeddings
+8% coverage} D -->|Match| Z D -->|No Match| E{Layer 4
Agentic Research
+2% coverage} E -->|Match| Z E -->|No Match| F[Manual Review] style B fill:#48bb78,stroke:#48bb78,color:#fff style C fill:#4299e1,stroke:#4299e1,color:#fff style D fill:#9f7aea,stroke:#9f7aea,color:#fff style E fill:#ed8936,stroke:#ed8936,color:#fff style Z fill:#667eea,stroke:#667eea,color:#fff style F fill:#cbd5e0,stroke:#a0aec0,color:#1a1a1a

Key Architectural Decisions

Category-First, Not Search-First

Search 1M per category vs 14M monolithic = 14x faster

Evidence Before AI

70% solved via verified data (100% accurate) before any ML

Multi-Dimensional Thresholds

Safety-critical (0.90+) vs general (0.85) vs commodity (0.80)

Category-Specific Logic

Bearings: base numbers. Sprockets: teeth count. Fuses: amperage

Common Pain Points in Industrial RFQ Matching

How this architecture addresses typical challenges

Pain #1: Large RFQs (50K-200K items) cause performance issues

Solution: Category routing reduces search space 10-14x (1M per category vs 10M-14M total). Batch: 397 items/sec throughput validated.

Pain #2: No visibility into per-category fill rates

Solution: Per-category reporting built-in. Example: "Bearings 95%, Hydraulics 78%, Electrical 85%" shows data gaps clearly.

Pain #3: Confidence scores don't reflect data quality

Solution: Progressive cascade provides explainable confidence. Exact match = 1.0, verified interchange = 0.95, AI = 0.65-0.90.

Pain #4: One-size-fits-all approach misses domain patterns

Solution: Category-specific strategies extract domain knowledge (bearing base numbers, sprocket teeth, fuse amperage).

Technology Stack

graph TB subgraph Input A[RFQ Items] end subgraph Classification B[Category Classifier
Pattern + Keyword] end subgraph Verified C1[Postgres DB
V1 Verified Data
Strategies 0-3] end subgraph AI_Layer D1[Sentence-BERT
Fine-tuned BGE-small] D2[FAISS Indexes
Category-specific
Strategy 6-7] end subgraph LLM_Layer E1[Claude Haiku
Extraction] E2[Claude Sonnet
Validation] E3[Gemini Flash
Explanations] end subgraph Agentic F[Claude Sonnet
Computer Use
Web Research] end A --> B B --> C1 C1 -->|70% coverage| G[Results] C1 -->|No match| D1 D1 --> D2 D2 -->|+8% coverage| G D2 -->|No match| F F -->|+2% coverage| G G --> E1 G --> E2 G --> E3 style A fill:#667eea,stroke:#667eea,color:#fff style B fill:#764ba2,stroke:#764ba2,color:#fff style C1 fill:#48bb78,stroke:#48bb78,color:#fff style D2 fill:#9f7aea,stroke:#9f7aea,color:#fff style G fill:#667eea,stroke:#667eea,color:#fff

Embeddings: Similarity Search (Strategy 6)

Model: Sentence-BERT (BGE-small-en-v1.5) fine-tuned on industrial parts

Purpose: Learn part number patterns from verified interchanges

Cost: $0 (local inference, no API calls)

FAISS IndexFlatIP <1ms queries 96.3% Recall@10 Category-specific indexes

LLMs: Validation & Explanation (Future Layer)

NOT for retrieval (embeddings faster/cheaper), used for: complex reasoning, validation, explanation

Claude Haiku: Attribute extraction ($0.025/1K parts)
Claude Sonnet: Compatibility validation ($0.30/1K validations)
Gemini Flash: Match explanations ($0.0004/100 explanations)

Total AI Cost: <$40/month for 200K items (vs $4.7M annual savings = 0.01% cost)

Agentic AI: Exception Research (2-5% of parts)

Use: Claude Sonnet with computer use for unknown manufacturers, discontinued parts, high-value RFQs

Cost: $0.50-$2.00 per part (exceptions only, economically justified)

Full Stack

Sentence-BERT (BGE-small) FAISS Postgres FastAPI Claude (Haiku/Sonnet) Gemini Flash Python

Key Takeaways

For Supply Chain Operations

What this solves: Matches customer RFQs (varying formats) to large industrial catalogs (10M-14M+ products)
Why category-based routing: 10-14x smaller search space, prevents cross-category errors, enables per-category reporting
Fill rate vs accuracy: Fill rate = % quoted, accuracy = confidence on matches. This architecture addresses both.
Operational transparency: Per-category metrics show where data gaps exist

For AI Engineering Teams

Architecture principles: Evidence before AI, category-first, progressive cascade, category-specific thresholds
Tech stack: Sentence-BERT (fine-tuned) + FAISS + Postgres + Claude (validation) + FastAPI
Key learnings: Embeddings = fallback not primary, one threshold ≠ all use cases, category-specific beats global
Validated results: 96.3% Recall@10, 8.3x improvement, <1ms queries (single-category proof)
Critical path: Training data acquisition determines timeline to multi-category deployment

Industrial RFQ Matching: Reference Architecture