Oracle - Sales Prediction ML | Aaron J. Spurlock

Overview

Oracle is a custom-trained machine learning system that predicts sale outcomes for incoming leads. Unlike simply calling OpenAI or using off-the-shelf ML services, Oracle is trained entirely on Midtown's proprietary historical data—making it uniquely tuned to our specific market, customer base, and sales patterns.

This isn't "we added AI to our product." This is building a custom ML model from scratch that directly impacts revenue through intelligent lead routing.

What Oracle Does

Sale Prediction

Given a new lead, Oracle predicts the probability of a successful sale:

Analyzes lead characteristics against historical patterns
Returns a probability score (0-100%)
Factors in timing, location, product interest, and more

Rep Matching

Oracle ranks which sales rep has the best chance of closing each lead:

Considers rep's historical performance in similar scenarios
Factors in territory familiarity, product expertise, workload
Enables intelligent lead routing beyond simple round-robin

Lift Calculation

Measures the improvement Oracle provides over baseline:

Compares Oracle-routed leads vs. random assignment
Quantifies the revenue impact of intelligent routing
Provides confidence metrics for predictions

Training Data Sources

Oracle's power comes from the richness of its training data:

Historical Appointment Data

Thousands of historical appointments with known outcomes
Sale price, close rate, time-to-close patterns
Rep performance across different scenarios

Geolocation Data

Property location and neighborhood context
Proximity to branches and service areas
Local market characteristics

Soft Credit Data

Financial signals indicating purchase capacity
Risk scoring for financing considerations
Correlated with historical close rates

Market Data

Local trends and seasonality
Competitive landscape factors
Economic indicators

Technical Architecture

Feature Engineering Pipeline

Raw data is transformed into ML-ready features including base lead score, neighborhood context, credit band normalization, days since inquiry, product demand indices, rep-territory fit scores, rep product expertise, and seasonality factors.

Model Training

Algorithm: XGBoost gradient boosting
Validation: Time-based cross-validation (no data leakage)
Metrics: AUC-ROC, precision/recall at various thresholds
Retraining: Periodic retraining as new data accumulates

Production Integration

Oracle exposes a REST API consumed by Evergreen. The CRM sends lead ID and extracted features, and receives back a probability score (0-100%), ranked list of recommended reps, and a lift calculation showing improvement over baseline.

Portfolio Significance

Oracle demonstrates end-to-end ML engineering:

Data Engineering: Building ETL pipelines for diverse data sources
Feature Engineering: Transforming raw data into predictive signals
Model Training: Selecting algorithms, tuning hyperparameters
Validation: Proper time-based splits, avoiding data leakage
Deployment: Production API serving real-time predictions
Integration: Seamless connection with the CRM workflow
Monitoring: Tracking model performance over time

This isn't a weekend project or a tutorial exercise—it's a production ML system that influences real business decisions and measurably impacts revenue.

Results

Oracle's predictions directly improve sales outcomes:

Intelligent Routing: Leads matched to reps most likely to close
Priority Optimization: High-probability leads get faster attention
Rep Development: Insights into what makes successful matches
Continuous Learning: Model improves as more data accumulates

Lessons Learned

Building Oracle taught valuable lessons about production ML:

Data quality matters more than algorithm choice: Clean, representative data beats fancy models
Feature engineering is the real work: Most time spent understanding and transforming data
Time-based validation is critical: Prevents overly optimistic performance estimates
Integration is half the battle: The best model is useless if it's not embedded in workflows

Oracle - Sales Prediction ML

Overview

What Oracle Does

Sale Prediction

Rep Matching

Lift Calculation

Training Data Sources

Historical Appointment Data

Geolocation Data

Soft Credit Data

Market Data

Technical Architecture

Feature Engineering Pipeline

Model Training

Production Integration

Portfolio Significance

Results

Lessons Learned

Technology Stack

ml

data

integration

Related Projects

Evergreen

AI/Automagic Platform

Business Intelligence Dashboard

Interested in working together?