Building a Production-Grade Database-Driven AI Model Selection System

January 12, 2026

Share:
Building a Production-Grade Database-Driven AI Model Selection System

Building a Production-Grade Database-Driven AI Model Selection System

When building enterprise AI applications, one of the most critical architectural decisions is how to manage AI model selection at scale. After months of iteration, we've arrived at a system that eliminates hardcoded models entirely, provides strict tenant isolation, and enables users to switch providers without code changes.

The Problem with Hardcoded Models

Most AI applications start with hardcoded model names scattered throughout the codebase:

// ❌ The old way - hardcoded everywhere
const model = getModel("gpt-4o");
const chatModel = getModel("llama-3.3-70b-versatile");
const toolModel = getModel("openai/gpt-oss-120b");

This approach creates several critical problems:

  1. Configuration drift: Model names duplicated across 50+ files
  2. Provider lock-in: Switching providers requires code changes
  3. No user control: Users can't choose their preferred models
  4. Silent failures: Hardcoded fallbacks mask configuration errors
  5. Tenant isolation issues: Personal and organization contexts share models
  6. Deployment complexity: Model changes require code deployments

Our Solution: Database-Driven Model Catalog

We built a system where every AI model configuration lives in the database. No hardcoded models. No fallbacks. No exceptions.

This post walks through the architecture that powers Fabric's production AI infrastructure.


Architecture Overview: Database as Single Source of Truth

The system is built on five core database tables that form a complete AI model catalog:

Loading diagram...

Core Principles

  1. Zero hardcoded models: Every model name comes from the database
  2. Strict tenant isolation: Personal and organization contexts are completely separate
  3. Clear preference hierarchy: User Override → Org Override → System Default → Error
  4. No silent fallbacks: Throws clear errors if configuration is missing
  5. Provider flexibility: Switch providers without code changes
  6. Single source of truth: All provider metadata centralized in one module

Centralized Provider Configuration

One of the most important architectural decisions was eliminating all duplicated provider configuration. In many AI applications, provider metadata ends up scattered across multiple files—display names here, capability flags there, URL mappings somewhere else. This leads to configuration drift and bugs when providers are added or modified.

The Problem with Scattered Configuration

Before centralization, we had provider information duplicated in three places:

  • Backend API handlers
  • Frontend settings components
  • Model resolution logic

When we added a new provider or changed a capability flag, we had to update multiple files and hope we didn't miss one. Inevitably, bugs crept in—providers showing as "embedding capable" in the UI but failing at runtime.

The Single Source of Truth Approach

We consolidated all provider metadata into a centralized configuration module. This module defines:

Provider Categories: Gateways (Vercel, OpenRouter, Cloudflare), cloud platforms (Azure, AWS, Google), direct providers (OpenAI, Anthropic, Groq, Cerebras, etc.), and special providers (Hybrid, Custom).

Provider Metadata: Display names, descriptions, and capability flags for all 21 supported providers.

Capability Functions: Helper functions that answer questions like "Can this provider support embedding models?" or "Is this a gateway provider?"

The Frontend/Backend Split

One interesting challenge: frontend client components can't import from the database package because it pulls in server-side dependencies (Prisma, PostgreSQL drivers). These don't run in the browser.

Our solution is a two-file architecture:

  • Backend module: The authoritative source in the database package
  • Frontend module: A client-safe copy in the web app's settings module

Both files define the same constants and functions. When we add a new provider, we update both. It's a small trade-off for the benefit of having clean, importable helper functions on both sides.

Benefits of Centralization

Before:

  • 250+ lines duplicated across 3 files
  • Provider capabilities scattered across codebase
  • Easy to miss updates when adding providers
  • Runtime capability bugs from inconsistent data

After:

  • Single source of truth (+ frontend copy for browser)
  • Centralized capability functions
  • Clear update checklist for new providers
  • Compile-time type safety

This architecture makes adding new providers straightforward: update the centralized config, mirror to frontend, and everything just works.


The AI Model Catalog

At the heart of the system is the AiModel table, which contains canonical definitions for 50+ AI models:

// Example: Llama 3.3 70B in the catalog
{
  id: "model_abc123",
  canonicalName: "llama-3.3-70b",
  displayName: "Llama 3.3 70B",
  family: "llama",
  vendor: "Meta",
  capabilities: ["TEXT", "CODE", "REASONING"],
  contextWindow: 131072,
  speedTier: "FAST",
  qualityTier: "PREMIUM",
  suitableForTasks: ["CHAT", "COMPLEX", "TOOL_CALLING"]
}

Provider-Specific Model Mappings

The same canonical model has different IDs on different providers. The AiModelProviderMapping table handles this translation automatically.

Example: Llama 3.3 70B across providers

  • Canonical name: llama-3.3-70b
  • Cerebras: llama-3.3-70b
  • Groq: llama-3.3-70b-versatile
  • Vercel Gateway: groq/llama-3.3-70b-versatile

Example: GPT-4o (only available via gateway)

  • Canonical name: gpt-4o
  • Cerebras: not available
  • Groq: not available
  • Vercel Gateway: openai/gpt-4o

This mapping layer is what enables seamless provider switching—users change their provider in Settings, and the system automatically resolves to the correct model ID.

// Example: Provider mappings for llama-3.3-70b
{
  modelId: "model_abc123",
  provider: "CEREBRAS",
  providerModelId: "llama-3.3-70b"
},
{
  modelId: "model_abc123",
  provider: "GROQ",
  providerModelId: "llama-3.3-70b-versatile"
},
{
  modelId: "model_abc123",
  provider: "VERCEL_GATEWAY",
  providerModelId: "groq/llama-3.3-70b-versatile"
}

When a user switches from Groq to Cerebras, the system automatically selects the correct provider-specific model ID. No code changes required.


Dynamic Model Resolution Flow

When a user makes an AI request, the system dynamically resolves the model configuration:

Loading diagram...

Preference Hierarchy

The system follows a strict three-level hierarchy:

  1. User Override (UserModelPreference table)

    • User's explicit choice for this task type + provider
    • Highest priority
    • Example: User wants GPT-4o for CHAT tasks on OpenAI
  2. Organization Override (OrganizationModelPreference table)

    • Organization's default for this task type + provider
    • Only applies in organization context
    • Example: Org mandates Claude Sonnet 4.5 for all COMPLEX tasks
  3. System Default (AiTaskModelDefault table)

    • Provider-specific defaults seeded from database
    • Example: Cerebras defaults to llama-3.3-70b for CHAT
  4. Error (NO hardcoded fallbacks)

    • Throws clear error: "No model configured for CEREBRAS + TOOL_CALLING"
    • Prevents silent failures with wrong models

Task Types and System Defaults

The AiTaskModelDefault table defines optimized models for each task type per provider. Here's how the defaults are configured:

SIMPLE — Fast tasks like title generation and summarization

  • Cerebras: llama3.1-8b | Groq: llama-3.1-8b-instant | OpenAI: gpt-4o-mini

COMPLEX — Detailed generation like documents and analysis

  • Cerebras: llama-3.3-70b | Groq: llama-3.3-70b-versatile | OpenAI: gpt-4o

CHAT — Conversational AI

  • Cerebras: llama-3.3-70b | Groq: llama-3.3-70b-versatile | OpenAI: gpt-4o

TOOL_CALLING — Function calling and MCP tools

  • Cerebras: gpt-oss-120b | Groq: openai/gpt-oss-120b | OpenAI: gpt-4o

REASONING — Deep analysis and problem-solving

  • Cerebras: gpt-oss-120b | Groq: deepseek-r1-distill-llama-70b | OpenAI: o1

EMBEDDING — Vector generation for RAG

  • All providers: text-embedding-3-small

IMAGE / AUDIO — Media generation and transcription

  • Image: dall-e-3 | Audio: whisper-1

Why Different Models for Different Tasks?

SIMPLE tasks use smaller, faster models (8B parameters) for quick responses:

  • Title generation
  • Text summarization
  • Simple Q&A

COMPLEX tasks use larger, more capable models (70B+ parameters):

  • Document generation
  • Detailed analysis
  • Code generation

TOOL_CALLING tasks require models with reliable function calling:

  • gpt-oss-120b: OpenAI's open-source model with native tool calling
  • Available on Groq and Cerebras for fast inference
  • More reliable than Llama models for structured outputs

REASONING tasks use specialized models:

  • DeepSeek R1: Chain-of-thought reasoning
  • OpenAI o1: Advanced problem-solving

Strict Tenant Isolation

One of the most critical aspects of the system is strict tenant isolation between personal and organization contexts.

The XOR Pattern

Every database query uses an exclusive OR (XOR) pattern to ensure data never leaks between contexts:

// ✅ CORRECT - XOR pattern
const tenantFilter = organizationId
  ? { organizationId, userId }           // Org context
  : { organizationId: null, userId };    // Personal context (null is REQUIRED)

const models = await db.userModelPreference.findMany({
  where: tenantFilter
});

// ❌ WRONG - Leaks data between contexts
const models = await db.userModelPreference.findMany({
  where: { OR: [{ userId }, { organizationId }] }  // NEVER DO THIS
});

Context-Aware Model Resolution

When resolving models, the system always includes the tenant context:

// Personal context
const modelString = await getConfiguredModelString("CHAT", {
  userId: "user_123",
  organizationId: null  // Explicitly null for personal
});

// Organization context
const modelString = await getConfiguredModelString("CHAT", {
  userId: "user_123",
  organizationId: "org_456"  // Org context
});

This ensures:

  • User's personal models are NEVER visible in org context
  • Org A's models are NEVER visible to Org B
  • No accidental data leakage between tenants

Real-World Example: Switching Providers

Let's walk through what happens when a user switches from Groq to Cerebras.

Initial State (Groq)

// User's configuration in database
{
  userId: "user_123",
  provider: "GROQ",
  apiKey: "encrypted_groq_key",
  baseUrl: "https://api.groq.com/openai/v1"
}

// System defaults for GROQ
CHAT → llama-3.3-70b-versatile
TOOL_CALLING → openai/gpt-oss-120b
COMPLEX → llama-3.3-70b-versatile

User Changes Provider in Settings

The user navigates to Settings > AI Providers and selects Cerebras as their default provider.

New State (Cerebras)

// Updated configuration in database
{
  userId: "user_123",
  provider: "CEREBRAS",
  apiKey: "encrypted_cerebras_key",
  baseUrl: "https://api.cerebras.ai/v1"
}

// System defaults for CEREBRAS (automatically resolved)
CHAT → llama-3.3-70b
TOOL_CALLING → gpt-oss-120b
COMPLEX → llama-3.3-70b

What Changed Automatically

  1. Provider-specific model IDs: llama-3.3-70b-versatilellama-3.3-70b
  2. Base URL: Groq API → Cerebras API
  3. API key: Groq key → Cerebras key
  4. Model format: Gateway format → Direct format

Zero code changes. Zero configuration files. Everything from the database.


Implementation: Core Functions

1. Get Configured Model String

The primary function for resolving models:

import { getConfiguredModelString } from "@repo/ai";

// Get model for a task type
const modelString = await getConfiguredModelString("CHAT", {
  userId: "user_123",
  organizationId: "org_456"  // or null for personal context
});

// Returns: "llama-3.3-70b" (for Cerebras)
// or "openai/gpt-4o" (for Vercel Gateway)

This function:

  1. Checks UserModelPreference for user override
  2. Checks OrganizationModelPreference for org override (if in org context)
  3. Falls back to AiTaskModelDefault for system default
  4. Throws error if no configuration found

2. Resolve Model with Provider

Get complete model configuration including provider and API key:

import { resolveModelWithProvider } from "@repo/ai";
import { decryptApiKey } from "@repo/utils";

const config = await resolveModelWithProvider("TOOL_CALLING", {
  userId: "user_123",
  organizationId: "org_456"
}, {
  requiresToolCalling: true  // Validate model supports tool calling
});

// Returns:
// {
//   modelString: "gpt-oss-120b",
//   provider: "CEREBRAS",
//   apiKey: "encrypted_key",
//   baseUrl: "https://api.cerebras.ai/v1"
// }

3. Create Model Instance

Use the resolved configuration to create a model:

import { getModel } from "@repo/ai";

const model = getModel(config.modelString, {
  userId: "user_123",
  organizationId: "org_456",
  apiKey: decryptApiKey(config.apiKey),
  provider: config.provider
});

// Returns: LanguageModel instance ready for AI SDK

4. Execute AI Operation

import { generateText } from "ai";

const result = await generateText({
  model,
  system: "You are a helpful assistant",
  prompt: "Write a blog post about AI"
});

Complete End-to-End Example

Here's a real-world example of generating a document with dynamic model resolution:

// apps/web/modules/saas/documents/actions/generate-document.ts
import { getConfiguredModelString, getModel } from "@repo/ai";
import { getAiProviderApiKey } from "@repo/database";
import { decryptApiKey } from "@repo/utils";
import { generateText } from "ai";

export async function generateDocument(params: {
  userId: string;
  organizationId: string | null;
  prompt: string;
}) {
  // Step 1: Get user's provider configuration
  const providerConfig = await getAiProviderApiKey({
    userId: params.userId,
    organizationId: params.organizationId
  });

  if (!providerConfig) {
    throw new Error("No AI provider configured. Please configure in Settings.");
  }

  // Step 2: Get model for COMPLEX task type
  const modelString = await getConfiguredModelString("COMPLEX", {
    userId: params.userId,
    organizationId: params.organizationId
  });

  // Step 3: Create model instance
  const model = getModel(modelString, {
    userId: params.userId,
    organizationId: params.organizationId,
    apiKey: decryptApiKey(providerConfig.apiKey),
    provider: providerConfig.provider
  });

  // Step 4: Generate document
  const result = await generateText({
    model,
    system: "You are a professional document writer.",
    prompt: params.prompt,
    temperature: 0.7,
    maxTokens: 4000
  });

  return result.text;
}

What Happens Under the Hood

  1. Provider lookup: Queries AiProviderApiKey table for user's default provider
  2. Model resolution: Checks UserModelPreferenceOrganizationModelPreferenceAiTaskModelDefault
  3. Provider mapping: Looks up provider-specific model ID in AiModelProviderMapping
  4. Model creation: Creates appropriate SDK instance (OpenAI, Anthropic, Groq, etc.)
  5. API call: Executes with correct base URL, API key, and model ID

All of this happens dynamically at runtime. No hardcoded models. No configuration files.


Database Schema

The system is built on five interconnected tables:

Loading diagram...

Table Descriptions

AiModel: Canonical model definitions

  • 50+ models from OpenAI, Anthropic, Meta, Google, DeepSeek, etc.
  • Includes capabilities, context window, speed/quality tiers
  • Seeded from seed-ai-models.ts

AiModelProviderMapping: Provider-specific model IDs

  • Maps canonical names to provider-specific IDs
  • Example: llama-3.3-70bllama-3.3-70b-versatile (Groq)
  • Enables automatic provider switching

AiTaskModelDefault: System defaults per task type

  • Optimized models for each task type per provider
  • Example: CEREBRAS + CHAT → llama-3.3-70b
  • Seeded with production-tested defaults

UserModelPreference: User overrides

  • User's explicit choice for task type + provider
  • Tenant-isolated (personal vs organization contexts)
  • Highest priority in resolution

OrganizationModelPreference: Organization overrides

  • Organization's default for task type + provider
  • Only applies in organization context
  • Second priority in resolution

Migration from Hardcoded Models

We completed a major refactoring to eliminate all hardcoded models and duplicated provider configuration. This was one of the most impactful architectural changes we made.

The State Before Migration

Our codebase had accumulated technical debt in several forms:

Hardcoded model names everywhere: Over 100 instances of model names like "gpt-4o" or "llama-3.3-70b-versatile" scattered across 50+ files. When OpenAI deprecated a model or Groq changed their naming convention, we had to hunt through the entire codebase.

Duplicated provider metadata: The same 250+ lines of provider configuration existed in three different files. Adding a new provider meant updating all three and hoping you didn't miss anything.

Silent fallbacks masking errors: When a model wasn't configured, the system would silently fall back to a hardcoded default. Users had no idea they were getting the wrong model.

The Migration Strategy

We took a systematic approach:

  1. Audit: Found and cataloged every hardcoded model and provider constant
  2. Centralize: Created the single-source-of-truth modules for provider configuration
  3. Database: Moved all model defaults to database tables with proper seeding
  4. Validate: Added capability validation (like embedding support checks)
  5. Error: Replaced silent fallbacks with clear, actionable error messages

The Results

Before → After:

  • Hardcoded models: 100+ instances → 0
  • Deprecated constants: 5 major constants → 0
  • Duplicated provider config: 250+ lines × 3 files → 2 files (backend + frontend)
  • Provider metadata: Scattered → Single source of truth
  • Provider switching: Requires code changes → Automatic
  • Tenant isolation: Partial → Complete
  • Error handling: Silent fallbacks → Clear errors
  • Embedding validation: Manual runtime checks → Centralized capability functions

The most satisfying outcome: adding a new provider now takes minutes instead of hours, and we haven't had a "wrong model" bug since the migration.


Adding a New AI Provider

The system is designed to make adding new providers straightforward—a direct benefit of the centralized architecture.

The Seven-Step Process

  1. Schema: Add the new provider to the database enum
  2. Backend Config: Add provider metadata to the centralized configuration module (category, display name, description, capabilities)
  3. Frontend Config: Mirror the same metadata in the client-safe module
  4. Base URL: Configure the API endpoint if using OpenAI-compatible protocol
  5. Model Mappings: Add provider-specific model IDs to the seed script
  6. Task Defaults: Configure which models to use for each task type
  7. Database Seed: Run the seeding command to populate the database

What Makes This Fast

The key insight is that most of this is configuration, not code. You're not writing new API handlers or modifying business logic—you're just declaring metadata and mappings.

For an OpenAI-compatible provider (which most are these days), the entire process takes about 15 minutes:

  • 5 minutes to add the schema and config entries
  • 5 minutes to configure model mappings
  • 5 minutes to seed and test

No Deployment Required

Once the database is seeded, users can immediately:

  • Select the new provider in their Settings
  • Configure their API key
  • Start using it for all task types

The application code doesn't need to change. The provider routing, model resolution, and API key management all work automatically because they're driven by database configuration, not hardcoded logic.


Production Lessons Learned

1. Eliminate All Hardcoded Values

Every hardcoded model name was a potential bug. We found 100+ instances scattered across the codebase. The database-driven approach eliminated all of them.

Key insight: If it can change, it belongs in the database, not in code.

2. Fail Loudly, Not Silently

Hardcoded fallbacks masked configuration errors. Users would get wrong models without knowing why.

// ❌ Bad: Silent fallback
const model = modelString ?? "gpt-4o";  // User has no idea this happened

// ✅ Good: Clear error
if (!modelString) {
  throw new Error(
    "No model configured for CEREBRAS + TOOL_CALLING. " +
    "Please configure in Settings > AI Providers."
  );
}

3. Tenant Isolation is Non-Negotiable

We had several bugs where personal models leaked into organization contexts. The XOR pattern eliminated all of them.

Key insight: Use organizationId: null explicitly for personal context. Never use OR patterns.

4. Provider Compatibility Must Be Validated

Returning groq/gpt-4o when the user has Groq configured is wrong - GPT-4o isn't available on Groq. The AiModelProviderMapping table ensures only compatible models are returned.

5. Database Seeding is Critical

The system is only as good as its seed data. We invested heavily in comprehensive seed scripts with:

  • 50+ canonical models
  • Provider mappings for all major providers
  • Optimized task defaults per provider
  • Production-tested configurations

6. Clear Error Messages Save Time

Instead of generic "Model not found" errors, we provide actionable messages:

Error: No model configured for provider "CEREBRAS" and task "TOOL_CALLING".

To fix:
1. Run: pnpm --filter @repo/database seed:ai-models
2. Or configure a custom model in Settings > AI Providers

User Experience: Settings UI

Users configure their AI providers through a clean Settings interface:

1. Select Default Provider

Settings > AI Providers > Default Provider

[ ] OpenAI Direct
[ ] Anthropic Direct
[x] Cerebras
[ ] Groq
[ ] Vercel AI Gateway
[ ] OpenRouter

2. Configure API Key

Cerebras API Key: [••••••••••••••••••••] [Save]

Get your API key: https://cloud.cerebras.ai/

3. Optional: Override Models per Task Type

Advanced Settings > Model Overrides

Task Type: CHAT
Provider: Cerebras
Model: [llama-3.3-70b ▼]
       - llama-3.3-70b (Default)
       - llama3.1-8b
       - gpt-oss-120b

[Save Override]

4. Organization Settings (Admins Only)

Organization admins can set defaults for all members:

Organization Settings > AI Providers

Default Provider: Cerebras
Organization API Key: [••••••••••••••••••••]

Model Overrides:
- CHAT: llama-3.3-70b
- COMPLEX: llama-3.3-70b
- TOOL_CALLING: gpt-oss-120b

[Save Organization Defaults]

Performance and Reliability

Database Query Optimization

All model resolution queries are optimized with:

  • Indexed lookups on userId, organizationId, taskType, provider
  • Single query to resolve model (no N+1 problems)
  • Cached provider configurations (Redis)

Error Handling

The system handles errors gracefully:

try {
  const modelString = await getConfiguredModelString("CHAT", {
    userId,
    organizationId
  });
} catch (error) {
  if (error.message.includes("No model configured")) {
    // Show user-friendly message with link to Settings
    return {
      error: "Please configure your AI provider in Settings",
      settingsUrl: "/app/settings/ai-providers"
    };
  }
  throw error;
}

Monitoring

We track:

  • Model resolution time (avg: 5ms)
  • Provider API latency
  • Error rates per provider
  • Cost per model per user

What's Next?

Planned Enhancements

  1. Cost-based routing: Automatically select cheaper providers for simple tasks
  2. Latency optimization: Route based on real-time response time metrics
  3. Usage analytics: Detailed per-user and per-provider cost tracking
  4. Model quality evaluation: A/B testing framework for model comparison
  5. Automatic failover: Fallback to secondary provider if primary fails
  6. Rate limit handling: Automatic retry with exponential backoff

Future Provider Support

  • Google Vertex AI: Enterprise-grade AI with data residency
  • Azure OpenAI: Microsoft's managed OpenAI service
  • AWS Bedrock: Amazon's managed AI service
  • Replicate: Community models and fine-tuned variants

Conclusion

Building a production-grade AI model selection system requires careful attention to:

  1. Zero hardcoded values: Everything in the database
  2. Strict tenant isolation: XOR pattern for personal vs organization contexts
  3. Clear error handling: Fail loudly with actionable messages
  4. Provider flexibility: Switch providers without code changes
  5. User control: Let users choose their models and providers

The database-driven approach eliminated 100+ hardcoded models, improved tenant isolation, and made adding new providers trivial. Most importantly, it gave users complete control over their AI infrastructure.


For implementation details, see: AI Model Configuration Documentation