Key Takeaways
- Model Context Protocol (MCP) server support is now the integration path that matters most for agentic workloads - platforms without maintained MCP servers force teams to write and own integration glue forever.
- The four requirements for AI-ready source of truth are MCP-ready data access, API-first with bulk operations, open and extensible data model, and operational scale for your environment.
- A meaningful evaluation takes five steps: define the data model first, test the API at real query rates, validate MCP integration support, confirm migration tooling, and build a managed agent reference implementation.
- NetBox is the default choice for network-first SoT needs with the cleanest MCP integration available today, while Nautobot fits teams with internal Python skill and custom workflow requirements.
- CI/CD discipline and managed agent reference implementations are not optional features - they are the foundation that determines whether the platform delivers value after deployment.
The source-of-truth problem surfaces when automation stalls
Most network teams discover their source-of-truth problem only after their first automation project stalls. Spreadsheets, wikis, and stale CMDB records cannot drive deterministic automation, and they certainly cannot ground an AI agent that needs to answer questions about real infrastructure. Once a team commits to AI-assisted operations, the platform decision becomes harder, not easier, because the bar moves from "can this hold data" to "can an MCP server expose this data to an agent under tenant-scoped permissions at the rate automation will drive."
The patterns below are the four failure modes we see most often in initial SoT engagements:
Four requirements for an AI-ready source of truth
The platforms worth considering in this category have all converged on roughly the same feature set for traditional IPAM. Subnet management, IP reservation, device records, VLAN tracking, and circuit modeling are table stakes everywhere. The dimensions that matter now are different, and they map directly to whether AI agents and managed automation can use the platform without constant human translation.
The four requirements below should be evaluated together. A platform that scores well on three and poorly on one will block your automation roadmap at the weakest dimension.
MCP-Ready Data Access
Model Context Protocol (MCP) server support, either shipped by the vendor or maintained as a first-party open-source project, is now the integration path that matters most for agentic workloads. An MCP server exposes SoT data through the protocol so that AI agents can read prefixes, devices, and circuits, and in some cases write back changes, under tenant-scoped authorization. NetBox Labs ships a maintained MCP server for NetBox. The Nautobot community has MCP work in active development. Infoblox and BlueCat MCP coverage varies sharply by product line and version. Platforms with no MCP server force teams to write and own integration glue that breaks with each model and platform upgrade.
API-First With Bulk Operations
Every object must be readable, writable, and queryable through a REST or GraphQL API, including bulk reads that scale to thousands of records per second. The right test is not "does the API exist" but "can it sustain the query pattern of a working agent." A common failure mode is a platform whose API returns small pages with low rate limits, which forces agents into hundreds of paginated requests for a single reasoning step. If automation has to scrape a UI, paginate through hundreds of small responses, or wait on async batch jobs to get current state, the platform is not ready for agentic workloads at production scale.
Open and Extensible Data Model
Custom fields, custom object types, and relationship modeling matter more than out-of-the-box features. The environments that need a SoT the most are the ones that do not look exactly like the vendor's reference architecture. The right platform lets you add object types for tenants, services, applications, and physical assets without paying for professional services or waiting on a vendor roadmap. Watch for hidden limits: maximum custom fields per object type, restrictions on relationships between custom types, and whether custom fields are exposed through the API and the MCP server the same way native fields are. Half-exposed extensions are worse than no extensions because they create silent data gaps that automation cannot see.
Operational Scale for Your Environment
Scale is not a feature, it is a constraint. Match the platform's tested ceilings to your environment plus three years of growth. Devices, prefixes, IP addresses, VRFs, sites, and especially the sustained query rate that automation and MCP traffic will drive against the API. NetBox at large scale comfortably handles millions of IP addresses and tens of thousands of devices, but the query rate ceiling depends on database tuning and infrastructure sizing that defaults will not give you. Infoblox at high query rates often requires Grid sizing that the original deployment did not plan for. Test under load with the same query patterns the agents will produce, not the patterns the UI produces.
A five-step evaluation
Run every candidate platform through the same evaluation. The goal is not to pick a winner on paper but to validate that the platform can carry your automation roadmap and your AI agent workloads for the next three years. Each step below produces a concrete artifact that the evaluation team can refer back to. Skip steps at your own cost: the platforms that look identical in a vendor demo can score very differently once they are exercised against real query patterns.
1. Define the data model first, platform second
List every object type you need (devices, prefixes, VRFs, circuits, contacts, services, tenants, applications), the relationships between them, the custom fields specific to your environment, and the tenancy model that authorization will follow. Most evaluations start with the platform demo, which is backwards. The data model decides whether the platform fits. Sketch it on paper or in a draft schema document before any vendor call. Vendors who push back on schema-first conversations are signaling something about the engagement that follows.
2. Test the API at real query rates
Write a script that simulates the query pattern of a working automation pipeline: read prefixes for a site, reserve next available IP, write back changes, repeat. Run it at the rate your roadmap will require, which is typically 50 to 500 requests per second for a mid-sized environment and higher for AI agent workloads that explore the data model. Many platforms slow to a crawl past a few requests per second on default configurations. Capture p95 and p99 response times, not averages. A platform with a 100 ms median and a 4 second p99 will break agents that chain calls.
3. Validate Model Context Protocol (MCP) integration support
Ask for the MCP server itself, the integration documentation, and a working example that connects to a sandbox instance and queries real data. If the answer is a roadmap commitment with no shipping code, treat that as a one to two year delay in your AI initiatives. Vendors who already ship MCP server implementations have already done the architectural work to expose their data under tenant-scoped permissions. Vendors who do not are asking you to wait for them, or to write the MCP server yourselves, which is a build project you almost certainly should not take on.
4. Confirm migration and bulk import tooling
Existing data lives in spreadsheets, an old IPAM platform, or a CMDB. Migration is the most expensive part of every SoT engagement and the place where projects stall. Platforms with strong bulk import tooling, conflict reconciliation, dry-run modes, and idempotent import scripts cut migration time by half or more. Ask for the actual tools, run an import against a sample of your data, and time it. If migration of 10,000 records takes hours rather than minutes, scale that out to your actual volume and the project plan changes.
5. Build a managed agent reference implementation
Before signing a multi-year contract, prove the platform can drive at least one production automation: next-available-IP, subnet provisioning, or device record reconciliation against a real source. The reference implementation should sit behind the MCP server, use Git for version control, run through a CI pipeline before deployment, and include a rollback path. The agent is what proves the platform was the right buy. Vendors who cannot help you stand this up in a proof-of-concept window are vendors whose platform will not carry your automation roadmap.
What a source of truth engagement should produce
A platform purchase by itself does not solve the SoT problem. The work that turns the platform into an operational system is what makes it valuable. The deliverables below define a complete IVI engagement. Each is a concrete artifact, not a checklist of activity, and each survives the handoff so the customer team can operate and extend the platform without ongoing dependency.
Data Model Specification
A documented schema covering every object type, custom field, tenancy boundary, and relationship, validated against the actual environment and the planned automation use cases. The specification is version-controlled and updated through the same change process as the platform configuration itself, so the schema and the platform never drift from one another.
Migrated and Reconciled Data Set
Initial data load from existing systems, reconciled against the live network through discovery and validation passes to catch the drift between what was tracked and what actually exists. Reconciliation almost always surfaces 5 to 20 percent of records that are wrong or stale, and the engagement closes those gaps before the platform goes live as an authoritative source.
Managed Agent Reference Implementation
At least one production automation built on top of the SoT and exposed through the MCP server where applicable, with proper CI/CD discipline including version control, automated tests, peer review, staged deployment, and a defined rollback path. The reference implementation is the template the customer team uses to build every subsequent agent.
Operational Runbook and Handoff
Documentation covering platform operations, MCP server health, API rate limits, backup and restore procedures, on-call escalation, and the change process for both data and schema. Training sessions for the customer team and a defined operating model so the platform can be extended without depending on external help for routine work.
Platform comparison
Each option below is evaluated across six fields. The platform name, category, how it works in practice, the environment where this option is the strongest choice, the real costs and limitations of choosing this path, and where this option sits in our engagement model.
NetBox Cloud or NetBox Enterprise
Managed open-source SoT
NetBox is the open-source standard for network source of truth and the most widely deployed SoT platform in the IVI customer base. NetBox Labs offers a managed Cloud version and an Enterprise support tier on top of the open-source core. The data model is mature, the REST and GraphQL APIs are well-documented and performant at scale, and NetBox Labs ships and maintains a Model Context Protocol (MCP) server that exposes NetBox data to AI agents under tenant-scoped permissions. For a team that wants to start with the largest community, the cleanest MCP integration story available today, and a clear path to managed operations, NetBox is the default choice.
Best fit: Mid-market and enterprise teams with a network-first SoT need, an automation roadmap that depends on Python tooling and APIs, a requirement for working MCP server integration today rather than on a vendor roadmap, and a willingness to operate or pay for a managed instance of the platform.
Tradeoffs: NetBox is strong on network objects but lighter on broader IT and service modeling than enterprise CMDBs. Customizing the data model beyond a certain point requires plugin development or careful use of custom fields, and plugins are a maintenance commitment over time. The Cloud offering is newer than the long-running self-hosted product and feature parity moves quickly but is not always complete.
IVI recommendation: This is our default recommendation for network-led SoT engagements. IVI deploys NetBox Cloud or NetBox Enterprise with full data model design, migration from existing systems, MCP server integration, and a managed agent reference implementation in 8 to 12 weeks.
Nautobot
Extensible Python-native platform
Nautobot started as a NetBox fork and has evolved into a platform built around Network Automation Apps and Jobs that extend the core data model and run workflows directly inside the platform. Nautobot is the right choice when the SoT needs to host significant custom logic, scheduled jobs, and integrations as first-class platform features rather than as external automation code. The Apps model gives engineering teams a structured place to extend behavior, and the Jobs framework lets the platform itself execute automation triggered by events, schedules, or API calls.
Best fit: Teams with internal Python skill and a willingness to invest in platform engineering, environments that need custom workflows tightly coupled to the SoT, and organizations that prefer one platform that holds both data and automation logic rather than splitting them across systems.
Tradeoffs: The added power of Apps and Jobs comes with operational complexity. Each App and Job is code the customer team owns, tests, and maintains, and the platform upgrade path includes validating those extensions. The Model Context Protocol (MCP) integration story for Nautobot is moving but trails NetBox slightly at this writing, with community work in progress rather than a fully maintained vendor server.
IVI recommendation: We recommend Nautobot when the customer has the engineering capacity to own and extend the platform long-term. IVI deploys Nautobot, builds the initial App and Job set, integrates the available MCP tooling, and establishes the CI/CD pipeline in a 10 to 14 week engagement.
Infoblox or BlueCat
Enterprise DDI platforms
Infoblox and BlueCat are enterprise platforms that combine IPAM with managed DNS and DHCP services, often called DDI (DNS, DHCP, IPAM). The integrated control plane is the main attraction. A single platform manages the records that DNS resolves, the addresses that DHCP hands out, and the IPAM allocations that drive both. Teams that already operate Infoblox or BlueCat for DDI services often choose to extend usage into the SoT role rather than introduce a second platform alongside their existing investment.
Best fit: Large enterprises with existing Infoblox or BlueCat deployments, strong needs for managed DDI services tightly coupled with SoT, regulatory or operational preferences for vendor-supported platforms over open source, and teams without the engineering capacity to own a Python-native platform.
Tradeoffs: Data models are less flexible than NetBox or Nautobot, and custom object types are limited or require professional services engagements to implement. API performance at high sustained query rates varies by Grid sizing and product line, and load testing is essential before committing to agent workloads. Model Context Protocol (MCP) support is uneven across product lines and versions, with no fully maintained MCP server available at the level NetBox provides today.
IVI recommendation: We recommend extending Infoblox or BlueCat into the SoT role when DDI integration is the dominant requirement and the customer already operates the platform at scale. IVI delivers the SoT extension, API integration, and automation enablement as a 6 to 10 week engagement, with MCP integration handled as a custom build where the product line does not yet ship a server.
How IVI engages
Most platform deployments end with a working install and a frustrated operations team. IVI's SoT engagements are structured around the handoff. The platform is one deliverable. The operating discipline is the other. The two items below describe how that gets built in practice across every engagement, regardless of which platform the customer selects.
CI/CD discipline built in from day one
Operational maturity
Version control, automated tests, peer review, and a rollback path are not optional features of an automation program. They are the foundation. IVI sets them up before the first managed agent ships, even for teams that have never run CI/CD before, because retrofitting discipline onto a running automation program is far more expensive than building it in from the start.
How It Works: Every change to the SoT data model, every agent, and every integration is tracked in a Git repository with pipeline-driven deployment through a CI tool such as GitHub Actions or GitLab CI, automated test coverage for the change itself, and a defined rollback procedure that has been exercised at least once in a non-production window. We coach the customer team through the workflow until it is the natural way they work.
Why It Matters: Teams that adopt CI/CD discipline during their first SoT engagement carry that discipline into every subsequent project. Teams that skip it spend the next two years rebuilding what they shipped without it, and they accumulate the kind of brittle scripts and untested production changes that erode trust in automation across the whole organization.
Managed agent reference implementation
The build layer
The SoT is the platform you buy. The managed agents are the value you build on top. Every IVI SoT engagement ships at least one production agent so the customer sees the buy-and-build pattern working end-to-end before we hand off, with the agent exposed through the Model Context Protocol (MCP) server where the platform supports it.
Typical First Agents: Next-available-IP allocation, subnet provisioning workflows triggered by ticket or chat, device record reconciliation against live network state through ongoing discovery, and managed handoff of SoT data to downstream automation platforms for configuration generation and change validation.
ROI Discipline: We only recommend building agents where the projected return on investment is under 12 months. Anything longer is a sign the team should buy a commercial agent, wait for the market to mature, or rescope the work into something smaller. The ROI rule keeps the build layer honest and prevents the engagement from accumulating speculative work.