When AI belongs on-prem (and when it doesn't)

The call we've been getting more than any other lately starts with some version of the same sentence: "We ordered some GPU servers and I need to know if we can plug them in." Sometimes the servers are already on the loading dock. Sometimes the PO hasn't been cut yet. Either way, by the time we get there, IT has picked the workload, finance has approved the budget, and nobody's run the math with facilities. Here's what we usually find when we walk the closet — and why the answer is almost always the same.

What's actually in those boxes

A modern AI server is not a rack-and-stack 2U file server with a couple of extra Ethernet ports. It's a dense, power-hungry specialized appliance designed to run GPUs at maximum sustained draw for hours at a time. The scale has moved fast:

An NVIDIA H100 GPU runs up to 700 W by itself. A DGX H100 server — 8 GPUs plus CPUs, NVSwitch fabric, InfiniBand NICs, and DPUs — draws roughly 10–11 kW per chassis at full load. [1]
The newer DGX H200 sits in a similar power envelope at roughly 10.2 kW, and NVIDIA's own documentation specifies it requires a 200–240 V three-phase configuration — a standard wall outlet is explicitly called out as unsuitable. At 208 V three-phase, it draws about 28.3 A per phase at max load. [2]
The DGX B200 (Blackwell) pushes this to about 14.3 kW per chassis, with each B200 GPU running 1,000 W air-cooled or 1,200 W liquid-cooled. [1][3]
The GB200 NVL72 rack-scale system — 72 Blackwell GPUs in a single rack — demands up to 120–140 kW per rack, with mandatory direct-to-chip liquid cooling. [1]

For context: a traditional enterprise server rack — the kind your existing closet was probably designed around — typically lives in the 5–15 kW range per rack. [4] A single DGX B200 is already at the top of that range. A rack of them is somewhere none of your existing infrastructure was asked to go.

Why 125 A single-phase hits a ceiling fast

Many mid-enterprise buildings have exactly the kind of setup this article is named for: a data closet fed by a 125 A single-phase 208 V circuit — two legs of the building's 208Y/120V three-phase service, pulled into a subpanel in the IDF or MDF. It's been fine for years. For workloads up to maybe 15 kW per rack, it's still fine. But the AI server changes the ceiling.

Let's run the math, because this is where the reality shows up.

A single-phase 208 V, 125 A circuit delivers a theoretical 26 kVA of apparent power (V × I). But NEC 210.19(A)(1) requires that branch-circuit conductors and breakers be sized to handle 125% of continuous load — equivalently, continuous loads are limited to 80% of the branch-circuit rating. [5] A server running GPU workloads at sustained high utilization is the textbook definition of a continuous load (defined in the NEC as any load running at maximum current for three or more hours).

Apply the 80% rule and that 26 kVA circuit gives you about 20–21 kW of usable continuous capacity. That's enough for one DGX H200 running hard (10.2 kW), with a little headroom for networking gear and a small UPS — and not much else. Add a second DGX and you're over the line.

And that's before we talk about cooling load, PDU overhead, the cost of balancing two circuits that must not trip each other under startup surge, or the fact that large single-phase loads start to create noticeable voltage imbalance in your building's 208Y/120V panel as a whole.

Why three-phase is the answer (not just "more amps")

Here's the number that makes the conversation different. That same 125 A at 208 V, delivered as three-phase, gives you roughly 45 kVA of apparent power — because three-phase power is computed as √3 × V_line-to-line × I, not just V × I. [6] After the NEC 80% continuous-load derate, that's about 36 kW usable — nearly 75% more capacity out of the same ampere rating on the breaker.

You get that capacity by adding a third current-carrying conductor rather than by upsizing the others, which is a meaningfully cheaper copper bill than going from 125 A single-phase to 200+ A single-phase. And you get it in a form that's better for the equipment, too: three-phase power delivered to a rack is naturally load-balanced across three legs, reducing voltage drop, reducing the neutral-conductor heating that single-phase UPS installations have to plan for, and giving you cleaner power at the GPU's input stage.

// 125 A at 208 V — single-phase vs three-phase

Characteristic

Single-phase 208 V

Three-phase 208Y/120V

Power formula

V × I

√3 × V_LL × I

Theoretical capacity at 125 A

26 kVA

45 kVA

Usable (80% continuous)

~21 kW

~36 kW

Conductors in feeder

2 hot + neutral + ground

3 hot + neutral + ground

Typical rack PDU plug

L6-30, L14-30, etc.

L21-30, L22-30, IEC 60309

Practical rack density served

~2.8 kVA (120V) – 7.4 kVA (230V) per PDU circuit

10–22 kW per rack, scalable

Can also deliver 120V circuits

Only from same leg

Yes — three legs each to neutral = 120V

Figures synthesized from industry references [7][6]; any actual installation should be load-calculated against the specific equipment nameplates and reviewed by a licensed electrical engineer.

Why it's almost always 208 V (and when it isn't)

In the US commercial building stock, the default three-phase service is 208Y/120V — three hot legs at 208 V line-to-line, 120 V from each leg to neutral. It's what most office buildings have at the service entrance, and it's what most mid-enterprise AI deployments will land on. Critically, every major AI server platform — DGX, HGX, and the big Supermicro/Dell/Lenovo reference designs — accepts 200–240 V input natively. [2]

The next step up is 480Y/277V, which is what you see in large data centers and purpose-built colocation space. It reduces conductor sizes further and supports the 100+ kW-per-rack densities that a GB200 NVL72 requires. But 480V requires dedicated step-down transformers for anything running at 208/120V (your lights, your office equipment, your existing PDUs), and it brings a different tier of electrical code attention. For most SMB, SLED, and mid-enterprise AI deployments — single DGX, small clusters, specialty rendering or inference racks — 208Y/120V three-phase is the right answer. 480V is a different project.

What else changes in the closet

Moving to three-phase isn't a single-line-item change. Here's the honest scope of what usually comes with it:

The PDU

Your existing single-phase rack PDU doesn't accept three-phase input. You're looking at a new three-phase rack PDU, typically a 30 A or 60 A unit with a receptacle like L21-30, L22-30, or an IEC 60309 plug. Good news: a three-phase rack PDU can still deliver 120 V convenience circuits for non-server gear, because each phase-to-neutral leg is 120 V. [6] You don't lose flexibility; you gain capacity.

The UPS

Single-phase UPS systems top out well below the capacities AI workloads demand. A three-phase UPS — typically a double-conversion design in the 20–60 kVA range for closet-scale AI, up to 200 kVA+ for dedicated AI rooms — is almost always the right path. If you're still on VRLA batteries and the system is end-of-life, this is a natural moment to have the Li-Ion vs VRLA conversation as part of the upgrade, because lithium UPS systems are built to handle the higher power densities and tighter footprints that AI deployments need.

The feeder

The conductors feeding your existing 125 A single-phase subpanel are probably not reusable for a new three-phase feeder — you need a third hot conductor, your conduit fill changes, and your existing breaker is single-phase. In most buildings where 208Y/120V three-phase is already available at the main panel, the path is extending a new three-phase feeder from the main panel into the closet, landing on a new three-phase subpanel. It's an electrical contractor job, needs a permit, and needs to be coordinated with a facilities outage. Plan on a half-day to two-day scope of work depending on conduit runs, panel capacity, and whether any AHJ complications surface.

The cooling question — and when it stops being optional

Power is half the conversation. The heat goes somewhere, and most existing closets weren't designed to move it. As a rough guide:

Under ~10 kW/rack: Standard room cooling usually sufficient, as long as your airflow and hot/cold aisle setup are reasonable.
10–30 kW/rack: In-row cooling or rear-door heat exchangers become the right tool. Traditional perimeter CRAC units start to struggle with the concentrated heat load.
30–50 kW/rack: The upper end of what air cooling can realistically deliver in an enterprise environment. Above this, the physics starts to fight you. [4]
Above ~50 kW/rack: Liquid cooling — direct-to-chip cold plates or rear-door heat exchangers with coolant distribution units (CDUs) — becomes necessary, not optional. NVIDIA's own GB200 NVL72 ships with mandatory direct liquid cooling. [1]

For most mid-enterprise AI deployments — one to four DGX-class systems in a single rack — you're in the 10–45 kW/rack range and air cooling is still workable, provided the room has the BTU capacity and the airflow is engineered properly. If you're going larger than that in a single room, the liquid cooling conversation is a project of its own, and it usually means a purpose-built room rather than a retrofit of an existing closet.

"The decision to deploy AI on-prem is almost always made upstairs. The decision about whether the building can actually run it — that conversation usually starts the day after the purchase order lands."

— KAT-5 Field Experience

The planning order that actually works

We've walked enough of these projects now to have a clear sequence that keeps them on track. It looks like this:

GPU platform and count first. How many DGX- or HGX-class systems, in what racks, running what workloads. This determines every downstream decision.
Per-rack load calculation. Sum the nameplate, apply a reasonable diversity factor (GPUs running inference rarely pull 100% sustained, training often does), add cooling overhead, add PDU and monitoring. Get the honest kW per rack.
Single-phase vs three-phase call. If you're above ~15 kW per rack, you're in three-phase territory. Below that, honestly evaluate whether you're planning to grow — if you'll cross the line in two years, do the three-phase work now.
Feeder and panel review. Does the main panel have capacity? What's the route from main panel to IDF/MDF? Is there a permit question?
PDU and UPS selection. Match to the load and the input voltage. Don't buy a single-phase UPS for a deployment that's obviously going three-phase in two years.
Cooling capacity check. Does the room have the BTU capacity? If not, does in-row cooling make sense, or is the heat load pushing you toward a liquid cooling conversation?
Schedule the work. Electrical, cooling, and equipment installation all in the right sequence, with the right downtime windows.

Doing these in order means every downstream decision is grounded in the ones above it. Doing them out of order — which is what happens when the GPU servers show up first and the facilities conversation starts second — means you pay for infrastructure twice, or you get a deployment that works for six months and then starts tripping breakers.

// The KAT-5 Take

Most mid-enterprise buildings can run AI on-prem — but it's a project, not a plug-in.

We see this scenario every few weeks now. An IT director or a CTO gets approval for some GPU servers, the boxes show up, and then we get the call. Most of the time, the honest answer is "yes, your building can do this — and here's what it actually takes." Sometimes it's a clean three-phase feeder extension and a new PDU. Sometimes it's a larger UPS, a new cooling strategy, and a serious conversation with the property owner. Once or twice it's been "not in this closet, but there's a better space on the floor below, and here's what that would look like."

The common thread: the sooner we're in the conversation, the cheaper and cleaner the project gets. The most expensive version of this is the one where the GPU servers are already on the dock, there's a deadline to have them working, and the electrical work has to happen on an accelerated schedule with limited options.

If someone in your organization is talking about AI on-prem — or has already placed the order — that's the moment to have us walk the closet. We'll give you the honest math: what fits, what doesn't, what the upgrade path looks like, and what order to do the work in.

References

IntuitionLabs (2026). NVIDIA HGX Platform: Data Center Physical Requirements Guide. intuitionlabs.ai/articles/nvidia-hgx-data-center-requirements
Uvation (2025). NVIDIA DGX H200 Power Consumption: Key Facts & Requirements. uvation.com
GPU.fm (2026). NVIDIA B200 GPU: Complete Pricing, Specs & Buyer's Guide. gpu.fm/blog/nvidia-b200-complete-buyers-guide-2026
Introl. Liquid Cooling vs Air: The 50kW GPU Rack Guide. introl.com/blog/liquid-cooling-gpu-data-centers-50kw-thermal-limits-guide
EE Power (NEC Commentary). National Electrical Code Basics: Sizing and Protecting Branch-circuit Conductors (NEC 210.19, 210.20, 310). eepower.com
Enconnex. Examining Single-Phase vs. 3-Phase Power & PDUs. blog.enconnex.com
Tripp Lite / Eaton. Power Distribution Units (PDUs) — single-phase vs three-phase capacity reference. tripplite.eaton.com

KAT-5

Physical Infrastructure · Rhode Island

KAT-5 is a Rhode Island–based, WBE/WBENC-certified value-added reseller serving SMB, SLED, enterprise, and mission-critical clients across New England. We spec, quote, and deliver the power, cooling, and rack systems that keep your technology running — and we'll tell you honestly what your building can and can't do.

When AI belongs on-prem (and when it doesn't)

AI on-prem: why your 125 A single-phase closet
is about to become a 208 V three-phase project.

What's actually in those boxes

Why 125 A single-phase hits a ceiling fast

Why three-phase is the answer (not just "more amps")

Why it's almost always 208 V (and when it isn't)

What else changes in the closet

The PDU

The UPS

The feeder

The cooling question — and when it stops being optional

The planning order that actually works

GPU servers on the way?

When AI belongs on-prem (and when it doesn't)

What's actually in those boxes

Why 125 A single-phase hits a ceiling fast

Why three-phase is the answer (not just "more amps")

Why it's almost always 208 V (and when it isn't)

What else changes in the closet

The PDU

The UPS

The feeder

The cooling question — and when it stops being optional

The planning order that actually works

GPU servers on the way?

Keep reading

Lithium-ion vs VRLA — when each one is actually the right call

Wi-Fi 6E, Wi-Fi 7, and the power problem they quietly bring

What an infrastructure assessment should actually deliver (and why most don't)

Wi-Fi 6E vs Wi-Fi 7 — what actually matters for the next refresh

Lithium-ion vs VRLA — When each one is actually the right call