Concept Guide

8 Addressing the Memory Bottleneck in AI Model Training for Healthcare
Table 1. Memory requirement for training 3D U-Net.
Image size
Batch
size
Training
outcome
Server
system memory
Server
CPU family
Server tag
128x128x128
16
Fail
192 GB
1
st
Generation Intel
Xeon Scalable
Processor
dev server
144x144x144
8
Success
384 GB
1
st
Generation Intel
Xeon Scalable
Processor
standard server
240x240x144
16
-
1.5 TB
2
nd
Generation Intel
Xeon Scalable
Processor
memory-rich
server
Table 2. Provisioning training infrastructure for 3D U-Net. We used random pixel values as input
tensors. Our development server failed when executing just the 3D convolution-kernel part of
the full 3D U-Net architecture.