Concept Guide

4 Addressing the Memory Bottleneck in AI Model Training for Healthcare
Motivation
Healthcare data sets often consist of large,
multi-dimensional modalities. Deep learning
(DL) models developed from these data sets
require both high accuracy and high confidence
levels to be useful in clinical practice.
Researchers employ advanced hardware and
software to speed up this both data- and
computation-intensive process.
Medical image analytics, such as semantic
segmentation, are particularly challenging
because the model is trained to automatically classify individual voxels from large volumetric
images [1]. The 3D (and sometimes 4D) nature of this data type demands increased memory
capacity and processing power when training the model. Consequently, researchers resort to
tricks, such as downsizing and tiling images, to cope with available system memory or adopting
shallower neural network topologies to address the high processing requirement. Ultimately, most
researchers choose a model based on the memory limitations of the hardware rather than based
on the best possible model design.
A high-memory CPU-based server solution, such as the 2
nd
Generation Intel Xeon Scalable
Processor, presents an attractive architecture for addressing the compute and memory
requirement of 3D semantic segmentation algorithms, such as 3D U-Net model. With more than
1 TB of system memory available, the 2
nd
Generation Intel Xeon Scalable Processor allows
researchers to develop large DL models that can be several orders of magnitude larger than those
available on DL accelerators.
Multimodal Brain Tumor Analysis
Multimodal brain tumor analysis is an important diagnosis process in the healthcare industry. A
brain tumor occurs when abnormal cells form within the brain. Gliomas are the most frequent
primary brain tumors in adults, presumably originating from glial cells and infiltrating the
surrounding tissues [2]. Current imaging techniques used in clinical studies are limited to basic
assessments, indicating for example, the presence of gliomas, or limited to non-wholistic
coverage of the scan as a result of the reliance on rudimentary measurement techniques [3]. By
“These models were only moderate size,
and we require more GPU or CPU
memory to be able to train larger
models...”
“Our estimations are based on our
current GPU hardware specifications. We
hope that switching to a CPU based
model (and using Intel-optimized
TensorFlow) will make training large
model more feasible.”
- NEUROMOD / University de
Montreal.