Concept Guide

5 Addressing the Memory Bottleneck in AI Model – Training for Healthcare

replacing current assessments with highly accurate and reproducible measurements, AI and DL

techniques can automatically analyze brain tumor scans, providing an enormous potential for

improved diagnosis, treatment planning and patient follow-ups.

A typical MRI scans of the brain may contain 4D volumes with multimodal, multisite MRI data

(FLAIR, T1w, T1gd, T2w). With appropriate training data sets, an AI-based brain tumor analysis

solution should perform segmentation on the images, annotating regions of interest as

necrotic/active tumor, oedema or benign.

Figure 1. AI-based Gliomas segmentation.

Computing Challenges

While the high processing requirement of medical data analysis may be addressed with hardware

accelerators, such as GPUs, addressing the memory requirement is not straightforward. As an

example, a GPU accelerator has between 8 GB to 32 GB of memory. Although convolutional

neural networks may only have several million trainable parameters, the actual memory footprint

of these models is not due to solely those parameters. Instead, most of the memory footprint of

these models comes from the activation (feature) maps in the model (Figure 2, green boxes).

These activation maps—essentially copies of the original images—are a function of the size of

the input to the network. Therefore, models that use large batch, high resolution, high dimensional

image inputs often require more memory than the accelerator card can accommodate. As a

simple example, a ResNet-50 topology that can train successfully on a 224x224x3 RGB input

image may report an out of memory (OOM) error when training on 4096x2160x3 input images

common to 4k video streams.

To compensate for the memory constraints of accelerator cards, researchers use the following

“tricks”: