Administrator Guide

9 RAPIDS Scaling on Dell EMC PowerEdge Servers
1.5 E2E NYC-Taxi Notebook
This is an End to End (E2E) notebook example extracted from Nvidia rapids ai/notebooks-contrib GitHub
repo, the workflow consists of three core phases: Extract-Transform-Load (ETL), Machine Learning
Training, and Inference operations performed on the NYC-Taxi dataset. The notebook focuses on
showing how to use cuDF with Dask & XGBoost to scale GPU DataFrame ETL-style operations &
model training out to multiple GPUs on multiple nodes. see below Figure 6. In this notebook we will see
how RAPIDS, Dask, and XGBoost are implemented to work together.
Figure 6. NYC-Taxi Notebook Workflow
1.6 RAPIDS Memory Manager (RMM)
According to Nvidia definition RAPIDS Memory Manager (RMM) is a central place for all device memory
allocations in cuDF (C++ and Python) and other RAPIDS libraries. In addition, it is a replacement allocator
for CUDA Device Memory (and CUDA Managed Memory) and a pool allocator to make CUDA device
memory allocation / deallocation faster and asynchronous.