Administrator Guide

9 RAPIDS Scaling on Dell EMC PowerEdge Servers

1.5 E2E NYC-Taxi Notebook

This is an End to End (E2E) notebook example extracted from Nvidia rapids ai/notebooks-contrib GitHub

repo, the workflow consists of three core phases: Extract-Transform-Load (ETL), Machine Learning

Training, and Inference operations performed on the NYC-Taxi dataset. The notebook focuses on

showing how to use cuDF with Dask & XGBoost to scale GPU DataFrame ETL-style operations &

model training out to multiple GPUs on multiple nodes. see below Figure 6. In this notebook we will see

how RAPIDS, Dask, and XGBoost are implemented to work together.

Figure 6. NYC-Taxi Notebook Workflow

1.6 RAPIDS Memory Manager (RMM)

According to Nvidia definition “RAPIDS Memory Manager (RMM) is a central place for all device memory

allocations in cuDF (C++ and Python) and other RAPIDS libraries. In addition, it is a replacement allocator

for CUDA Device Memory (and CUDA Managed Memory) and a pool allocator to make CUDA device

memory allocation / deallocation faster and asynchronous”.