White Papers

28 CheXNet – Inference with Nvidia T4 on Dell EMC PowerEdge R7425 | Document ID

int outputIndex = engine.getBindingIndex(“CheXNet_sigmoid_tensor”);

//3-Set up a buffer array pointing to the input and output buffers on the GPU, using the indexes:

void* buffers[2];

buffers[inputIndex] = inputbuffer;

buffers[outputIndex] = outputBuffer;

//4-TensorRT™ execution is typically asynchronous, so enqueue the kernels on a CUDA stream:

context.enqueue(batchSize, buffers, stream, nullptr):

Command line to execute the trtexec file:

./trtexec

--uff=/home/chest-x-ray/output_convert_to_uff/chexnet_frozen_graph_1541777429.uff \

--output=chexnet_sigmoid_tensor \

--uffInput=input_tensor,3,256,256 \

--iterations=40 \

--int8 \

--batch=1 \

--device=0 \

--avgRuns=100

Docker image used for native TRT: nvcr.io/nvidia/tensorrt:18.11-py3

Where:

--uff=: UFF file location

--output: output tensor name

--uffInput: Input tensor name and its dimensions for UFF parser (in CHW format)

--iterations: Run N iterations

--int8: Run in int8 precision mode

--batch: Set batch size

--device: Set specific cuda device to N

--avgRuns: Set avgRuns to N - perf is measured as an average of avgRuns