White Papers

28 CheXNet Inference with Nvidia T4 on Dell EMC PowerEdge R7425 | Document ID
int outputIndex = engine.getBindingIndex(CheXNet_sigmoid_tensor”);
//3-Set up a buffer array pointing to the input and output buffers on the GPU, using the indexes:
void* buffers[2];
buffers[inputIndex] = inputbuffer;
buffers[outputIndex] = outputBuffer;
//4-TensorRT™ execution is typically asynchronous, so enqueue the kernels on a CUDA stream:
context.enqueue(batchSize, buffers, stream, nullptr):
Command line to execute the trtexec file:
./trtexec
--uff=/home/chest-x-ray/output_convert_to_uff/chexnet_frozen_graph_1541777429.uff \
--output=chexnet_sigmoid_tensor \
--uffInput=input_tensor,3,256,256 \
--iterations=40 \
--int8 \
--batch=1 \
--device=0 \
--avgRuns=100
Docker image used for native TRT: nvcr.io/nvidia/tensorrt:18.11-py3
Where:
--uff=: UFF file location
--output: output tensor name
--uffInput: Input tensor name and its dimensions for UFF parser (in CHW format)
--iterations: Run N iterations
--int8: Run in int8 precision mode
--batch: Set batch size
--device: Set specific cuda device to N
--avgRuns: Set avgRuns to N - perf is measured as an average of avgRuns