Algorithm Acceleration Study

Step 1 : Introduction to GPGPU/CUDA


Overview of GPGPU/CUDA

Thread and Execution Model


Step 2 : CUDA Memory Model


CUDA memory hierarchy 

Memory model & Performance

Using shared memory


Step 3 : Maximizing Memory Throughput


Global memory 

Shared memory


Step 4 : Synchronization & Concurrent Execution


Synchronization 

CUDA stream & Concurrent execution

CUDA Event


Step 5 : Algorithm Implementation and Acceleration


Algorithm Implementation with MATLAB 

Algorithm Implementation with C/C++

Algorithm Acceleration with GPGPU/CUDA