Using CUDA, we need to write down a routine that can transform high-dimensional data sets to low-dimensional ones. The fundamental operations involved are computing the matrix p from the input data, selecting a random matrix Y, computing matrix q from the random Y matrix, and then updating the Y matrix using the given equation until the values in the Y matrix converge.
Please see attached PDF for detailed instructions. The code has to run on input matrices of size 100,000 * 100, and must exploit CUDA parallelism using the CuBLAS library.
A reference sequential Matlab implementation can be developed if absolutely needed.
I am expecting this to take 2 hours of an experienced CUDA programmer's time; so, please price your bids accordingly.