A few updates on the native and GPU deconvolution experiments…
I’ve added a JavaCpp wrapper to the YacuDecu Cuda Richardson Lucy impelmenation by Bob Pepin.
Ops Experiments now has
cuda sub-directories. These each have native source code and an associated CMakeList.txt file.
To make it work you build the native code it outside the
ops-experiments directory structure. If you inspect the javacpp plugin section of the pom you can see where JavaCpp is expecting the native headers and libraries.
The wrapper files (ie CudaRicharsonLucyWrapper) specify where it looks for third party libraries (such as Cuda and MKL). (only Windows 64 is supported right now).
There is a test that runs Ops, MKL, and Cuda deconvolution, and benchmarks them.
The test has debugging statements, as I am still tracking down memory corruption issues.
Preliminary times on my machine for 100 iterations of Richardson Lucy…
Ops - 260 seconds
MKL - ~16 seconds
Cuda ~4 seconds
Notes: I am in the process of profiling the ops version, to see why it is so slow. I would expect it to be slower than the MKL version, but not that much slower.