Optimizing Raytracing Algorithm Using CUDA

Sayed Ahmadreza Razian, Hossein MahvashMohammadi


Now, there are many codes to generate images using raytracing algorithm, which can run on CPU or GPU in single or multi-thread methods. In this paper, an optimized algorithm has been designed to generate image using raytracing algorithm to run on CPU or GPU in multi-thread algorithm.

This algorithm employs light with depth of 8 to generate images. It is optimized by changing pixel travel priority and ray of light to thread, dedicating depth function to empty threads, and using optimized functions from MSDN library. Its code has been written in C++ and CUDA. In addition, we do the following to show its performance: comparing implementation in different compiler mode, changing thread number, examining different resolution, and investigating data bandwidth.

The results show that one can generate at least 11 frames per second in HD (720p) resolution by GPU processor and GT 840M graphic card, using trace method. If better graphic card employ, this algorithm and program can be used to generate real-time animation.


CUDA; Raytracing; GPU; Modeling; Parallel Processing.


Schweitzer, Dino, and Elizabeth S. Cobb. "Scanline rendering of parametric surfaces." In ACM SIGGRAPH Computer Graphics, vol. 16, no. 3, pp. 265-271. ACM, 1982.

Roth, Scott D. "Ray casting for modeling solids." Computer graphics and image processing 18, no. 2 (1982): 109-144.

Glassner, Andrew S. "An Introduction to Ray Tracing Morgan Kaufmann." (1989).

Sillion, F. X., and C. Puech. "Radiosity and global illumination. The Morgan Kaufmann series in computer graphics." (1994): 978-1558602779.

Lensch, Hendrik, Michael Goesele, Philippe Bekaert, Jan Kautz, Marcus A. Magnor, Jochen Lang, and Hans‐Peter Seidel. "Interactive rendering of translucent objects." In Computer Graphics Forum, vol. 22, no. 2, pp. 195-205. Blackwell Publishing, Inc, 2003.

Shum, Harry, and Sing Bing Kang. "Review of image-based rendering techniques." In VCIP, pp. 2-13. 2000.

C. Don, Fundamentals of Ray Tracing, 2013.

Parker, Steven, Michael Parker, Yarden Livnat, Peter-Pike Sloan, Charles Hansen, and Peter Shirley. "Interactive ray tracing for volume visualization." In ACM SIGGRAPH 2005 Courses, p. 15. ACM, 2005.

Cullinan, Christopher, Christopher Wyant, Timothy Frattesi, and Xinming Huang. "Computing performance benchmarks among cpu, gpu, and fpga." Internet: www. wpi. edu/Pubs/E-project/Available/E-project-030212-123508/unrestricted/Benchmarking Final (2013).

Ghorpade, Jayshree, Jitendra Parande, Madhura Kulkarni, and Amit Bawaskar. "GPGPU processing in CUDA architecture." arXiv preprint arXiv:1202.4347 (2012).

Bakkum, Peter, and Kevin Skadron. "Accelerating SQL database operations on a GPU with CUDA." In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 94-103. ACM, 2010.

Nvidia, C. U. D. A. "programming guide, 2009." Nvidia, Santa Clara, CA.

T. A. Pitkin, GPU ray tracing with CUDA, Eastern Washington University, 2013.

Allgyer, Michael. "Real-time Ray Tracing using CUDA." Master's Project Report (2008).

Britton, Andrew D. "Full CUDA implementation of GPGPU recursive ray-tracing." PhD diss., Purdue University, 2010.

Gupta, Shubham, and M. Rajasekhara Babu. "Performance Analysis of GPU compared to Single-core and Multi-core CPU for Natural Language Applications." International Journal of Advanced Computer Science and Applications 2, no. 5 (2011): 50-53.

Inoue, Hiroshi, and Toshio Nakatani. "Performance of multi-process and multi-thread processing on multi-core SMT processors." In Workload Characterization (IISWC), 2010 IEEE International Symposium on, pp. 1-10. IEEE, 2010.

Fedorova, Alexandra, Margo I. Seltzer, Christopher A. Small, and Daniel Nussbaum. "Performance of multithreaded chip multiprocessors and implications for operating system design." (2005).

Sulatycke, Peter D., and Kanad Ghose. "A fast multithreaded out-of-core visualization technique." In Parallel Processing, 1999. 13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP. Proceedings, pp. 569-575. IEEE, 1999.

Microsoft-Using Critical Section Objects, "Using Critical Section Objects," [Online]. Available: https://msdn.microsoft.com/en-us/library/windows/desktop/ms686908%28v=vs.85%29.aspx.

M. Righini, How Processor Core Count Impacts Virtualization Performance and Scalability, Intel, 2012.

Microsoft-Acquiring high-resolution time stamps, "Acquiring high-resolution time stamps," [Online]. Available: https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx.

PassMark, "CPU Benchmarks," 2015. [Online]. Available: http://www.cpubenchmark.net/index.php.

PassMark® Software Pty Ltd, "About PassMark Software," [Online]. Available: http://www.passmark.com/about/index.htm.

PassMark, "Videocard Benchmarks," 2015. [Online]. Available: http://www.videocardbenchmark.net/.

P. Debevec, "Rendering Synthetic Objectsin to Real Scenes: Bridging Traditionaland Image-based Graphics with Global Illumination and High Dynamic Range Photography," in IGGRAPH98 conference proceedings.

Full Text: PDF

DOI: 10.28991/ijse-01119


  • There are currently no refbacks.

Copyright (c) 2017 Sayed Ahmadreza Razian, Hossein MahvashMohammadi