Generating the Julia Set using C++ and OpenCL

For the sake of keeping things short as possible I will skip redundant explanations what Julia Set and fractals are in general. There is plenty documentation on internet about it and let’s face it you are probably here just for the codez.  Just to make sure we are on the same page this is the picture of Julia set and what we will generate with our implementation. Colors are completely arbitrary. 

JuliaSet

Task here is to generate 2D image of Julia set where each individual pixel in image is represented with following equation:

zk+1 = zk 2 + c

Z and c represent 2D points with (x,y) values.  Each pixel is given an initial value for z and we will refer to as z0. This initial value is determined by the pixel’s position in the range z spans.

The value of c is constant for all the pixels. The above equation is applied to each pixel and is evaluated iteratively (in a loop). In each iteration, z is updated according to the above equation (the list of values zk refers to the orbit, or dwell of z). This creates a sequence of z values for each pixel. The first 5 values in a sequence are…

z0

z1 = z0 2 + c

z2 = z1 2 + c

z3 = z2 2 + c

z4 = z3 2 + c

Once we’ve calculated a new value for z in a given iteration, we calculate the length of z. If the length is greater than some threshold (usually 2.0), we stop iterating and use the index of the current iteration (k) as an index to look-up an array of colours to determine the colour for the pixel being processed. If we never exceed the threshold and k reaches a given maximum number of iterations then we colour the pixel black.

The image plane (x, y) is actually a complex plane and z and c are complex numbers. Complex numbers are like 2D points (with x and y values as noted above), but the rules for multiplying complex numbers is different. Multiplying “real” numbers (as you’ve done in maths before) simply scales the number along the “number line”, that is, the real axis. However, multiplying complex numbers (2D points on the plane) actually rotates the points in the plane as well as scales their position.

So this is basic intro into what code actually does. I will assume reader is familiar with setting up OpenCL, both for ATI/AMD and Nvidia graphic cards. Very briefly, for AMD cards you should download OpenCL SDK from here and for Nvidia ones CUDA toolkit form here. If we are talking about Windows and Visual Studio as C++ IDE then for Nvidia cards you should add following to “Additional include directories”:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include

and

C:\Program Files (x86)\AMD APP SDK\2.9\include

for AMD based cards. This is where your OpenCL header files will live after SDK inhalation. Same goes for “Library directories”

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\lib\Win32

and

C:\Program Files (x86)\AMD APP SDK\2.9\lib\x86

Of course paths can differ if your version and installation path is different then mine.

Last “setup” thing to do is to link against “OpenCL.lib”, actual OpenCL library. This is the point where you are good to go and write, compile and execute your OpenCL kernels.

In my particular example I am using “Windows Imaging Component” to manage bitmaps (load, save to file , etc.).  Basically entire application, if I may call it that, is split into three files, main.cpp where application logic is, julia_kernels.cl – OpenCL implementation for generating Julia set images and imageio.cpp which is basically wrapper around aforementioned  “WIC”.

Application allows to execute code both on CPU and GPU to be able to measure speed gains when using OpenCL over standard linear CPU implementation.

main.cpp

#include 
#include 
#include 
#include 
#include "imageio.h"
#include "setup_cl.h"
#include 

#include 
#include 

using namespace std;


//
// support structs (reflect vector types in OpenCL)
//

struct float2 {

	float x, y;

	float2(const float _x, const float _y) : x(_x), y(_y) {}
};

struct float3 {

	float x, y, z;

	float3(const float _x, const float _y, const float _z) : x(_x), y(_y), z(_z) {}
};

struct float4 {

	float x, y, z, w;

	float4(const float _x, const float _y, const float _z, const float _w) : x(_x), y(_y), z(_z), w(_w) {}
};


struct pixel_color {
	float R;
	float G;
	float B;
	float A;
};

pixel_color colors[16];

void InitPixelColors()
{
	pixel_color black = { 0.0f, 0.0f, 0.0f, 255.0f };
	pixel_color blue = { 9.0f, 26.0f, 236.0f, 255.0f };
	pixel_color brown = { 163.0f, 107.0f, 7.0f, 255.0f };
	pixel_color green = { 8.0f, 240.0f, 6.0f, 255.0f };
	pixel_color magenta = { 250.0f, 32.0f, 130.0f, 255.0f };
	pixel_color orange = { 250.0f, 107.0f, 6.0f, 255.0f };
	pixel_color red = { 250.0f, 0.0f, 0.0f, 255.0f };
	pixel_color darkGrey = { 50.0f, 50.0f, 50.0f, 255.0f };
	pixel_color lightBlue = { 50.0f, 170.0f, 200.0f, 255.0f };
	pixel_color lightGreen = { 50.0f, 240.0f, 175.0f, 255.0f };
	pixel_color lightYellow = { 245.0f, 250.0f, 140.0f, 255.0f };
	pixel_color violetRed = { 245.0f, 150.0f, 180.0f, 255.0f };
	pixel_color ivory = { 255.0f, 250.0f, 240.0f, 255.0f };
	pixel_color yellow = { 235.0f, 255.0f, 15.0f, 255.0f };
	pixel_color cyan = { 0.0f, 240.0f, 240.0f, 255.0f };
	pixel_color lime = { 191.0f, 255.0f, 0.0f, 255.0f };

	colors[0] = black;
	colors[1] = blue;
	colors[2] = brown;
	colors[3] = green;
	colors[4] = magenta;
	colors[5] = orange;
	colors[6] = red;
	colors[7] = darkGrey;
	colors[8] = lightBlue;
	colors[9] = lightGreen;
	colors[10] = lightYellow;
	colors[11] = violetRed;
	colors[12] = ivory;
	colors[13] = yellow;
	colors[14] = cyan;
	colors[15] = lime;
}

pixel_color getPixelColor(int val){

	return colors[val];
}

void generate_julia_set_image(
	cl_context context, 
	cl_device_id device, 
	cl_command_queue commandQueue, 
	cl_program program, 
	float c_component_real_part, 
	float c_component_imaginary_part,
	int imageWidth,
	int imageHeight,
	int numIterations
	)
{
	cl_int err;
	cl_kernel kernel = 0;
	cl_mem outputImage = 0;
	BGRA8 *result;
	pixel_color pixelColor;
	pixelColor.R = 1.0;
	pixelColor.G = 1.0;
	pixelColor.B = 0.0;
	pixelColor.A = 1.0;
	cl_mem pixelColorBuffer;

	float2 C_component(c_component_real_part, c_component_imaginary_part);
	cl_mem c_componentBuffer;

	try
	{
		kernel = clCreateKernel(program, "generate_image", 0);
		if (!kernel)
			throw(string("could not create kernel"));

		// setup output image
		cl_image_format outputFormat;

		outputFormat.image_channel_order = CL_BGRA; // store a BGRA image
		outputFormat.image_channel_data_type = CL_UNORM_INT8;// Each component is 8 bits in the range [0, 1]

		outputImage = clCreateImage2D(context, CL_MEM_WRITE_ONLY, &outputFormat, imageWidth, imageHeight, 0, 0, &err);

		pixelColorBuffer = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(pixel_color), &pixelColor, 0);
		c_componentBuffer = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(float2), &C_component, 0);

		if (!pixelColorBuffer){
			throw(string("pixel buffer failed"));
		}

		clSetKernelArg(kernel, 0, sizeof(cl_mem), &pixelColorBuffer);
		clSetKernelArg(kernel, 1, sizeof(cl_mem), &outputImage);
		clSetKernelArg(kernel, 2, sizeof(cl_mem), &c_componentBuffer);
		clSetKernelArg(kernel, 3, sizeof(int), &numIterations);


		// setup worksize arrays
		size_t globalWorkSize[2] = { imageWidth, imageHeight };

		// setup event (for profiling)
		cl_event doneEvent;

		// enqueue kernel
		err = clEnqueueNDRangeKernel(commandQueue, kernel, 2, 0, globalWorkSize, 0, 0, 0, &doneEvent);

		// block until kernel finishes and report time taken to run the kernel
		clWaitForEvents(1, &doneEvent);
		cl_ulong startTime = (cl_ulong)0;
		cl_ulong endTime = (cl_ulong)0;

		clGetEventProfilingInfo(doneEvent, CL_PROFILING_COMMAND_START, sizeof(cl_ulong), &startTime, 0);
		clGetEventProfilingInfo(doneEvent, CL_PROFILING_COMMAND_END, sizeof(cl_ulong), &endTime, 0);

		double tdelta = (double)(endTime - startTime);

		cout << "time taken (in seconds) to create the image = " << (tdelta * 1.0e-9) << endl;

		// extract the resulting image from OpenCL
		result = (BGRA8*)malloc(imageWidth * imageHeight * sizeof(BGRA8));
		if (!result)
			throw("cannot create output image buffer to save to disk");

		size_t origin[3] = { 0, 0, 0 };
		size_t region[3] = { imageWidth, imageHeight, 1 };

		err = clEnqueueReadImage(commandQueue, outputImage, CL_TRUE, origin, region, 0, 0, result, 0, 0, 0);

		wstring filename(L"gpu_" + to_wstring(imageWidth) + L"x" + to_wstring(imageHeight) + L"_" + to_wstring(255) + L".bmp");
		saveImageBGRA8(imageWidth, imageHeight, result, filename);
	}
	catch (std::string err)
	{
		cout << "Error has occured: " << err << std::endl;
	}
	
	/* Deallocate resources */
	clReleaseMemObject(pixelColorBuffer);
	clReleaseMemObject(outputImage);
	clReleaseMemObject(c_componentBuffer);
	clReleaseKernel(kernel);
	
}



double GetComplexLenght(complex input)
{
	return std::sqrt(std::pow((std::real(input)), 2) + std::pow((std::imag(input)), 2));
}

void RunOnCPU(int imageWidth, int imageHeight, int numIterations, double c_component_real_part, double c_component_imaginary_part)
{
	std::chrono::time_point start, end;
	start = std::chrono::system_clock::now();

	cout << "\n\nRunnig  on CPU\n" << endl;

	BGRA8* imageBuffer = (BGRA8*)malloc(imageWidth * imageHeight * sizeof(BGRA8));

	complex Z(0.0, 0.0);
	complex C(c_component_real_part, c_component_imaginary_part);


	for (int i = 0; i < imageHeight; i++)
	{
		for (int j = 0; j < imageWidth; j++)
		{
			Z = complex((-imageWidth / 2 + j) / (imageWidth / 4.0), (-imageHeight / 2 + i) / (imageHeight / 4.0));
			int iteration = 0;

			for (int k = 0; k < numIterations; k++)
			{
				Z = std::pow(Z, 2) + C;
				iteration++;
				if (GetComplexLenght(Z) > 2)
				{
					break;
				}
			}

			int pixelIndex = j + (i * imageWidth);
			if (iteration == numIterations)
			{
				imageBuffer[pixelIndex].r = 0;
				imageBuffer[pixelIndex].g = 0;
				imageBuffer[pixelIndex].b = 0;
				imageBuffer[pixelIndex].a = 255;
			}
			else
			{
				pixel_color pc = getPixelColor(iteration % 16);
				imageBuffer[pixelIndex].r = (BYTE)pc.R;
				imageBuffer[pixelIndex].g = (BYTE)pc.G;
				imageBuffer[pixelIndex].b = (BYTE)pc.B;
				imageBuffer[pixelIndex].a = (BYTE)pc.A;
			}
		}
	}

	end = std::chrono::system_clock::now();

	std::chrono::duration elapsed_seconds = end - start;
	std::time_t end_time = std::chrono::system_clock::to_time_t(end);

	std::cout << "\nFinished computation on CPU at " << std::ctime(&end_time)
		<< "elapsed time: " << elapsed_seconds.count() << "s\n\n\n";

	wstring filename(L"cpu_" + to_wstring(imageWidth) + L"x" + to_wstring(imageHeight) + L"_" + to_wstring(numIterations) + L".bmp");
	saveImageBGRA8(imageWidth, imageHeight, imageBuffer, filename);
	free(imageBuffer);
}

int RunOnGPU(int imageWidth, int imageHeight, int numIterations, double c_component_real_part, double c_component_imaginary_part)
{
	cout << "\n\nRunnig  on GPU\n" << endl;
	// 1. Create and validate OpenCL context
	cl_context context = createContext();

	if (!context)
	{
		cout << "cl context not created\n";
		shutdownCOM();
		return 1;
	}

	// 2. Get a list of devices associated with the context
	size_t deviceBufferSize;
	cl_int errNum = clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, 0, &deviceBufferSize);
	cl_device_id* contextDevices = (cl_device_id*)malloc(deviceBufferSize);
	errNum = clGetContextInfo(context, CL_CONTEXT_DEVICES, deviceBufferSize, contextDevices, 0);

	// use the first device in the list - should be GPU since we requested this during context setup
	cl_device_id device = contextDevices[0];

	// 3. Create and vailedate program object based on julia_kernels.cl
	cl_program program = createProgram(context, device, "julia_kernels.cl");

	if (!program)
	{
		cout << "Could not create program object";
		shutdownCOM();
		return 1;
	}

	// 4. create and validate the command queue (for first device in context)
	// Note: add profiling flag so we can get timing data from events
	cl_command_queue commandQueue = clCreateCommandQueue(context, device, CL_QUEUE_PROFILING_ENABLE, 0);
	if (!commandQueue) {

		cout << "command queue not created\n";
		shutdownCOM();
		return 1;
	}

	// call function to carry out task
	generate_julia_set_image(context, device, commandQueue, program, c_component_real_part, c_component_imaginary_part, imageWidth, imageHeight, numIterations);

	/* Deallocate resources */
	clReleaseCommandQueue(commandQueue);
	clReleaseProgram(program);
	clReleaseContext(context);
}

void Test_800x600_255(double c_component_real_part, double c_component_imaginary_part)
{
	int imageWidth = 800;
	int imageHeight = 600;
	int numIterations = 255; 
	RunOnCPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
	RunOnGPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
}

void Test_1024x768_255(double c_component_real_part, double c_component_imaginary_part)
{
	int imageWidth = 1024;
	int imageHeight = 768;
	int numIterations = 255;
	RunOnCPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
	RunOnGPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
}

void Test_1680x1050_255(double c_component_real_part, double c_component_imaginary_part)
{
	int imageWidth = 1680;
	int imageHeight = 1050;
	int numIterations = 255;
	RunOnCPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
	RunOnGPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
}

void Test_6400x3200_255(double c_component_real_part, double c_component_imaginary_part)
{
	int imageWidth = 6400;
	int imageHeight = 3200;
	int numIterations = 255;
	RunOnCPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
	RunOnGPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
}

void TestScalability()
{
	int imageWidth = 6400;
	int imageHeight = 3200;
	double c_component_real_part = -0.805;
	double c_component_imaginary_part = 0.156;
	int numIterations = 255;

	while (RunOnGPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part) == 0)
	{
		imageWidth += 1000;
		imageHeight += 1000;
	}
}

void RunAutomaticTests()
{
	cout << "Runnig automatic tests!" << endl;
	double c_component_real_part = -0.805;
	double c_component_imaginary_part = 0.156;

	// Run speed comapre tests
	Test_800x600_255(c_component_real_part, c_component_imaginary_part);
	Test_1024x768_255(c_component_real_part, c_component_imaginary_part);
	Test_1680x1050_255(c_component_real_part, c_component_imaginary_part);
	Test_6400x3200_255(c_component_real_part, c_component_imaginary_part);

	// Test Scalability - How big images can we create
	TestScalability();

	cout << "Test run finished!" << endl;
}

int main(int argc, char **argv)
{
	// initialise COM so we can export image data using WIC
	initCOM();

	InitPixelColors();

	// clear screen
	system("cls");
	cout << "***************************************************************\n";
	cout << "*             Julia set generator                             *\n";
	cout << "***************************************************************\n\n\n";

	cout << "Do you want to run automatic tests both on GPU and CPU (y/n)? ";
	char answer;
	cin >> answer;

	if (answer == 'y' || answer == 'Y'){
		RunAutomaticTests();
	}
	else
	{
		cout << "Enter image width: ";
		int imageWidth = 800;
		cin >> imageWidth;
		cout << "\nEnter image height: ";
		int imageHeight = 600;
		cin >> imageHeight;

		float c_component_real_part;
		cout << "\nEnter real part of C component: ";
		cin >> c_component_real_part;

		float c_component_imaginary_part;
		cout << "\nEnter imaginary part of C component: ";
		cin >> c_component_imaginary_part;

		int numIterations = 255;
		cout << "\nEnter the number of iterations: ";
		cin >> numIterations;

		RunOnCPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);

		RunOnGPU(imageWidth, imageHeight, numIterations, c_component_real_part, c_component_imaginary_part);
	}
	//done
	shutdownCOM();

	return 0;
}

And this is actual OpenCL kernel implementation:

// Julia set structs and kernels
typedef struct pixel_color {
	float		R; 
	float		G; 
	float		B;
	float		A;
} pixel_color;

//2 component vector to hold the real and imaginary parts of a complex number:
typedef float2 cfloat;

#define I ((cfloat)(0.0, 1.0))
#define M_PI 3.14159265358979323846

/*
 * Return Real (Imaginary) component of complex number:
 */
inline float real(cfloat a){
     return a.x;
}
inline float imag(cfloat a){
     return a.y;
}

/*
 * Get the modulus of a complex number (its length):
 */
inline float cmod(cfloat a){
    return (sqrt(a.x*a.x + a.y*a.y));
}

// Add two complex numbers
inline cfloat cadd(cfloat a, cfloat b){
	return (cfloat)( a.x + b.x, a.y + b.y);
}

/*
 * Get the argument of a complex number (its angle):
 * http://en.wikipedia.org/wiki/Complex_number#Absolute_value_and_argument
 */
inline float carg(cfloat a){
    if(a.x > 0){
        return atan(a.y / a.x);

    }else if(a.x < 0 && a.y >= 0){
        return atan(a.y / a.x) + M_PI;

    }else if(a.x < 0 && a.y < 0){
        return atan(a.y / a.x) - M_PI;

    }else if(a.x == 0 && a.y > 0){
        return M_PI/2;

    }else if(a.x == 0 && a.y < 0){
        return -M_PI/2;

    }else{
        return 0;
    }
}


/*
 * Multiply two complex numbers:
 *
 *  a = (aReal + I*aImag)
 *  b = (bReal + I*bImag)
 *  a * b = (aReal + I*aImag) * (bReal + I*bImag)
 *        = aReal*bReal +I*aReal*bImag +I*aImag*bReal +I^2*aImag*bImag
 *        = (aReal*bReal - aImag*bImag) + I*(aReal*bImag + aImag*bReal)
 */
inline cfloat  cmult(cfloat a, cfloat b){
    return (cfloat)( a.x*b.x - a.y*b.y, a.x*b.y + a.y*b.x);
}

/*
 *  Square root of complex number.
 *  Although a complex number has two square roots, numerically we will
 *  only determine one of them -the principal square root, see wikipedia
 *  for more info: 
 *  http://en.wikipedia.org/wiki/Square_root#Principal_square_root_of_a_complex_number
 */
 inline cfloat csqrt(cfloat a){
     return (cfloat)( sqrt(cmod(a)) * cos(carg(a)/2),  sqrt(cmod(a)) * sin(carg(a)/2));
 }

 inline pixel_color getPixelColor(int val){
	pixel_color colors[16];
	pixel_color black = { 0.0f, 0.0f, 0.0f, 1.0f };
	pixel_color blue = { 9.0f, 26.0f, 236.0f, 1.0f };
	pixel_color brown = { 163.0f, 107.0f, 7.0f, 1.0f };
	pixel_color green = { 8.0f, 240.0f, 6.0f, 1.0f };
	pixel_color magenta = { 250.0f, 32.0f, 130.0f, 1.0f };
	pixel_color orange = { 250.0f, 107.0f, 6.0f, 1.0f };
	pixel_color red = { 250.0f, 0.0f, 0.0f, 1.0f };
	pixel_color darkGrey = { 50.0f, 50.0f, 50.0f, 1.0f };
	pixel_color lightBlue = { 50.0f, 170.0f, 200.0f, 1.0f };
	pixel_color lightGreen = { 50.0f, 240.0f, 175.0f, 1.0f };
	pixel_color lightYellow = { 245.0f, 250.0f, 140.0f, 1.0f };
	pixel_color violetRed = { 245.0f, 150.0f, 180.0f, 1.0f };
	pixel_color ivory = { 255.0f, 250.0f, 240.0f, 1.0f };
	pixel_color yellow = { 235.0f, 255.0f, 15.0f, 1.0f };
	pixel_color cyan = { 0.0f, 240.0f, 240.0f, 1.0f };
	pixel_color lime = { 191.0f, 255.0f, 0.0f, 1.0f };
	
	colors[0] = black;
	colors[1] = blue;
	colors[2] = brown;
	colors[3] = green;
	colors[4] = magenta;
	colors[5] = orange;
	colors[6] = red;
	colors[7] = darkGrey;
	colors[8] = lightBlue;
	colors[9] = lightGreen;
	colors[10] = lightYellow;
	colors[11] = violetRed;
	colors[12] = ivory;
	colors[13] = yellow;
	colors[14] = cyan;
	colors[15] = lime;

	pixel_color normalizedColorVal = { colors[val].R / 255.0f, colors[val].G / 255.0f, colors[val].B / 255.0f, 1.0f };

	return normalizedColorVal;
 }

__kernel void generate_image(const global pixel_color *pixelColor, write_only image2d_t outputImage , const global float2 *c_component,  int numIterations)
{
	// get id of element in array
	int x = get_global_id(0);
	int y = get_global_id(1);
	int w = get_global_size(0);
	int h = get_global_size(1);

	pixel_color pc = *pixelColor;
	//pc.R = c_component[0].x;
	//pc.G = c_component[0].y;

	cfloat Z = { ( -w / 2 + x) / (w/4.0f) , ( -h / 2 + y) / (h/4.0f) };
	cfloat C = { c_component[0].x, c_component[0].y };
	int iteration = 0;

	while (iteration < numIterations)
	{
		 cfloat Zpow2 = cmult(Z, Z); 
		 cfloat Zn = cadd(Zpow2, C);
		 Z.x = Zn.x;
		 Z.y = Zn.y;
		 iteration++;
		 if(cmod(Z) > 2)
		 {
			break;
		 }
	}

	// threshold reached mark pixel as black
	if (iteration == numIterations)
	{
		pc.R = 0.0f;
		pc.G = 0.0f;
		pc.B = 0.0f;
	}
	else
	{
		pc = getPixelColor(iteration % 16);
	}

	// RGBA
	float4 color = (float4)(pc.R, pc.G, pc.B, pc.A);

	write_imagef(outputImage, (int2)(x, y), color);
}

Here is the example application running.

juliaset generator

I don't want to get into details about how each line of code works. I’ve tried to comment on important parts of code. If you have any questions about some parts of the code hit me in comments and I’ll be glad to clear things up.

Here is the entire solution available for download.

JuliaSet.zip (19.74 kb)

Fix for Intel C++ Compiler XE 13.0 error: unable to obtain mapped memory (see pch_diag.txt)

 

CPlusPlusStudio_productVery annoying error when trying to compile C++ project with Visual Studio 2010 and Intel Compiler XE  13.0 (btw, it’s also reproducible with Intel Compiler 12 and 14). With Microsoft compilers projected compiles without warning but with Intel this error message comes up: 

error : unable to obtain mapped memory (see pch_diag.txt)

Intel suggest here that this is the problem with precompiled headers but it turns out problems is in icl.exe (Intel compiler). It needs admin rights. You can either give it explicitly to icl.exe or start Visual Studio as Administrator.

Starting with Ogre3D on Mac OS X 10.9.2 and Xcode 5.1.1

ogre-logo-wetfloorOgre3D is one of the most popular open-source graphics rendering engines. One of the powerful  advantages it offers is abstraction of rendering engine. It can run same code ("game") on OpenGL, Direct3D9,Direct3D10 or Direct3D11 by just changing application configuration file. Above  all it abstracts very efficiently many of the core game concepts like loading models, composing scenes, working with cameras etc. What attracted me in particular to Ogre is it's portability. Ogre can be compiled and is available as cross-platform SDK, available on Windows, OS X and some Linux distributions. I've got up to speed with Ogre with this excellent book. Even though book says it is platform independent and you can use any c++ compiler you want it is in essence very much tied to Windows and Visual Studio. Ogre has great wiki with a lot of examples and help pages to get you started but I had a ruff first encounter on OS X platform getting SDK and samples even to compile. In essence there are couple of prerequisites before you even compile and information about those is a bit scattered.

My idea here is simple "Hello World" or in this case "Hello Sinbad" example application from ground up. From downloading and installing prerequisites to actually writing and compiling code with Xcode.

At the time of writing current stable SDK version is 1.9. Even though it says stable it would not compile with Xcode because some include paths are not correctly set. It’s probably easy fix but for sake of keeping things simple and because the only good beginner book out there “OGRE 3D 1.7 Beginner's Guide” is using version 1.7 I will use it too. OS X is version 10.9.2 Mavericks with Xcode 5.1.1.

Installing prerequisites

  1. CMake for OS X – Used for compiling SDK  and sample code. You can download it here. At the end of the setup you should allow installation to place aliases in /usr/bin because SDK build will use Cmake from .sh scripts and effectively calling “cmake” without absolute path to your /Applications folder where Cmake will be installed.
  2. Nvidia CG Toolkit -   Installs Cg framework system wide and can be downloaded here.

Downloading and compiling Ogre3D SDK

Go to ogre3d.org and download OS X SDK version 1.7 or just click here. After downloading and mounting SDK .dmg file you should see content similar as shown on this picture. ogreSDK Simply copy entire OgreSDK to your local folder where you have read and write permissions. Important thing to notice when you are copying SDK locally is that you should not have spaces in your path to “OgreSDK”. In my case I had folder structure something like this: /Users/bbizic/Projects/3D Project/Ogre3D/OgreSDK. This one space in “3D Projects” caused SDK build to fails so beware of that when copying SDK files.

Next step is to compile Ogre SDK. As OS X user I was expecting like with other projects simple ./configre make steps but that’s not how it works here. From Terminal/shell navigate to your OgreSDK folder and execute following command.

cmake -GXcode

After this step navigate to your OgreSDK local folder and open OGRE.xcodeproj. You should see similar structure as this.

ogre xcodeI will be using debug version of libraries so make sure from Product –> Edit Scheme debug version was set and as scheme “ALL_BUILD” is checked. After that just click Product –> Build. It will take a while until all the projects are complied.

Notice: Ogre3D uses Boost libraries for threading. If for some reason SDK fails to compile boost libraries as part of the SDK build the reason could be that boost is targeting lover version OS X SDK, namely OS X 10.5. To be able to compile against and for older versions of OS X Xcode/gcc must have copy of OS X SDK either under /Applciations/Xcode.app/Contents/Developer/Platforms/MacOSX.Platform/Developer/SDKs/MacOSX10.5.sdk and/or under /Develper/SDKs/MacOSX10.5.sdk. Let me know if you have problems I will write more details about it. 

Sample Hello Sinbad World Example

 

Finally it’s time to build sample application. Fire up Xcode –> New Project and choose Command Line Tool. xcode commandline 

On next screen give some project name and as type choose C++. We could of gone with Cocoa application to make use of built in bundling system but for sake of simplicity we will build only our executable with correct libraries and then we will make use of SampleBrowser application which we built when we first time build the OgreSDK. In runtime Ogre needs to access it’s libraries from correct location and also few configuration files are needed to be properly set up. To avoid the hustle of configuration we will simple replace SampleBrowser executable with our new one.

Place project in folder next to OgreSDK so it will simplify search paths for include and lib files. Once project was created add new C++ file named main.cpp. and place following content in. You can omit my comments.

//
//  main.cpp
//  HelloSinbad
//
//  Created by Bojan Bizic on 12/05/14.
//  Copyright (c) 2014 BojanBizic. All rights reserved.
//

#include "Ogre/ExampleApplication.h"

class Example1 : public ExampleApplication
{
public:
    void createScene()
    {
        // Set the scene's ambient light
        mSceneMgr->setAmbientLight(Ogre::ColourValue(0.5f, 0.5f, 0.5f));
        
        // Create an Entity
        Ogre::Entity* ogreHead = mSceneMgr->createEntity("Head", "ogrehead.mesh");
        
        // Create a SceneNode and attach the Entity to it
        Ogre::SceneNode* headNode = mSceneMgr->getRootSceneNode()->createChildSceneNode("HeadNode");
        headNode->attachObject(ogreHead);
        
        headNode->setPosition(0.0, 0.0, 400.0);
        
        // Create a Light and set its position
        Ogre::Light* light = mSceneMgr->createLight("MainLight");
        light->setPosition(20.0f, 80.0f, 500.0f);
    }
};

int main(void)
{

    // Create application object
    Example1 app;
    
    try
    {
        app.go();
    } catch( Ogre::Exception& e ) {
        std::cerr << "An exception has occured: " << e.getFullDescription().c_str() << std::endl;
    }
    
    return 0;
}

Next step is to configure project settings namely include and library paths. First we need to make Boost happy and disable C++11.

cpp settings 

Next, header search paths.

header search paths

You need to add path to OgreSDK/include , OdreSDK/boost_1_46_1 and in case of our Mac OSX OgreSDK/include/OGRE/OSX otherwise compiler want be able to find macUtils.h file.

Next step, library search paths.

library search paths

We need OgreSDK/lib/ and OgreSDK/boost_1_46_1/lib.

Framework paths is also need because on Mac OS X Ogre is bundled as framework and should be referenced as such. Picture below shows setting on my system.framework paths

 

In next step switch over to the Build Phases tab and references as show on image below.

framework referecnces

 

If everything was setup correctly as shown above you should be able to compile at this point. Only thing left to do now is to copy our “HelloSinbad” (in my case this is the name of the project) and copy it to OgreSDK/bin/Debug/SampleBrowser/Content/MacOS and start it from there.

Final result

You can open up SampleBrowser app by right clicking on it and then on “Show Package Content” and place your executable in MacOS folder. Double clicking on executable should bring up Ogre configuration window.

ogre config windowSince we are on Mac OS X on available rendering Subsystem shown should be OpenGL. On Windows we would also see options for Direct3D. By clicking on OK button settings are saved in Resources/ogre.cfg file and our Sinbad is finally shown on screen.

sinbad  

 

 

That was it. Not so straight forward as on Windows but on other hand not less rewarding. Most of the work lies on setup and deployment of Ogre SDK files. Coding part was pretty simple and short. Writing similar application which loads model, sets up lighting and cameras with lower level abstractions such as OpenGL or Direct3D would involve at least thousand lines of C++ code plus GLSL/HLSL counter parts for Shader pipelines, not to mention problems with portability.

Hope this helps and if you have questions regarding this post please post it in comments bellow I will be more then happy to answer them.

MFC + 64Bit OS + VS2010 + Debugging = System.AccessViolationException

Once again getting my hands dirty with old C++ MFC code and I ran into strange problem.

It looks like this has been well-known issue since VS 6.0 over VS2005 up to newest 2010 version.

If you are trying to debug C++ MFC application on 64Bit OS using Visual Studio 2010 you will most likely get  System.AccessViolationException in “Unknow Module”.

 

AccessViolation

Did some research over MSDN and as it turns out only solution to this problem is to disable RPC debugging.

 

DisableRPC

In Visual Studio go to Tools->Options->Debugging->Native and just uncheck “Enable RPC debugging”. This setting is always enabled by default.

I will defiantly find my own post search for this solution in near future Smile.

Hope it helps you and please don’t use MFC any more Winking smile.

About me

Bizic Bojan is Co-Founder of Amida IT-Services GmbH and Software Architect with focus on .NET, C++, Python and Cloud Native solutions. 

 

Disclaimer:

The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.