This weekend I managed to setup my application to finally render a cube, and I am going over an overview of what is required to setup to render a basic 3D cube. The basic thing I need is to create the vertex buffer and index buffer of the cube.

var vertex: [VertexAttrib] = [ VertexAttrib(position: float4(x: -1.0, y: -1.0, z: -1.0, w: 1.0), color: float4(x: 1.0, y: 0.0, z: 0.0, w: 1.0)), VertexAttrib(position: float4(x: -1.0, y: -1.0, z: 1.0, w: 1.0), color: float4(x: 0.0, y: 1.0, z: 0.0, w: 1.0)), VertexAttrib(position: float4(x: 1.0, y: -1.0, z: 1.0, w: 1.0), color: float4(x: 0.0, y: 0.0, z: 1.0, w: 1.0)), VertexAttrib(position: float4(x: 1.0, y: -1.0, z: -1.0, w: 1.0), color: float4(x: 1.0, y: 1.0, z: 0.0, w: 1.0)), VertexAttrib(position: float4(x: -1.0, y: 1.0, z: -1.0, w: 1.0), color: float4(x: 1.0, y: 0.0, z: 1.0, w: 1.0)), VertexAttrib(position: float4(x: -1.0, y: 1.0, z: 1.0, w: 1.0), color: float4(x: 0.0, y: 1.0, z: 1.0, w: 1.0)), VertexAttrib(position: float4(x: 1.0, y: 1.0, z: 1.0, w: 1.0), color: float4(x: 0.0, y: 0.0, z: 0.0, w: 1.0)), VertexAttrib(position: float4(x: 1.0, y: 1.0, z: -1.0, w: 1.0), color: float4(x: 1.0, y: 1.0, z: 1.0, w: 1.0)) ] let index: [u_short] = [ 0, 3, 2, 2, 1, 0, // bottom 1, 2, 6, 6, 5, 1, // front 4, 5, 6, 6, 7, 4, // top 3, 0, 4, 4, 7, 3, // behind 0, 1, 5, 5, 4, 0, // left 2, 3, 7, 7, 6, 2 // right ]

So I created two buffers first, one for vertex attributes for position and color, and one for the indices that contains the index of vertices that get drawn for each face of the cube. Similar to the quad, a vertex descriptor needs to be created. What is different is that we need the MVP (Model, View, Projection) matrix now to specify the world coordinate position of the cube, the camera position and orientation, and also the perspective projection of the camera. We are specifying this MVP matrix using one 4×4 floating point matrix for each object in the scene, and hence we need a uniform buffer setup for the MVP matrix.

var matrix = ObjectAttrib( transform: float4x4.init()); var transformBuffer = device.makeBuffer( length: MemoryLayout<float4x4>.size, options: MTLResourceOptions.storageModeShared)

ObjectAttrib is just a struct wrapping the `float4x4`

transform matrix for now, later on I will split the MVP into two separate matrices (M and VP) as I will most likely need the model matrix for other calculations (in world space). Following the Apple’s Metal best practice guide, I am using the `storageModeShared`

option for my buffer’s resource option, since my buffer is relatively small. I will in the future try to performance measure this against `storageModeManaged`

with `didModifyRange()`

.

In the update loop, we update the camera, get the projection matrix using viewport information, along with some assumptions (60 degrees for, 0.1 near plane, and 100 far plane), and then update the matrix and do a memcpy to the `transformBuffer`

to update the MVP matrix of the object. Because we are using `storageModeShared`

, the GPU accesses the system memory directly, so we don’t have to notify Metal to transfer anything from system memory to device memory.

let viewMatrix: float4x4 = camera.transform.inverse() let projectionMatrix: float4x4 = matrix_perspective(fovY: float_t(60.0 * M_PI / 180.0), aspect: float_t(viewport.width) / float_t(viewport.height), nearZ: 0.1, farZ: 100.0) matrix = projectionMatrix * viewMatrix * modelMatrix; memcpy(transformBuffer?.contents(), &matrix, MemoryLayout<float4x4>.size)

I have implemented the view matrix as the inverse of the transformation of the camera matrix of the camera object. Below is the camera matrix code, implemented in the camera class. I am using a right handed coordinate system, so the forward vector of the camera is by default point the -Z direction. Using the look at position, the camera position, and the up vector of the camera, we can get the three axes of the camera to form a rotational matrix, and then apply translation to the camera using the camera position to form the camera transformation. The view matrix is simply the inverse of this matrix:

func update() { let forward = normalize(target - pos) let right = normalize(cross(forward, up)) up = normalize(cross(right, forward)) transform = float4x4.init( float4(1.0, 0.0, 0.0, 0.0), float4(0.0, 1.0, 0.0, 0.0), float4(0.0, 0.0, 1.0, 0.0), float4(pos.x, pos.y, pos.z, 1.0)) * float4x4.init( float4(right.x, right.y, right.z, 0.0), float4(up.x, up.y, up.z, 0.0), float4(-forward.x, -forward.y, -forward.z, 0.0), float4(0.0, 0.0, 0.0, 1.0)) }

The projection matrix is the perspective projection matrix with the correction matrix that corrects the z-near and z-far of the NDC cube by scaling it 0.5 and shifting it +0.5. The correction matrix is required in Metal because the NDC cube is 2x2x1 rather than 2x2x2 like in OpenGL, the centre is at z = 0.5 rather than z = 0.

func perspective_matrix( fovY: float_t, aspect: float_t, nearZ: float_t, farZ: float_t)->float4x4 { var yscale: float_t = 1.0 / tanf(fovY * 0.5) var xscale: float_t = 1.0 / aspect * yscale var a = -(farZ) / (farZ - nearZ) var b = -(farZ * nearZ) / (farZ - nearZ) var m: float4x4 = float4x4.init( float4(xscale, 0.0, 0.0, 0.0), float4(0.0, yscale, 0.0, 0.0), float4(0.0, 0.0, a, -1.0), float4(0.0, 0.0, b, 0.0)) var correction = float4x4.init( float4(1.0, 0.0, 0.0, 0.0), float4(0.0, 1.0, 0.0, 0.0), float4(0.0, 0.0, 0.5, 0.0), float4(0.0, 0.0, 0.5, 1.0)) return correction * m }

The cube’s draw method is different due to the use of index buffer. You don’t have to explicitly call `setVertexBuffer()`

, instead the `drawIndexedPrimitives()`

provides you with arguments for you to fill in details for your index buffer. Note that at the beginning I have declared my index buffer as a ushort array, so I am using uint16 here with 0 offset.

// create render commands for the cube with the render command encoder encoder.setVertexBuffer(_vertexBuffer, offset: 0, at: 0) encoder.drawIndexedPrimitives( type: MTLPrimitiveType.triangle, indexCount: index.count, indexType: MTLIndexType.uint16, indexBuffer: _indexBuffer!, indexBufferOffset: 0)

This is what we get so far, we are missing depth test, and we are also drawing both sides of the faces for all the faces of the cube.

To solve the first problem, we need to create a depth buffer and enable depth test. We have to create a descriptor for the depth test, so that nothing that is occluded is being drawn over objects that are meant to occlude it. This is done using depth comparison function (less than), so that we are only drawing fragments are in front of the fragments that we have already drawn from the perspective of the camera.

// create depth descriptor let depthDescriptor = MTLDepthStencilDescriptor() depthDescriptor.depthCompareFunction = MTLCompareFunction.less depthDescriptor.isDepthWriteEnabled = true depthStencilState = device.makeDepthStencilState(descriptor: depthDescriptor)

We also need to create a depth texture, and a descriptor for creating it. I’m using `depth32Float`

with my viewport’s width and height, and also the same sample count as the viewport that I have, hence I had to declare it with the `textureType`

as `type2DMultisample`

. Lastly, this `resourceOptions`

can be set to `storageModePrivate`

since it is only being written and read on the GPU.

// create depth texture let depthTexDescriptor = MTLTextureDescriptor() depthTexDescriptor.pixelFormat = MTLPixelFormat.depth32Float depthTexDescriptor.width = Int(view.drawableSize.width) depthTexDescriptor.height = Int(view.drawableSize.height) depthTexDescriptor.mipmapLevelCount = 1 depthTexDescriptor.sampleCount = view.sampleCount depthTexDescriptor.resourceOptions = MTLResourceOptions.storageModePrivate depthTexDescriptor.textureType = MTLTextureType.type2DMultisample depthTexDescriptor.usage = MTLTextureUsage.renderTarget depthTexture = device.makeTexture(descriptor: depthTexDescriptor)

The pipeline state descriptor has to be modified to specify the format of the depth buffer we are using with our pipeline.

pipelineStateDescriptor.depthAttachmentPixelFormat = MTLPixelFormat.depth32Float

I see some people using `CAMetalLayer`

, but I am just using the render pass descriptor provided by `MTKView`

. The `MTKView's`

render pass descriptor requires depth attachment to be set to activate depth test. Here we also specify the clear value of depth to be 1.0 (closer objects from the camera has smaller depth value), and the actions for the depth attachment.

// inside render loop if let d = view.currentRenderPassDescriptor { d.depthAttachment.texture = depthTexture d.depthAttachment.clearDepth = 1.0 d.depthAttachment.loadAction = MTLLoadAction.clear d.depthAttachment.storeAction = MTLStoreAction.store // create encoder ... }

During the creation of the render command encoder, we have to set the depth stencil state to use the `depthCompareFunction`

we have set and enable/disable depth test.

encoder.setDepthStencilState(depthStencilState)