Apex GPU rendering
Introduction
Apex is a technology that enables Mako DOM content, typically loaded from PDF, to be rendered at high speed on a GPU.
The implementation in Mako is via a new rendering class, IApexRenderer
. It performs the same rendering functions as Mako’s native IJawsRenderer
, except for screening (halftoning). Rendering with Apex is very similar to working with Jaws, but the arguments are presented in a different, and more organized fashion. The following section compares the two approaches.
Common code
This code instantiates Mako then creates a reference to an object that can be rendered – in this case, a complete page – from the hierarchy of parent objects (assembly, document, page). It also gets the cropbox
, a rectangle that defines the viewable (and often, the printable) area of the page. Finally, a device CMYK colorspace is created that will be used to specify the render colorspace.
const auto mako = IJawsMako::create();
IJawsMako::enableAllFeatures(mako);
const auto assembly = IInput::create(mako, eFFPDF)->open(testFilePath + "Cheshire Cat.pdf");
const auto document = assembly->getDocument();
const auto page = document->getPage(0);
const auto cropBox = page->getCropBox();
const auto fixedPage = page->getContent();
const auto cmykSpace = IDOMColorSpaceDeviceCMYK::create(mako);
Rendering code
Jaws
The Jaws render()
method has many parameters that default to suitable values, greatly simplifying this example.
// Render with Jaws at 300 dpi, CMYK
const auto jawsRenderer = IJawsRenderer::create(mako);
const auto image = jawsRenderer->render(fixedPage, 300, 8, cmykSpace);
IDOMTIFFImage::encode(mako, image, IOutputStream::createToFile(mako, "FromJaws.tif"));
Apex
Apex is a little more bureaucratic, but this allows for a more consistent approach to render calls for different use cases.
// Render with Apex at 300 dpi, CMYK
auto imageRenderSpec = CImageRenderSpec();
imageRenderSpec.width = static_cast<uint32>(lround(cropBox.dX / 96.0 * 300));
imageRenderSpec.height = static_cast<uint32>(lround(cropBox.dY / 96.0 * 300));
imageRenderSpec.processSpace = cmykSpace;
imageRenderSpec.sourceRect = cropBox;
const auto apexRenderer = IApexRenderer::create(mako);
apexRenderer->render(fixedPage, &imageRenderSpec);
IDOMTIFFImage::encode(mako, imageRenderSpec.result, IOutputStream::createToFile(mako, "FromApex.tif"));
Calling the renderer: The IJawsRenderer
class offers various rendering methods (render()
, renderToFrameBuffer()
etc.) to suit different use cases. Each has its own unique set of parameters, which are numerous and can be confusing. For Apex, we have taken a different approach, whereby you first populate a CRenderSpec
(or rather, one of the derived classes such as CImageRenderSpec
or CFrameBufferRenderSpec
) with the required information. The minimum that can be specified is shown in the above example: width, height, colorspace and the bounds of the area to be rendered.
In a future version of Mako, IJawsRenderer
will also accept CRenderSpec
-derived classes to specify rendering parameters.
Apex has no explicit resolution setting: With Apex, you specify a width in height in pixels, and the buffer is filled accordingly, allowing for non-square rendering. This approach matches IJawsRenderer::renderToFrameBuffer()
or IJawsRenderer::renderToFrameBuffers()
; it’s only IJawsRenderer::render()
that allows a resolution to be specified.
Threading pattern
Mako’s encapsulation of the Jaws RIP in the IJawsRenderer
class makes possible multithreaded rendering whereby a page is divided into tiles or bands, then multiple instantiations of the renderer can be called on their own thread to render the page regions simultaneously. User code is required to implement this pattern.
For Apex, the model is slightly different. Each GPU requires a single instance of the Apex renderer.
When you make a render()
call to an IApexRenderer
instance, this is the sequence of events:
Internally, Apex breaks up the rendering area into tiles (due to the size of images that GPUs can work on to keep within VRAM limitations) and prepares a command buffer with images and geometry, then queues that to the GPU
When the the GPU becomes available, the tile is rendered
The raster data for that tile is retrieved – this can usually be done while the next render (the previous step) is already underway
When rendering to a FrameBuffer, Apex will work on the preparation simultaneously on several threads; this benefits performance because although the GPU can work only on one tile at a time, it does so very quickly, so it’s important to feed data to it as fast as possible to keep it busy. Therefore, in user code, it may be advantageous to submit multiple renders to a single Apex instance on different threads, again to ensure the input “hopper” is kept full. This can be done with some light threading.
To see a fully-worked example of threading with the Apex renderer, please look at simpleapexrender
(found in the MakoApps folder or the Mako Apex distribution folder)
Apex: Rendering to a FrameBuffer
The examples above render to an IDOMImage
, a convenient Mako object that allows for image operations with built-in methods for color conversion, downsampling, adding a bleed and much more. However, behind the scenes Mako will be making one or more copies of the raster data, as required. A more efficient approach is to render to a buffer that is then reused, by an image encoder for example. The sample simpleapexrender
takes this approach.
Render
IPagePtr page = document->getPage(pageNum);
// Fetch the content
IDOMFixedPagePtr content = page->getContent();
// Decide the size of the resulting rendered image. Here we'll just
// render the crop box area.
FRect cropBox = page->getCropBox();
uint32 pixelWidth = (uint32) lround(cropBox.dX * renderParams.m_xResolution / 96.0);
uint32 pixelHeight = (uint32) lround(cropBox.dY * renderParams.m_yResolution / 96.0);
// Allocate an 8bpc frame buffer
uint8 numChannels = renderParams.m_finalSpace->getNumComponents();
if (renderParams.m_alpha)
{
numChannels++;
}
uint32 stride = pixelWidth * numChannels * renderParams.m_depth / 8;
std::unique_ptr<uint8[]> frameBuffer(new uint8[(uint64) stride * (uint64) pixelHeight]);
// Render
CFrameBufferRenderSpec renderSpec;
renderSpec.processSpace = renderParams.m_processSpace;
renderSpec.finalSpace = renderParams.m_finalSpace;
renderSpec.width = pixelWidth;
renderSpec.height = pixelHeight;
renderSpec.depth = renderParams.m_depth;
renderSpec.aaFactor = renderParams.m_aaFactor;
renderSpec.alpha = renderParams.m_alpha;
renderSpec.sourceRect = cropBox;
renderSpec.hostEndian = false;
renderSpec.buffer = frameBuffer.get();
renderSpec.rowStride = (int32) stride;
renderSpec.optionalContent = optionalContent;
renderSpec.optionalContentEvent = renderParams.m_optionalContentEvent;
renderer->render(content, &renderSpec);
// Done with the page
page->revert();
page->release();
This leaves us with a buffer ready to encode.
Encode
To encode from a buffer, we use an IImageFrameWriter::writeScanline()
method, passing in the address of the start of each scanline, as found in the rendered buffer.
// Create the path to the output
char outPath[MAX_PATH];
edlSnprintfE(outPath, sizeof(outPath), renderParams.m_outputFilePath.c_str(), pageNumInFile);
// Create the output image frame
IRAInputStreamPtr readStream = IInputStream::createFromFile(jawsMako, outPath);
IRAOutputStreamPtr writeStream = IOutputStream::createToFile(jawsMako, outPath);
IImageFrameWriterPtr frame;
eImageExtraChannelType extraChannel = renderParams.m_alpha ? eIECAlpha : eIECNone;
if (renderParams.m_renderedFormat == eRFPNG)
{
IDOMImagePtr image = IDOMPNGImage::createWriterAndImage(jawsMako,
frame,
renderParams.m_finalSpace,
pixelWidth, pixelHeight,
renderParams.m_depth,
renderParams.m_xResolution,
renderParams.m_yResolution,
extraChannel,
readStream, writeStream);
}
else if (renderParams.m_renderedFormat == eRFTIFF)
{
// TIFF
IDOMImagePtr image = IDOMTIFFImage::createWriterAndImage(jawsMako,
frame,
renderParams.m_finalSpace,
pixelWidth, pixelHeight,
renderParams.m_depth,
renderParams.m_xResolution,
renderParams.m_yResolution,
IDOMTIFFImage::eTCAuto,
IDOMTIFFImage::eTPNone,
extraChannel,
renderParams.m_bigTIFF,
readStream, writeStream);
}
else
{
// Can't write with an extra channel
if (extraChannel != eIECNone)
{
throwEDLError(JM_ERR_GENERAL, L"Can't write JPEG with an alpha channel");
}
if (renderParams.m_depth == 16)
{
throwEDLError(JM_ERR_GENERAL, L"Can't write JPEG with 16 bit depth");
}
// JPEG - middling quality
IDOMImagePtr image = IDOMJPEGImage::createWriterAndImage(jawsMako,
frame,
renderParams.m_finalSpace,
pixelWidth, pixelHeight, 8,
renderParams.m_xResolution,
renderParams.m_yResolution,
3,
readStream, writeStream);
}
// Out with it
for (uint32 y = 0; y < pixelHeight; y++)
{
const uint8 *scanline = frameBuffer.get() + (uint64) y * (uint64) stride;
frame->writeScanLine(scanline);
}
frame->flushData();