(v13) Further tuning

This page applies to Harlequin v13.1r0 and later; and to Harlequin Core but not Harlequin MultiRIP

Once you have an initial configuration for an optimal system, and especially if there are additional significant processes on the same server, you may find tools such as Process Monitor and Intel® VTuneTM Amplifier useful to understand what areas are the biggest bottlenecks (such as in-cache flushing and memory contention).

As an example of the kind of evaluation that you might find useful, this document sets out a Global Graphics analysis of the limits on performance when multiple Harlequin RIPs were run on a single server, primarily addressing the question of:

why the increase in throughput is not linear in the whole of the range where the product of number of RIPs and
the number of threads per RIP is below the number of physical processing cores available.

It is high-level analysis, noting our own experience and suggesting a reasonable methodology. It is not intended to define a canonically correct process of identifying an optimal configuration.