Hands-On High Performance Programming with Qt 5
上QQ阅读APP看书,第一时间看更新

Instrumenting profilers

Code instrumentation means adding additional code to our existing code base that will measure performance and output performance data. This can be done automatically by some tools or manually, by simply writing some printf statements. The venerable prof and gprof GNU profilers used to work that way—the programmer had to specify a special compiler switch (-p or -pg), and the GNU compiler would add the the necessary code in each of the functions. As for a more modern example, commercial rational quantify tools use object-code insertion technology to instrument executable being tested by dynamically inserting instrumentation code. On the other hand, RAD Game Tools' telemetry provides instrumentation functions to be manually inserted into code, and Intel VTune provides such functions in its instrumentation and tracing technology (ITT) API library.

For this reason, such techniques are also called invasive, as they change the code they are supposed to be measuring (for your inner geek, remember the Heisenberg principle?). Thus, its advantage is high precision data and its disadvantage is changing the runtime behavior of programs (that is, diminishing accuracy).

The simpler but more time-consuming version of instrumentation, namely manually inserting the measurement and output code, also has the right to exist:

  • Profile builds: Here, in addition to the usual release and debug build configurations, you can add a configuration where the profiling code will be enabled, to activate it when a performance problem has to be investigated or the application's performance has to be checked. For example, in Telemetry's case, you have to set #define NTELEMETRY 1, and recompile, and the instrumentation will be gone.
  • Custom visualizations: Another advantage of manual instrumentation is that we can format the outputs as we wish, thus making them compatible with our preferred visualization tools. For example, Intel's ITT API functions can generate output in several visualizers' format, depending on the value of an environment variable. We will see some examples of this approach later in this chapter.