How to use perf to measure part of a program quickly

This is a post summarizing how to quickly get started with perf. Specifically, how to use perf to measure part of the program execution (ignoring the parts you don’t care) quickly.

  1. Basics
    1. Perf is a command line tool (https://perf.wiki.kernel.org/index.php/Main_Page)
      1. To use it to measure the performance of a program simply do
        1. $ perf stat <perf-options> <program_name> <program_arguments>
          1. an example would be “perf stat ls” and it would print out a small list of measurements of the “ls” program.
        2. perf-options
  2. A good tutorial
    1. https://perf.wiki.kernel.org/index.php/Tutorial#multiplexing_and_scaling_events

The really key part that I want to talk about is how to use perf only to benchmark part of the program quickly. This idea was suggested to me by Vladimir Kiriansky in the COMMIT group. We can use a “getchar()“ or whatever to stop the execution of the program and then get the process ID and use perf in a separate terminal to start monitoring the process ID. This way, you can easily get around the parts that you don’t want to measure using perf. I have tried this and it works fine.

Of course, there are heavier weights method to use the program to invoke perf for precise measurement. But it seems that the trick above is really useful for quick prototyping and measurements.

You can find the commands to attach perf to a specific process here

https://perf.wiki.kernel.org/index.php/Tutorial#multiplexing_and_scaling_events

A few points to note

  1. To be precise, I started perf after I resumed the execution so perf will not monitor the program while it is waiting for my command
  2. I used sleep to specify how long I want perf to monitor the process. This way, perf will end the monitoring before the program stops execution. This way, perf will not monitor  after the program completed execution.
  3. The specific command for perf that I used is the following (similar to the tutorial’s command on attaching to a process). I highlighted -p to show it is attaching to process 41498, and sleep in the end to specify how long I want to monitor the process. In my experience this seems to work.
    1. perf stat -e LLC-loads,LLC-load-misses,dTLB-load-misses -p 41498 sleep 4
Advertisements
This entry was posted in Uncategorized and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s