ruby-prof

Build Status

Overview

ruby-prof is a profiler for MRI Ruby. Its features include:

Why ruby-prof?

ruby-prof is helpful if your program is slow and you don't know why. It can help you track down methods that are either slow, allocate a large number of objects or allocate objects with high memory usage. Often times the results will be surprising - when profiling what you think you know almost always turns out to be wrong.

Since ruby-prof is built using ruby's C tracepoint api, it knows a lot about your program. However, using ruby-prof also comes with two caveats:

Most of the time, these two caveats are acceptable. But if you need to determine why a program running in production is slow or hung, a sampling profiler will be a better choice. Excellent choices include stackprof or rbspy.

If you are just interested in memory usage, you may also want to checkout the memory_profiler gem (although ruby-prof provides similar information).

Installation

The easiest way to install ruby-prof is by using Ruby Gems. To install:

gem install ruby-prof

If you are running Linux or Unix you'll need to have a C compiler installed so the extension can be built when it is installed. If you are running Windows, then you should install the Windows specific gem or install install devkit.

ruby-prof requires Ruby 2.7.0 or higher.

!!!WARNING!!!

Ruby versions 2.7.0 and 2.7.1 have a bug that can cause ruby-prof to return incorrect results.


Getting Started

There are three ways to use ruby-prof:

Command Line

The easiest way to use ruby-prof is via the command line, which requires no modifications to your program.  The basic usage is:

ruby-prof [options] <script.rb> [--] [script-options]"

Where script.rb is the program you want to profile.

For a full list of options, see RubyProf::Cmd documentation or execute the following command:

ruby-prof -h

Convenience API

The second way to use ruby-prof is via its convenience API. This requires small modifications to the program you want to profile:

require 'ruby-prof'

profile = RubyProf::Profile.new

# profile the code
profile.start
# ... code to profile ...
result = profile.stop

# print a flat profile to text
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)

Alternatively, you can use a block to tell ruby-prof what to profile:

require 'ruby-prof'

# profile the code
result = RubyProf::Profile.profile do
  # ... code to profile ...
end

# print a graph profile to text
printer = RubyProf::GraphPrinter.new(result)
printer.print(STDOUT, {})
ruby-prof also supports pausing and resuming profiling runs.
require 'ruby-prof'

profile = RubyProf::Profile.new

# profile the code
profile.start
# ... code to profile ...

profile.pause
# ... other code ...

profile.resume
# ... code to profile ...

result = profile.stop

Note that resume will only work if start has been called previously. In addition, resume can also take a block:

require 'ruby-prof'

profile = RubyProf::Profile.new

# profile the code
profile.start
# ... code to profile ...

profile.pause
# ... other code ...

profile.resume do
  # ... code to profile...
end

result = profile.stop

With this usage, resume will automatically call pause at the end of the block.

The RubyProf.profile method can take various options, which are described in the Profiling Options below.

Core API

The convenience API is a wrapper around the the Ruby::Profile class. Using the Profile class directly provides addition functionality, such as method exclusion.

To create a new profile:

require 'ruby-prof'

profile = RubyProf::Profile.new(options)
result = profile.profile do
           ...
         end
Once a profile is completed, you can either generate a report via a printer or save the results for later analysis. For a list of profiling options, please see the Profiling Options section.

Advanced Usage

This section describes advanced usage of ruby-prof. Additional documentation for every class is also available.

Profiling Options

ruby-prof understands the following options when profiling code:

measure_mode
What ruby-prof should measure. For more information see the Measurement Mode section.
track_allocations
Tracks each object location, including the object class, memory size and source file location. For more information see the Allocation Tracking section.
exclude_threads
Array of threads which should not be profiled. For more information see the Thread Inclusion/Exclusion section.
include_threads
Array of threads which should be profiled. All other threads will be ignored. For more information see the Thread Inclusion/Exclusion section.

Measurement Mode

The measurement mode determines what ruby-prof measures when profiling code. Supported measurements are:

RubyProf::WALL_TIME

Wall time measures the real-world time elapsed between any two moments in seconds. If there are other processes concurrently running on the system that use significant CPU or disk time during a profiling run then the reported results will be larger than expected. On Windows, wall time is measured using GetTickCount(), on MacOS by mach_absolute_time, on Linux by clock_gettime and otherwise by gettimeofday.

RubyProf::PROCESS_TIME

Process time measures the time used by a process between any two moments in seconds. It is unaffected by other processes concurrently running on the system. Remember with process time that calls to methods like sleep will not be included in profiling results. On Windows, process time is measured using GetProcessTimes and on other platforms by clock_gettime.

RubyProf::ALLOCATIONS

Object allocations measures show how many objects each method in a program allocates. Measurements are done via Ruby's GC.stat api.

RubyProf::MEMORY

Memory measures how much memory each method in a program uses in bytes. Measurements are done via Ruby's TracePoint api.

To set the measurement mode:

RubyProf.measure_mode = RubyProf::WALL_TIME
RubyProf.measure_mode = RubyProf::PROCESS_TIME
RubyProf.measure_mode = RubyProf::ALLOCATIONS
RubyProf.measure_mode = RubyProf::MEMORY

The default value is RubyProf::WALL_TIME. You may also specify the measure mode by using the RUBY_PROF_MEASURE_MODE environment variable:

export RUBY_PROF_MEASURE_MODE=wall
export RUBY_PROF_MEASURE_MODE=process
export RUBY_PROF_MEASURE_MODE=allocations
export RUBY_PROF_MEASURE_MODE=memory

Allocation Tracking

ruby-prof also has the ability to track object allocations. This functionality can be turned on via the track_allocations option:

require 'ruby-prof'

RubyProf.profile(:track_allocations => true) do
  ...
end

Note the RubyProf::ALLOCATIONS measure mode is slightly different than tracking allocations. The measurement mode provides high level information about the number of allocations performed in each method. In contrast, tracking allocations provides detailed information about the type, number, memory usage and location of each allocation. Currently, to see allocations results you must use the RubyProf::GraphHtmlPrinter.

Thread Inclusion/Exclusion

ruby-prof can profile multiple threads. Sometimes this can be overwhelming. For example, assume you want to determine why your tests are running slowly.  If you are using minitest, it will run your tests in parallel by spawning tens of worker threads (note to tell minitest to use a single thread set the N environmental variable like this ENV = “0”). Thus, ruby-prof provides two options to specify which threads should be profiled:

exclude_threads::
    Array of threads which should not be profiled.

include_threads::
    Array of threads which should be profiled. All other threads will
    be ignored.

Method Exclusion

ruby-prof supports excluding specific methods and threads from profiling results. This is useful for reducing connectivity in the call graph, making it easier to identify the source of performance problems when using a graph printer. For example, consider Integer#times: it's hardly ever useful to know how much time is spent in the method itself. We are more interested in how much the passed in block contributes to the time spent in the method which contains the Integer#times call. The effect on collected metrics are identical to eliminating methods from the profiling result in a post process step.

profile = RubyProf::Profile.new(...)
profile.exclude_methods!(Integer, :times, ...)
profile.start

A convenience method is provided to exclude a large number of methods which usually clutter up profiles:

profile.exclude_common_methods!

However, this is a somewhat opinionated method collection. It's usually better to view it as an inspiration instead of using it directly (see github.com/ruby-prof/ruby-prof/blob/e087b7d7ca11eecf1717d95a5c5fea1e36ea3136/lib/ruby-prof/profile/exclude_common_methods.rb).

Merging Threads and Fibers

A common design pattern is to have a main thread delegate work to background threads or fibers (the rest of this section will just refer to worker threads to keep the text simpler). Examples include web servers, such as Puma and Falcon.

Understanding profiling results for worker threads can be very difficult because there may be hundreds or thousands of them. To help with this, ruby-prof includes the ability to merge results for worker threads that start with the same parent method. In the best case, this can collapse results into two threads - one for the parent thread and one for all worker threads.

Note the collapsed threads show the sum of times for all merged threads. For example, assume there are 10 worker threads that each took 5 seconds to run. The single merged thread will show a total time of 50 seconds.

To merge threads:

profile = RubyProf.profile do
            ...
          end
profile.merge!

Saving Results

It can be helpful to save the results of a profiling run for later analysis.  Results can be saved using Ruby's marshal library.

profile_1 = RubyProf.profile do
  ...
end

# Save the results
data = Marshal.dump(profile_1)

# Sometime later load the results profile_2 = Marshal.load(data)
!!!WARNING!!! - Only load ruby-prof profiles that you know are safe. Demarshaling data can lead to arbitrary code execution and thus can be dangerous.

Reports

Once you have completed a profiling run, ruby-prof provides a number of reports that you can use to analyze the results. Reports are created via the use of printers:

profile = RubyProf.profile do 
...
end
printer = RubyProf::GraphPrinter.new(profile) printer.print(STDOUT, :min_percent => 2)

The first parameter is any writable IO object such as STDOUT or a file. The second parameter specifies the minimum percentage a method must take to be printed. Percentages should be specified as integers in the range 0 to 100. For more information please see the documentation for the different printers.

The different types of reports, and their associated printers, are:

Flat Report (RubyProf::FlatPrinter)
The flat report show the overall time spent in each method. Its is a good way of quickly identifying which methods take the most time. An example of a flat report can be found in examples/primes.flat.txt.
Graph Reports (RubyProf::GraphPrinter)
The graph report shows the overall time spent in each method. In addition, it also show which methods call the current method and which methods its calls. Thus they are good for understanding how methods gets called and provide insight into the flow of your program. An example graph report is located at examples/primes.graph.txt.
HTML Graph Reports (RubyProf::GraphHtmlPrinter)
HTML Graph profiles are the same as graph reports, except output is generated in hyper-linked HTML. Since graph reports can be quite large, the embedded links make it much easier to navigate the results. An example html graph profile is located at examples/primes.graph.html.
Call stack reports (RubyProf::CallStackPrinter)
Call stack reports produce a HTML visualization of the time spent in each execution path of the profiled code. An example can be found at examples/primes.stack.html.
Call graphs (RubyProf::CallTreePrinter)
Call graphs output results in the calltree profile format which is used by KCachegrind. More information about the format can be found at the KCachegrind site. An example can be found at examples/primes.callgrind.out.20124.
Graphviz reports (RubyProf::DotPrinter)
The graphviz report is designed to be opened by Graphviz to create visualization of profile results. An example can be found at examples/primes.dot.
Multiple reports (RubyProf::MultiPrinter)
MultiPrinter which can generate several reports in one profiling run. MultiPrinter requires a directory path and a profile basename for the files they produce
printer = RubyProf::MultiPrinter.new(result)
printer.print(:path => ".", :profile => "profile")

Performance

Significant effort has been put into reducing ruby-prof's overhead. Our tests show that the overhead associated with profiling code varies considerably with the code being profiled. Most programs will run approximately twice as slow while highly recursive programs (like the fibonacci series test) will run up to five times slower.

Profiling Rails

To profile a Rails application it is vital to run it using production like settings (cache classes, cache view lookups, etc.). Otherwise, Rail's dependency loading code will overwhelm any time spent in the application itself (our tests show that Rails dependency loading causes a roughly 6x slowdown). The best way to do this is create a new Rails environment, profile.rb.

To profile Rails:

  1. Create a new profile.rb environment. Make sure to turn on cache_classes and cache_template_loading. Otherwise your profiling results will be overwhelmed by the time Rails spends loading required files. You should likely turn off caching.

  2. Add the ruby-prof to your gemfile:

    group :profile do
      gem 'ruby-prof'
    end
    
  3. Add the ruby prof rack adapter to your middleware stack. The Rails documentation describes several ways to do this. One way is to add the following code to application.rb:

    if Rails.env.profile?
      config.middleware.use Rack::RubyProf, :path => './tmp/profile'
    end
    

    The path is where you want profiling results to be stored. By default the rack adapter will generate a html call graph report and flat text report.

  4. Now make a request to your running server. New profiling information will be generated for each request. Note that each request will overwrite the profiling reports created by the previous request!

Architecture

To learn more about how ruby-prof works internally please see the architecture page.

Version 1.0

ruby-prof's development stretches all the way back to 2005. Fourteen years later, it seems time for version 1.0! Version 1.0 is a significant update that greatly improves the internals of ruby-prof and adds a number of improvements:

At the same time, it was also time to remove old code and deprecated methods. These changes include:

API Documentation

API documenation for each class is available.

License

See LICENSE for license information.

Development

Code is located at github.com/ruby-prof/ruby-prof