ruby-prof
Overview
ruby-prof is a profiler for MRI Ruby. Its features include:
-
Speed - it is a C extension and therefore many times faster than the standard Ruby profiler.
-
Measurement Modes - ruby-prof can measure program wall time, process time, object allocations and memory usage
-
Reports - ruby-prof can generate a variety of text and cross-referenced html reports
-
Threads - supports profiling multiple threads simultaneously
Why ruby-prof?
ruby-prof is helpful if your program is slow and you don't know why. It can help you track down methods that are either slow, allocate a large number of objects or allocate objects with high memory usage. Often times the results will be surprising - when profiling what you think you know almost always turns out to be wrong.
Since ruby-prof is built using ruby's C tracepoint api, it knows a lot about your program. However, using ruby-prof also comes with two caveats:
-
To use ruby-prof you generally need to include a few lines of extra code in your program (although see command line usage)
-
Using ruby-prof will cause your program to run slower (see Performance section)
Most of the time, these two caveats are acceptable. But if you need to determine why a program running in production is slow or hung, a sampling profiler will be a better choice. Excellent choices include stackprof or rbspy.
If you are just interested in memory usage, you may also want to checkout the memory_profiler gem (although ruby-prof provides similar information).
Installation
The easiest way to install ruby-prof is by using Ruby Gems. To install:
gem install ruby-prof
If you are running Linux or Unix you'll need to have a C compiler installed so the extension can be built when it is installed. If you are running Windows, then you should install the Windows specific gem or install install devkit.
ruby-prof requires Ruby 2.7.0 or higher.
!!!WARNING!!!
Ruby versions 2.7.0 and 2.7.1 have a bug that can cause ruby-prof to return incorrect results.
Getting Started
There are three ways to use ruby-prof:
- command line
- convenience API
- core API
Command Line
The easiest way to use ruby-prof is via the command line, which requires no modifications to your program. The basic usage is:
ruby-prof [options] <script.rb> [--] [script-options]"
Where script.rb is the program you want to profile.
For a full list of options, see RubyProf::Cmd documentation or execute the following command:
ruby-prof -h
Convenience API
The second way to use ruby-prof is via its convenience API. This requires small modifications to the program you want to profile:
require 'ruby-prof' profile = RubyProf::Profile.new # profile the code profile.start # ... code to profile ... result = profile.stop # print a flat profile to text printer = RubyProf::FlatPrinter.new(result) printer.print(STDOUT)
Alternatively, you can use a block to tell ruby-prof what to profile:
require 'ruby-prof' # profile the code result = RubyProf::Profile.profile do # ... code to profile ... end # print a graph profile to text printer = RubyProf::GraphPrinter.new(result) printer.print(STDOUT, {})ruby-prof also supports pausing and resuming profiling runs.
require 'ruby-prof' profile = RubyProf::Profile.new # profile the code profile.start # ... code to profile ... profile.pause # ... other code ... profile.resume # ... code to profile ... result = profile.stop
Note that resume will only work if start has been called previously. In addition, resume can also take a block:
require 'ruby-prof' profile = RubyProf::Profile.new # profile the code profile.start # ... code to profile ... profile.pause # ... other code ... profile.resume do # ... code to profile... end result = profile.stop
With this usage, resume will automatically call pause at the end of the block.
The RubyProf.profile
method can take various options,
which are described in the Profiling
Options below.
Core API
The convenience API is a wrapper around the the Ruby::Profile
class. Using the Profile class directly provides addition
functionality, such as method exclusion.
To create a new profile:
require 'ruby-prof' profile = RubyProf::Profile.new(options) result = profile.profile do ... endOnce a profile is completed, you can either generate a report via a printer or save the results for later analysis. For a list of profiling options, please see the Profiling Options section.
Advanced Usage
This section describes advanced usage of ruby-prof. Additional documentation for every class is also available.
Profiling Options
ruby-prof understands the following options when profiling code:
- measure_mode
- What ruby-prof should measure. For more information see the Measurement Mode section.
- track_allocations
- Tracks each object location, including the object class, memory size and source file location. For more information see the Allocation Tracking section.
- exclude_threads
- Array of threads which should not be profiled. For more information see the Thread Inclusion/Exclusion section.
- include_threads
- Array of threads which should be profiled. All other threads will be ignored. For more information see the Thread Inclusion/Exclusion section.
Measurement Mode
The measurement mode determines what ruby-prof measures when profiling code. Supported measurements are:
- RubyProf::WALL_TIME
-
Wall time measures the real-world time elapsed between any two moments in seconds. If there are other processes concurrently running on the system that use significant CPU or disk time during a profiling run then the reported results will be larger than expected. On Windows, wall time is measured using
GetTickCount()
, on MacOS bymach_absolute_time
, on Linux byclock_gettime
and otherwise bygettimeofday
. - RubyProf::PROCESS_TIME
-
Process time measures the time used by a process between any two moments in seconds. It is unaffected by other processes concurrently running on the system. Remember with process time that calls to methods like sleep will not be included in profiling results. On Windows, process time is measured using
GetProcessTimes
and on other platforms byclock_gettime
. - RubyProf::ALLOCATIONS
-
Object allocations measures show how many objects each method in a program allocates. Measurements are done via Ruby's
GC.stat
api. - RubyProf::MEMORY
-
Memory measures how much memory each method in a program uses in bytes. Measurements are done via Ruby's TracePoint
api.
To set the measurement mode:
RubyProf.measure_mode = RubyProf::WALL_TIME RubyProf.measure_mode = RubyProf::PROCESS_TIME RubyProf.measure_mode = RubyProf::ALLOCATIONS RubyProf.measure_mode = RubyProf::MEMORY
The default value is RubyProf::WALL_TIME
. You may also
specify the measure mode by using the RUBY_PROF_MEASURE_MODE
environment variable:
export RUBY_PROF_MEASURE_MODE=wall export RUBY_PROF_MEASURE_MODE=process export RUBY_PROF_MEASURE_MODE=allocations export RUBY_PROF_MEASURE_MODE=memory
Allocation Tracking
ruby-prof also has the ability to track object allocations. This functionality can be turned on via the track_allocations option:
require 'ruby-prof' RubyProf.profile(:track_allocations => true) do ... end
Note the RubyProf::ALLOCATIONS
measure mode is slightly
different than tracking allocations. The measurement mode provides
high level information about the number of allocations performed in
each method. In contrast, tracking allocations provides detailed
information about the type, number, memory usage and location of each
allocation. Currently, to see allocations results you must use the RubyProf::GraphHtmlPrinter
.
Thread Inclusion/Exclusion
ruby-prof can profile multiple threads. Sometimes this can be overwhelming. For example, assume you want to determine why your tests are running slowly. If you are using minitest, it will run your tests in parallel by spawning tens of worker threads (note to tell minitest to use a single thread set the N environmental variable like this ENV = 0). Thus, ruby-prof provides two options to specify which threads should be profiled:
exclude_threads:: Array of threads which should not be profiled. include_threads:: Array of threads which should be profiled. All other threads will be ignored.
Method Exclusion
ruby-prof supports excluding specific methods and threads from
profiling results. This is useful for reducing connectivity in the
call graph, making it easier to identify the source of performance
problems when using a graph printer. For example, consider Integer#times
:
it's hardly ever useful to know how much time is spent in the method
itself. We are more interested in how much the passed in block
contributes to the time spent in the method which contains the Integer#times
call. The effect on collected metrics are identical to eliminating
methods from the profiling result in a post process step.
profile = RubyProf::Profile.new(...) profile.exclude_methods!(Integer, :times, ...) profile.start
A convenience method is provided to exclude a large number of methods which usually clutter up profiles:
profile.exclude_common_methods!
However, this is a somewhat opinionated method collection. It's usually better to view it as an inspiration instead of using it directly (see github.com/ruby-prof/ruby-prof/blob/e087b7d7ca11eecf1717d95a5c5fea1e36ea3136/lib/ruby-prof/profile/exclude_common_methods.rb).
Merging Threads and Fibers
A common design pattern is to have a main thread delegate work to background threads or fibers (the rest of this section will just refer to worker threads to keep the text simpler). Examples include web servers, such as Puma and Falcon.
Understanding profiling results for worker threads can be very difficult because there may be hundreds or thousands of them. To help with this, ruby-prof includes the ability to merge results for worker threads that start with the same parent method. In the best case, this can collapse results into two threads - one for the parent thread and one for all worker threads.
Note the collapsed threads show the sum of times for all merged threads. For example, assume there are 10 worker threads that each took 5 seconds to run. The single merged thread will show a total time of 50 seconds.
To merge threads:
profile = RubyProf.profile do ... end profile.merge!
Saving Results
It can be helpful to save the results of a profiling run for later analysis. Results can be saved using Ruby's marshal library.
profile_1 = RubyProf.profile do ... end # Save the results data = Marshal.dump(profile_1)!!!WARNING!!! - Only load ruby-prof profiles that you know are safe. Demarshaling data can lead to arbitrary code execution and thus can be dangerous.
# Sometime later load the results profile_2 = Marshal.load(data)
Reports
Once you have completed a profiling run, ruby-prof provides a number of reports that you can use to analyze the results. Reports are created via the use of printers:
profile = RubyProf.profile do
...
end
printer = RubyProf::GraphPrinter.new(profile) printer.print(STDOUT, :min_percent => 2)
The first parameter is any writable IO object such as STDOUT
or a file. The second parameter specifies the minimum percentage a
method must take to be printed. Percentages should be specified as
integers in the range 0 to 100. For more information please see the
documentation for the different printers.
The different types of reports, and their associated printers, are:
- Flat Report (RubyProf::FlatPrinter)
- The flat report show the overall time spent in each method. Its is a good way of quickly identifying which methods take the most time. An example of a flat report can be found in examples/primes.flat.txt.
- Graph Reports (RubyProf::GraphPrinter)
- The graph report shows the overall time spent in each method. In
addition, it also show which methods call the current method and
which methods its calls. Thus they are good for understanding how
methods gets called and provide insight into the flow of your
program. An example graph report is located at examples/primes.graph.txt.
- HTML Graph Reports (RubyProf::GraphHtmlPrinter)
- HTML Graph profiles are the same as graph reports, except output is generated in hyper-linked HTML. Since graph reports can be quite large, the embedded links make it much easier to navigate the results. An example html graph profile is located at examples/primes.graph.html.
- Call stack reports (RubyProf::CallStackPrinter)
- Call stack reports produce a HTML visualization of the time spent in each execution path of the profiled code. An example can be found at examples/primes.stack.html.
- Call graphs (RubyProf::CallTreePrinter)
- Call graphs output results in the calltree profile format which is used by KCachegrind. More information about the format can be found at the KCachegrind site. An example can be found at examples/primes.callgrind.out.20124.
- Graphviz reports (RubyProf::DotPrinter)
- The graphviz report is designed to be opened by Graphviz to create visualization of profile results. An example can be found at examples/primes.dot.
- Multiple reports (RubyProf::MultiPrinter)
- MultiPrinter which can generate several reports in one profiling
run.
MultiPrinter
requires a directory path and a profile basename for the files they produceprinter = RubyProf::MultiPrinter.new(result) printer.print(:path => ".", :profile => "profile")
Performance
Significant effort has been put into reducing ruby-prof's overhead. Our tests show that the overhead associated with profiling code varies considerably with the code being profiled. Most programs will run approximately twice as slow while highly recursive programs (like the fibonacci series test) will run up to five times slower.
Profiling Rails
To profile a Rails application it is vital to run it using production like settings (cache classes, cache view lookups, etc.). Otherwise, Rail's dependency loading code will overwhelm any time spent in the application itself (our tests show that Rails dependency loading causes a roughly 6x slowdown). The best way to do this is create a new Rails environment, profile.rb.
To profile Rails:
-
Create a new profile.rb environment. Make sure to turn on
cache_classes
andcache_template_loading
. Otherwise your profiling results will be overwhelmed by the time Rails spends loading required files. You should likely turn off caching. -
Add the ruby-prof to your gemfile:
group :profile do gem 'ruby-prof' end
-
Add the ruby prof rack adapter to your middleware stack. The Rails documentation describes several ways to do this. One way is to add the following code to
application.rb
:if Rails.env.profile?
config.middleware.use
Rack::RubyProf, :path => './tmp/profile' endThe path is where you want profiling results to be stored. By default the rack adapter will generate a html call graph report and flat text report.
-
Now make a request to your running server. New profiling information will be generated for each request. Note that each request will overwrite the profiling reports created by the previous request!
Architecture
To learn more about how ruby-prof works internally please see the architecture page.
Version 1.0
ruby-prof's development stretches all the way back to 2005. Fourteen years later, it seems time for version 1.0! Version 1.0 is a significant update that greatly improves the internals of ruby-prof and adds a number of improvements:
-
Profiling is significantly faster (5x in some cases)
- Recursive profiles are finally handled correctly. Yeah!!!
- Redesigned reports.
-
The ability to measure allocations and memory usage using a standard (unpatched) version of ruby
-
The ability to save and reload profiling results for later analysis
-
The ability track object allocations
- New documentation website.
At the same time, it was also time to remove old code and deprecated methods. These changes include:
-
Drop support for old versions of ruby. Currently 2.4 and up are required.
-
Remove support for method elimination (instead use method exclusion).
-
Remove the CPU_TIME measurement mode since it duplicates the PROCESS_TIME mode and required inline assembly code
-
Remove the GC_TIME and GC_RUNS measurement modes since they required patched ruby versions. In modern ruby it is possible to reimplement this functionality, but it does not seem of that much value (shout out if you want it back).
- Merge the
RubyProf::FlatPrinterWithLineNumbers
printer intoRubyProf::FlatPrinter
. - Removed support for RUBY_PROF_EDITOR_URI environment variable that specified the link scheme. This features was more confusing then helpful.
API Documentation
API documenation for each class is available.
License
See LICENSE for license information.
Development
Code is located at github.com/ruby-prof/ruby-prof