Pull Request Analytics: How to Visualize Cycle Time / Lead Time and Get Insights for Improvement

March 30, 2021
#Reporting#How To#Bitbucket
12 min

Cycle Time / Lead Time is one of the most important metrics for software development. It can tell a lot about the efficiency of the development process and the teams’ speed and capacity. In the previous article, we showed you how to get a detailed report with the pull request statistics and Cycle Time / Lead Time calculated on the repository level. 

Today we’ll tell you how to use this report:

  • How to visualize the pull request data.
  • What things to pay attention to.
  • What insights you can get to improve performance.

Please note that we define Cycle Time / Lead Time as the time between the developer’s first commit and the time it’s merged and will refer to it as Cycle Time throughout the article.

Analyzing your codebase

First, you need to understand the current state of affairs and how it compares to the industry standards. According to the Code Climate’s research, the industry-wide median for Cycle Time is 3.4 days, with only the top 25% managing to keep it as low as 1.8 days and the bottom 25% having a Cycle Time of 6.2 days.

© Code Climate

To get a better understanding of the development process, it might be helpful to look at the teams’ dynamics and monitor the changes over time. The following chart shows how the average Cycle Time changes month after month with a trend line, so you can see objectively whether the development process is getting faster or slower and check how your rates compare to the industry average. Follow the instructions to build this chart.

For a more precise analysis and evaluation of the current code base, you can also use the Cycle Time distribution chart that provides pull request statistics aggregated by their Cycle time value, making it easy to spot the outliers for further investigation. Learn how to build this chart.

In addition to the Cycle Time, Awesome Graphs for Bitbucket lets you analyze the pull request resolution time out-of-the-box. Using the Resolution Time Distribution report, you can see how long it takes pull requests to merge or decline, find the shortest and longest pull requests, and predict the resolution time of future pull requests with the historical data.

While Cycle Time serves as a great indicator of success and, keeping it low, you can increase the output and efficiency of your teams, it’s not diagnostic by itself and can’t really tell what you are doing right or wrong. To understand why it is high or low, you’ll need to dig deeper into the metrics it consists of. The chart below gives you a general overview of the pull requests on the repository level and shows the Cycle Time with the percentage of the stages it’s comprised of (which we’ll discuss in detail in the following paragraphs). You can build a chart like this using the Chart from Table macro, available in the Table Filter and Charts app.

Breaking down the Cycle Time

We break down Cycle Time into four stages:

  • Time to open (from the first commit to open)
  • Time waiting for review (from open to the first comment)
  • Time to approve (from the first comment to approved)
  • Time to merge (from approved to merge)

Now we’ll go through each of these stages, discussing the things to pay attention to.

Time to Open

This metric is arguably the most important of all, as it influences all the later stages and, according to the research, pull requests that open faster tend to merge faster.

Long Time to Open might indicate that the developer had to switch tasks and/or that code was rewritten, which might also result in large batch sizes. In one of the previous articles, we described how you can check the size of your pull requests in Bitbucket, so you can also use it for a deeper analysis.

One of the things you can do to improve your Time to Open is to decrease the pull request size to be no more than 200 to 400 lines of code. Thus you’ll influence each stage of the cycle, as the smaller pull requests are more likely to be reviewed more thoroughly and be approved sooner.

Time to Review

Time to Review is a great metric to understand if your teams adopted Code Review as part of the daily routine. If it’s high, then it might not be part of their habit, and you’ll need to foster this culture. Another reason might be that the pull requests are not review-friendly and the reviewers procrastinate dealing with them. You can change this, once again, by keeping the pull request size small and by writing a reasonable description so it’s easier to get started with them. If the long Time to Review rate is caused by organizational issues, then it might require reprioritization.

Time to Approve

This is the stage you don’t really want to minimize but rather make it consistent by reducing inefficiencies in the code review process. While there are many strategies for Code Review, there is hardly any industry standard for Code Review metrics, so you’ll need to focus on the organization of the process and try to find a way to get constructive feedback.

Time to Merge

Long Time to Merge might be an indicator that there are obstacles in the delivery workflow. To improve it, you need to find out if there are any blockers in the process, including manual deployment, and check if your tooling satisfies your current needs.

Wrapping up

Cycle Time’s importance is difficult to overestimate, as this metric can tell a lot about the way you work, and controlling it, you can optimize the development process and deliver faster.

Once again, we built the initial pull request report with the help of the Awesome Graphs for Bitbucket app as a data provider and used the Table Filter and Charts for Confluence app to aggregate and visualize the data.

These are just a few examples, but you can get much more even from this one report. Check out the other guides for charts based on data from Bitbucket. Share your feedback and ideas in the comments, and we’ll try to cover them in future posts.

Pull Request Analytics: How to Get Pull Request Cycle Time / Lead Time for Bitbucket

March 23, 2021
#Reporting#How To#Bitbucket
13 min

What Cycle Time is and why it is important

Pull Request Cycle Time / Lead Time is a powerful metric to look at while evaluating the engineering teams’ productivity. It helps track the development process from the first moment the code was written in a developer’s IDE and up to the time it’s deployed to production.

Please note that we define Cycle Time / Lead Time as the time between the developer’s first commit and the time it’s merged and will refer to it as Cycle Time throughout the article.

Having this information, you can get an unbiased view of the engineering department’s speed and capacity and find the points to drive improvement. It can also be an indicator of business success as, by controlling the Cycle Time, you can increase the output and efficiency to deliver products faster.

This article will show you how to get a detailed pull request report with the Cycle Time and the related metrics calculated on the repository level. The metrics include:

  • Time to open (from the first commit to open)
  • Time waiting for review (from open to the first comment)
  • Time to approve (from the first comment to approved)
  • Time to merge (from approved to merge)

How to get Time to Open, Time to Review, Time to Approve, and Time to Merge metrics

We can get all the necessary pull request data from Awesome Graphs for Bitbucket and its REST API combined with Bitbucket’s REST API resources. We’ll use Python to make requests into the APIs, calculate and aggregate this data and then save it as a CSV file, like this:

The following script will do all this work for us:

</p>
import sys
import requests
import csv
from dateutil import parser
from datetime import datetime
 
bitbucket_url = sys.argv[1]
login = sys.argv[2]
password = sys.argv[3]
project = sys.argv[4]
repository = sys.argv[5]
since = sys.argv[6]
until = sys.argv[7]
 
s = requests.Session()
s.auth = (login, password)
 
 
class PullRequest:
 
    def __init__(self, pr_id, title, author, state, created, closed):
        self.pr_id = pr_id
        self.title = title
        self.author = author
        self.state = state
        self.created = created
        self.closed = closed
 
 
def parse_date_ag_rest(date):
    return parser.isoparse(date).replace(tzinfo=None, microsecond=0)
 
 
def get_date_from_timestamp(timestamp):
    return datetime.fromtimestamp(timestamp / 1000).replace(microsecond=0)
 
 
def subtract_dates(minuend, subtrahend):
    if minuend is None or subtrahend is None:
        return None
    else:
        return round(((minuend - subtrahend).total_seconds() / 86400), 2)
 
 
def get_pull_requests():
 
    pull_request_list = []
 
    get_prs_url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
        + '/pull-requests'
 
    is_last_page = False
 
    while not is_last_page:
 
        response = s.get(get_prs_url, params={'start': len(pull_request_list), 'limit': 1000,
                                              'sinceDate': since, 'untilDate': until}).json()
 
        for pr_details in response['values']:
 
            pd_id = pr_details['id']
            title = pr_details['title']
            author = pr_details['author']['user']['emailAddress']
            state = pr_details['state']
            created = parse_date_ag_rest(pr_details['createdDate'])
 
            if pr_details['closed'] is True:
                closed = parse_date_ag_rest(pr_details['closedDate'])
            else:
                closed = None
 
            pull_request_list.append(PullRequest(pd_id, title, author, state, created, closed))
 
        is_last_page = response['isLastPage']
 
    return pull_request_list
 
 
def get_first_commit_time(pull_request):
 
    commit_dates = []
 
    commits_url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository + '/pull-requests/' \
        + str(pull_request.pr_id) + '/commits'
 
    is_last_page = False
 
    while not is_last_page:
 
        commits_response = s.get(commits_url, params={'start': len(commit_dates), 'limit': 500}).json()
 
        for commit in commits_response['values']:
            commit_timestamp = commit['authorTimestamp']
            commit_dates.append(get_date_from_timestamp(commit_timestamp))
 
        is_last_page = commits_response['isLastPage']
 
    if not commit_dates:
        first_commit = None
    else:
        first_commit = commit_dates[-1]
 
    return first_commit
 
 
def get_pr_activities(pull_request):
 
    counter = 0
    comment_dates = []
    approval_dates = []
 
    pr_url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository + '/pull-requests/' \
        + str(pull_request.pr_id) + '/activities'
 
    is_last_page = False
 
    while not is_last_page:
 
        pr_response = s.get(pr_url, params={'start': counter, 'limit': 500}).json()
 
        for pr_activity in pr_response['values']:
 
            counter += 1
 
            if pr_activity['action'] == 'COMMENTED':
                comment_timestamp = pr_activity['comment']['createdDate']
                comment_dates.append(get_date_from_timestamp(comment_timestamp))
            elif pr_activity['action'] == 'APPROVED':
                approval_timestamp = pr_activity['createdDate']
                approval_dates.append(get_date_from_timestamp(approval_timestamp))
 
            is_last_page = pr_response['isLastPage']
 
    if not comment_dates:
        first_comment_date = None
    else:
        first_comment_date = comment_dates[-1]
 
    if not approval_dates:
        approval_time = None
    else:
        approval_time = approval_dates[0]
 
    return first_comment_date, approval_time
 
 
print('Collecting a list of pull requests from the repository', repository)
 
with open(f'{project}_{repository}_prs_cycle_time_{since}_{until}.csv', mode='a', newline='') as report_file:
    report_writer = csv.writer(report_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    report_writer.writerow(['id',
                            'title',
                            'author',
                            'state',
                            'first_commit',
                            'created',
                            'first_comment',
                            'approved',
                            'closed',
                            'cycle_time_d',
                            'time_to_open_d',
                            'time_to_review_d',
                            'time_to_approve_d',
                            'time_to_merge_d'])
 
    for pull_request in get_pull_requests():
 
        print('Processing pull request', pull_request.pr_id)
 
        first_commit_time = get_first_commit_time(pull_request)
 
        first_comment, approval = get_pr_activities(pull_request)
 
        cycle_time = subtract_dates(pull_request.closed, first_commit_time)
 
        time_to_open = subtract_dates(pull_request.created, first_commit_time)
 
        time_to_review = subtract_dates(first_comment, pull_request.created)
 
        time_to_approve = subtract_dates(approval, first_comment)
 
        time_to_merge = subtract_dates(pull_request.closed, approval)
 
        report_writer.writerow([pull_request.pr_id,
                                pull_request.title,
                                pull_request.author,
                                pull_request.state,
                                first_commit_time,
                                pull_request.created,
                                first_comment,
                                approval,
                                pull_request.closed,
                                cycle_time,
                                time_to_open,
                                time_to_review,
                                time_to_approve,
                                time_to_merge])
 
print('The resulting CSV file is saved to the current folder.')
<p>

To make this script work, you’ll need to pre-install the requests and dateutil modules. The csvsys, and datetime modules are available in Python out of the box. You need to pass the following arguments to the script when executed:

  • the URL of your Bitbucket, 
  • login, 
  • password, 
  • project key, 
  • repository slug, 
  • since date (to include PRs created after), 
  • until date (to include PRs created before).

Here’s an example:

py script.py https://bitbucket.your-company-name.com login password PRKEY repo-slug 2020-11-30 2021-02-01

Once the script’s executed, the resulting file will be saved to the same folder as the script.

What to do with the report

After you generated a CSV file, you can process it in analytics tools such as Tableau, PowerBI, Qlik, Looker, visualize this data on your Confluence pages with the Table Filter and Charts for Confluence app, or integrate it in any custom solution of your choice for further analysis. 

An example of the data visualized with Table Filter and Charts for Confluence.

By measuring Cycle Time, you can:

  • See objectively whether the development process is getting faster or slower.
  • Analyze the correlation of the specific metrics with the overall cycle time (e.g., pull requests that open faster, merge faster).
  • Compare the results of the particular teams and users within the organization or across the industry.

What’s next?

The report described in this article is built with the help of the Awesome Graphs for Bitbucket app as a data provider, available for Bitbucket Server and Data Center. Using it, you can gain more visibility into the development process to analyze patterns and find bottlenecks.

If you want to learn more about how to use Cycle Time and the related metrics, write in the comments below and upvote this post, and we’ll show you how to visualize the data, what to look at and how to get insights from it in the future posts!

How to Export Commit and Pull Request Data from Bitbucket to CSV

November 26, 2020
#How To#Bitbucket#Reporting
11 min

Being a universal file type, CSV serves as a go-to format for integrations between the applications. It allows for transferring a large amount of data across the systems, even if the integration is not supported natively.  However, you can’t export commit and pull request data from Bitbucket out of the box. The good news is that Awesome Graphs for Bitbucket gives you the capability to export to CSV in different ways.

In this article, we’ll show you how you can use the app to export engineering data to CSV for further integration, organization, and processing in analytics tools and custom solutions.

What you will get

The described ways of exporting will give you two kinds of generated CSV files, depending on the type of data exported. 

In the case of commit data, you’ll get a list of commits with their details:

list of commits with details

And the resulting CSV with a list of pull requests will look like this:

export commit and pull request data to csv

Exporting from the People page

You can export raw commit and pull request data to CSV directly from Bitbucket. When you click All users in the People dropdown menu at the header, you’ll get to the People page with a global overview of developers’ activity in terms of commits or pull requests.

At the top-right corner, you’ll notice the Export menu, where you can choose CSV.

export pull request data from bitbucket

By default, the page shows contributions made within a month, but you can choose a longer period up to a quarter. The filtering applies not only to the GUI but also to the data exported, so if you don’t change the timespan, you’ll get a list of commits or pull requests for the last 30 days.

Exporting via the REST API resources

Beginning with version 5.5.0, Awesome Graphs REST API allows you to retrieve and export commit and pull request data to CSV on global, project, repository, and user levels, using the dedicated resources. This functionality is aimed to automate the processes you used to handle manually and streamline the existing workflows.

You can access the in-app documentation (accessible to Awesome Graphs’ users) by choosing Export → REST API on the People page or go to our documentation website.

We’ll show you two examples of the resources and how they work: one for exporting commits and another for pull requests. You’ll be able to use the rest of the resources as they follow the model.

Export commits to CSV

This resource exports a list of commits with their details from all Bitbucket projects and repositories to a CSV file.

Here is the curl request example:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/commits/export/csv" --output commits.csv

Alternatively, you can use any REST API client like Postman or put the URL directly into your browser’s address bar (you need to be authenticated in Bitbucket in this browser), and you’ll get a generated CSV file.

By default, it exports the data for the last 30 days. You can set a timeframe for exported data up to one year (366 days) with sinceDate / untilDate parameters:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/commits/export/csv?sinceDate=2020-10-01&untilDate=2020-10-13" --output commits.csv

For commit resources, you can also use the query parameters such as merges to filter merge/non-merge commits or order to specify the order to return commits in.

Read more about the resource and its parameters.

Export pull requests to CSV

The pull request resources work similarly, so to export a list of pull requests with their details from all Bitbucket projects and repositories to a CSV file, make the following curl request:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/pull-requests/export/csv" --output pullrequests.csv

The sinceDate / untilDate parameters can also be applied to state the timespan up to a year, but here you have an additional parameter dateType, allowing you to choose either the creation date or the date of the last update as a filtering criterion. So, if you set dateType to created, only the pull requests created during the stated period will be returned, while dateType set to updated will include the pull requests that were updated within the time frame.

Another pull request specific parameter is state, which allows you to filter the response to only include openmerged, or declined pull requests.

For example, the following request will return a list of open pull requests, which were updated between October 1st and October 13th:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/commits/export/csv?dateType=updated&state=open&sinceDate=2020-10-01&untilDate=2020-10-13" --output pullrequests.csv

Learn more about this resource.

Integrate intelligently

While CSV is supported by many systems and is quite comfortable to manage, it is not the only way for software integrations the Awesome Graphs for Bitbucket app offers. Using the REST API, you can make the data flow between the applications and automate the workflow, eliminating manual work. And we want to make it easier for you and save your time.
Let us know what integrations you are interested in, and we’ll try to bring them to you, so you don’t have to spend time and energy creating workarounds.