Pull Request Analytics: How to Get Pull Request Cycle Time / Lead Time for Bitbucket

March 23, 2021
#Bitbucket#Reporting#How To
13 min
bitbucket code review

What Cycle Time is and why it is important

Pull Request Cycle Time / Lead Time is a powerful metric to look at while evaluating the engineering teams’ productivity. It helps track the development process from the first moment the code was written in a developer’s IDE and up to the time it’s deployed to production.

Please note that we define Cycle Time / Lead Time as the time between the developer’s first commit and the time it’s merged and will refer to it as Cycle Time throughout the article.

Having this information, you can get an unbiased view of the engineering department’s speed and capacity and find the points to drive improvement. It can also be an indicator of business success as, by controlling the Cycle Time, you can increase the output and efficiency to deliver products faster.

This article will show you how to get a detailed pull request report with the Cycle Time and the related metrics calculated on the repository level. The metrics include:

  • Time to open (from the first commit to open)
  • Time waiting for review (from open to the first comment)
  • Time to approve (from the first comment to approved)
  • Time to merge (from approved to merge)

How to get Time to Open, Time to Review, Time to Approve, and Time to Merge metrics

We can get all the necessary pull request data from Awesome Graphs for Bitbucket and its REST API combined with Bitbucket’s REST API resources. We’ll use Python to make requests into the APIs, calculate and aggregate this data and then save it as a CSV file, like this:

The following script will do all this work for us:

</p>
import sys
import requests
import csv
from dateutil import parser
from datetime import datetime
 
bitbucket_url = sys.argv[1]
login = sys.argv[2]
password = sys.argv[3]
project = sys.argv[4]
repository = sys.argv[5]
since = sys.argv[6]
until = sys.argv[7]
 
s = requests.Session()
s.auth = (login, password)
 
 
class PullRequest:
 
    def __init__(self, pr_id, title, author, state, created, closed):
        self.pr_id = pr_id
        self.title = title
        self.author = author
        self.state = state
        self.created = created
        self.closed = closed
 
 
def parse_date_ag_rest(date):
    return parser.isoparse(date).replace(tzinfo=None, microsecond=0)
 
 
def get_date_from_timestamp(timestamp):
    return datetime.fromtimestamp(timestamp / 1000).replace(microsecond=0)
 
 
def subtract_dates(minuend, subtrahend):
    if minuend is None or subtrahend is None:
        return None
    else:
        return round(((minuend - subtrahend).total_seconds() / 86400), 2)
 
 
def get_pull_requests():
 
    pull_request_list = []
 
    get_prs_url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
        + '/pull-requests'
 
    is_last_page = False
 
    while not is_last_page:
 
        response = s.get(get_prs_url, params={'start': len(pull_request_list), 'limit': 1000,
                                              'sinceDate': since, 'untilDate': until}).json()
 
        for pr_details in response['values']:
 
            pd_id = pr_details['id']
            title = pr_details['title']
            author = pr_details['author']['user']['emailAddress']
            state = pr_details['state']
            created = parse_date_ag_rest(pr_details['createdDate'])
 
            if pr_details['closed'] is True:
                closed = parse_date_ag_rest(pr_details['closedDate'])
            else:
                closed = None
 
            pull_request_list.append(PullRequest(pd_id, title, author, state, created, closed))
 
        is_last_page = response['isLastPage']
 
    return pull_request_list
 
 
def get_first_commit_time(pull_request):
 
    commit_dates = []
 
    commits_url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository + '/pull-requests/' \
        + str(pull_request.pr_id) + '/commits'
 
    is_last_page = False
 
    while not is_last_page:
 
        commits_response = s.get(commits_url, params={'start': len(commit_dates), 'limit': 500}).json()
 
        for commit in commits_response['values']:
            commit_timestamp = commit['authorTimestamp']
            commit_dates.append(get_date_from_timestamp(commit_timestamp))
 
        is_last_page = commits_response['isLastPage']
 
    if not commit_dates:
        first_commit = None
    else:
        first_commit = commit_dates[-1]
 
    return first_commit
 
 
def get_pr_activities(pull_request):
 
    counter = 0
    comment_dates = []
    approval_dates = []
 
    pr_url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository + '/pull-requests/' \
        + str(pull_request.pr_id) + '/activities'
 
    is_last_page = False
 
    while not is_last_page:
 
        pr_response = s.get(pr_url, params={'start': counter, 'limit': 500}).json()
 
        for pr_activity in pr_response['values']:
 
            counter += 1
 
            if pr_activity['action'] == 'COMMENTED':
                comment_timestamp = pr_activity['comment']['createdDate']
                comment_dates.append(get_date_from_timestamp(comment_timestamp))
            elif pr_activity['action'] == 'APPROVED':
                approval_timestamp = pr_activity['createdDate']
                approval_dates.append(get_date_from_timestamp(approval_timestamp))
 
            is_last_page = pr_response['isLastPage']
 
    if not comment_dates:
        first_comment_date = None
    else:
        first_comment_date = comment_dates[-1]
 
    if not approval_dates:
        approval_time = None
    else:
        approval_time = approval_dates[0]
 
    return first_comment_date, approval_time
 
 
print('Collecting a list of pull requests from the repository', repository)
 
with open(f'{project}_{repository}_prs_cycle_time_{since}_{until}.csv', mode='a', newline='') as report_file:
    report_writer = csv.writer(report_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    report_writer.writerow(['id',
                            'title',
                            'author',
                            'state',
                            'first_commit',
                            'created',
                            'first_comment',
                            'approved',
                            'closed',
                            'cycle_time_d',
                            'time_to_open_d',
                            'time_to_review_d',
                            'time_to_approve_d',
                            'time_to_merge_d'])
 
    for pull_request in get_pull_requests():
 
        print('Processing pull request', pull_request.pr_id)
 
        first_commit_time = get_first_commit_time(pull_request)
 
        first_comment, approval = get_pr_activities(pull_request)
 
        cycle_time = subtract_dates(pull_request.closed, first_commit_time)
 
        time_to_open = subtract_dates(pull_request.created, first_commit_time)
 
        time_to_review = subtract_dates(first_comment, pull_request.created)
 
        time_to_approve = subtract_dates(approval, first_comment)
 
        time_to_merge = subtract_dates(pull_request.closed, approval)
 
        report_writer.writerow([pull_request.pr_id,
                                pull_request.title,
                                pull_request.author,
                                pull_request.state,
                                first_commit_time,
                                pull_request.created,
                                first_comment,
                                approval,
                                pull_request.closed,
                                cycle_time,
                                time_to_open,
                                time_to_review,
                                time_to_approve,
                                time_to_merge])
 
print('The resulting CSV file is saved to the current folder.')
<p>

To make this script work, you’ll need to pre-install the requests and dateutil modules. The csvsys, and datetime modules are available in Python out of the box. You need to pass the following arguments to the script when executed:

  • the URL of your Bitbucket, 
  • login, 
  • password, 
  • project key, 
  • repository slug, 
  • since date (to include PRs created after), 
  • until date (to include PRs created before).

Here’s an example:

py script.py https://bitbucket.your-company-name.com login password PRKEY repo-slug 2020-11-30 2021-02-01

Once the script’s executed, the resulting file will be saved to the same folder as the script.

What to do with the report

After you generated a CSV file, you can process it in analytics tools such as Tableau, PowerBI, Qlik, Looker, visualize this data on your Confluence pages with the Table Filter and Charts for Confluence app, or integrate it in any custom solution of your choice for further analysis. 

An example of the data visualized with Table Filter and Charts for Confluence.

By measuring Cycle Time, you can:

  • See objectively whether the development process is getting faster or slower.
  • Analyze the correlation of the specific metrics with the overall cycle time (e.g., pull requests that open faster, merge faster).
  • Compare the results of the particular teams and users within the organization or across the industry.

What’s next?

The report described in this article is built with the help of the Awesome Graphs for Bitbucket app as a data provider, available for Bitbucket Server and Data Center. Using it, you can gain more visibility into the development process to analyze patterns and find bottlenecks.

If you want to learn more about how to use Cycle Time and the related metrics, write in the comments below and upvote this post, and we’ll show you how to visualize the data, what to look at and how to get insights from it in the future posts!

Related posts