Pull Request Analytics: How to Visualize Cycle Time / Lead Time and Get Insights for Improvement

March 30, 2021
#Reporting#How To#Bitbucket
12 min

Cycle Time / Lead Time is one of the most important metrics for software development. It can tell a lot about the efficiency of the development process and the teams’ speed and capacity. In the previous article, we showed you how to get a detailed report with the pull request statistics and Cycle Time / Lead Time calculated on the repository level. 

Today we’ll tell you how to use this report:

  • How to visualize the pull request data.
  • What things to pay attention to.
  • What insights you can get to improve performance.

Please note that we define Cycle Time / Lead Time as the time between the developer’s first commit and the time it’s merged and will refer to it as Cycle Time throughout the article.

Analyzing your codebase

First, you need to understand the current state of affairs and how it compares to the industry standards. According to the Code Climate’s research, the industry-wide median for Cycle Time is 3.4 days, with only the top 25% managing to keep it as low as 1.8 days and the bottom 25% having a Cycle Time of 6.2 days.

© Code Climate

To get a better understanding of the development process, it might be helpful to look at the teams’ dynamics and monitor the changes over time. The following chart shows how the average Cycle Time changes month after month with a trend line, so you can see objectively whether the development process is getting faster or slower and check how your rates compare to the industry average. Follow the instructions to build this chart.

For a more precise analysis and evaluation of the current code base, you can also use the Cycle Time distribution chart that provides pull request statistics aggregated by their Cycle time value, making it easy to spot the outliers for further investigation. Learn how to build this chart.

In addition to the Cycle Time, Awesome Graphs for Bitbucket lets you analyze the pull request resolution time out-of-the-box. Using the Resolution Time Distribution report, you can see how long it takes pull requests to merge or decline, find the shortest and longest pull requests, and predict the resolution time of future pull requests with the historical data.

While Cycle Time serves as a great indicator of success and, keeping it low, you can increase the output and efficiency of your teams, it’s not diagnostic by itself and can’t really tell what you are doing right or wrong. To understand why it is high or low, you’ll need to dig deeper into the metrics it consists of. The chart below gives you a general overview of the pull requests on the repository level and shows the Cycle Time with the percentage of the stages it’s comprised of (which we’ll discuss in detail in the following paragraphs). You can build a chart like this using the Chart from Table macro, available in the Table Filter and Charts app.

Breaking down the Cycle Time

We break down Cycle Time into four stages:

  • Time to open (from the first commit to open)
  • Time waiting for review (from open to the first comment)
  • Time to approve (from the first comment to approved)
  • Time to merge (from approved to merge)

Now we’ll go through each of these stages, discussing the things to pay attention to.

Time to Open

This metric is arguably the most important of all, as it influences all the later stages and, according to the research, pull requests that open faster tend to merge faster.

Long Time to Open might indicate that the developer had to switch tasks and/or that code was rewritten, which might also result in large batch sizes. In one of the previous articles, we described how you can check the size of your pull requests in Bitbucket, so you can also use it for a deeper analysis.

One of the things you can do to improve your Time to Open is to decrease the pull request size to be no more than 200 to 400 lines of code. Thus you’ll influence each stage of the cycle, as the smaller pull requests are more likely to be reviewed more thoroughly and be approved sooner.

Time to Review

Time to Review is a great metric to understand if your teams adopted Code Review as part of the daily routine. If it’s high, then it might not be part of their habit, and you’ll need to foster this culture. Another reason might be that the pull requests are not review-friendly and the reviewers procrastinate dealing with them. You can change this, once again, by keeping the pull request size small and by writing a reasonable description so it’s easier to get started with them. If the long Time to Review rate is caused by organizational issues, then it might require reprioritization.

Time to Approve

This is the stage you don’t really want to minimize but rather make it consistent by reducing inefficiencies in the code review process. While there are many strategies for Code Review, there is hardly any industry standard for Code Review metrics, so you’ll need to focus on the organization of the process and try to find a way to get constructive feedback.

Time to Merge

Long Time to Merge might be an indicator that there are obstacles in the delivery workflow. To improve it, you need to find out if there are any blockers in the process, including manual deployment, and check if your tooling satisfies your current needs.

Wrapping up

Cycle Time’s importance is difficult to overestimate, as this metric can tell a lot about the way you work, and controlling it, you can optimize the development process and deliver faster.

Once again, we built the initial pull request report with the help of the Awesome Graphs for Bitbucket app as a data provider and used the Table Filter and Charts for Confluence app to aggregate and visualize the data.

These are just a few examples, but you can get much more even from this one report. Check out the other guides for charts based on data from Bitbucket. Share your feedback and ideas in the comments, and we’ll try to cover them in future posts.

Pull Request Analytics: How to Get Pull Request Cycle Time / Lead Time for Bitbucket

March 23, 2021
#Reporting#How To#Bitbucket
13 min

What Cycle Time is and why it is important

Pull Request Cycle Time / Lead Time is a powerful metric to look at while evaluating the engineering teams’ productivity. It helps track the development process from the first moment the code was written in a developer’s IDE and up to the time it’s deployed to production.

Please note that we define Cycle Time / Lead Time as the time between the developer’s first commit and the time it’s merged and will refer to it as Cycle Time throughout the article.

Having this information, you can get an unbiased view of the engineering department’s speed and capacity and find the points to drive improvement. It can also be an indicator of business success as, by controlling the Cycle Time, you can increase the output and efficiency to deliver products faster.

This article will show you how to get a detailed pull request report with the Cycle Time and the related metrics calculated on the repository level. The metrics include:

  • Time to open (from the first commit to open)
  • Time waiting for review (from open to the first comment)
  • Time to approve (from the first comment to approved)
  • Time to merge (from approved to merge)

How to get Time to Open, Time to Review, Time to Approve, and Time to Merge metrics

We can get all the necessary pull request data from Awesome Graphs for Bitbucket and its REST API combined with Bitbucket’s REST API resources. We’ll use Python to make requests into the APIs, calculate and aggregate this data and then save it as a CSV file, like this:

The following script will do all this work for us:

</p>
import sys
import requests
import csv
from dateutil import parser
from datetime import datetime
 
bitbucket_url = sys.argv[1]
login = sys.argv[2]
password = sys.argv[3]
project = sys.argv[4]
repository = sys.argv[5]
since = sys.argv[6]
until = sys.argv[7]
 
s = requests.Session()
s.auth = (login, password)
 
 
class PullRequest:
 
    def __init__(self, pr_id, title, author, state, created, closed):
        self.pr_id = pr_id
        self.title = title
        self.author = author
        self.state = state
        self.created = created
        self.closed = closed
 
 
def parse_date_ag_rest(date):
    return parser.isoparse(date).replace(tzinfo=None, microsecond=0)
 
 
def get_date_from_timestamp(timestamp):
    return datetime.fromtimestamp(timestamp / 1000).replace(microsecond=0)
 
 
def subtract_dates(minuend, subtrahend):
    if minuend is None or subtrahend is None:
        return None
    else:
        return round(((minuend - subtrahend).total_seconds() / 86400), 2)
 
 
def get_pull_requests():
 
    pull_request_list = []
 
    get_prs_url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
        + '/pull-requests'
 
    is_last_page = False
 
    while not is_last_page:
 
        response = s.get(get_prs_url, params={'start': len(pull_request_list), 'limit': 1000,
                                              'sinceDate': since, 'untilDate': until}).json()
 
        for pr_details in response['values']:
 
            pd_id = pr_details['id']
            title = pr_details['title']
            author = pr_details['author']['user']['emailAddress']
            state = pr_details['state']
            created = parse_date_ag_rest(pr_details['createdDate'])
 
            if pr_details['closed'] is True:
                closed = parse_date_ag_rest(pr_details['closedDate'])
            else:
                closed = None
 
            pull_request_list.append(PullRequest(pd_id, title, author, state, created, closed))
 
        is_last_page = response['isLastPage']
 
    return pull_request_list
 
 
def get_first_commit_time(pull_request):
 
    commit_dates = []
 
    commits_url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository + '/pull-requests/' \
        + str(pull_request.pr_id) + '/commits'
 
    is_last_page = False
 
    while not is_last_page:
 
        commits_response = s.get(commits_url, params={'start': len(commit_dates), 'limit': 500}).json()
 
        for commit in commits_response['values']:
            commit_timestamp = commit['authorTimestamp']
            commit_dates.append(get_date_from_timestamp(commit_timestamp))
 
        is_last_page = commits_response['isLastPage']
 
    if not commit_dates:
        first_commit = None
    else:
        first_commit = commit_dates[-1]
 
    return first_commit
 
 
def get_pr_activities(pull_request):
 
    counter = 0
    comment_dates = []
    approval_dates = []
 
    pr_url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository + '/pull-requests/' \
        + str(pull_request.pr_id) + '/activities'
 
    is_last_page = False
 
    while not is_last_page:
 
        pr_response = s.get(pr_url, params={'start': counter, 'limit': 500}).json()
 
        for pr_activity in pr_response['values']:
 
            counter += 1
 
            if pr_activity['action'] == 'COMMENTED':
                comment_timestamp = pr_activity['comment']['createdDate']
                comment_dates.append(get_date_from_timestamp(comment_timestamp))
            elif pr_activity['action'] == 'APPROVED':
                approval_timestamp = pr_activity['createdDate']
                approval_dates.append(get_date_from_timestamp(approval_timestamp))
 
            is_last_page = pr_response['isLastPage']
 
    if not comment_dates:
        first_comment_date = None
    else:
        first_comment_date = comment_dates[-1]
 
    if not approval_dates:
        approval_time = None
    else:
        approval_time = approval_dates[0]
 
    return first_comment_date, approval_time
 
 
print('Collecting a list of pull requests from the repository', repository)
 
with open(f'{project}_{repository}_prs_cycle_time_{since}_{until}.csv', mode='a', newline='') as report_file:
    report_writer = csv.writer(report_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    report_writer.writerow(['id',
                            'title',
                            'author',
                            'state',
                            'first_commit',
                            'created',
                            'first_comment',
                            'approved',
                            'closed',
                            'cycle_time_d',
                            'time_to_open_d',
                            'time_to_review_d',
                            'time_to_approve_d',
                            'time_to_merge_d'])
 
    for pull_request in get_pull_requests():
 
        print('Processing pull request', pull_request.pr_id)
 
        first_commit_time = get_first_commit_time(pull_request)
 
        first_comment, approval = get_pr_activities(pull_request)
 
        cycle_time = subtract_dates(pull_request.closed, first_commit_time)
 
        time_to_open = subtract_dates(pull_request.created, first_commit_time)
 
        time_to_review = subtract_dates(first_comment, pull_request.created)
 
        time_to_approve = subtract_dates(approval, first_comment)
 
        time_to_merge = subtract_dates(pull_request.closed, approval)
 
        report_writer.writerow([pull_request.pr_id,
                                pull_request.title,
                                pull_request.author,
                                pull_request.state,
                                first_commit_time,
                                pull_request.created,
                                first_comment,
                                approval,
                                pull_request.closed,
                                cycle_time,
                                time_to_open,
                                time_to_review,
                                time_to_approve,
                                time_to_merge])
 
print('The resulting CSV file is saved to the current folder.')
<p>

To make this script work, you’ll need to pre-install the requests and dateutil modules. The csvsys, and datetime modules are available in Python out of the box. You need to pass the following arguments to the script when executed:

  • the URL of your Bitbucket, 
  • login, 
  • password, 
  • project key, 
  • repository slug, 
  • since date (to include PRs created after), 
  • until date (to include PRs created before).

Here’s an example:

py script.py https://bitbucket.your-company-name.com login password PRKEY repo-slug 2020-11-30 2021-02-01

Once the script’s executed, the resulting file will be saved to the same folder as the script.

What to do with the report

After you generated a CSV file, you can process it in analytics tools such as Tableau, PowerBI, Qlik, Looker, visualize this data on your Confluence pages with the Table Filter and Charts for Confluence app, or integrate it in any custom solution of your choice for further analysis. 

An example of the data visualized with Table Filter and Charts for Confluence.

By measuring Cycle Time, you can:

  • See objectively whether the development process is getting faster or slower.
  • Analyze the correlation of the specific metrics with the overall cycle time (e.g., pull requests that open faster, merge faster).
  • Compare the results of the particular teams and users within the organization or across the industry.

What’s next?

The report described in this article is built with the help of the Awesome Graphs for Bitbucket app as a data provider, available for Bitbucket Server and Data Center. Using it, you can gain more visibility into the development process to analyze patterns and find bottlenecks.

If you want to learn more about how to use Cycle Time and the related metrics, write in the comments below and upvote this post, and we’ll show you how to visualize the data, what to look at and how to get insights from it in the future posts!

How to Get the Number of Commits and Lines of Code in Pull Requests

February 4, 2021
#How To#Bitbucket#Reporting
9 min

According to the research conducted by the Cisco Systems programming team, where they tried to determine the best practices for code review, they found out that the pull request size should not include more than 200 to 400 lines of code. Keeping the size of your pull requests within these limits not only will speed up the review but also this amount of information is optimal for the brain to process effectively at a time.

In case you’d like to analyze your current database, counting lines of code manually for each pull request could take years, so we suggest automating this process with the help of Awesome Graphs for Bitbucket and Python. This article will show you how you can build a report with pull request size statistics in terms of lines of code and commits on the repository level.

What you will get

As a result, you’ll get a CSV file containing a detailed list of pull requests created during the specified period with the number of commits, lines of code added and deleted in them.

How to get it

To get the report described above, we’ll run the following script that will make requests into the REST API, and do all the calculations and aggregation for us.

import requests
import csv
import sys

bitbucket_url = sys.argv[1]
login = sys.argv[2]
password = sys.argv[3]
project = sys.argv[4]
repository = sys.argv[5]
since = sys.argv[6]
until = sys.argv[7]

get_prs_url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
            + '/pull-requests'

s = requests.Session()
s.auth = (login, password)


class PullRequest:

    def __init__(self, title, pr_id, author, created, closed):
        self.title = title
        self.pr_id = pr_id
        self.author = author
        self.created = created
        self.closed = closed


class PullRequestWithCommits:

    def __init__(self, title, pr_id, author, created, closed, commits, loc_added, loc_deleted):
        self.title = title
        self.pr_id = pr_id
        self.author = author
        self.created = created
        self.closed = closed
        self.commits = commits
        self.loc_added = loc_added
        self.loc_deleted = loc_deleted


def get_pull_requests():

    pull_request_list = []

    is_last_page = False

    while not is_last_page:

        response = s.get(get_prs_url, params={'start': len(pull_request_list), 'limit': 1000,
                                      'sinceDate': since, 'untilDate': until}).json()

        for pr_details in response['values']:

            title = pr_details['title']
            pd_id = pr_details['id']
            author = pr_details['author']['user']['emailAddress']
            created = pr_details['createdDate']
            closed = pr_details['closedDate']

            pull_request_list.append(PullRequest(title, pd_id, author, created, closed))

        is_last_page = response['isLastPage']

    return pull_request_list


def get_commit_statistics(pull_request_list):

    pr_list_with_commits = []

    for pull_request in pull_request_list:

        print('Processing Pull Request', pull_request.pr_id)

        commit_ids = []

        is_last_page = False

        while not is_last_page:

            url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository \
                + '/pull-requests/' + str(pull_request.pr_id) + '/commits'
            response = s.get(url, params={'start': len(commit_ids), 'limit': 25}).json()

            for commit in response['values']:
                commit_ids.append(commit['id'])

            is_last_page = response['isLastPage']

        commits = 0
        loc_added = 0
        loc_deleted = 0

        for commit_id in commit_ids:

            commits += 1

            url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
                + '/commits/' + commit_id
            response = s.get(url).json()

            if 'errors' not in response:
                loc_added += response['linesOfCode']['added']
                loc_deleted += response['linesOfCode']['deleted']
            else:
                pass

        pr_list_with_commits.append(PullRequestWithCommits(pull_request.title, pull_request.pr_id, pull_request.author,
                                                           pull_request.created, pull_request.closed, commits,
                                                           loc_added, loc_deleted))

    return pr_list_with_commits


with open('{}_{}_pr_size_stats_{}_{}.csv'.format(project, repository, since, until), mode='a', newline='') as report_file:

    report_writer = csv.writer(report_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    report_writer.writerow(['title', 'id', 'author', 'created', 'closed', 'commits', 'loc_added', 'loc_deleted'])

    for pr in get_commit_statistics(get_pull_requests()):
        report_writer.writerow([pr.title, pr.pr_id, pr.author, pr.created, pr.closed, pr.commits, pr.loc_added, pr.loc_deleted])

print('The resulting CSV file is saved to the current folder.')

To make this script work, you’ll need to install the requests module in advance, the csv and sys modules are available in Python out of the box. Then you need to pass seven arguments to the script when executed: the URL of your Bitbucket, login, password, project key, repository name, since date, until date. Here’s an example:

py script.py https://bitbucket.your-company-name.com login password PRKEY repo-name 2020-11-31 2021-02-01

As you’ll see at the end of the execution, the resulting file will be saved to the same folder next to the script.

Want more?

The Awesome Graphs for Bitbucket app and its REST API, in particular, allow you to get much more than described here, and we want to help you to get the most of it. If you have an idea in mind or a problem that you’d like us to solve, write here in the comments or create a request in our Help Center, and we’ll cover it in future posts! In fact, the idea for this very article was brought to us by our customers, so there is a high chance that your case will be the next one. 

Here are a few how-tos that you can read right now: