How to Get the Number of Commits and Lines of Code in Pull Requests

May 16, 2024
#Reporting#How To#Bitbucket
10 min
How to Get the Number of Commits and Lines of Code in Pull Requests

Counting lines of code manually for each pull request to analyze your current Bitbucket database could take years. As a solution, we suggest automating this process with Awesome Graphs for Bitbucket and Python. This article will show you how to get the number of commits and lines of code in pull requests from Bitbucket Data Center and build a pull request size report on the repository level.

Why pull request size matters

According to the research conducted by the Cisco Systems programming team, where they tried to determine the best practices for code review, they found out that the pull request size should not include more than 200 to 400 lines of code. Maintaining the size of pull requests within these limits is helpful for:

  • speeding up the review process 
  • enhancing code readability
  • keeping reviewers focused and attentive to details, as this amount of information is optimal for the brain to process effectively at a time
  • contributing to more thorough and efficient reviews. 

All this, in turn, enhances overall code quality and streamlines delivery.

What pull request report you will get

With the help of Awesome Graphs for Bitbucket and Python, you can get a CSV file containing a list of pull requests created during the specified period, along with the number of commits and lines of code added and deleted in them. The report will also contain the authors’ emails and the date of creation and closure of each pull request.

how to get the number of commits and lines of code in pull requests from Bitbucket Data Center and build a pull request size report

How to get a pull request size report

To get the report described above, we’ll run the following script that will make requests into the REST API, and do all the calculations and aggregation for us.

import requests
import csv
import sys

bitbucket_url = sys.argv[1]
login = sys.argv[2]
password = sys.argv[3]
project = sys.argv[4]
repository = sys.argv[5]
since = sys.argv[6]
until = sys.argv[7]

get_prs_url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
            + '/pull-requests'

s = requests.Session()
s.auth = (login, password)


class PullRequest:

    def __init__(self, title, pr_id, author, created, closed):
        self.title = title
        self.pr_id = pr_id
        self.author = author
        self.created = created
        self.closed = closed


class PullRequestWithCommits:

    def __init__(self, title, pr_id, author, created, closed, commits, loc_added, loc_deleted):
        self.title = title
        self.pr_id = pr_id
        self.author = author
        self.created = created
        self.closed = closed
        self.commits = commits
        self.loc_added = loc_added
        self.loc_deleted = loc_deleted


def get_pull_requests():

    pull_request_list = []

    is_last_page = False

    while not is_last_page:

        response = s.get(get_prs_url, params={'start': len(pull_request_list), 'limit': 1000,
                                      'sinceDate': since, 'untilDate': until}).json()

        for pr_details in response['values']:

            title = pr_details['title']
            pd_id = pr_details['id']
            author = pr_details['author']['user']['emailAddress']
            created = pr_details['createdDate']
            closed = pr_details['closedDate']

            pull_request_list.append(PullRequest(title, pd_id, author, created, closed))

        is_last_page = response['isLastPage']

    return pull_request_list


def get_commit_statistics(pull_request_list):

    pr_list_with_commits = []

    for pull_request in pull_request_list:

        print('Processing Pull Request', pull_request.pr_id)

        commit_ids = []

        is_last_page = False

        while not is_last_page:

            url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository \
                + '/pull-requests/' + str(pull_request.pr_id) + '/commits'
            response = s.get(url, params={'start': len(commit_ids), 'limit': 25}).json()

            for commit in response['values']:
                commit_ids.append(commit['id'])

            is_last_page = response['isLastPage']

        commits = 0
        loc_added = 0
        loc_deleted = 0

        for commit_id in commit_ids:

            commits += 1

            url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
                + '/commits/' + commit_id
            response = s.get(url).json()

            if 'errors' not in response:
                loc_added += response['linesOfCode']['added']
                loc_deleted += response['linesOfCode']['deleted']
            else:
                pass

        pr_list_with_commits.append(PullRequestWithCommits(pull_request.title, pull_request.pr_id, pull_request.author,
                                                           pull_request.created, pull_request.closed, commits,
                                                           loc_added, loc_deleted))

    return pr_list_with_commits


with open('{}_{}_pr_size_stats_{}_{}.csv'.format(project, repository, since, until), mode='a', newline='') as report_file:

    report_writer = csv.writer(report_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    report_writer.writerow(['title', 'id', 'author', 'created', 'closed', 'commits', 'loc_added', 'loc_deleted'])

    for pr in get_commit_statistics(get_pull_requests()):
        report_writer.writerow([pr.title, pr.pr_id, pr.author, pr.created, pr.closed, pr.commits, pr.loc_added, pr.loc_deleted])

print('The resulting CSV file is saved to the current folder.')

To make this script work, you’ll need to install the requests module in advance, the csv and sys modules are available in Python out of the box. Then, you need to pass seven arguments to the script when executed: the URL of your Bitbucket, login, password, project key, repository name, since date, until date. Here’s an example:

py script.py https://bitbucket.your-company-name.com login password PRKEY repo-name 2023-11-31 2024-02-01

At the end of the execution, the resulting file will be saved to the same folder next to the script.

Want more?

The Awesome Graphs for Bitbucket app and its REST API, in particular, allow you to get much more than described here, and we want to help you get the most out of it. Exclusively for our Data Center clients, we offer a Premium Support Subscription where our tech team will help you write custom scripts for our REST API to get the data you need. Contact us if you have an issue you’d like to solve, and we will assist you.

Here are a few how-tos that may be of help right now: