How to Get the Number of Commits and Lines of Code in Pull Requests
Counting lines of code manually for each pull request to analyze your current Bitbucket database could take years. As a solution, we suggest automating this process with Awesome Graphs for Bitbucket and Python. This article will show you how to get the number of commits and lines of code in pull requests from Bitbucket Data Center and build a pull request size report on the repository level.
Why pull request size matters
According to the research conducted by the Cisco Systems programming team, where they tried to determine the best practices for code review, they found out that the pull request size should not include more than 200 to 400 lines of code. Maintaining the size of pull requests within these limits is helpful for:
- speeding up the review process
- enhancing code readability
- keeping reviewers focused and attentive to details, as this amount of information is optimal for the brain to process effectively at a time
- contributing to more thorough and efficient reviews.
All this, in turn, enhances overall code quality and streamlines delivery.
What pull request report you will get
With the help of Awesome Graphs for Bitbucket and Python, you can get a CSV file containing a list of pull requests created during the specified period, along with the number of commits and lines of code added and deleted in them. The report will also contain the authors’ emails and the date of creation and closure of each pull request.
How to get a pull request size report
To get the report described above, we’ll run the following script that will make requests into the REST API, and do all the calculations and aggregation for us.
import requests
import csv
import sys
bitbucket_url = sys.argv[1]
login = sys.argv[2]
password = sys.argv[3]
project = sys.argv[4]
repository = sys.argv[5]
since = sys.argv[6]
until = sys.argv[7]
get_prs_url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
+ '/pull-requests'
s = requests.Session()
s.auth = (login, password)
class PullRequest:
def __init__(self, title, pr_id, author, created, closed):
self.title = title
self.pr_id = pr_id
self.author = author
self.created = created
self.closed = closed
class PullRequestWithCommits:
def __init__(self, title, pr_id, author, created, closed, commits, loc_added, loc_deleted):
self.title = title
self.pr_id = pr_id
self.author = author
self.created = created
self.closed = closed
self.commits = commits
self.loc_added = loc_added
self.loc_deleted = loc_deleted
def get_pull_requests():
pull_request_list = []
is_last_page = False
while not is_last_page:
response = s.get(get_prs_url, params={'start': len(pull_request_list), 'limit': 1000,
'sinceDate': since, 'untilDate': until}).json()
for pr_details in response['values']:
title = pr_details['title']
pd_id = pr_details['id']
author = pr_details['author']['user']['emailAddress']
created = pr_details['createdDate']
closed = pr_details['closedDate']
pull_request_list.append(PullRequest(title, pd_id, author, created, closed))
is_last_page = response['isLastPage']
return pull_request_list
def get_commit_statistics(pull_request_list):
pr_list_with_commits = []
for pull_request in pull_request_list:
print('Processing Pull Request', pull_request.pr_id)
commit_ids = []
is_last_page = False
while not is_last_page:
url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository \
+ '/pull-requests/' + str(pull_request.pr_id) + '/commits'
response = s.get(url, params={'start': len(commit_ids), 'limit': 25}).json()
for commit in response['values']:
commit_ids.append(commit['id'])
is_last_page = response['isLastPage']
commits = 0
loc_added = 0
loc_deleted = 0
for commit_id in commit_ids:
commits += 1
url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
+ '/commits/' + commit_id
response = s.get(url).json()
if 'errors' not in response:
loc_added += response['linesOfCode']['added']
loc_deleted += response['linesOfCode']['deleted']
else:
pass
pr_list_with_commits.append(PullRequestWithCommits(pull_request.title, pull_request.pr_id, pull_request.author,
pull_request.created, pull_request.closed, commits,
loc_added, loc_deleted))
return pr_list_with_commits
with open('{}_{}_pr_size_stats_{}_{}.csv'.format(project, repository, since, until), mode='a', newline='') as report_file:
report_writer = csv.writer(report_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
report_writer.writerow(['title', 'id', 'author', 'created', 'closed', 'commits', 'loc_added', 'loc_deleted'])
for pr in get_commit_statistics(get_pull_requests()):
report_writer.writerow([pr.title, pr.pr_id, pr.author, pr.created, pr.closed, pr.commits, pr.loc_added, pr.loc_deleted])
print('The resulting CSV file is saved to the current folder.')
To make this script work, you’ll need to install the requests module in advance, the csv and sys modules are available in Python out of the box. Then, you need to pass seven arguments to the script when executed: the URL of your Bitbucket, login, password, project key, repository name, since date, until date. Here’s an example:
py script.py https://bitbucket.your-company-name.com login password PRKEY repo-name 2023-11-31 2024-02-01
At the end of the execution, the resulting file will be saved to the same folder next to the script.
Want more?
The Awesome Graphs for Bitbucket app and its REST API, in particular, allow you to get much more than described here, and we want to help you get the most out of it. Exclusively for our Data Center clients, we offer a Premium Support Subscription where our tech team will help you write custom scripts for our REST API to get the data you need. Contact us if you have an issue you’d like to solve, and we will assist you.
Here are a few how-tos that may be of help right now: