How to Get the Number of Commits and Lines of Code in Pull Requests

February 4, 2021
#How To#Bitbucket#Reporting
9 min

According to the research conducted by the Cisco Systems programming team, where they tried to determine the best practices for code review, they found out that the pull request size should not include more than 200 to 400 lines of code. Keeping the size of your pull requests within these limits not only will speed up the review but also this amount of information is optimal for the brain to process effectively at a time.

In case you’d like to analyze your current database, counting lines of code manually for each pull request could take years, so we suggest automating this process with the help of Awesome Graphs for Bitbucket and Python. This article will show you how you can build a report with pull request size statistics in terms of lines of code and commits on the repository level.

What you will get

As a result, you’ll get a CSV file containing a detailed list of pull requests created during the specified period with the number of commits, lines of code added and deleted in them.

How to get it

To get the report described above, we’ll run the following script that will make requests into the REST API, and do all the calculations and aggregation for us.

import requests
import csv
import sys

bitbucket_url = sys.argv[1]
login = sys.argv[2]
password = sys.argv[3]
project = sys.argv[4]
repository = sys.argv[5]
since = sys.argv[6]
until = sys.argv[7]

get_prs_url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
            + '/pull-requests'

s = requests.Session()
s.auth = (login, password)


class PullRequest:

    def __init__(self, title, pr_id, author, created, closed):
        self.title = title
        self.pr_id = pr_id
        self.author = author
        self.created = created
        self.closed = closed


class PullRequestWithCommits:

    def __init__(self, title, pr_id, author, created, closed, commits, loc_added, loc_deleted):
        self.title = title
        self.pr_id = pr_id
        self.author = author
        self.created = created
        self.closed = closed
        self.commits = commits
        self.loc_added = loc_added
        self.loc_deleted = loc_deleted


def get_pull_requests():

    pull_request_list = []

    is_last_page = False

    while not is_last_page:

        response = s.get(get_prs_url, params={'start': len(pull_request_list), 'limit': 1000,
                                      'sinceDate': since, 'untilDate': until}).json()

        for pr_details in response['values']:

            title = pr_details['title']
            pd_id = pr_details['id']
            author = pr_details['author']['user']['emailAddress']
            created = pr_details['createdDate']
            closed = pr_details['closedDate']

            pull_request_list.append(PullRequest(title, pd_id, author, created, closed))

        is_last_page = response['isLastPage']

    return pull_request_list


def get_commit_statistics(pull_request_list):

    pr_list_with_commits = []

    for pull_request in pull_request_list:

        print('Processing Pull Request', pull_request.pr_id)

        commit_ids = []

        is_last_page = False

        while not is_last_page:

            url = bitbucket_url + '/rest/api/latest/projects/' + project + '/repos/' + repository \
                + '/pull-requests/' + str(pull_request.pr_id) + '/commits'
            response = s.get(url, params={'start': len(commit_ids), 'limit': 25}).json()

            for commit in response['values']:
                commit_ids.append(commit['id'])

            is_last_page = response['isLastPage']

        commits = 0
        loc_added = 0
        loc_deleted = 0

        for commit_id in commit_ids:

            commits += 1

            url = bitbucket_url + '/rest/awesome-graphs-api/latest/projects/' + project + '/repos/' + repository \
                + '/commits/' + commit_id
            response = s.get(url).json()

            if 'errors' not in response:
                loc_added += response['linesOfCode']['added']
                loc_deleted += response['linesOfCode']['deleted']
            else:
                pass

        pr_list_with_commits.append(PullRequestWithCommits(pull_request.title, pull_request.pr_id, pull_request.author,
                                                           pull_request.created, pull_request.closed, commits,
                                                           loc_added, loc_deleted))

    return pr_list_with_commits


with open('{}_{}_pr_size_stats_{}_{}.csv'.format(project, repository, since, until), mode='a', newline='') as report_file:

    report_writer = csv.writer(report_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    report_writer.writerow(['title', 'id', 'author', 'created', 'closed', 'commits', 'loc_added', 'loc_deleted'])

    for pr in get_commit_statistics(get_pull_requests()):
        report_writer.writerow([pr.title, pr.pr_id, pr.author, pr.created, pr.closed, pr.commits, pr.loc_added, pr.loc_deleted])

print('The resulting CSV file is saved to the current folder.')

To make this script work, you’ll need to install the requests module in advance, the csv and sys modules are available in Python out of the box. Then you need to pass seven arguments to the script when executed: the URL of your Bitbucket, login, password, project key, repository name, since date, until date. Here’s an example:

py script.py https://bitbucket.your-company-name.com login password PRKEY repo-name 2020-11-31 2021-02-01

As you’ll see at the end of the execution, the resulting file will be saved to the same folder next to the script.

Want more?

The Awesome Graphs for Bitbucket app and its REST API, in particular, allow you to get much more than described here, and we want to help you to get the most of it. If you have an idea in mind or a problem that you’d like us to solve, write here in the comments or create a request in our Help Center, and we’ll cover it in future posts! In fact, the idea for this very article was brought to us by our customers, so there is a high chance that your case will be the next one. 

Here are a few how-tos that you can read right now:

How to Export Commit and Pull Request Data from Bitbucket to CSV

November 26, 2020
#How To#Bitbucket#Reporting
11 min

Being a universal file type, CSV serves as a go-to format for integrations between the applications. It allows for transferring a large amount of data across the systems, even if the integration is not supported natively.  However, you can’t export commit and pull request data from Bitbucket out of the box. The good news is that Awesome Graphs for Bitbucket gives you the capability to export to CSV in different ways.

In this article, we’ll show you how you can use the app to export engineering data to CSV for further integration, organization, and processing in analytics tools and custom solutions.

What you will get

The described ways of exporting will give you two kinds of generated CSV files, depending on the type of data exported. 

In the case of commit data, you’ll get a list of commits with their details:

list of commits with details

And the resulting CSV with a list of pull requests will look like this:

export commit and pull request data to csv

Exporting from the People page

You can export raw commit and pull request data to CSV directly from Bitbucket. When you click All users in the People dropdown menu at the header, you’ll get to the People page with a global overview of developers’ activity in terms of commits or pull requests.

At the top-right corner, you’ll notice the Export menu, where you can choose CSV.

export pull request data from bitbucket

By default, the page shows contributions made within a month, but you can choose a longer period up to a quarter. The filtering applies not only to the GUI but also to the data exported, so if you don’t change the timespan, you’ll get a list of commits or pull requests for the last 30 days.

Exporting via the REST API resources

Beginning with version 5.5.0, Awesome Graphs REST API allows you to retrieve and export commit and pull request data to CSV on global, project, repository, and user levels, using the dedicated resources. This functionality is aimed to automate the processes you used to handle manually and streamline the existing workflows.

You can access the in-app documentation (accessible to Awesome Graphs’ users) by choosing Export → REST API on the People page or go to our documentation website.

We’ll show you two examples of the resources and how they work: one for exporting commits and another for pull requests. You’ll be able to use the rest of the resources as they follow the model.

Export commits to CSV

This resource exports a list of commits with their details from all Bitbucket projects and repositories to a CSV file.

Here is the curl request example:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/commits/export/csv" --output commits.csv

Alternatively, you can use any REST API client like Postman or put the URL directly into your browser’s address bar (you need to be authenticated in Bitbucket in this browser), and you’ll get a generated CSV file.

By default, it exports the data for the last 30 days. You can set a timeframe for exported data up to one year (366 days) with sinceDate / untilDate parameters:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/commits/export/csv?sinceDate=2020-10-01&untilDate=2020-10-13" --output commits.csv

For commit resources, you can also use the query parameters such as merges to filter merge/non-merge commits or order to specify the order to return commits in.

Read more about the resource and its parameters.

Export pull requests to CSV

The pull request resources work similarly, so to export a list of pull requests with their details from all Bitbucket projects and repositories to a CSV file, make the following curl request:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/pull-requests/export/csv" --output pullrequests.csv

The sinceDate / untilDate parameters can also be applied to state the timespan up to a year, but here you have an additional parameter dateType, allowing you to choose either the creation date or the date of the last update as a filtering criterion. So, if you set dateType to created, only the pull requests created during the stated period will be returned, while dateType set to updated will include the pull requests that were updated within the time frame.

Another pull request specific parameter is state, which allows you to filter the response to only include openmerged, or declined pull requests.

For example, the following request will return a list of open pull requests, which were updated between October 1st and October 13th:

curl -X GET -u username:password "https://bitbucket.your-company-name.com/rest/awesome-graphs-api/latest/commits/export/csv?dateType=updated&state=open&sinceDate=2020-10-01&untilDate=2020-10-13" --output pullrequests.csv

Learn more about this resource.

Integrate intelligently

While CSV is supported by many systems and is quite comfortable to manage, it is not the only way for software integrations the Awesome Graphs for Bitbucket app offers. Using the REST API, you can make the data flow between the applications and automate the workflow, eliminating manual work. And we want to make it easier for you and save your time.
Let us know what integrations you are interested in, and we’ll try to bring them to you, so you don’t have to spend time and energy creating workarounds.