21 December 2021

Working with Google Sheets using Python

Recently at Louder, there has been a need for tools that can read user input data and operate upon that data. This includes a tool that sends calendar invitations to the designated support portal lead each day, a tool that creates Data Transfers for each new onboarded client and much more.

I’d considered the pros and cons of creating a web app and API to manage these tools and allowing Louder team members to use a nice user interface to provide information. However, all of our team members feel comfortable using spreadsheets, and since we use Google’s tech stack for the most part, I thought why not use Google Sheets to collect user input?

The approach

Google already has a lot of client libraries for Python, however they can be a little cumbersome to initialise. This becomes especially prevalent writing the same client initialisation and authentication code over and over again for different projects. So, it was decided to write a wrapper around those client libraries.

Where to start?

When designing something like this, start by thinking about what functionality is needed from the tool. What immediately came to mind was the ability to read from, and write to a spreadsheet. In this case, the formatting options accessible through the API are unnecessary, so this wrapper will be fairly simple.

Data usually comes out as a list of lists containing values, when using Google Sheets data via Apps Script and this doesn’t make for a particularly enjoyable experience when working with data. Creating a function to parse the incoming data into a list of dictionaries is be a valuable addition, leaving data stored in key-value pairs that is easy to work with down the line.

Lastly, consider how the data needs to be formatted before being written. Again, Google Sheets expects a list of lists, so a function that converts a list of dictionaries into the correct format is the best way to approach this. Keep in mind, that it is likely adjustments will be made to an existing sheet and, therefore, the header row may not be required writing your data.

Approaching the code

Classes

Now that the objectives have been defined, start writing some code. There are a couple of options around designing the Class(es) to contain the code. Building a SheetReader and SheetWriter Class is one way to store methods for reading and writing separately. However, instantiating two Classes to obtain a read and write operation defeats the purpose of writing a wrapper to cut down on boilerplate code. So, with that in mind, creating one Class of Sheets is the better alternative.

Methods

Simple code is always best, so this method is quite straightforward. Take a step back to think about what is required to read and write data from a Google Sheet:

  1. First, authenticate against the Sheets API. There are two ways to do this - one is as a user, and the other is as a service account. There are great use-cases for both, so set up have two methods (one for each).

  2. Next, there should be a method that constructs a spreadsheet object that can be queried. This will call the googleapiclient.discovery.build() method, and return the build.spreadsheets() method. Using the above spreadsheet method, create both a read, and a write method.

  3. Lastly, to tackle the considerations of data handling, create two methods to transform data extracted and data to be written. These should be a method to convert the extracted data into a list of dictionaries, and a method to convert a list of dictionaries into the correct datatype to write to a spreadsheet (list of lists).

Packaging a Python library

So, all the code is ready, but well written code can, and should be reusable. The best way to do this, is to package it up. What is required to do this? A sane file structure -

  1. In the root of the working directory, create a directory with the name of the library that needs to be built.

  2. Within that directory, create a Python file and write the code there.

  3. Using the Google Sheets wrapper as an example, call the directory sheets_manager and the file, sheets_manager.py.

  4. Add another file into this directory and call it __init__.py. This file will expose the Class, in order to instantiate it in other files.

Here is what it needs to contain:

from .sheets_manager import Sheets

__all__ = ["Sheets"]

5. Finally, create a setup file. This is a file that will sit in the root directory of the project, called setup.py, and will allow the library to be installed via pip.

from setuptools import setup, find_packages
import pathlib
setup(
    name="your_lib_name_here",
    version="1.0",
    description="What does your code do?",
    url="Where is your code hosted?",
    author="You!",
    author_email="Your email",
    license="MIT",
    long_description=None,
    platforms=[],
    install_requires=[ # put your requirements from requirements.txt into this list
        "google-api-python-client==2.24.0",
        "google-auth-httplib2==0.1.0",
        "google-auth-oauthlib==0.4.6",
        "oauth2client==4.1.3",
    ],
    packages=["your_lib_name_here"]
    + ["your_lib_name_here." + pkg for pkg in find_packages("your_lib_name_here")],
)

Now, that this is all set up, start using your code. Simply run pip install path/to/this/dir and it will install into your current development environment.

Alternatively, if the code is stored in a Git repository, install it by running pip install git+ssh://your-repo-url. Note, if using https to clone the repositories, replace ssh with https.

Using the library

I wouldn’t just write an article about building a wrapper around the Google Sheets Client Libraries if I didn’t actually go ahead and do it in the first place. So, here it is, available for anyone with a Google Cloud Platform project to use.

You can install the Library using pip by running pip install git+ssh://github.com/mr-strawberry66/sheets-manager, then import it into a Python file with from sheets_manager import Sheets.

Examples

Read from a sheet

from sheets_manager import Sheets

spreadsheet = Sheets(
    sheet_id="your-sheet-id-here",
    credentials="path/to/your/credentials.json",
    default_range="your_named_range_in_sheets",
)

data = spreadsheet.read_data().to_dict()

Please note that the to_dict() method should be omitted, if it is expected that the data will have jagged rows. This will require further processing before converting to a list of dictionaries.

Write to a sheet

from sheets_manager import Sheets

spreadsheet = Sheets(
    sheet_id="your-sheet-id-here",
    credentials="path/to/your/credentials.json",
    default_range="your_named_range_in_sheets",
)

data = [
    {"Name": "Fred", "Age": 31},
    {"Name": "Julie", "Age": 28},
]

# Write data with the header row
resp = spreadsheet.to_list(data=data).write_data()
"""
Output in Sheet:

Name    Age
Fred    31
Julie   28
"""

# Write data with no header row
resp = spreadsheet.to_list(data=data, header=False).write_data()
"""
Output in Sheet:

Fred    31
Julie   28
"""

About Sam Kenney

Sam is a data engineer. In his spare time he plays guitar for the UK-based alternative band, Worst Case Scenario.