Learn how to write and test a custom library to interact with an HTTP .

Introduction

Most websites we use provide an HTTP API to enable developers
to access their data from their own applications. For developers utilizing
the API, this usually involves making some HTTP requests to the service, and using
the responses in their applications.
However, this may get tedious since you have to write HTTP requests for each API
endpoint you intend to use. Furthermore, when a part of the API changes, you
have to edit all the individual requests you have written.

A better approach would be to use a library in your language of choice that helps
you abstract away the API’s implementation details. You would access the API
through calling regular methods provided by the library, rather than constructing
HTTP requests from scratch. These libraries also have the advantage of returning
data as familiar data structures provided by the language, hence enabling
idiomatic ways to access and manipulate this data.

In this tutorial, we are going to write a Python library to help us communicate with
The Movie Database‘s API from Python code.

By the end of this tutorial, you will learn:

  • How to create and test a custom library which communicates with a third-party API and
  • How to use the custom library in a Python script.

Prerequisites

Before we get started, ensure you have one of the following Python
versions installed:

  • Python 2.7, 3.3, 3.4, or 3.5

We will also make use of the Python packages listed below:

  • requests – We will use this to make HTTP requests,
  • vcrpy – This will help
    us record HTTP responses during tests and test those responses, and
  • pytest – We will use this as our
    framework.

Project Setup

We will organize our project as follows:

.
├── requirements.txt
├── tests
│   ├── __init__.py
│   ├── test_tmdbwrapper.py
│   └── vcr_cassettes
└── tmdbwrapper
    └── __init__.py
    └── tv.py

This sets up a folder for our and one for holding the tests. The
vcr_cassettes subdirectory inside tests will store our recorded HTTP
interactions with The Movie Database’s API.

Our project will be organized around the functionality we expect to provide in
our wrapper. For example, methods related to TV functionality will be in the
tv.py file under the tmdbwrapper directory.

We need to list our dependencies in the requirements.txt file as follows.
At the time of writing, these are the latest versions. Update the version numbers
if later versions have been published by the time you are reading this.

requests==2.11.1
vcrpy==1.10.3
pytest==3.0.3

Finally, let’s install the requirements and get started:

pip install -r requirements.txt

Test-driven Development

Following the test-driven development practice, we will write the tests for our
application first, then implement the functionality to make the tests pass.

For our first test, let’s test that our module will be able to fetch a TV show’s
info from TMDb successfully.

# tests/test_tmdbwrapper.py

from tmdbwrapper import TV

def test_tv_info():
    """Tests an API call to get a TV show's info"""

    tv_instance = TV(1396)
    response = tv_instance.info()

    assert isinstance(response, dict)
    assert response['id'] == 1396, "The ID should be in the response"

In this initial test, we are demonstrating the behavior we expect our complete
module to exhibit. We expect that our tmdbwrapper package will contain a TV
class, which we can then instantiate with a TMDb TV ID.
Once we have an instance of the class, when we call the info method, it should
return a dictionary containing the TMDb TV ID we provided under the 'id' key.

To run the test, execute the py.test command from the root directory.
As expected, the test will fail with an error message that should contain
something similar to the following snippet:

    ImportError while importing test module '/Users/kevin/code/python/tmdbwrapper/tests/test_tmdbwrapper.py'.
    'cannot import name TV'
    Make sure your test modules/packages have valid Python names.

This is because the tmdbwrapper package is empty right now. From now on, we will
write the package as we go, adding new code to fix the failing tests, adding
more tests and repeating the process until we have all the functionality we need.

Implementing Functionality in Our API Wrapper

To start with, the minimal functionality we can add at this stage is creating the TV class inside our package.

Let’s go ahead and create the class in the tmdbwrapper/tv.py file:

# tmdbwrapper/tv.py

class TV(object):
  pass

Additionally, we need to import the TV class in the tmdbwrapper/__init__.py file,
which will enable us to import it directly from the package.

# tmdbwrapper/__init__.py

from .tv import TV

At this point, we should re-run the tests to see if they pass.
You should now see the following error message:

    >        tv_instance = TV(1396)
    E       TypeError: object() takes no parameters

We get a TypeError. This is good. We seem to be making some progress.
Reading through the error, we can see that it occurs when we try to instantiate
the TV class with a number.
Therefore, what we need to do next is implement a constructor for the TV class
that takes a number. Let’s add it as follows:

# tmdbwrapper/tv.py

class TV(object):
  def __init__(self, id):
        pass

As we just need the minimal viable functionality right now, we will leave the
constructor empty, but ensure that it receives self and id as parameters.
This id parameter will be the TMDb TV ID that will be passed in.

Now, let’s re-run the tests and see if we made any progress. We should see the
following error message now:

>       response = tv_instance.info()
E       AttributeError: 'TV' object has no attribute 'info'

This time around, the problem is that we are using the info method from the tv_instance,
and this method does not exist. Let’s add it.

# tmdbwrapper/tv.py

class TV(object):
    def __init__(self, id):
        pass

    def info(self):
        pass

After running the tests again, you should see the following failure:

    >       assert isinstance(response, dict)
    E       assert False
    E        +  where False = isinstance(None, dict)

For the first time, it’s the actual test failing, and not an error in our code.
To make this pass, we need to make the info method return a dictionary. Let’s
also pre-empt the next failure we expect. Since we know that the returned
dictionary should have an id key, we can return a dictionary with an
'id' key whose value will be the TMDb TV ID provided when the class is initialized.

To do this, we have to store the ID as an instance variable, in order to access
it from the info function.

# tmdbwrapper/tv.py

class TV(object):
    def __init__(self, id):
        self.id = id

    def info(self):
        return {'id': self.id}

If we run the tests again, we will see that they pass.

Writing Foolproof Tests

You may be asking yourself why the tests are passing, since we clearly have not
fetched any info from the API. Our tests were not exhaustive enough.
We need to actually ensure that the correct info that has been fetched from the
API is returned.

If we take a look at the TMDb documentation
for the TV info method, we can see that there are many additional fields
returned from the TV info response, such as poster_path, popularity, name,
overview, and so on.

We can add a test to check that the correct fields are returned in the response,
and this would in turn help us ensure that our tests are indeed checking for a correct
response object back from the info method.

For this case, we will select a handful of these properties and ensure that they
are in the response. We will use pytest fixtures for setting up the list of keys we expect to be included in the response.

Our test will now look as follows:

# tests/test_tmdbwrapper.py

from pytest import fixture
from tmdbwrapper import TV

@fixture
def tv_keys():
    # Responsible only for returning the test data
    return ['id', 'origin_country', 'poster_path', 'name',
              'overview', 'popularity', 'backdrop_path',
              'first_air_date', 'vote_count', 'vote_average']

def test_tv_info(tv_keys):
    """Tests an API call to get a TV show's info"""

    tv_instance = TV(1396)
    response = tv_instance.info()

    assert isinstance(response, dict)
    assert response['id'] == 1396, "The ID should be in the response"
    assert set(tv_keys).issubset(response.keys()), "All keys should be in the response"

Pytest fixtures help us create test data that we can then use in other tests.
In this case, we create the tv_keys fixture which returns a list of some of the
properties we expect to see in the TV response.
The fixture helps us keep our code clean, and explicitly separate the scope of the two
functions.

You will notice that the test_tv_info method now takes tv_keys as a parameter.
In order to use a fixture in a test, the test has to receive the fixture name as
an argument. Therefore, we can make assertions using the test data.
The tests now help us ensure that the keys from our fixtures are a subset of the
list of keys we expect from the response.

This makes it a lot harder for us to cheat in our tests in future, as we did before.

Running our tests again should give us a constructive error message which fails
because our response does not contain all the expected keys.

Fetching Data from TMDb

To make our tests pass, we will have to construct a dictionary object
from the TMDb API response and return that in the info method.

Before we proceed, please ensure you have obtained an API key from TMDb by registering.
All the available info provided by the API can be viewed in the
API Overview page and all methods
need an API key. You can request one after registering your account on TMDb.

First, we need a requests session
that we will use for all HTTP interactions.
Since the api_key parameter is required for all requests, we will attach it to
this session object so that we don’t have to specify it every time we need to make an
API call. For simplicity, we will write this in the package’s __init__.py
file.

# tmdbwrapper/__init__.py

import os
import requests

TMDB_API_KEY = os.environ.get('TMDB_API_KEY', None)

class APIKeyMissingError(Exception):
    pass

if TMDB_API_KEY is None:
    raise APIKeyMissingError(
        "All methods require an API key. See "
        "https://developers.themoviedb.org/3/getting-started/introduction "
        "for how to retrieve an authentication token from "
        "The Movie Database"
    )
session = requests.Session()
session.params = {}
session.params['api_key'] = TMDB_API_KEY

from .tv import TV

We define a TMDB_API_KEY variable which gets the API key from the
TMDB_API_KEY environment variable. Then, we go ahead and initialize a requests
session and provide the API key in the params object. This means that it will
be appended as a parameter to each request we make with this session object.
If the API key is not provided, we will raise a custom APIKeyMissingError with
a helpful error message to the user.

Next, we need to make the actual API request in the info method as follows:

# tmdbwrapper/tv.py

from . import session

class TV(object):

    def __init__(self, id):
        self.id = id

    def info(self):
        path = 'https://api.themoviedb.org/3/tv/{}'.format(self.id)
        response = session.get(path)
        return response.json()

First of all, we import the session object that we defined in the package root.
We then need to send a GET request to the TV info URL that returns details about a single TV show, given its ID.
The resulting response object is then returned as a dictionary by calling the
.json() method on it.

There’s one more thing we need to do before wrapping this up. Since we are now
making actual API calls, we need to take into account some API best practices.
We don’t want to make the API calls to the actual TMDb API every time we run our
tests, since this can get you rate limited.

A better way would be to save the HTTP response the first time a request is made,
then reuse this saved response on subsequent test runs. This way, we minimize
the amount of requests we need to make on the API and ensure that our tests still
have access to the correct data. To accomplish this, we will use the vcr package:

# tests/test_tmdbwrapper.py
import vcr

@vcr.use_cassette('tests/vcr_cassettes/tv-info.yml')
def test_tv_info(tv_keys):
    """Tests an API call to get a TV show's info"""

    tv_instance = TV(1396)
    response = tv_instance.info()

    assert isinstance(response, dict)
    assert response['id'] == 1396, "The ID should be in the response"
    assert set(tv_keys).issubset(response.keys()), "All keys should be in the response"

We just need to instruct vcr where to store the HTTP response for the
request that will be made for any specific test. See vcr’s docs on
detailed usage information.

At this point, running our tests requires that we have a TMDB_API_KEY environment
variable set, or else we’ll get an APIKeyMissingError.
One way to do this is by setting it right before running the tests,
i.e. TMDB_API_KEY='your-tmdb-api-key' py.test.

Running the tests with a valid API key should have them passing.

Adding More Functions

Now that we have our tests passing, let’s add some more functionality to our
wrapper. Let’s add the ability to return a list of the most popular TV
shows on TMDb. We can add the following test:

# tests/test_tmdbwrapper.py

@vcr.use_cassette('tests/vcr_cassettes/tv-popular.yml')
def test_tv_popular():
    """Tests an API call to get a popular tv shows"""

    response = TV.popular()

    assert isinstance(response, dict)
    assert isinstance(response['results'], list)
    assert isinstance(response['results'][0], dict)
    assert set(tv_keys).issubset(response['results'][0].keys())

Note that we are instructing vcr to save the API response in a different file.
Each API response needs its own file.

For the actual test, we need to check that the response is a dictionary
and contains a results key, which contains a list of TV show dictionary objects.
Then, we check the first item in the results list to ensure it is a
valid TV info object, with a test similar to the one we used for the info method.

To make the new tests pass, we need to add the popular method to the TV class.
It should make a request to the popular TV shows path, and then return
the response serialized as a dictionary.
Let’s add the popular method to the TV class as follows:

# tmdbwrapper/tv.py

  @staticmethod
  def popular():
      path = 'https://api.themoviedb.org/3/tv/popular'
      response = session.get(path)
      return response.json()

Also, note that this is a staticmethod, which means it doesn’t need the class
to be initialized for it to be used. This is because it doesn’t use any instance
variables, and it’s called directly from the class.

All our tests should now be passing.

Taking Our API Wrapper for a Spin

Now that we’ve implemented an API wrapper, let’s check if
it works by using it in a script. To do this, we will write a program that
lists out all the popular TV shows on TMDb along with their popularity rankings.
Create a file in the root folder of our project. You can name the file anything you like — ours is called testrun.py.

# example.py

from __future__ import print_function
from tmdbwrapper import TV

popular = TV.popular()

for number, show in enumerate(popular['results'], start=1):
    print("{num}. {name} - {pop}".format(num=number,
                                         name=show['name'], pop=show['popularity']))

If everything is working correctly, you should see an ordered list of the current
popular TV shows and their popularity rankings on The Movie Database.

Filtering Out the API Key

Since we are saving our HTTP responses to a file on a disk, there are chances
we might expose our API key to other people, which is a Very Bad Idea™,
since other people might use it for malicious purposes. To deal with this, we
need to filter out the API key from the saved responses.
To do this, we need to add a filter_query_parameters keyword argument to the
vcr decorator methods as follows:

@vcr.use_cassette('tests/vcr_cassettes/tv-popular.yml', filter_query_parameters=['api_key'])

This will save the API responses, but it will leave out the API key.

Continuous Testing on Semaphore CI

Lastly, let’s add continuous testing to our application using Semaphore CI.

We want to ensure that our package works on various platforms and that we don’t
accidentally break functionality in future versions. We do this through continuous automatic testing.

Ensure you’ve committed everything on Git, and push your repository to GitHub or
Bitbucket, which will enable Semaphore to fetch your code.
Next, sign up for a free Semaphore account, if don’t have one already.
Once you’ve confirmed your email, it’s time to create a new project.

Follow these steps to add the project to Semaphore:

  1. Once you’re logged into Semaphore, navigate to your list of projects and click the “Add New Project” button:

    Add New Project Screen

  2. Next, select the account where you wish to add the new project.

    Select Account Screen

  3. Select the repository that holds the code you’d like to build:

    Select Repository Screen

  4. Configure your project as shown below:

    Project Configuration Screen

Finally, wait for the first build to run.

It should fail, since as we recall, the TMDB_API_KEY environment key is required
for the tests to run.

Navigate to the Project Settings page of your application and add your API key
as an environment variable as shown below:

Add environment variable screen

Make sure to check the Encrypt content checkbox when adding the key to ensure
the API key will not be publicly visible.
Once you’ve added that and re-run the build, your tests should be passing again.

Conclusion

We have learned how to write a Python wrapper for an HTTP API by writing one ourselves.
We have also seen how to test such a library and what are some best practices around that,
such as not exposing our API keys publicly when recording HTTP responses.

Adding more methods and functionality to our API wrapper should be straightforward,
since we have set up methods that should guide us if we need to add more. We
encourage you to check out the API
and implement one or two extra methods to practice.
This should be a good starting point for writing a Python wrapper for any API
out there.

Please reach out with any questions or feedback that you may have in the comments
section below. You can also check out the complete code and contribute on GitHub.



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here