Content from Introduction
Last updated on 2025-02-04 | Edit this page
Overview
Questions
- What is a Software Management Plan (SMP)?
- Why is an SMP important?
- How is an SMP useful?
Objectives
- Understand the role of a Software Management Plan (SMP).
- Understand that the stage and scope of your software can determine that some parts of the SMP are not relevant (yet).
- Understand that no matter the scope of your software, an SMP is always relevant.
Introduction
A Software Management Plan (SMP) is a formal document explaining how software is written and managed both during and after a research project. It is a living document and will evolve with the boundary conditions of your project and software. While it is encouraged to write an SMP before starting to develop code, it is never too late to create one for existing projects.
Importance
SMPs provide value both inside and outside your organization. The things that you and your organization might find valuable in an SMP are: - Writing the plan encourages you to think about the roles and responsibilities within the project, thus defining tasks and responsibilities early on. - Once a plan has been filled out, it can also be used to give guidance for new team members, thus reducing the time needed for onboarding. - Writing an SMP will guide you through the best practices that you can apply to your software based on the size and scope of your project. Following the best practices outlined in an SMP will make it easier for others to use or cite your software.
Outside your organization, research funders specifically have become more aware of SMPs and are starting to require them because of all of the above reasons. From the funder perspective, a well thought out SMP also demonstrates the feasibility and reliability of the project. It is thus a good idea to prepare a well written SMP when submitting a funding proposal.
Key Points
- A SMP is valuable in any stage of your project
- It outlines how the software supports the vision of your project
- It encourages you to follow best practices based on the scope of your project
Content from Intermediate software testing
Last updated on 2024-12-04 | Edit this page
Overview
Questions
- How can I make changes to my research code while being sure existing functionality still works?
- How can I execute the same test with multiple parameters?
- What is code coverage and how can it help me verify the functionality of my code?
- How do I create independent testing for my code without having to instantiate all the software?
- How can I prevent dealing with external system during my tests?
- How can I check if a given change improves the performance of my code?
- How do I make sure that the application I have deployed for another party still works?
- How can I make sure my programs stops when impossible cases are found?
Objectives
- Use pytest to write tests.
- Use parameterized tests.
- Use code coverage to have an idea of the confidence the system still works when changing the code.
- Use mocks to mock out complex paths of code.
- Use stubs to stub out complex paths of code.
- Use performance testing tool to see if the speed of the code confirms to our demands.
- Use smoke tests to do a quick check if the application should still run.
- Use runtime testing to prevent weird cases.
Introduction
In this episode we are going to take a look at a few different types of automated testing. We will also see how we can use code coverage the increase our confidence that everything still works when we make a change to the code. There is an assumed base of having worked through the material on this website.
1. Improve testing
Add code coverage
What is code coverage? Code coverage is the percentage of the research / production code you have written that is covered by unittests. If you have a very high percentage then when you make a change in the code, and it breaks the chances of it being caught before showing it to users is very high. If you have a low percentage the chances of finding these bugs are low, or you have to do a lot of manual testing. When working with python and pytest there are packages to easily get the test coverage of your application. The one used most is pytest-cov .
Add parameterized tests
When writing tests it sometimes happens that you want a lot of tests for the same function. You could write a lot of test functions with the same setup and when calling the function under test some different parameters. A cleaner way where you have to maintain less code afterwards to do this is by using paramterized tests. With this you add the different parameters as inputs to you test function. An example of this looks like this:
PYTHON
@pytest.mark.parametrize(
("onset", "phenomenon", "expected"),
[
(
"2024-12-09T11:31:14Z",
"snow-ice",
"Monday 9 December: chance of snow/road icing",
),
(
"2025-01-04T00:00:00Z",
"low-temperature",
"Saturday 4 January: chance of cold",
),
],
ids=["special_case", "normal_case"],
)
def test_get_english_headline(onset: str, phenomenon: str, expected: str) -> None:
"""test generation of english headline"""
assert _get_english_headline({"onset": onset, "phenomenon": phenomenon}) == expected
As you can see even the expected result is now an input of the test. We can use the ids parameter to give a test a name. With this name you can also run the test for only one of the ids. For more information on parameterized tests you can read this how-to guide.
2. Testing a unit of software without having to instantiate all the code
Sometimes it happens that you want to test a function but in that
function a lot of complex objects are used (and those objects in turn
need other objects…). One way to deal with this is to add those complex
objects as input to the function. You can that use this mock to prevent
you having to create all those objects yourself. In the code bellow we
see the complex class being mocked and then given an implementation for
when the method is called. This way we don’t need to create
input_one
and input_two
with all of their
possible inputs. This type of test double tests state and behaviour.
PYTHON
from unittest.mock import MagicMock
class Complex:
def __init__(self, input_one, input_two):
self.input_one = input_one
self.input_two = input_two
def execute(self):
"do complex things"
pass
def function_under_test(my_complex_object_with_multiple_inputs):
return my_complex_object_with_multiple_inputs.execute()
def test_function_under_test():
inputs = MagicMock()
inputs.execute = MagicMock(return_value=3)
result = function_under_test(inputs)
expected = 3
assert result == expected
assert inputs.execute.call_count == 1
For more information on mocking you can read this quick guide.
3. Working with external systems during a test
When writing code you do not always have the data on your machine. Sometimes you need to download data over http. For this a lot of the time people use the requests library (when you have async code aiohttp is a nice alternative). For your unit test however you don’t want to be dependent on the network, because this is unreliable and can have your tests sometimes fail for no reason. One way is to split the http call inside another method and use a fake response when testing that method. The following code calls the german weather opendata platform to get thunderstorm data. The page gets a lot of updates in the data but the format stay’s the same. The actual api calls can then be tested inside an integration test and also look at the error handling. More information about integration testing can be found at the turing way.
PYTHON
import requests
from bs4 import BeautifulSoup
from unittest.mock import patch
def get_konrad3d_data(url):
response = requests.get(url)
return response.text
def extract_latest_file(overview_page):
soup = BeautifulSoup(overview_page, features="html.parser")
urls = soup.find_all('a')
latest_file = urls[-1].get('href')
return latest_file
def get_latest_file_konrad3d():
url = "https://opendata.dwd.de/weather/radar/konrad3d/"
overview_page = get_konrad3d_data(url)
latest_file = extract_latest_file(overview_page)
return latest_file
data = '<html><head><title>Index of /weather/radar/konrad3d/</title></head><body><h1>Index of /weather/radar/konrad3d/</h1><hr><pre><a href="../">../</a><a href="KONRAD3D_20241116T093000.xml">KONRAD3D_20241116T093000.xml</a> 16-Nov-2024 09:34 3895<a href="KONRAD3D_20241118T092500.xml">KONRAD3D_20241118T092500.xml</a> 18-Nov-2024 09:30 3938</pre><hr></body></html>'
def test_download_latest_data_konrad3d():
with patch("faketest.get_konrad3d_data", return_value=data):
result = get_latest_file_konrad3d()
expected = "KONRAD3D_20241118T092500.xml"
assert result == expected
Another way to not do these API calls is by using the requests_mock library to mock requests API calls. This makes you dependent on another library and still does not show you if things work in reality. It’s being used by a lot of people, but personally I prefer fewer dependencies and write integration tests for the integration with external systems. When you mock this it can give you a false sense of security like happened with the Crowdstrike outage in their testing. If you want to use this an example can be found bellow.
PYTHON
import requests
import requests_mock
def get_konrad3d_data(url):
response = requests.get(url)
return response.text
def test_download_latest_data_konrad3d():
data = '<html><head><title>Index of /weather/radar/konrad3d/</title></head><body><h1>Index of /weather/radar/konrad3d/</h1><hr><pre><a href="../">../</a><a href="KONRAD3D_20241116T093000.xml">KONRAD3D_20241116T093000.xml</a> 16-Nov-2024 09:34 3895<a href="KONRAD3D_20241118T092500.xml">KONRAD3D_20241118T092500.xml</a> 18-Nov-2024 09:30 3938</pre><hr></body></html>'
url = 'https://opendata.dwd.de/weather/radar/konrad3d/'
with requests_mock.Mocker() as m:
m.get(url, text=data)
result = get_konrad3d_data(url)
assert result == data
4 Performance testing of functions
There are moments that the performance of you function might matter a lot. You might not want a single function to ever execute slower than x seconds. To test this you could write tests for the specific functions that should stay fast. How this works is that you run a function x amount of times and the max duration of the function should not be higher than the x seconds. A useful library to help with these types of test in python is pytest-benchmark. This library can also be used to check if the performance between versions of the code is improved.
PYTHON
import time
def function_to_test(duration=1):
time.sleep(duration)
return 123
def test_my_function(benchmark):
allowed_speed = 1.000002
result = benchmark.pedantic(function_to_test, iterations=5)
assert benchmark.stats.stats.max < allowed_speed
assert result == 123
This code can be run with the following command:
pytest -v -s the file_this_is_in.py::test_my_function
. It
will run the code 5 times and none of the calls is allowed to be slower
than the allowed_speed.
When you write API’s you can also have performance requirements. For this another type of tool is used. One of the most used tools for this in python is locust. For more information about this tool look you can look at their documentation.
5. Smoke testing to see if your application is still doing its basic functionality
There are moments that you want to start an application but the application has some prerequisites it needs to have before you can say that it’s good and allowed to run. For this you can use a smoke tests. For example when you have an application that when a user calls it reads configurations files from a file system the check could be if the files exist at the correct location and the format is as expected. Maybe someone manually moved the files it this could break the whole system. So when the files are not there, there is smoke and thus if it’s production we could get a fire. In the example bellow you could see how to test something like this in the same application. However, most of the time those checks would be in another script before you start this script (or if you use Kubernetes an init container).
PYTHON
def config_file_is_found():
#check on location if file exists
pass
def main():
#application logic
pass
if __name__ == '__main__':
if not config_file_is_found():
raise FileNotFoundError("our config file is not found")
main()
More information about smoke tests can be found on the turing way.
6. Runtime testing
When software is in production, and you introduce a new path inside the code you might want to run it for a while without actually implementing the behaviour inside that code path. And example for this is that when we implemented an extra validation for our public dataplatform we first added the validation where we allowed everything like before. But we executed the new logic and logged all unexpected things that happened. This gave us a lot of information about what would happen when we would turn the feature on for real. One important thing we found out that inside our network some http requests would only reach their destination after 10+ seconds. The application would already have given the users an error and that’s not what we wanted. Because of this information we could add a solution that when we eventually brought our check live no users got an error.
An example of a check like this can be found bellow.
PYTHON
def my_new_validation_logic_to_external_api():
print("do an external api call")
return True
def get_observation_data():
return "observation data"
def give_the_user_observation_data():
try:
is_allowed = my_new_validation_logic_to_external_api()
if not is_allowed:
logger.warning("for user with id x we get not allowed back")
is_allowed = True
except Exception as exc:
logger.warning("We got the following exception: %s", str(exc))
is_allowed = True
if is_allowed:
return get_observation_data()
An example where you would like to do this for a research project might be when with reinforcement learning steps take too long. This can mean that for cost efficiency at that moment it is the most cost-effective. More information on runtime testing can be found at the turing way.
7. Closing words
In the previous parts we have looked at quite a few different types of test with examples. Also, some ways on making the tests more reusable and improving the quality. We would like to end with giving a few more possible resources where you could find information about different types of tests or testing tools:
Content from Continuous Integration
Last updated on 2025-02-12 | Edit this page
Overview
Questions
- What is CI/CD?
- Why do we use CI/CD?
- What does CI/CD have to do with version control?
- What is a CI/CD pipeline?
- What is docker and how does it relate to CI/CD?
Objectives
- Be able to explain the basic concepts of Continuous Integration and Continuous Delivery.
- Be able to explain (identify three reasons) why Continuous Integration and Continuous Delivery should be used.
- Be able to identify freely available tools and services to implement these concepts in a research context.
- Be able to explain the relationship between CI/CD and version control
- Be able to build a simple CI/CD pipeline or workflow
The main goal is to create awareness of the concept of Continuous Integration and Continuous Delivery and some of the tools to support it.
Continuous Integration and Continuous Deployment / Delivery
Introduction
The Turing way explains the concept very well.
Continuous Integration should not be confused with DevOps: CI/CD is not DevOps, but DevOps effectively requires CI/CD. DevOps is the concept where a team is responsible for the entire life cycle of a software product or software component. From development to deployment to operating and maintaining it in production. Continuous Integration and Continuous Delivery and/or Deployment plays a large role in this. But DevOps is not required for leveraging CI/CD.
We should distinguish Continuous Integration, Continuous Delivery and Continuous Deployment.
For more information about DevOps see the guide at github.
Continuous Integration
Continuous integration is the practice of integrating all your code changes into the main branch of a shared source code repository early and often, automatically testing each change when you commit or merge them, and automatically kicking off a build. With continuous integration, errors and security issues can be identified and fixed more easily, and much earlier in the development process. – gitlab.com
Continous Delivery
Continuous delivery is a software development practice that works in conjunction with CI to automate the infrastructure provisioning and application release process. – gitlab.com
This can be understood as creating a Docker container, creating a PyPi package for Python, a jar file for Java or equivalents for programming languages like R or C/C++ and Fortran. This will be done in an automated way every time a change is pushed to the git repository on github, or gitlab or some other platform.
Continous Deployment
Continuous deployment enables organizations to deploy their applications automatically, eliminating the need for human intervention. – gitlab.com
When talking about deployment, we mean that the software is running on a server and the services it provides are available for consumption by other software components. For research software that is the focus of this course, that is rarely the case, so we ignore Continuous Deployment and focus on Integration and Delivery.
Why is Continuous Integration and Continuous Delivery recommended?
CI/CD should be used when work is done in a collaborative project where changes created by different contributors need to be merged and tested. The earlier in the process this is done, the easier any bugs or merge conflicts are to solve.
However, even in projects with a single developer utilizing CI/CD tools can be very benificial. It will enable users of the software to have early access to it and bugs are discovered sooner. It also enables the developer to run unit testing in an automated way to discover bugs early in the process.
- The earlier conflicts and bugs are discovered, the easier they are to fix.
- Deliver value to the user of the software quickly
Containers
Nowadays software is often distributed or deployed using containers. These containers contain every dependency an application needs and can run anywhere. This is especially useful for applications that run as a service on a cloud provider such as Google or Amazon Web Services where it is not known beforehand where an application will run. These containers are light weight operating system images in which the application is stored including everything it needs. CI/CD is extremely useful for automatically building such container images, as explained below in the section explaining pipelines and workflows. Running applications in containers tends to enforce decoupling from external dependencies and communicating to external services through well defined and stable interfaces. Building and distributing containers is generally done using docker but there are others such as podman You can find more information about containers here.
Publishing your application as a docker image is advised if the application has a lot of dependencies or if the build process is very complicated. If the application is simple with only few dependencies, then creating a docker image probably creates too much overhead. Docker is not suitable for publishing a module or library. Here is a tutorial on how to build docker images Docker is a commercial application that has a community edition that can be used free of charge. Usually the communitiy edition provides more than enough functionality. Podman is a free and open source alternative for Docker if you prefer.
The docker website has a list of guides on creating docker images for all kinds of purposes.
Pipelines or workflows
A CI/CD pipeline is an automated process utilized by software development teams to streamline the creation, testing and deployment of applications. – gitlab.com
Pipeline stages
- verification, testing
- build
- deployment (Is this relevant for scientists who typically do not operate continuous running processes)
- Perhaps publishing packages to PyPi or similar
Simple github workflow
Github workflows and alternatives from other suppliers enable you to trigger certain jobs on certain events. For instance, you can configure it to only update the documentation if the only changes pushed to the repository are in the documentation. Alternatively you can trigger it to build a release package only when merging to the main branch. Check the Understanding GitHub Actions guide for all the possibilities.
Github Pages13 is a workflow that works out of the box by configuring it on the repository’s website14
Workflows are defined in the .github/workflows
directory
in a git repository.
Github has a quick start tutorial to get you started with workflows.
The example below will update the repository’s GitHub Pages site when
a changes is pushed to the gh-pages
branch. As you can see,
it will only build on pushing to the gh-pages
branch.
Build containers
- Reproducable, all dependencies, packages included.
Github[11] and docker[12] have tutorials on building docker container images in githhub workflows.
[11] https://github.com/actions/starter-workflows/blob/main/ci/docker-image.yml [12] https://docs.docker.com/guides/gha/
And more
Github provides a collection of starter workflows that you can build on.
Demo: From zero to published package in 15 minutes
Step 1: Create a Python project
First step is to create a Python project on your personal computer. There are several tools to help you with this. In this tutorial we’ll be using Poetry.
Next turn the project into a git repository:
And add the project’s contents to it:
Step 2: Create a github repository and push the project to it
- Go to https://github.com and login.
- Click on “New” to create a new project and give it the name
demo-tddc-nes
. - Click “Create repository” at the bottom right. No need to change anything else.
Then push the project with the following commands:
Refresh the page in the browser and you will see the contents of your project.
Step 3: Create a python module and a test
With a text editor create a file hello.py
in the
demo_tdcc_nes/
subfolder in the repository and paste the
contents below.
Next, to create a test, create a folder tests
in the
repository top level directory:
So, the repository should look like this:
.
├── demo_tdcc_nes
│ ├── hello.py
│ └── __init__.py
├── pyproject.toml
├── README.md
└── tests
└── __init__.py
Now create a file test_hello.py
in the
tests
folder and paste the following content:
PYTHON
import pytest
import demo_tdcc_nes
import demo_tdcc_nes.hello
@pytest.mark.parametrize('thing, expected', [("TDCC-NES", "Hello TDCC-NES")])
def test_hello(thing: str, expected: str) -> None:
result = demo_tdcc_nes.hello.hello(thing)
assert result == expected
install pytest package:
Then run the tests as follows:
And examine the result of a passing test:
TXT
$ pytest tests/
================================================== test session starts ===================================================
platform linux -- Python 3.11.9, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/user/projects/TDCC/demo-tdcc-nes
configfile: pyproject.toml
collected 1 item
tests/test_hello.py . [100%]
=================================================== 1 passed in 0.01s ====================================================
$
Step4: Commit the changes to git and push them to github
Type git status
to show the files that have been added
or have had their contents changed since the last commit.
$ git status
On branch main
Your branch is up to date with 'origin/main'.
Untracked files:
(use "git add <file>..." to include in what will be committed)
demo_tdcc_nes/hello.py
tests/test_hello.py
nothing added to commit but untracked files present (use "git add" to track)
Stage the changes and commit:
Next push the changes to github:
Step 5: Add a github actions workflow
The next step is to add a workflow to trigger github into creating a package and make it available.
Create the .github/workflows
sub folder in the
repository:
The repository now should look like this:
.
├── demo_tdcc_nes
│ ├── hello.py
│ ├── __init__.py
│ └── __pycache__
│ ├── hello.cpython-311.pyc
│ └── __init__.cpython-311.pyc
├── .github
│ └── workflows
├── pyproject.toml
├── README.md
└── tests
├── __init__.py
├── __pycache__
│ ├── __init__.cpython-311.pyc
│ └── test_hello.cpython-311-pytest-8.3.3.pyc
└── test_hello.py
In the .github/workflows
folder create the file
build-demo-tdcc-nes.yml
. The name of the file is not very
important, it’s helpful however to choose a name that identifies what it
does. It should have the extension .yml
or
.yaml
so it can be identifies as a yaml file.
Add the content below to the
.github/workflows/build-demo-tdcc-nes.yml
file:
YAML
name: Upload Python Package
on: [push]
permissions:
contents: read
jobs:
release-build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Build release distributions
run: |
# NOTE: put your own distribution build steps here.
python -m pip install build
python -m build
- name: Upload distributions
uses: actions/upload-artifact@v4
with:
name: release-dists
path: dist/
pypi-publish:
runs-on: ubuntu-latest
needs:
- release-build
steps:
- name: Retrieve release distributions
uses: actions/download-artifact@v4
with:
name: release-dists
path: dist/
Stage the change, commit and push:
Click on the “Actions” tab in the github web page to see the
workflow. Then click on the workflow (it has the title of the
git commit
message). A small graph is shown and at the
bottom “release-dists” link is provided. Click on that to download the
package.
That’s it! Package published!
Step 6: Publishing the package to PyPi
You can publish the package to PyPi if you have an account at https://pypi.org/. Follow the steps in the Python Packaging User Guid to make the package available through PyPi.
Considerations
CI/CD pipelines are not very suitable if your tests require a lot of static data. Running large integration tests inside a CI/CD pipeline is thus not recommended as there is generally limited space and time in CI/CD pipelines. Writing small and fast unit tests that run automatically inside CI/CD pipelines rather than large integration tests is recommended. It is therefore helpful to practice software engineering best practices such as decoupling, since that will lead to more easily testable code. Larger integration tests can still be done in CI/CD as long as they don’t require more than a few hundred megabytes of space and can complete within say 30 minutes. Check the resource limits your CI/CD infrastructure provider (e.g. github or gitlab) imposes on CI/CD pipelines. If you expect your tests to require more time and space than the CI/CD platform of you choice allows, consider alternative approaches such as blue/green deployments. Another alternative is to host your own pipeline runners that can be associated with your project on github9 or gitlab10.