Setting up your dev environment¶
In order to contribute to Great Expectations, you will need the following:
A GitHub account—this is sufficient if you only want to contribute to the documentation.
If you want to contribute code, you will also need a working version of Git on your computer. Please refer to the Git setup instructions for your environment.
We also recommend going through the SSH key setup process on GitHub for easier authentication.
Fork and clone the repository¶
1. Fork the Great Expectations repo
Go to the Great Expectations repo on GitHub.
Forkbutton in the top right. This will make a copy of the repo in your own GitHub account.
GitHub will take you to your forked version of the repository.
2. Clone your fork
Click the green
Clonebutton and choose the SSH or HTTPS URL depending on your setup.
Copy the URL and run
git clone <url>in your local terminal.
This will clone the
developbranch of the great_expectations repo. Please use
master!) as the starting point for your work.
Atlassian has a nice tutorial for developing on a fork.
3. Add the upstream remote
On your local machine, cd into the
great_expectationsrepo you cloned in the previous step.
git remote add upstream email@example.com:great-expectations/great_expectations.git
This sets up a remote called
upstreamto track changes to the main branch.
4. Create a feature branch to start working on your changes.
git checkout -b feature/my-feature-name
We do not currently follow a strict naming convention for branches. Please pick something clear and self-explanatory, so that it will be easy for others to get the gist of your work.
Install python dependencies¶
5. Create a new virtual environment
Make a new virtual environment (e.g. using virtualenv or conda), name it “great_expectations_dev” or similar.
virtualenv great_expectations_dev; source great_expectations_dev/bin/activate
This is not required, but highly recommended.
6. Install dependencies from requirements-dev.txt
pip install -r requirements-dev.txt
This will ensure that sure you have the right libraries installed in your python environment.
7. Install great_expectations from your cloned repo
pip install -e .
-ewill install Great Expectations in “editable” mode. This is not required, but is often very convenient as a developer.
(Optional) Configure resources for testing and documentation¶
Depending on which features of Great Expectations you want to work on, you may want to configure different backends for local testing, such as postgresql and Spark. Also, there are a couple of extra steps if you want to build documentation locally.
If you want to develop against local postgresql:
To simplify setup, the repository includes a docker-compose file that can stand up a local postgresql container. To use it, you’ll need to have docker installed.
great_expectationsrepo and run
docker-compose up -d
Within the same directory, you can run
docker-compose psto verify that the container is running. You should see something like:Name Command State Ports ——————————————————————————————————————————— postgresql_travis_db_1 docker-entrypoint.sh postgres Up 0.0.0.0:5432->5432/tcp
Once you’re done testing, you can shut down your postgesql container by running
docker-compose downfrom the same directory.
Caution: If another service is using port 5432, docker may start the container but silently fail to set up the port. In that case, you will probably see errors like this:psycopg2.OperationalError: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5432? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5432?
Or this…sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: database "test_ci" does not exist (Background on this error at: http://sqlalche.me/e/e3q8)
If you want to develop against local Spark:
In most cases,
pip install requirements-dev.txtshould set up pyspark for you.
If you don’t have Java installed, you will probably need to install it and set your
JAVA_HOMEenvironment variables appropriately.
You can find official installation instructions for spark here.
If you want to build documentation locally:
pip install -r docs/requirements.txt
To build documentation, the command is
cd docs; make html
Documentation will be generated in
index.htmlas the index page.
Run tests to confirm that everything is working¶
You can run all tests by running
pytest in the great_expectations directory root. Please see Testing for testing options and details.
At this point, you have everything you need to start coding!