Process for contributing to the docs

Here is how you can contribute to the fastai documentation in just 4 steps.

Step 1. Create a fastai git branch

The process of creating a branch (with a fork), including a program that will do it for you in one step, and submitting the PR is explained in details in How to Make a Pull Request (PR)

Step 2. Setup

From the fastai repo or your forked branch checkout folder, install the required developer modules:

pip install -e ".[dev]"

If you’re on Windows, you also need to convert the Unix symlink between docs_src\imgs and docs\imgs. You will need to (1) remove docs_src\imgs, (2) execute cmd.exe as administrator, and (3) finally, in the docs_src folder, execute:

cd docs_src
mklink /d imgs ..\docs\imgs

If you followed the fastai-specific instructions explained here, you’re all set. If you made a PR branch in some other way, it’s crucial that you execute:

tools/run-after-git-clone  # or python tools\run-after-git-clone on windows

in that branch once. You can read more about it here.

Step 3. Edit the documents

There are two types of source files: *ipynb and *md files.

  1. *ipynb notebook files, located under the directory docs_src, are the sources for most of the *html files on fastai1.fast.ai. For example, https://fastai1.fast.ai/data_block.html is generated from the docs_src/data_block.ipynb.

    While you can use a normal editor for editing this type of file, it’s difficult to edit json-format files and it’s very easy to break them. Instead, edit *ipynb files by opening them in your Jupyter Notebook environment.

    If you were using a text editor to make changes, when you are done working on a notebook improvement, please make sure to validate the notebook’s format by simply loading it in Jupyter Notebook.

    When you finish editing a notebook, remember to save it before doing git commit!

    You don’t need to convert your work to HTML: we will do it after your PR is accepted and merged.

    Note: JupyterLab is currently not supported. If you missed this warning and have already edited .ipynb files in JupyterLab, you can fix them.

  2. *md text files, located at docs/*.md and docs/*/*.md require no Jupyter environment - i.e. they contain plain text formatted using the markdown format. Note, that unlike *ipynb, these are located in the docs directory. For example, https://fastai1.fast.ai/troubleshoot.html’s source is docs/troubleshoot.md.

    Edit these files in your editor. To validate the Markdown, use grip or any other Markdown rendering/validating tool of your liking.

Do not edit the docs/*html files because they are autogenerated. If you change them, your changes will get overwritten. We autogenerate the docs about once a day, so your change will become visible on docs.fast.ai then.

Step 4. Submit a PR with your changes

See Submit Your PR.

You’re done. You don’t need to do anything more at this point, other than checking on the status of your PR and if everything is good it’ll be merged. If there are some issues with it, you may be requested to make some changes, which you just commit and push like your initial commit. The PR will get automatically updated with your new changes.

There is also a visual step-by-step demo of how Daniel did his first PR that improved fastai documentation.

Down the road, once you get comfortable submitting documentation PRs, you can explore the rest of the document, but leave it for later, so you don’t get confused.

How the docs are created - a Visual diagram

To help you understand better the documentation creation process, here is a diagram of stages each type of a document goes through after it has been edited and before the changes appear on the website:

1. edit    2. tools/build-docs  3. jekyll (githubpages)
     |            |                   |
docs_src/*.ipynb ----> docs/*.html -----> docs.fast.ai/*html
docs/*.md ------------------------------> docs.fast.ai/*html

At the end of stages 1 and 2 git commit and git push are needed, the 3rd stage happens automatically. *md files require no stage 2.

After your PR is merged, we rebuild the docs (using tools/build-docs) and then we commit/push the rebuilt docs. Then, GitHub Pages usually updates the website automatically within a few minutes. Therefore, if your changes aren’t visible on the website, despite your PR being merged, it’s because the second stage hasn’t been done. It happens once a day or so, so please be patient.

If you find this section’s instructions unclear or difficult to follow, please, kindly let us know in this thread, so that we could improve them.

The rest of this document goes more in depth about all the documentation generation functionalities. You don’t need to read or understand any of it to make a successful contribution to the existing documents. If you just want to add some text or correct a typo, make a PR with the notebook changed, and we’ll take care of the rest.

Syncing added/updated API with docs

If you’re just improving the documentation prose, and not the API documentation, just edit directly the desired source files from docs_src/*.ipynb, docs/*.md and docs/dev/*.md. The following instructions are only needed if you tweak the API and need to make those changes visible/up-to-date in the documentation files.

The bulk of the process and setup are explained in gen_doc.gen_notebooks, but its primarily purpose is for doing the heavy lifting of documenting new modules. Here you will find the minimal instructions needed to do a simple synchronization of newly added function/classes or their updates for fastai modules that already have corresponding documentation documents.

Prerequisites

Install the prerequisites:

pip install -e ".[dev]"

Install the Hide Input Jupyter Notebook extension:

  1. start jupyter notebook
  2. go to http://localhost:8888/nbextensions
  3. enable the Hide Input extension

Initial Synchronization

Let’s say we want to do some changes to the docs for data_block.py:

First, run an update to sync any API changes that happened before your work, but not synchronized with the docs. The code is usually ahead of the docs, and the docs don’t get updated all the time.

tools/build-docs --update-nb-links  docs_src/data_block.ipynb

Assuming the build was successful, commit the changes.

While you don’t have to do this first, it’s very helpful, since now when you re-run the build you will be able to see only the changes you introduced and not potentially hundreds of changes that have nothing to do with your modifications.

Now, you can start working on the docstrings of the new or updated functions and classes, and extra prose that you’d like to add to the documentation.

Adding a new function/class

Say you added a method to data_block.py:

def foobar(self, times:int=1)->'str':
    "This functions returns FooBar * times"
    return "FooBar" * times

While the docstring can be of any length, the fastai coding style requests that it should be no longer than 120 char long.

Any extra notes should be placed inside the corresponding entry in .ipynb and if it’s really important some perhaps as comments in the .py, following the docstring.

Now run:

tools/build-docs --document-new-fns docs_src/data_block.ipynb

and if you now load docs_src/data_block.ipynb in Jupyter Notebook and scroll down to the very end of the notebook, you will find a new method entry was added under the header New Methods - Please document or move to the undocumented section.

Take that cell and move it up to where it belongs in the document—most likely in the same position it’s found in the source code .py file. Note that if you select this cell and hit Toggle cell input display (the Hide Input extension) in the menu, you will see something like:

show_doc(LabelLists.foobar)

If something is not displayed correctly, lookup the show_doc function, where you can adjust the arguments to make things look right. For example, you can pass full_name='foobar' argument to adjust a function name (usually helpful with functions that start with _, or you can pass title_level=3 if you want it to show up at a header level 3.

When you’re done tweaking the show_doc(...) input for that entry, remember to hit again the Toggle cell input display button to make it invisible, so that in the final docs site it’s not displayed but the functionality generated by it does.

If you need to add any extra comments, example or instructions, create one or more cells and add what you need (markup and code if need be).

When satisfied, first make sure to save the notebook (since it auto-saves only every few minutes), and then convert this notebook into html with:

tools/build-docs docs_src/data_block.ipynb

It can be a good idea to run git diff to check your changes, but it might be tricky since the output format is not very human-friendly. But it’ll show you if you messed something up - e.g. you deleted something unintentionally.

Finally, commit the modified .ipynb and the corresponding .html file:

git commit docs_src/data_block.ipynb docs/data_block.html

and then push the changes into the repo.

Several minutes after the push, you will see the updated documents at https://fastai1.fast.ai/data_block.html.

Updating an existing function/class

To take care of updating any changes in the API’s arguments and docstrings that are already in the API docs, execute the following command. (We are using data_block as an example.)

tools/build-docs --update-nb-links docs_src/data_block.ipynb

Then, as in the previous section, check the diff, commit, and push.

Creating a new documentation notebook from existing module

If a fastai.* Python module already exists, but there is no associated documentation notebook (docs_src/*.ipynb), you can auto-generate one by running the following:

tools/build-docs fastai.subpackage.module

This will create a skeleton documentation notebook - docs_src/subpackage.module.ipynb. It will populate with all the module methods. These will need to be documented.

To change the default header levels (e.g. h3 or h2 instead of h2), adjust them with an explicit title_level argument in the corresponding show_doc() entry. For example:

show_doc(...., title_level=4)

See the documentation for show_doc for more options.

Borked rendering

If after git pull you load, e.g. docs_src/data_block.ipynb in Jupyter Notebook and you get a bunch of cryptic entries like:

<IPython.core.display.Markdown object>
<IPython.core.display.Markdown object>
<IPython.core.display.Markdown object>

instead of the API documentation entries, executing:

tools/build-docs --update-nb-links docs_src/data_block.ipynb

and then reloading the notebook fixes the problem.

Building the documentation website

The https://fastai1.fast.ai/ website is comprised from documentation notebooks converted to .html, .md files, jekyll metadata, jekyll templates (including the sidebar).

  • .md files are automatically converted by GitHub Pages (requires no extra action)
  • the sidebar and other jekyll templates under docs/_data/ are automatically deployed by GitHub Pages (requires no extra action)
  • changes in jekyll metadata require a rebuild of the affected notebooks
  • changes in .ipynb nbs require a rebuild of the affected notebooks

Updating sidebar

  1. edit docs_src/sidebar/sidebar_data.py
  2. python tools/make_sidebar.py
  3. check docs/_data/sidebars/home_sidebar.yml
  4. git commit docs_src/sidebar/sidebar_data.py docs/_data/sidebars/home_sidebar.yml

jekyll sidebar documentation.

Updating notebook metadata

In order to pass the right settings to the website version of the docs, each notebook has a custom entry which if you look at the source code, looks like:

 "metadata": {
  "jekyll": {
   "keywords": "fastai",
   "toc": "false",
   "title": "Welcome to fastai"
  },
  [...]

Do not edit this entry manually, or your changes will be overwritten in the next metadata update.

The only correct way to change any notebook’s metadata is by opening docs_src/jekyll_metadata.ipynb, finding the notebook you want to change the metadata for, changing it, and running the notebook, then saving and committing it and the resulting changes.

Updating notebooks

Use this section only when you have added a new function that you want to document, or modified an existing function.

Here is how to build/update the documentation notebooks to reflect changes in the library.

For most cases, run the full doc sync (which also updates the test registry):

make docs

To update only the modified notebooks under docs_src run:

python tools/build-docs

To update specific *ipynb nbs:

python tools/build-docs docs_src/notebook1.ipynb docs_src/notebook2.ipynb ...

To update specific fastai.* module:

python tools/build-docs fastai.subpackage1.module1 fastai.subpackage2.module2 ...

To force a rebuild of all notebooks and not just the modified ones, use the -f option.

python tools/build-docs -f

To scan a module and add any new module functions to documentation notebook:

python tools/build-docs --document-new-fns

To automatically append new fastai methods to their corresponding documentation notebook:

python tools/build-docs --update-nb-links

Use the -h for more options.

Alternatively, update_notebooks can be run from the notebook.

To update all notebooks under docs_src run:

update_notebooks('.')

To update specific python file only:

update_notebooks('gen_doc.gen_notebooks.ipynb', update_nb=True)

update_nb=True inserts newly added module methods into the docs that haven’t already been documented.

Alternatively, you can update a specific module:

update_notebooks('fastai.gen_doc.gen_notebooks', dest_path='fastai/docs_src')

Updating html only

If you are not synchronizing the code base with its documentation, but made some manual changes to the documentation notebooks, then you don’t need to update the notebooks, but just convert them to .html:

To convert docs_src/*ipynb to docs/*html:

  • only the modified *ipynb:
python tools/build-docs -l
  • specific *ipynbs:
python tools/build-docs -l docs_src/notebook1.ipynb docs_src/notebook2.ipynb ...
  • force to rebuild all *ipynbs:
python tools/build-docs -fl

After you commit doc changes please validate that all the links and #anchors are correct.

If it’s the first time you are about to run the link checker, install the prerequisites first.

After committing the new changes, first, wait a few minutes for GitHub Pages to sync, otherwise you’ll be testing an outdated live site.

Then, do:

cd tools/checklink
./checklink-docs.sh

The script will be silent and only report problems as it finds them.

Remember, that it’s testing the live website, so if you detect problems and make any changes, remember to first commit the changes and wait a few minutes before re-testing.

You can also test the site locally before committing your changes, please see: README.

To test the course-v3.fast.ai site, do:

./checklink-course-v3.sh

Working with Markdown

Preview

If you work on Markdown (.md) files, it helps to validate that the layout is not broken. grip seems to work quite well for this purpose. To install it, grip, run

pip install grip

Here is an example of how to use grip:

grip -b docs/dev/release.md

It will open a browser with the Markdown rendered as HTML. It uses the GitHub API, so this is exactly how it’ll look on GitHub once you commit it. And here is a handy alias:

alias grip='grip -b'

so you don’t need to remember the flag.

Markdown Tips

  • If you use numbered items and their number goes beyond 9 you must switch to 4-whitespace chars indentation for the paragraphs belonging to each item. Under 9 or with * you need 3-whitespace chars as a leading indentation.
  • When building tables make sure to use --|-- and not --+-- to separate the headers: GitHub will not render it properly otherwise.

Testing site locally

Install prerequisites:

  • Debian/Ubuntu:

     sudo apt install ruby-bundler
    

    When running this one it will ask for your user’s password (basically running a sudo operation):

     bundle install jekyll
    
  • Mac OS:

     gem install bundler
     cd docs
     bundle install
    

Start the website:

cd docs
bundle exec jekyll serve

it will tell you which localhost url to go to see the site.

Cleanups

Check whether we have any doc/*html orphans that no longer have docs_src/*ipynb source files:

perl -le 'm#docs/(.*?)\.html# && !-e "docs_src/$1.ipynb" && print for @ARGV' docs/*html

and remove them from git.