961

Imagine that you want to develop a non-trivial end-user desktop (not web) application in Python. What is the best way to structure the project's folder hierarchy?

Desirable features are ease of maintenance, IDE-friendliness, suitability for source control branching/merging, and easy generation of install packages.

In particular:

  1. Where do you put the source?
  2. Where do you put application startup scripts?
  3. Where do you put the IDE project cruft?
  4. Where do you put the unit/acceptance tests?
  5. Where do you put non-Python data such as config files?
  6. Where do you put non-Python sources such as C++ for pyd/so binary extension modules?
1
  • 2
    Can any of those SO individuals that keep voting for "question is not focused enough" please step up and propose a more "focused" question?
    – conny
    Nov 28, 2023 at 13:23

8 Answers 8

509

Doesn't too much matter. Whatever makes you happy will work. There aren't a lot of silly rules because Python projects can be simple.

  • /scripts or /bin for that kind of command-line interface stuff
  • /tests for your tests
  • /lib for your C-language libraries
  • /doc for most documentation
  • /apidoc for the Epydoc-generated API docs.

And the top-level directory can contain README's, Config's and whatnot.

The hard choice is whether or not to use a /src tree. Python doesn't have a distinction between /src, /lib, and /bin like Java or C has.

Since a top-level /src directory is seen by some as meaningless, your top-level directory can be the top-level architecture of your application.

  • /foo
  • /bar
  • /baz

I recommend putting all of this under the "name-of-my-product" directory. So, if you're writing an application named quux, the directory that contains all this stuff is named /quux.

Another project's PYTHONPATH, then, can include /path/to/quux/foo to reuse the QUUX.foo module.

In my case, since I use Komodo Edit, my IDE cuft is a single .KPF file. I actually put that in the top-level /quux directory, and omit adding it to SVN.

13
  • 38
    Any open source python projects you would recommend emulating their directory structure? Oct 5, 2009 at 17:02
  • 6
    Look at Django for a good example.
    – S.Lott
    Oct 5, 2009 at 19:08
  • 54
    I don't tend to consider Django a good example -- playing tricks with sys.path is an instant DQ in my book. Oct 29, 2010 at 16:20
  • 26
    re "tricks": Django adds the parent of the root project folder to the sys.path, so that modules can be imported as either "from project.app.module import klass" or "from app.module import klass". Nov 7, 2011 at 7:12
  • 4
    Oh, I love this trick and am using it now. I want to put the shared module in another directory, and I do not want to install module system-wide, nor do I want to ask people to modify PYTHONPATH manually. Unless people propose something better, I think this is actually the cleanest way to go.
    – Yongwei Wu
    Jul 5, 2017 at 9:39
319

According to Jean-Paul Calderone's Filesystem structure of a Python project:

Project/
|-- bin/
|   |-- project
|
|-- project/
|   |-- test/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |   
|   |-- __init__.py
|   |-- main.py
|
|-- setup.py
|-- README
17
  • 54
    how does executable file in the bin folder reference the project module? (I don't think python syntax allows ../ in an include statement) Jun 8, 2014 at 6:46
  • 4
    @ThorSummoner That only works when remaining inside a single package. For the relative import to work here you'd need an __init__.py file in both the bin folder and the Project top folder.
    – Yonatan
    Nov 2, 2014 at 14:13
  • 12
    @ThorSummoner Simple. You install the package! (pip install -e /path/to/Project)
    – Kroltan
    Dec 19, 2014 at 17:27
  • 39
    It would be awesome if someone would zip up a sample of this layout with hello.py and hello-test.py and make it available for us newbs. Jan 15, 2015 at 14:36
  • 12
    @Bloke The core is the -e flag, which installs the package as an editable package, that is, installs it as links to the actual project folder. The executable can then just import project to have access to the module.
    – Kroltan
    Mar 14, 2016 at 12:15
304

This blog post by Jean-Paul Calderone is commonly given as an answer in #python on Freenode.

Filesystem structure of a Python project

Do:

  • name the directory something related to your project. For example, if your project is named "Twisted", name the top-level directory for its source files Twisted. When you do releases, you should include a version number suffix: Twisted-2.5.
  • create a directory Twisted/bin and put your executables there, if you have any. Don't give them a .py extension, even if they are Python source files. Don't put any code in them except an import of and call to a main function defined somewhere else in your projects. (Slight wrinkle: since on Windows, the interpreter is selected by the file extension, your Windows users actually do want the .py extension. So, when you package for Windows, you may want to add it. Unfortunately there's no easy distutils trick that I know of to automate this process. Considering that on POSIX the .py extension is a only a wart, whereas on Windows the lack is an actual bug, if your userbase includes Windows users, you may want to opt to just have the .py extension everywhere.)
  • If your project is expressable as a single Python source file, then put it into the directory and name it something related to your project. For example, Twisted/twisted.py. If you need multiple source files, create a package instead (Twisted/twisted/, with an empty Twisted/twisted/__init__.py) and place your source files in it. For example, Twisted/twisted/internet.py.
  • put your unit tests in a sub-package of your package (note - this means that the single Python source file option above was a trick - you always need at least one other file for your unit tests). For example, Twisted/twisted/test/. Of course, make it a package with Twisted/twisted/test/__init__.py. Place tests in files like Twisted/twisted/test/test_internet.py.
  • add Twisted/README and Twisted/setup.py to explain and install your software, respectively, if you're feeling nice.

Don't:

  • put your source in a directory called src or lib. This makes it hard to run without installing.
  • put your tests outside of your Python package. This makes it hard to run the tests against an installed version.
  • create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module instead of a package, it's simpler.
  • try to come up with magical hacks to make Python able to import your module or package without having the user add the directory containing it to their import path (either via PYTHONPATH or some other mechanism). You will not correctly handle all cases and users will get angry at you when your software doesn't work in their environment.
15
  • 38
    This was exactly what I needed. "DONT try to come up with magical hacks to make Python able to import your module or package without having the user add the directory containing it to their import path." Good to know! Feb 28, 2013 at 5:18
  • 21
    Confused about "put your source in a directory called src or lib. This makes it hard to run without installing.". What would be installed? Is it the dir name that causes the issue, or the fact that it is a subdir at all? Feb 3, 2015 at 22:25
  • 14
    "This makes it hard to run without installing." -- that's the point
    – Nick T
    Jul 11, 2019 at 21:23
  • 27
    I find it ironical that the example uses Twisted as the project name, since the official Twisted library now uses a src layout, which contradicts the first "Don't" recommendation: "put your source in a directory called src or lib. This makes it hard to run without installing." This is the whole point (see Ionel Cristian Mărieș's article).
    – Géry Ogam
    Oct 22, 2019 at 13:09
  • 11
    Do: "put your source in a directory called src or lib." Jun 21, 2020 at 9:15
157

Check out Open Sourcing a Python Project the Right Way.

Let me excerpt the project layout part of that excellent article:

When setting up a project, the layout (or directory structure) is important to get right. A sensible layout means that potential contributors don't have to spend forever hunting for a piece of code; file locations are intuitive. Since we're dealing with an existing project, it means you'll probably need to move some stuff around.

Let's start at the top. Most projects have a number of top-level files (like setup.py, README.md, requirements.txt, etc). There are then three directories that every project should have:

  • A docs directory containing project documentation
  • A directory named with the project's name which stores the actual Python package
  • A test directory in one of two places
    • Under the package directory containing test code and resources
    • As a stand-alone top level directory To get a better sense of how your files should be organized, here's a simplified snapshot of the layout for one of my projects, sandman:
$ pwd
~/code/sandman
$ tree
.
|- LICENSE
|- README.md
|- TODO.md
|- docs
|   |-- conf.py
|   |-- generated
|   |-- index.rst
|   |-- installation.rst
|   |-- modules.rst
|   |-- quickstart.rst
|   |-- sandman.rst
|- requirements.txt
|- sandman
|   |-- __init__.py
|   |-- exception.py
|   |-- model.py
|   |-- sandman.py
|   |-- test
|       |-- models.py
|       |-- test_sandman.py
|- setup.py

As you can see, there are some top level files, a docs directory (generated is an empty directory where sphinx will put the generated documentation), a sandman directory, and a test directory under sandman.

5
  • 7
    I do this, but more so: I have a toplevel Makefile with an 'env' target that automates 'virtualenv env ; ./env/bin/pip install -r requirements.txt ; ./env/bin/python setup.py develop', and also usually a 'test' target that depends on env and also installs test dependencies and then runs py.test.
    – pjz
    Nov 10, 2014 at 3:12
  • @pjz Could you please expand your idea? Are you talking about putting Makefile at the same level as setup.py? So if I understands you correctly make env automates creatating a new venv and install the packages into it... ?
    – St.Antario
    Nov 28, 2019 at 7:01
  • 1
    @St.Antario exactly. As mentioned I generally also have a 'test' target to run the tests, and sometimes a 'release' target that looks at the current tag and builds a wheel and sends it to pypi.
    – pjz
    Jan 9, 2020 at 2:08
  • In this structure, how does any file inside /code/sandman/sandman/ import something in /code/sandman/docs/? Say, I want to import config.py from sandman.py. How should I do that? Apr 22, 2021 at 13:13
  • 2
    This answer is outdated and not recommended anymore. Please don't reproduce this today. See packaging.python.org/en/latest/tutorials/packaging-projects
    – buhtz
    Aug 8, 2022 at 18:51
56

The "Python Packaging Authority" has a sampleproject:

https://github.com/pypa/sampleproject

It is a sample project that exists as an aid to the Python Packaging User Guide's Tutorial on Packaging and Distributing Projects.

2
32

Try starting the project using the python_boilerplate template. It largely follows the best practices (e.g. those here), but is better suited in case you find yourself willing to split your project into more than one egg at some point (and believe me, with anything but the simplest projects, you will. One common situation is where you have to use a locally-modified version of someone else's library).

  • Where do you put the source?

    • For decently large projects it makes sense to split the source into several eggs. Each egg would go as a separate setuptools-layout under PROJECT_ROOT/src/<egg_name>.
  • Where do you put application startup scripts?

    • The ideal option is to have application startup script registered as an entry_point in one of the eggs.
  • Where do you put the IDE project cruft?

    • Depends on the IDE. Many of them keep their stuff in PROJECT_ROOT/.<something> in the root of the project, and this is fine.
  • Where do you put the unit/acceptance tests?

    • Each egg has a separate set of tests, kept in its PROJECT_ROOT/src/<egg_name>/tests directory. I personally prefer to use py.test to run them.
  • Where do you put non-Python data such as config files?

    • It depends. There can be different types of non-Python data.
      • "Resources", i.e. data that must be packaged within an egg. This data goes into the corresponding egg directory, somewhere within package namespace. It can be used via the pkg_resources package from setuptools, or since Python 3.7 via the importlib.resources module from the standard library.
      • "Config-files", i.e. non-Python files that are to be regarded as external to the project source files, but have to be initialized with some values when application starts running. During development I prefer to keep such files in PROJECT_ROOT/config. For deployment there can be various options. On Windows one can use %APP_DATA%/<app-name>/config, on Linux, /etc/<app-name> or /opt/<app-name>/config.
      • Generated files, i.e. files that may be created or modified by the application during execution. I would prefer to keep them in PROJECT_ROOT/var during development, and under /var during Linux deployment.
  • Where do you put non-Python sources such as C++ for pyd/so binary extension modules?
    • Into PROJECT_ROOT/src/<egg_name>/native

Documentation would typically go into PROJECT_ROOT/doc or PROJECT_ROOT/src/<egg_name>/doc (this depends on whether you regard some of the eggs to be a separate large projects). Some additional configuration will be in files like PROJECT_ROOT/buildout.cfg and PROJECT_ROOT/setup.cfg.

8
  • Thanks for a great answer! You clarified many things for me! I just have one question: Can eggs be nested?
    – matanc1
    Aug 19, 2014 at 16:02
  • No, you can't "nest" eggs in the sense of storing .egg files within other .egg files and hoping this would be of much use [unless you're up to something really weird]. What you can do, though, is create "virtual" eggs - empty packages that do not provide any useful code, but list other packages in their dependency lists. This way, when a user attempts to install such a package he will recursively install many dependent eggs.
    – KT.
    Sep 11, 2014 at 17:08
  • @KT can you elaborate a bit about how you handle generated data? In particular, how do you (in code) distinguish between development and deployment? I imagine you have some base_data_location variable, but how do you set it appropriately?
    – cmyr
    Apr 3, 2016 at 21:20
  • 1
    I presume you are speaking about "runtime data" - something people would often put under /var/packagename or ~/.packagename/var, or whatnot. Most of the time those choices are sufficient as a default that your users don't care to change. If you want to allow this behaviour to be tuned, options are rather abundant and I do not think there is a single fits-all best practice. Typical choices: a) ~/.packagename/configfile, b) export MY_PACKAGE_CONFIG=/path/to/configfile c) command-line options or function parameters d) combination of those.
    – KT.
    Apr 3, 2016 at 21:44
  • Note that it is quite usual to have a singleton Config class somewhere, which handles your favourite config loading logic for you and perhaps even lets the user modify the settings at runtime. In general, though, I think this is an issue worth a separate question (which might have been asked before somewhere here).
    – KT.
    Apr 3, 2016 at 21:44
20

In my experience, it's just a matter of iteration. Put your data and code wherever you think they go. Chances are, you'll be wrong anyway. But once you get a better idea of exactly how things are going to shape up, you're in a much better position to make these kinds of guesses.

As far as extension sources, we have a Code directory under trunk that contains a directory for python and a directory for various other languages. Personally, I'm more inclined to try putting any extension code into its own repository next time around.

With that said, I go back to my initial point: don't make too big a deal out of it. Put it somewhere that seems to work for you. If you find something that doesn't work, it can (and should) be changed.

1
  • Yep. I try to be "Pythonic" about it: explicit is better than implicit.. Directory heirarchies are read/inspected more than they are written. Etc..
    – eric
    Apr 30, 2015 at 0:19
15

Non-python data is best bundled inside your Python modules using the package_data support in setuptools. One thing I strongly recommend is using namespace packages to create shared namespaces which multiple projects can use -- much like the Java convention of putting packages in com.yourcompany.yourproject (and being able to have a shared com.yourcompany.utils namespace).

Re branching and merging, if you use a good enough source control system it will handle merges even through renames; Bazaar is particularly good at this.

Contrary to some other answers here, I'm +1 on having a src directory top-level (with doc and test directories alongside). Specific conventions for documentation directory trees will vary depending on what you're using; Sphinx, for instance, has its own conventions which its quickstart tool supports.

Please, please leverage setuptools and pkg_resources; this makes it much easier for other projects to rely on specific versions of your code (and for multiple versions to be simultaneously installed with different non-code files, if you're using package_data).

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.