πŸš€ Installing dcbench

This section describes how to install the dcbench Python package.

pip install dcbench

Optional

Some parts of dcbench rely on optional dependencies. If you know which optional dependencies you’d like to install, you can do so using something like pip install dcbench[dev] instead. See setup.py for a full list of optional dependencies.

Installing from branch

To install from a specific branch use the command below, replacing main with the name of any branch in the dcbench repository.

pip install "dcbench @ git+https://github.com/data-centric-ai/dcbench@main"

Installing from clone

You can install from a clone of the dcbench repo with:

git clone https://github.com/data-centric-ai/dcbench.git
cd dcbench
pip install -e .

βš™οΈ Configuring dcbench

Several aspects of dcbench behavior can be configured by the user. For example, one may wish to change the directory in which dcbench downloads artifacts (by default this is ~/.dcbench).

You can see the current state of the dcbench configuration with:

In [1]: import dcbench

In [2]: dcbench.config
Out[2]: DCBenchConfig(local_dir='/home/docs/.dcbench', public_bucket_name='dcbench', hidden_bucket_name='dcbench-hidden', celeba_dir='/home/docs/.dcbench/datasets/celeba', imagenet_dir='/home/docs/.dcbench/datasets/imagenet')

Configuring with YAML

To change the configuration create a YAML file, like the one below:

Then set the environment variable DCBENCH_CONFIG to point to the file:

export DCBENCH_CONFIG="/path/to/dcbench-config.yaml"

If you’re using a conda, you can permanently set this variable for your environment:

conda env config vars set DCBENCH_CONFIG="path/to/dcbench-config.yaml"
conda activate env_name  # need to reactivate the environment

Configuring Programmatically

You can also update the config programmatically, though unlike the YAML method above, these changes will not persist beyond the lifetime of your program.

dcbench.config.local_dir = "/path/to/storage"
dcbench.config.public_bucket_name = "dcbench-test"