dcbench.tasks.slice_discovery package

Submodules

dcbench.tasks.slice_discovery.baselines module

confusion_sdm(problem)[source]

A simple slice discovery method that returns a slice corresponding to each cell of the confusion matrix. For example, for a binary prediction task, this sdm will return 4 slices corresponding to true positives, false positives, true negatives and false negatives.

Parameters

problem (SliceDiscoveryProblem) – The slice discovery problem.

Returns

The predicted slices.

Return type

SliceDiscoverySolution

domino_sdm(problem)[source]
Parameters

problem (dcbench.tasks.slice_discovery.problem.SliceDiscoveryProblem) –

Return type

dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

dcbench.tasks.slice_discovery.metrics module

precision_at_k(slice, pred_slice, k=25)[source]
Parameters
  • slice (numpy.ndarray) –

  • pred_slice (numpy.ndarray) –

  • k (int) –

recall_at_k(slice, pred_slice, k=25)[source]
Parameters
  • slice (numpy.ndarray) –

  • pred_slice (numpy.ndarray) –

  • k (int) –

compute_metrics(slices, pred_slices)[source]

[summary]

Parameters
  • slices (np.ndarray) – [description]

  • pred_slices (np.ndarray) – [description]

Returns

[description]

Return type

dict

dcbench.tasks.slice_discovery.problem module

class SliceDiscoverySolution(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.solution.Solution

Parameters
  • artifacts (Mapping[str, Artifact]) –

  • attributes (Mapping[str, Attribute]) –

  • container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'pred_slices': ArtifactSpec(description='A DataPanel of predicted slice labels with columns `id` and `pred_slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}
attribute_specs: Mapping[str, AttributeSpec] = {'problem_id': AttributeSpec(description='A unique identifier for this problem.', attribute_type=<class 'str'>, optional=False)}
task_id: str = 'slice_discovery'
class SliceDiscoveryProblem(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.problem.Problem

Parameters
  • artifacts (Mapping[str, Artifact]) –

  • attributes (Mapping[str, Attribute]) –

  • container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'activations': ArtifactSpec(description="A DataPanel of the model's activations with columns `id`,`act`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'base_dataset': ArtifactSpec(description='A DataPanel representing the base dataset with columns `id` and `image`.', artifact_type=<class 'dcbench.common.artifact.VisionDatasetArtifact'>, optional=False), 'clip': ArtifactSpec(description="A DataPanel of the image embeddings from OpenAI's CLIP model", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'model': ArtifactSpec(description='A trained PyTorch model to audit.', artifact_type=<class 'dcbench.common.artifact.ModelArtifact'>, optional=False), 'test_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'test_slices': ArtifactSpec(description='A DataPanel of the ground truth slice labels with columns  `id`, `slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'val_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}
attribute_specs: Mapping[str, AttributeSpec] = {'alpha': AttributeSpec(description='The alpha parameter for the AUC metric.', attribute_type=<class 'float'>, optional=False), 'dataset': AttributeSpec(description='The name of the dataset being audited.', attribute_type=<class 'str'>, optional=False), 'n_pred_slices': AttributeSpec(description='The number of slice predictions that each slice discovery method can return.', attribute_type=<class 'int'>, optional=False), 'slice_category': AttributeSpec(description='The type of slice .', attribute_type=<class 'str'>, optional=False), 'slice_names': AttributeSpec(description='The names of the slices in the dataset.', attribute_type=<class 'list'>, optional=False), 'target_name': AttributeSpec(description='The name of the target column in the dataset.', attribute_type=<class 'str'>, optional=False)}
task_id: str = 'slice_discovery'
solve(pred_slices_dp)[source]
Parameters

pred_slices_dp (meerkat.datapanel.DataPanel) –

Return type

dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

evaluate(solution)[source]
Parameters

solution (dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution) –

Return type

dict

name: str
summary: str
solution_class: type

Module contents

confusion_sdm(problem)[source]

A simple slice discovery method that returns a slice corresponding to each cell of the confusion matrix. For example, for a binary prediction task, this sdm will return 4 slices corresponding to true positives, false positives, true negatives and false negatives.

Parameters

problem (SliceDiscoveryProblem) – The slice discovery problem.

Returns

The predicted slices.

Return type

SliceDiscoverySolution

domino_sdm(problem)[source]
Parameters

problem (dcbench.tasks.slice_discovery.problem.SliceDiscoveryProblem) –

Return type

dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

class SliceDiscoveryProblem(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.problem.Problem

Parameters
  • artifacts (Mapping[str, Artifact]) –

  • attributes (Mapping[str, Attribute]) –

  • container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'activations': ArtifactSpec(description="A DataPanel of the model's activations with columns `id`,`act`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'base_dataset': ArtifactSpec(description='A DataPanel representing the base dataset with columns `id` and `image`.', artifact_type=<class 'dcbench.common.artifact.VisionDatasetArtifact'>, optional=False), 'clip': ArtifactSpec(description="A DataPanel of the image embeddings from OpenAI's CLIP model", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'model': ArtifactSpec(description='A trained PyTorch model to audit.', artifact_type=<class 'dcbench.common.artifact.ModelArtifact'>, optional=False), 'test_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'test_slices': ArtifactSpec(description='A DataPanel of the ground truth slice labels with columns  `id`, `slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'val_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}
attribute_specs: Mapping[str, AttributeSpec] = {'alpha': AttributeSpec(description='The alpha parameter for the AUC metric.', attribute_type=<class 'float'>, optional=False), 'dataset': AttributeSpec(description='The name of the dataset being audited.', attribute_type=<class 'str'>, optional=False), 'n_pred_slices': AttributeSpec(description='The number of slice predictions that each slice discovery method can return.', attribute_type=<class 'int'>, optional=False), 'slice_category': AttributeSpec(description='The type of slice .', attribute_type=<class 'str'>, optional=False), 'slice_names': AttributeSpec(description='The names of the slices in the dataset.', attribute_type=<class 'list'>, optional=False), 'target_name': AttributeSpec(description='The name of the target column in the dataset.', attribute_type=<class 'str'>, optional=False)}
task_id: str = 'slice_discovery'
solve(pred_slices_dp)[source]
Parameters

pred_slices_dp (meerkat.datapanel.DataPanel) –

Return type

dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

evaluate(solution)[source]
Parameters

solution (dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution) –

Return type

dict

name: str
summary: str
solution_class: type
class SliceDiscoverySolution(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.solution.Solution

Parameters
  • artifacts (Mapping[str, Artifact]) –

  • attributes (Mapping[str, Attribute]) –

  • container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'pred_slices': ArtifactSpec(description='A DataPanel of predicted slice labels with columns `id` and `pred_slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}
attribute_specs: Mapping[str, AttributeSpec] = {'problem_id': AttributeSpec(description='A unique identifier for this problem.', attribute_type=<class 'str'>, optional=False)}
task_id: str = 'slice_discovery'