dcbench.tasks.slice_discovery package

Submodules

dcbench.tasks.slice_discovery.baselines module

confusion_sdm(problem)[source]

A simple slice discovery method that returns a slice corresponding to each cell of the confusion matrix. For example, for a binary prediction task, this sdm will return 4 slices corresponding to true positives, false positives, true negatives and false negatives.

Parameters: problem (SliceDiscoveryProblem) – The slice discovery problem.
Returns: The predicted slices.
Return type: SliceDiscoverySolution

domino_sdm(problem)[source]

Parameters: problem (dcbench.tasks.slice_discovery.problem.SliceDiscoveryProblem) –
Return type: dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

dcbench.tasks.slice_discovery.metrics module

precision_at_k(slice, pred_slice, k=25)[source]

Parameters

slice (numpy.ndarray) –
pred_slice (numpy.ndarray) –
k (int) –

recall_at_k(slice, pred_slice, k=25)[source]

Parameters

slice (numpy.ndarray) –
pred_slice (numpy.ndarray) –
k (int) –

compute_metrics(slices, pred_slices)[source]

[summary]

Parameters

slices (np.ndarray) – [description]
pred_slices (np.ndarray) – [description]

Returns

[description]

Return type

dict

dcbench.tasks.slice_discovery.problem module

class SliceDiscoverySolution(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.solution.Solution

Parameters

artifacts (Mapping[str, Artifact]) –
attributes (Mapping[str, Attribute]) –
container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'pred_slices': ArtifactSpec(description='A DataPanel of predicted slice labels with columns `id` and `pred_slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}

attribute_specs: Mapping[str, AttributeSpec] = {'problem_id': AttributeSpec(description='A unique identifier for this problem.', attribute_type=<class 'str'>, optional=False)}

task_id: str = 'slice_discovery'

class SliceDiscoveryProblem(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.problem.Problem

Parameters

artifacts (Mapping[str, Artifact]) –
attributes (Mapping[str, Attribute]) –
container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'activations': ArtifactSpec(description="A DataPanel of the model's activations with columns `id`,`act`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'base_dataset': ArtifactSpec(description='A DataPanel representing the base dataset with columns `id` and `image`.', artifact_type=<class 'dcbench.common.artifact.VisionDatasetArtifact'>, optional=False), 'clip': ArtifactSpec(description="A DataPanel of the image embeddings from OpenAI's CLIP model", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'model': ArtifactSpec(description='A trained PyTorch model to audit.', artifact_type=<class 'dcbench.common.artifact.ModelArtifact'>, optional=False), 'test_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'test_slices': ArtifactSpec(description='A DataPanel of the ground truth slice labels with columns `id`, `slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'val_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}

attribute_specs: Mapping[str, AttributeSpec] = {'alpha': AttributeSpec(description='The alpha parameter for the AUC metric.', attribute_type=<class 'float'>, optional=False), 'dataset': AttributeSpec(description='The name of the dataset being audited.', attribute_type=<class 'str'>, optional=False), 'n_pred_slices': AttributeSpec(description='The number of slice predictions that each slice discovery method can return.', attribute_type=<class 'int'>, optional=False), 'slice_category': AttributeSpec(description='The type of slice .', attribute_type=<class 'str'>, optional=False), 'slice_names': AttributeSpec(description='The names of the slices in the dataset.', attribute_type=<class 'list'>, optional=False), 'target_name': AttributeSpec(description='The name of the target column in the dataset.', attribute_type=<class 'str'>, optional=False)}

task_id: str = 'slice_discovery'

solve(pred_slices_dp)[source]

Parameters: pred_slices_dp (meerkat.datapanel.DataPanel) –
Return type: dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

evaluate(solution)[source]

Parameters: solution (dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution) –
Return type: dict

name: str

summary: str

solution_class: type

Module contents

confusion_sdm(problem)[source]

Parameters: problem (SliceDiscoveryProblem) – The slice discovery problem.
Returns: The predicted slices.
Return type: SliceDiscoverySolution

domino_sdm(problem)[source]

Parameters: problem (dcbench.tasks.slice_discovery.problem.SliceDiscoveryProblem) –
Return type: dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

class SliceDiscoveryProblem(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.problem.Problem

Parameters

artifacts (Mapping[str, Artifact]) –
attributes (Mapping[str, Attribute]) –
container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'activations': ArtifactSpec(description="A DataPanel of the model's activations with columns `id`,`act`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'base_dataset': ArtifactSpec(description='A DataPanel representing the base dataset with columns `id` and `image`.', artifact_type=<class 'dcbench.common.artifact.VisionDatasetArtifact'>, optional=False), 'clip': ArtifactSpec(description="A DataPanel of the image embeddings from OpenAI's CLIP model", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'model': ArtifactSpec(description='A trained PyTorch model to audit.', artifact_type=<class 'dcbench.common.artifact.ModelArtifact'>, optional=False), 'test_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'test_slices': ArtifactSpec(description='A DataPanel of the ground truth slice labels with columns `id`, `slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False), 'val_predictions': ArtifactSpec(description="A DataPanel of the model's predictions with columns `id`,`target`, and `probs.`", artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}

attribute_specs: Mapping[str, AttributeSpec] = {'alpha': AttributeSpec(description='The alpha parameter for the AUC metric.', attribute_type=<class 'float'>, optional=False), 'dataset': AttributeSpec(description='The name of the dataset being audited.', attribute_type=<class 'str'>, optional=False), 'n_pred_slices': AttributeSpec(description='The number of slice predictions that each slice discovery method can return.', attribute_type=<class 'int'>, optional=False), 'slice_category': AttributeSpec(description='The type of slice .', attribute_type=<class 'str'>, optional=False), 'slice_names': AttributeSpec(description='The names of the slices in the dataset.', attribute_type=<class 'list'>, optional=False), 'target_name': AttributeSpec(description='The name of the target column in the dataset.', attribute_type=<class 'str'>, optional=False)}

task_id: str = 'slice_discovery'

solve(pred_slices_dp)[source]

Parameters: pred_slices_dp (meerkat.datapanel.DataPanel) –
Return type: dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution

evaluate(solution)[source]

Parameters: solution (dcbench.tasks.slice_discovery.problem.SliceDiscoverySolution) –
Return type: dict

name: str

summary: str

solution_class: type

class SliceDiscoverySolution(artifacts, attributes=None, container_id=None)[source]

Bases: dcbench.common.solution.Solution

Parameters

artifacts (Mapping[str, Artifact]) –
attributes (Mapping[str, Attribute]) –
container_id (str) –

artifact_specs: Mapping[str, dcbench.common.artifact_container.ArtifactSpec] = {'pred_slices': ArtifactSpec(description='A DataPanel of predicted slice labels with columns `id` and `pred_slices`.', artifact_type=<class 'dcbench.common.artifact.DataPanelArtifact'>, optional=False)}

attribute_specs: Mapping[str, AttributeSpec] = {'problem_id': AttributeSpec(description='A unique identifier for this problem.', attribute_type=<class 'str'>, optional=False)}

task_id: str = 'slice_discovery'