dcbench.tasks.budgetclean package

Submodules

dcbench.tasks.budgetclean.common module

class Preprocessor(num_strategy='mean')[source]

Bases: object

docstring for Preprocessor.

fit(X_train, y_train, X_full=None)[source]
transform(X=None, y=None)[source]

dcbench.tasks.budgetclean.problem module

class BudgetcleanProblem(id, artifacts, attributes=None)[source]

Bases: dcbench.common.problem.Problem

Parameters
  • id (str) –

  • artifacts (Mapping[str, Artifact]) –

  • attributes (Mapping[str, BASIC_TYPE]) –

artifact_specs: Mapping[str, dcbench.common.artifact.ArtifactSpec] = {'X_test': ArtifactSpec(description=('Features of the test dataset used to produce the final evaluation score of the model.',), artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>), 'X_train_clean': ArtifactSpec(description='Features of the clean training dataset where each dirty value from the dirty dataset is replaced with the correct clean candidate.', artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>), 'X_train_dirty': ArtifactSpec(description=('Features of the dirty training dataset which we need to clean. Each dirty cell contains an embedded list of clean candidate values.',), artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>), 'X_val': ArtifactSpec(description='Feature of the validtion dataset which can be used to guide the cleaning optimization process.', artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>), 'y_test': ArtifactSpec(description='Labels of the test dataset.', artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>), 'y_train': ArtifactSpec(description='Labels of the training dataset.', artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>), 'y_val': ArtifactSpec(description='Labels of the validation dataset.', artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>)}
evaluate(solution)[source]
Parameters

solution (dcbench.tasks.budgetclean.problem.BudgetcleanSolution) –

Return type

dcbench.common.result.Result

classmethod from_id(scenario_id)[source]
Parameters

scenario_id (str) –

classmethod list()[source]
name: str
solution_class: type
solve(idx_selected, **kwargs)[source]
Parameters
  • idx_selected (Any) –

  • kwargs (Any) –

Return type

dcbench.common.solution.Solution

summary: str
task_id: str = 'budgetclean'
class BudgetcleanSolution(id, artifacts, attributes=None)[source]

Bases: dcbench.common.solution.Solution

Parameters
  • id (str) –

  • artifacts (Mapping[str, Artifact]) –

  • attributes (Mapping[str, BASIC_TYPE]) –

artifact_specs: Mapping[str, dcbench.common.artifact.ArtifactSpec] = {'idx_selected': ArtifactSpec(description='', artifact_type=<class 'dcbench.common.artifact.CSVArtifact'>)}

Module contents