The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

10 Open Issues Need Help Last updated: Sep 8, 2025

Open Issues Need Help

View All on GitHub
enhancement good first issue SSoC25 Beginner

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: Write unit tests using pytest for the functions within the `data_optimizer.py` file of the AutoEDA project. The tests should cover both normal and edge cases (incorrect or missing input). The tests should be saved in a new file named `test_data_optimizer.py` within the `unit-tests` directory.

Complexity: 3/5
help wanted SSOC S4 Beginner

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: Write unit tests using pytest for the feat_scaling.py module, ensuring all scaling functions produce correct outputs and handle incorrect inputs appropriately. Tests should be saved in unit_tests/test_feat_scaling.py and cover both valid and invalid inputs for each function.

Complexity: 3/5
help wanted SSOC S4 Intermediate

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: Write unit tests using pytest for the `encoding_categorical.py` file within a Python project, ensuring all encoding functions are correctly tested and handle invalid inputs appropriately. The tests should be saved as `unit_tests/test_encoding_categorical.py` and cover all functions in the target file.

Complexity: 3/5
help wanted SSOC S4 Beginner

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: Debug and fix a synchronization issue in a React frontend application where the dark mode toggle icon displays the incorrect state after a page reload. The icon needs to accurately reflect the current theme (light or dark) immediately upon page load, and a single click should correctly toggle the theme and update the icon.

Complexity: 3/5
bug enhancement good first issue SSOC S4

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: Add a project logo to the top of the README.md file to improve the visual appeal and professionalism of the project's GitHub page.

Complexity: 1/5
documentation good first issue SSOC S4 Beginner

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: Improve the visual design of the existing scroll-to-top button in the AutoEDA frontend application. The task involves updating the button's appearance to align with modern UI/UX best practices, using the provided before-and-after images as a guide. This likely involves modifying CSS styles within the React component responsible for rendering the button.

Complexity: 2/5
enhancement good first issue SSOC S4 Beginner

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: The task involves converting Jupyter Notebook code for handling missing values in CSV datasets into a modular Python module. This module should read a CSV, apply various missing value imputation techniques (dropping, replacing with fixed values, mean, median, mode, forward/backward fill), evaluate their effectiveness based on defined metrics, select the best strategy, and save the cleaned dataset. The module needs to be well-documented, testable, and handle edge cases.

Complexity: 4/5
help wanted SSOC S4 Advanced

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: Design a new logo for the AutoEDA project. The logo should be modern, professional, and reflect the project's focus on automated data preprocessing. The design should be submitted as part of a pull request.

Complexity: 3/5
enhancement good first issue SSOC S4 Beginner

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python

AI Summary: The task involves creating a Jupyter Notebook for automated feature selection. The notebook should handle various feature selection methods (variance threshold, correlation analysis, model-based feature importance) on a preprocessed dataset (already cleaned and scaled), logging all steps and producing a new CSV with the selected features. The notebook needs to be generic enough to work for both classification and regression tasks.

Complexity: 4/5
help wanted SSOC S4 Advanced

The Automated Data Preprocessing Toolkit streamlines the data preprocessing stage in machine learning by automating tasks like handling missing values, encoding categorical features, and normalizing data. With a user-friendly interface for easy dataset uploads, it enhances data quality and improves model performance efficiently.

Python