rajatsainju2025/code-explainer

AI/ML • Code Generation & Assistance

package: Add PyPI publish workflow (release.yml) 2 months ago

AI Summary: The task is to create a GitHub Actions workflow that automates the process of building and publishing the `code-explainer` Python package to PyPI (the Python Package Index). This involves using a build backend compliant with PEP 517, the `twine` tool for uploading, and a PyPI API token for authentication. An optional test deployment to `testpypi` is also requested.

Complexity: 3/5

good first issue ci packaging

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

docs: Add CONTRIBUTING quickstart checklist and links to Discussions 2 months ago

AI Summary: Update the project's CONTRIBUTING.md file to include a quickstart checklist (around 5 minutes), links to the project's Discussions section, and reference the examples/README.md file and the model presets table from the main README.

Complexity: 2/5

documentation good first issue

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

ci: Add Colab notebook test in CI (nbval or papermill) 2 months ago

AI Summary: The task is to add a Continuous Integration (CI) job that tests the provided Colab notebook example to ensure it remains functional. This involves choosing a suitable testing tool (nbval or papermill), integrating it into the CI pipeline, and potentially handling conditional execution to skip the test on forks.

Complexity: 3/5

good first issue ci

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

examples: Add more dataset samples and a data schema doc 2 months ago

AI Summary: This task requires expanding the existing code explanation datasets by adding 20-50 more small code examples to the JSON files located in the `data` directory. Additionally, it involves creating documentation detailing the JSON schema and fields used in these datasets, which should be added to the `examples/README.md` file (and potentially a `data/README.md` file). The changes should ensure that the evaluation tests run quickly.

Complexity: 2/5

documentation good first issue data

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

docs: Add model presets table to README with commands 2 months ago

AI Summary: Update the project's README file to include a table summarizing available model presets (DistilGPT-2, CodeT5 Small/Base, CodeGPT, StarCoderBase-1B), along with their corresponding training and evaluation commands. The information should be concise and easily understandable, referencing the existing examples directory for more detailed instructions.

Complexity: 2/5

documentation good first issue

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

examples: Add Google Colab quickstart notebook 2 months ago

AI Summary: Create a Google Colab notebook demonstrating the installation, training, evaluation, and optional serving of the code explainer project using the provided configuration and data. The notebook should be added to the `examples` directory and linked from the project's README.

Complexity: 2/5

documentation good first issue

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

Add CodeT5 preset docs and README section on model presets 2 months ago

AI Summary: Update the project's README to include documentation for new CodeT5 and StarCoder model presets, adding a table summarizing the available presets (architecture, base model, config file, training and evaluation commands), and instructions on how to switch between them using the `configs/default.yaml` file.

Complexity: 2/5

help wanted good first issue

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

Add tests for seq2seq tokenization and causal masking 2 months ago

AI Summary: The task requires writing unit tests for a code explainer project, specifically focusing on the tokenization and masking processes for both seq2seq and causal language models. The tests should cover source/target tokenization, handling of padding tokens, prompt masking, and padding masking. The tests should use small configuration files to minimize download times and be compatible with the project's CI system.

Complexity: 4/5

help wanted good first issue

rajatsainju2025/code-explainer

0

Python

AI/ML • Code Generation & Assistance

Add example datasets and examples/ README for quick start 2 months ago

AI Summary: Create small example datasets (train, eval, test) with about 10 code-explanation pairs each in JSON format and update the examples/README.md to include instructions and copy-pastable commands for training, evaluating, and serving the code explainer using different model presets (DistilGPT-2, CodeT5-small, etc.).

Complexity: 3/5

help wanted good first issue

rajatsainju2025/code-explainer

0

Python

Open Issues Need Help