3 Open Issues Need Help Last updated: Jul 10, 2025

Open Issues Need Help

View All on GitHub
More parsimonious pipeline about 2 months ago

AI Summary: The task is to enhance the existing pseudonymization pipeline by integrating a Named Entity Recognition (NER) step followed by the use of libraries like Faker or Mimesis to generate more realistic pseudonyms for various PII types (names, numbers, addresses, etc.). This aims to create a more efficient and potentially higher-quality pseudonymization process compared to relying solely on OpenAI's LLMs.

Complexity: 4/5
enhancement help wanted

Pseudonymization enhanced by OpenAI LLMs

Python

AI Summary: Enhance the existing pseudonymization pipeline to prevent the tagging and subsequent pseudonymization of specific entities, such as public figures and movie titles. This requires integrating an ontology or a custom list of entities to exclude from the annotation process.

Complexity: 4/5
enhancement help wanted

Pseudonymization enhanced by OpenAI LLMs

Python

AI Summary: Integrate support for local LLMs, specifically Ollama, into the existing pseudonymization pipeline. This involves researching suitable Ollama models (referencing the provided arXiv paper), modifying the code to handle Ollama API calls, and potentially adapting existing functions to work with different model outputs. Thorough testing will be required to ensure functionality and accuracy.

Complexity: 4/5
enhancement help wanted

Pseudonymization enhanced by OpenAI LLMs

Python