Open Issues Need Help
View All on GitHubAI Summary: The task is to enhance the existing pseudonymization pipeline by integrating a Named Entity Recognition (NER) step followed by the use of libraries like Faker or Mimesis to generate more realistic pseudonyms for various PII types (names, numbers, addresses, etc.). This aims to create a more efficient and potentially higher-quality pseudonymization process compared to relying solely on OpenAI's LLMs.
AI Summary: Enhance the existing pseudonymization pipeline to prevent the tagging and subsequent pseudonymization of specific entities, such as public figures and movie titles. This requires integrating an ontology or a custom list of entities to exclude from the annotation process.
AI Summary: Integrate support for local LLMs, specifically Ollama, into the existing pseudonymization pipeline. This involves researching suitable Ollama models (referencing the provided arXiv paper), modifying the code to handle Ollama API calls, and potentially adapting existing functions to work with different model outputs. Thorough testing will be required to ensure functionality and accuracy.