MartinLeitgab/AISafetyIntervention_LiteratureExtraction

This repository contains all outcomes created in the 2025 Scientific Literature Knowledge Extraction Tool project hosted on the Eleuther AI Discord.

7 stars 11 forks 7 watchers Python MIT License

View on GitHub

3 Open Issues Need Help Last updated: Sep 12, 2025

Open Issues Need Help

View All on GitHub

Bug: Edges reference unknown nodes 5 months ago

bug help wanted

MartinLeitgab/AISafetyIntervention_LiteratureExtraction

This repository contains all outcomes created in the 2025 Scientific Literature Knowledge Extraction Tool project hosted on the Eleuther AI Discord.

Python

🚀 Feature: Improve Handling of Oversized Papers in Extraction Script 5 months ago

AI Summary: The extraction script fails to process long arXiv papers because their content exceeds the AI model's context window, leading to `BadRequestError` messages. This results in valuable papers being skipped and noisy logs without a clear strategy for handling these oversized inputs. The issue proposes discussing potential solutions such as explicitly skipping these documents with clearer logging or implementing a chunking mechanism to process them in smaller sections.

Complexity: 3/5

bug enhancement help wanted good first issue question

MartinLeitgab/AISafetyIntervention_LiteratureExtraction

This repository contains all outcomes created in the 2025 Scientific Literature Knowledge Extraction Tool project hosted on the Eleuther AI Discord.

Python

🚀 Feature: Batching Requests to OpenAI API 5 months ago

AI Summary: The `Extractor` class currently sends sequential, blocking requests to the OpenAI API, leading to inefficiency for large datasets. This issue proposes implementing batching or parallelization to significantly speed up processing, better utilize API quotas, and reduce costs, by allowing multiple concurrent requests.

Complexity: 4/5

enhancement help wanted question

MartinLeitgab/AISafetyIntervention_LiteratureExtraction

This repository contains all outcomes created in the 2025 Scientific Literature Knowledge Extraction Tool project hosted on the Eleuther AI Discord.

Python