Open Issues Need Help
View All on GitHubAI Summary: This GitHub issue proposes implementing an `inspect` command for the `openaudit-catalog` tool. This command will read `catalog_meta.json` from a specified distribution path and print key statistics like catalog version, sources, artifacts, QA status, and row counts from Parquet/SQLite files, if present. The goal is to provide a quick, human-readable overview for auditing and improving maintainer experience during PR and release reviews.
CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.
AI Summary: This GitHub issue proposes creating a comprehensive contribution guide for adding new data sources to the project's catalog. The guide, to be located at `docs/sources/README.md`, will include a template for documenting source records (detailing name, terms, risks, and cache policy) and a checklist for pull requests, ensuring proper documentation, synthetic fixtures, QA, and schema mapping. This initiative aims to enable safe, auditable, and high-quality contributions, reducing legal risks and accelerating catalog growth with proper governance.
CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.
AI Summary: This issue proposes implementing a new `checksum` command for the `openaudit-catalog` tool. This command will generate `checksums.sha256` files containing SHA256 hashes for key published artifacts such as `catalog.sqlite`, `catalog_meta.json`, `qa_report.json`, and `entities.parquet`. The primary goal is to ensure the integrity and auditability of these artifacts, with specific rules for hash formatting and deterministic ordering.
CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.
AI Summary: The issue proposes implementing a new `export --format parquet` command to generate an `entities.parquet` file from the staging area. This file will serve as an auditable snapshot of the catalog, including row counts in its metadata, while ensuring sensitive fields are excluded and the schema remains stable and versioned. This improves auditability and facilitates external data inspection without relying solely on SQLite.
CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.
AI Summary: This issue proposes implementing a `normalize_name` function within the `openaudit_catalog` to standardize how entity names are processed. The goal is to ensure consistency with the existing search-api's query normalization, thereby improving search result matching, ranking accuracy, and overall data consistency by reducing false positives.
CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.
AI Summary: This issue describes the bootstrapping of the `openaudit-entity-catalog` repository. The goal is to establish a Python CLI using Typer, define the initial folder structure, and set up standard tooling/scripts to enable contributors to build, validate, and export catalog artifacts locally.
CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.