CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.

catalog cli dataset sqlite
6 Open Issues Need Help Last updated: Feb 28, 2026

Open Issues Need Help

View All on GitHub

AI Summary: This GitHub issue proposes implementing an `inspect` command for the `openaudit-catalog` tool. This command will read `catalog_meta.json` from a specified distribution path and print key statistics like catalog version, sources, artifacts, QA status, and row counts from Parquet/SQLite files, if present. The goal is to provide a quick, human-readable overview for auditing and improving maintainer experience during PR and release reviews.

Complexity: 3/5
good first issue priority:low type:feature area:tooling

CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.

#catalog#cli#dataset#sqlite

AI Summary: This GitHub issue proposes creating a comprehensive contribution guide for adding new data sources to the project's catalog. The guide, to be located at `docs/sources/README.md`, will include a template for documenting source records (detailing name, terms, risks, and cache policy) and a checklist for pull requests, ensuring proper documentation, synthetic fixtures, QA, and schema mapping. This initiative aims to enable safe, auditable, and high-quality contributions, reducing legal risks and accelerating catalog growth with proper governance.

Complexity: 3/5
good first issue priority:medium type:docs area:community

CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.

#catalog#cli#dataset#sqlite

AI Summary: This issue proposes implementing a new `checksum` command for the `openaudit-catalog` tool. This command will generate `checksums.sha256` files containing SHA256 hashes for key published artifacts such as `catalog.sqlite`, `catalog_meta.json`, `qa_report.json`, and `entities.parquet`. The primary goal is to ensure the integrity and auditability of these artifacts, with specific rules for hash formatting and deterministic ordering.

Complexity: 3/5
good first issue priority:high type:feature area:security area:audit

CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.

#catalog#cli#dataset#sqlite

AI Summary: The issue proposes implementing a new `export --format parquet` command to generate an `entities.parquet` file from the staging area. This file will serve as an auditable snapshot of the catalog, including row counts in its metadata, while ensuring sensitive fields are excluded and the schema remains stable and versioned. This improves auditability and facilitates external data inspection without relying solely on SQLite.

Complexity: 3/5
good first issue priority:medium type:feature area:export

CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.

#catalog#cli#dataset#sqlite

AI Summary: This issue proposes implementing a `normalize_name` function within the `openaudit_catalog` to standardize how entity names are processed. The goal is to ensure consistency with the existing search-api's query normalization, thereby improving search result matching, ranking accuracy, and overall data consistency by reducing false positives.

Complexity: 2/5
good first issue priority:high type:feature area:transform

CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.

#catalog#cli#dataset#sqlite

AI Summary: This issue describes the bootstrapping of the `openaudit-entity-catalog` repository. The goal is to establish a Python CLI using Typer, define the initial folder structure, and set up standard tooling/scripts to enable contributors to build, validate, and export catalog artifacts locally.

Complexity: 2/5
good first issue priority:high type:chore area:tooling

CLI em Python para construir, validar e publicar o Catálogo Canônico de Entidades do OpenAudit Brasil.

#catalog#cli#dataset#sqlite