LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

arrow artificial-intelligence big-data data data-engineering datafusion distributed-computing machine-learning pyspark python rust spark sql
13 Open Issues Need Help Last updated: Aug 31, 2025

Open Issues Need Help

View All on GitHub

AI Summary: Users are encountering a `ValueError` when converting a Spark DataFrame, created with `spark.range()` and `withColumns()`, to an Apache Arrow table using `df.toArrow()`. The error, "Target schema's field names are not matching the table's field names," occurs sporadically (approximately 50% of the time), making it difficult to debug. A minimal reproducible example is provided.

Complexity: 4/5
bug good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
DuckLake Support 11 days ago
help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted epic

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql

AI Summary: Investigate and replicate Spark's type casting logic for null functions in the LakeSail Sail project, ensuring parity with Spark's behavior. This involves analyzing Spark's source code and potentially implementing the logic as user-defined functions (UDFs) within Sail, also considering the broader applicability of this casting behavior to other scenarios.

Complexity: 4/5
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql

AI Summary: Implement support for five new Spark 4.0 functions (`>>`, `<<`, `nullifzero`, `zeroifnull`, `dayname`) within the LakeSail Sail framework, ensuring compatibility with its existing Spark SQL and DataFrame API.

Complexity: 3/5
good first issue help wanted

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Rust
#arrow#artificial-intelligence#big-data#data#data-engineering#datafusion#distributed-computing#machine-learning#pyspark#python#rust#spark#sql