A python library for converting Pytorch modules into a circle model that is a lightweight and efficient representation in ONE designed for optimized on-device neural network inference.

1 Open Issue Need Help Last updated: Jun 19, 2025

Open Issues Need Help

View All on GitHub

AI Summary: The task involves modifying the TICO library to convert the LlamaAttention module from the Hugging Face Llama model into a single, optimized 'attention' operation within the Circle model format. This requires integrating the RoPE (Rotary Position Embedding) and other LlamaAttention-specific operations into a custom Circle opcode, unlike previous work that relied on PyTorch's standard scaled_dot_product_attention function. The goal is to improve the efficiency of Llama model inference on ONERT.

Complexity: 4/5
enhancement help wanted

A python library for converting Pytorch modules into a circle model that is a lightweight and efficient representation in ONE designed for optimized on-device neural network inference.

Python