---
title: "Google AI Edge Gallery Skills and Portable Document Capsules"
slug: "google-ai-edge-gallery-portable-documents"
summary: "A research draft on using Google AI Edge Gallery-style on-device skills with Capsules to turn offline work into portable, verifiable document formats."
status: "draft"
version: "0.1"
updated: "2026-05-07"
audience:
  - "agent builders"
  - "mobile AI developers"
  - "document workflow teams"
tags:
  - "google ai edge"
  - "agent skills"
  - "offline"
  - "documents"
  - "portable work"
canonical_path: "/research/google-ai-edge-gallery-portable-documents"
---

# Google AI Edge Gallery Skills and Portable Document Capsules

## Extending Offline Work to Portable Document Formats

### Abstract

Google AI Edge Gallery is an open-source mobile application for running generative models locally. Google describes it as a way to run open-source LLMs on device, with offline privacy, model management, benchmarks, AI chat, image input, audio transcription and translation, mobile actions, and Agent Skills that can augment local models with tools and visual result cards.

Capsules can complement that direction. A Capsule does not need to replace AI Edge Gallery or any on-device model host. It can provide the portable work artifact: source material, extracted structure, document outputs, verification history, skill instructions, and handoff state. An Edge Gallery-compatible Capsule skill would teach a local model how to read a capsule, transform offline work into document formats, and return a structured result that can be appended back to the capsule.

The research question is simple: can offline AI become useful for real work if the work product is not trapped in the app session?

---

## 1. Why Edge Gallery matters

On-device AI changes the deployment assumption. A user can run a capable model without sending data to a server. That matters for field work, disaster response, legal aid, healthcare intake, private notes, and small organizations that cannot depend on cloud access for every step.

Google's AI Edge Gallery is relevant because it combines several practical ingredients:

- local model execution
- model management and benchmarking
- multimodal input through image and audio flows
- Agent Skills as a way to extend model capability
- mobile actions for bounded device-side operations
- a developer-facing open-source codebase

Those ingredients are close to the Capsule thesis: the model can reason locally, but the artifact must carry enough state and provenance to move between people, devices, and models.

---

## 2. Capsule role

A portable document capsule can carry:

- `program.md`: current document goal, transformation steps, review notes, and next-actor instructions
- `agents.md`: actor roles, authority notes, and trust context
- `payload/documents/source/`: original notes, scans, transcripts, or structured records
- `payload/documents/output/`: generated markdown, HTML, PDF-ready HTML, or other export forms
- `manifest.json`: content index and skill trust metadata
- `chain/events.jsonl`: extraction, summarization, review, export, and approval events
- `skills/`: the instructions a local model needs to understand the capsule and produce safe outputs
- `provenance/envelope.json`: signed package metadata and verification boundary

The capsule becomes the portable document workspace. Edge Gallery, Gemma, or another on-device host becomes the local reasoning surface.

---

## 3. The Edge Gallery capsule skill

A Capsule skill for AI Edge Gallery should be small and explicit. It should not assume cloud APIs. It should teach the local agent to:

1. inspect `manifest.json`
2. read `program.md`, payload references, skills, and recent chain events
3. identify available source documents
4. choose a supported output target
5. transform content into a portable document format
6. return a structured result envelope to the host
7. ask the host to append a Pith-compressed event to the capsule

The host still owns file access and mutation. The model should not directly rewrite the capsule. It should return a response that the host can verify, display, and append.

---

## 4. Portable document targets

The first useful targets are deliberately ordinary:

| Target | Why it matters | Capsule treatment |
|---|---|---|
| Markdown | readable by humans and LLMs | canonical editable surface |
| HTML | renderable offline in browsers and WebViews | preview and artifact export |
| PDF-ready HTML | printable and shareable without SaaS | deterministic intermediate before PDF |
| JSON | machine-readable extraction | state and structured result envelope |
| Transcript text | audio/video field notes | source artifact plus summary events |

The capsule should avoid pretending every format is equally verifiable. Markdown, JSON, and HTML are easier to diff and inspect. PDF is valuable for distribution, but should usually be treated as a derived export with source markdown or HTML preserved inside the capsule.

---

## 5. Offline workflow

A minimal offline workflow looks like this:

1. user opens a capsule on a mobile device
2. host verifies the package and displays the current surface
3. user records audio, imports an image, or adds notes
4. Edge Gallery or another local model transforms the material
5. model returns `llm_return` with document text, citations, and confidence notes
6. host writes updated document output and appends a compressed event
7. user exports the capsule or a derived document

This is stronger than a standalone note or PDF because the capsule keeps source, transformation, result, and handoff together.

---

## 6. Disaster and low-connectivity relevance

The pattern matters in environments where the network is unavailable or not trusted. A field worker can carry a capsule with intake forms, local observations, audio transcripts, and a generated summary. The work can continue offline and later sync as a verified artifact.

This does not require a mesh network to prove value. Mesh transport can come later. The immediate proof is that work can be captured, transformed, checked, and handed off without a cloud runtime.

---

## 7. Implementation sketch

The first implementation should be a lab skill, not a protocol change:

- create `capsule-edge-gallery/SKILL.md`
- include instructions for reading a capsule and producing document outputs
- define a small result envelope: `document_created`, `document_updated`, `citation_added`, `export_requested`
- package sample capsules in [capsules-extra](https://github.com/virionai/capsules-extra)
- test with a local/mobile model host where possible
- keep credentials and private user data out of public examples

If Edge Gallery can load modular skills from URL, the first test should be a hosted skill URL plus a sample document capsule. If the host cannot mutate capsules directly, the skill should still return the right envelope so a surrounding harness can append the event.

---

## 8. Open questions

- What exact skill package shape does Edge Gallery expect for loaded Agent Skills?
- Can the app expose enough local file access for capsule import/export without weakening sandbox boundaries?
- Which portable document formats should be first-class outputs versus derived exports?
- Can a local model reliably preserve citations from source files into the generated document payload?
- What result envelope should be common across mobile, browser, and CLI hosts?

---

## Conclusion

Google AI Edge Gallery points toward a practical future for offline model work. Capsules supply the missing portable artifact. Together they suggest a local-first document workflow where users can capture source material, ask an on-device model to transform it, verify the resulting work packet, and hand it to another person or AI system without losing context.

The goal is not offline chat. The goal is offline work that can travel.

---

## Sources reviewed

- Google AI Edge Gallery GitHub repository: https://github.com/google-ai-edge/gallery
- Google AI Edge Gallery on Google Play: https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery
- Google Developers Blog, AI Edge Gallery with audio and Google Play: https://developers.googleblog.com/en/google-ai-edge-gallery-now-with-audio-and-on-google-play/
- Capsules Extra repository: https://github.com/virionai/capsules-extra