Case Study: GrantFlow

TL;DR

Built GrantFlow — an agent-native grant workflow API for donor-aware proposal operations, governed review, traceability, and export-ready evidence packs.
It is not a grant-writing chatbot. It is the API layer an AI agent can discover, register with, call safely, and audit.
Built around the operational contract a real agent runtime needs: typed .well-known discovery, scoped self-serve credentials, OAuth client-credentials, idempotency keys, deterministic generation, HITL checkpoints, audit events, and .docx / .xlsx / ZIP exports.
Positioned for NGO / implementer teams with recurring EU and UN workflows, and for any agent runtime that needs governed grant operations rather than ad-hoc LLM calls.

Evidence

Public GitHub repository: grantflow
Latest release: v2.1.4 (2026-03-20).
Universal agent contract: AGENTS.md — shortest path for agents to discover, authenticate, and use the API without reading the full repo.
Agent quickstart: docs/agents/quickstart.md.
MCP tool-server guide: docs/agents/mcp.md.
Architecture, production boundaries, reference topology, and enterprise access layer documented under docs/.

Project state (self-reported)

Distribution: public Python repository, FastAPI service, container build, MCP stdio server, optional streamable-http MCP transport via pip install "grantflow[mcp]".
Implemented surfaces: agent discovery (/.well-known/agent-capabilities.json, agent.json, agent-policy.json, agent-tools.json, agent-recipes.json), POST /agents/onboarding, POST /agents/register, POST /agents/oauth/token, POST /agents/introspect, credential rotation and revocation, POST /agents/session, preflight, deterministic generation with idempotency keys, status / quality / grounding / events endpoints, HITL checkpoints, exports.
CI: unit tests, mypy, ruff, supply-chain checks, demo smoke, HITL smoke, grounded LLM evaluation, docker-compose smoke, nightly grounded tail, synthetic alert delivery check.
Donor template coverage (current strongest paths): EU, UN, and USAID as conditional depending on use case and operating constraints.
No production-adoption, customer, or benchmark numbers are claimed. Customer-specific pilot data stays outside the public repository by design.

Context / Constraint

The next operator running a grant proposal cycle may be an AI agent, not a person clicking through a dashboard. That agent still needs operational controls: discovery, typed contracts, auth, idempotency, preflight gates, review checkpoints, audit events, and deterministic smoke tests.

A single LLM endpoint with a "draft a proposal" prompt is not a workflow — it is an unbounded text generator without traceability, governance, or export-ready outputs.

Donor reviewers and audit teams need traceable evidence; agent runtimes need stable contracts and bounded retries; NGO operators need human checkpoints and review SLAs. All three have to live in one API.

Problem

Most agent-assisted proposal workflows are wrappers around a chat model. They produce text fluently. They struggle with the operational shape of real grant work: tenanted access, idempotent generation across retries, donor-specific preflight gates, structured review states, audit events, grounding inspection, and exportable evidence packs.

Buyers cannot ship that into an EU or UN review process. Agent runtimes cannot orchestrate it reliably. Operations teams cannot audit it after the fact.

Actions

Reframed the project from "AI proposal assistant" to agent-native grant workflow API. The core artifact is an OpenAPI surface with .well-known discovery, not a chat UI.
Wrote AGENTS.md as the shortest path for AI agents to discover, authenticate, and use GrantFlow without reading the full repo.
Defined the agent operational contract: discovery → onboarding → preflight → deterministic generation with an idempotency key → status / quality / events → HITL review → export.
Implemented self-serve credentials (signed API keys with expiry, tenant, and scopes) and OAuth client-credentials, with introspection, rotation, and revocation. Agent-critical endpoints enforce tenant_id and scopes when API-key auth is active.
Added an MCP-style stdio tool server with tools/list / tools/call semantics and a production transport (streamable-http) for runtimes that prefer the MCP SDK.
Implemented HITL checkpoints (architect, table of contents, MEL, logframe), critic findings, review comments with lifecycle status, SLA and portfolio signals, grounding gates, citation checks, and readiness warnings.
Added structured agent errors for auth, idempotency, and generation startup failures so an agent can branch on them rather than parse free-form messages.
Added exports to .docx, .xlsx, and buyer-facing ZIP evidence packs.
Built CI for supply-chain checks, deterministic smoke, HITL smoke, grounded LLM evaluation, docker-compose smoke, nightly grounded tail, and synthetic alert-delivery checks.
Documented production boundaries explicitly: built-in auth covers controlled deployments; enterprise IAM / OIDC / SAML / RBAC sits at the gateway / platform layer and reuses GrantFlow's onboarding metadata. Customer-specific pilot data stays outside the public repository.
Defined a canonical pilot path: ICP = NGO / implementer teams with recurring EU / UN workflows; scope = 3–6 representative cases with named owners; exit = Go / No-Go based on cycle-time delta, review-loop delta, and trust in traceability.

What it does now

Lets an AI agent discover its capabilities, tools, policy, and recipes through .well-known endpoints before any real call.
Onboards agents with self-serve API keys or OAuth client credentials, in tenanted, scoped, expiring form.
Runs donor-aware preflight gates before a generation starts.
Runs deterministic generation against an idempotency key so retries and reconnects do not duplicate work.
Surfaces status, quality, grounding, citations, version, and lifecycle events on stable endpoints, including audit-friendly job events.
Pauses at HITL checkpoints and resumes only after explicit approval.
Exports .docx, .xlsx, and ZIP evidence packs that are ready for donor review.
Travels across runtimes: any HTTP-capable agent, plus MCP runtimes via stdio or streamable-http.

What it is not

Not a grant-writing chatbot.
Not a one-click "auto-fill my proposal" tool.
Not a replacement for human review, donor relationship work, or organizational accountability.
Not a database of donor calls, beneficiaries, or pre-approved language.
Not a SaaS product with public customer references in this repository — pilot data stays outside the public repo by design.
Not an enterprise IAM / OIDC / SAML / RBAC implementation in itself — those concerns sit at the gateway / platform layer.

Tech stack

Python (FastAPI) service.
OpenAPI surface with x-grantflow-agent-recipes extension and a dedicated recipes endpoint.
MCP tool servers: stdio (grantflow.mcp.server) and optional streamable-http (grantflow.mcp.fastmcp_server) via pip install "grantflow[mcp]".
Self-serve signed API keys and OAuth client-credentials, with introspection, rotation, and revocation.
Docker / docker-compose, including a pilot compose file and a production-compose example.
Makefile-based bootstrap (make bootstrap-dev).
CI: pytest, mypy, ruff, supply-chain checks, demo smoke, HITL smoke, grounded LLM evaluation, docker-compose smoke, nightly grounded tail, synthetic alert delivery check, release-cut and release-drafter workflows.
Topics on the public repo: fastapi, mcp, agentic-ai, ai-agents, api-first, grant-proposals, grant-management, proposal-workflow, human-in-the-loop, traceability, donor-workflows, nonprofit-tech, workflow-automation, document-generation, openapi.

Companion projects

Nonprofit Proposal Go/No-Go Engine — a runtime-agnostic decision skill that pairs naturally with GrantFlow: the skill produces the Go / Conditional Go / No-Go judgment; GrantFlow runs the governed generation, review, and export around that decision.
Agenda Intelligence MD — schemas, validators, and evidence-audit tooling that can score and audit outputs the agent produces through GrantFlow.

Relevance

This project demonstrates how I think about practical infrastructure for agent-driven nonprofit operations: typed contracts before chat UI, governed credentials before "trust the agent", deterministic generation before clever prompting, HITL and audit events before "just ship it", and exports that hold up in front of an EU or UN reviewer.

The honest scope is also part of the design: customer pilot data stays out of the public repository, no production adoption is claimed, and the README's "Buyer Proof" section names current strongest donor-template paths rather than customer references.

Project links

Author: Vassiliy Lakhonin