Back to profile
Case Study: Global Think Tank Analyst
TL;DR
- Built Global Think Tank Analyst — a strategic-risk analysis skill for AI agents that produce policy-risk, sanctions, regulatory, geopolitical, trade, and strategic-risk memos.
- It is a domain reasoning layer (a behavior contract), not an agent framework, validator, MCP server, or eval platform — those concerns live in the companion project Agenda Intelligence MD.
- Forces agents to frame the decision, separate facts from assessments, state uncertainty, reason through actor incentives, identify scenarios, and define watch-next indicators.
- Repositioned the project as the horizontal domain skill in a three-repo portfolio: horizontal skill, vertical specialists (Central Asia & Caspian), and infrastructure / validation (Agenda Intelligence MD).
Evidence
- Public GitHub repository: github.com/vassiliylakhonin/global-think-tank-analyst
- Agent-readable orientation file:
llms.txt.
- Universal agent contract:
AGENTS.md (project rules: identity, honesty, evidence, naming, definition of done).
- Canonical skill behavior:
SKILL.md. Codex variant: codex/SKILL.md.
- Worked illustrative memos under
examples/ covering sanctions exposure, regulatory impact, scenario brief, and red-team — every example labels its evidence mode explicitly.
- Human review aids under
evals/: review checklist, failure-modes catalogue, starter rubric.
- Public signal archive under
signals/ with a contributor template.
Metrics
- Latest release: v1.2.0, 2026-05-08 — strategic-risk skill repositioning.
- Distribution model: plain markdown skill files, attachable to any AI agent. No CLI, no runtime.
- Five memo modes: Quick Brief, Standard Memo, Scenario Brief, Red-Team Challenge, Decision Briefing Pack.
- Four canonical evidence modes:
live-source-backed, user-provided sources, illustrative source packet, reasoning-only.
- Domain coverage: policy risk, sanctions, regulatory, geopolitical, trade, strategic risk.
No production-usage, adoption, or benchmark numbers are claimed.
Context / Constraint
LLMs are good at summarizing geopolitical events. They are weak at turning them into decision-ready intelligence.
The common failure mode: confident-sounding regional commentary, vague "monitor closely" advice, no decision frame, no actor incentives, no triggers, no evidence boundaries.
A skill layer needed to be small enough to attach to any agent and strict enough to actually change the output — without becoming a framework or runtime.
Problem
Most AI-generated strategic-risk analysis is fluent but decision-light. It rarely says what decision is being supported, separates facts from assessments, states confidence honestly, or names the indicators that would update the judgment.
That is fine for background reading. It is weak for compliance, risk committees, sanctions-exposure decisions, regulatory planning, or any operating decision that has to be defensible.
Actions
- Reframed the project as a strategic-risk analysis skill for AI agents — a domain reasoning layer, not infrastructure.
- Wrote
AGENTS.md as a canonical project-rules spec: identity, honesty rules, evidence rules, naming hierarchy, and definition of done.
- Defined the analytical contract every memo must respect:
Question / Decision / Audience / Time horizon / Evidence mode
Fact / Assessment / Assumption / Scenario / Unknown
Actor incentives and leverage
Options with trade-offs
Watch-next indicators (concrete, observable)
Confidence and key unknowns
What evidence would change the judgment
- Added five memo modes matching real-world request shapes: Quick Brief, Standard Memo, Scenario Brief, Red-Team Challenge, Decision Briefing Pack.
- Added four evidence modes so agents always disclose what their output is grounded in. When live verification is unavailable, the memo must include an explicit
EVIDENCE ACCESS LIMITED notice and lower confidence.
- Wrote four flagship illustrative memos under
examples/ to make the contract concrete.
- Added human review aids under
evals/: a yes/no checklist, a failure-modes catalogue, and a starter scoring rubric — explicitly labeled as review aids, not a validated benchmark.
- Reframed the public signal archive as examples of the skill style, not official intelligence; added
signals/TEMPLATE.md for contributors.
- Aligned the weekly signal-generation script with the new template and tightened its system prompt against fabrication.
- Repositioned the repo as the horizontal layer in a three-repo portfolio: horizontal skill (this repo), vertical specialists (Central Asia & Caspian), infrastructure / validation (Agenda Intelligence MD).
What it does now
- Frames a broad geopolitical or policy question as a decision problem before producing analysis.
- Forces a clear answer to what decision this informs, not just a topic summary.
- Separates facts, assessments, assumptions, scenarios, and unknowns visibly.
- Discloses evidence limits when live verification is unavailable.
- Produces concrete watch-next indicators and decision triggers, not vague "monitor closely" endings.
- Travels across runtimes: ChatGPT, Claude, Gemini, Perplexity, Cursor, Codex, OpenClaw, MCP agents, RAG systems, internal copilots.
- Composes with vertical-specialist skills for region-deep analysis and with Agenda Intelligence MD for validation, scoring, and audit.
What it is not
- Not an autonomous intelligence system.
- Not a factuality verifier or live source retriever.
- Not legal, compliance, sanctions, or investment advice.
- Not a generic agent framework, CLI tool, MCP server, or eval framework.
- Not a benchmarked evaluation framework.
- Not a replacement for human analyst judgment.
Portfolio context
Global Think Tank Analyst is the horizontal domain skill in a three-repo portfolio designed to compose:
- Horizontal domain skill — Global Think Tank Analyst (this case study). Reasoning method and memo modes, region- and topic-agnostic.
- Vertical specialists — central-asia-caspian-hybrid-intelligence-skill. Region-deep skills that ride on top of the horizontal method.
- Infrastructure / validation — Agenda Intelligence MD. Schemas, validation, scoring, evidence audit, CLI / MCP / CI tooling.
This repo does not duplicate either neighbor. Vertical depth lives in vertical-specialist repos; validation tooling lives in Agenda Intelligence MD.
Why this version is better
The skill is small enough to attach to any capable agent, and strict enough to change the shape of the output. The contract does not ask the model to sound smarter; it asks the model to frame the decision, label its evidence, and name what to watch next.
That is the part most generic geopolitical analysis misses.
Before / after (illustrative)
Excerpt from a live-source-backed example in the repo, condensed for this page. Full memo with sources, scenarios, options, and watch-next indicators: examples/live-source-backed-eu-ai-act-simplification.md. Evidence mode: live-source-backed.
User question: "What does the May 7, 2026 EU Council–Parliament provisional agreement on AI Act simplification (Omnibus VII) change for our compliance roadmap, and how should we adjust delivery over the next 6 months?"
Before — generic strategic-risk commentary:
The provisional agreement clarifies certain AI Act obligations and indicates a more pragmatic approach to compliance. Companies should monitor the formal adoption process, review their compliance roadmap, and adjust resourcing as needed.
Summarizes the news but does not support a decision. No frame, no evidence boundary, no scenarios, no triggers.
After — with the Global Think Tank Analyst skill attached:
- Decision frame: Question = roadmap impact; Decision = whether to slow / redirect / accelerate AI Act compliance investment over the next 6 months; Audience = head of legal & compliance and product VP at an AI provider with EU customer exposure; Time horizon = 6 months; Evidence mode =
live-source-backed; Confidence = Moderate.
- Key judgment (Moderate): The agreement buys schedule, narrows two real obligations, and adds one new prohibition — but does not relax the underlying AI Act architecture. The dominant move is to redeploy compliance budget rather than cut it.
- Facts (with sources): Council and Parliament reached a provisional agreement on 2026-05-07 (Omnibus VII); national-sandbox deadline postponed to 2027-08-02; transparency grace period for AI-generated content reduced from 6 to 3 months (new deadline 2026-12-02); new prohibition on non-consensual sexual / intimate content; SMC privileges extended; GPAI supervision clarified with national-authority carve-outs.
- Assessments (separated from facts): For providers touching synthetic content, the operative tightening is the 3-month grace period, not the high-risk timeline. The new prohibition will create downstream content-moderation, model-card, and vendor-risk obligations broader than the headline.
- Scenarios (6 months): clean adoption (modal); adoption with material amendments; member-state implementation drift; litigation or political shift.
- Watch-next indicators: formal adoption notices and OJ publication date; substantive amendments before adoption; Commission delegated/implementing acts on the new prohibition; AI Office guidance on the GPAI carve-outs.
- What would change the judgment: material amendment narrowing the synthetic-content relief; a CJEU referral on the new prohibition; Commission guidance interpreting the GPAI carve-outs more broadly than the press release implies.
The skill does not retrieve sources or verify facts — that is the job of a source-backed workflow or Agenda Intelligence MD. It asks the agent to frame the decision, label its evidence, and name what would change the view.
Tech stack
- Plain markdown skill files (
SKILL.md, codex/SKILL.md, AGENTS.md, llms.txt).
- Worked memo examples and human review aids in markdown.
- Lightweight Python helper for the public signal archive.
- GitHub repository with CI validating skill frontmatter.
- Agent-readable
llms.txt and JSON signal index/feed.
Relevance
This project demonstrates how I think about useful agent infrastructure: small reusable layers, explicit reasoning contracts, low context cost, honest evidence discipline, and outputs that improve decisions rather than just sounding polished — composed cleanly with vertical specialist skills and a separate infrastructure layer instead of bundling everything into one repo.
Project links
Author: Vassiliy Lakhonin