API Reference - Colosseum

All endpoints are prefixed with your COLOSSEUM_COPILOT_API_BASE (default: https://copilot.colosseum.com/api/v1). All requests require a Bearer token:

-H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT"

Rate limits

All limits are per-user (keyed by your Colosseum account). Exceeding a limit returns 429 with a Retry-After header.

Category	Limit	Applies to
Search	30 req/min	`/search/projects`, `/search/archives`
Analysis	10 req/min	`/analyze`, `/compare`
Concurrency	2 in-flight	All data endpoints
Source suggestions	5 req/hr	`/source-suggestions`
Feedback	10 req/hr	`/feedback`

When rate limited (429), honor the Retry-After header. Most agent runtimes handle this automatically.

Endpoints

GET /colosseum_copilot/status

Auth pre-flight check. Call this first to verify your token is valid before making other API calls.

curl "$COLOSSEUM_COPILOT_API_BASE/status" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT"

Response:

Field	Type	Description
`authenticated`	boolean	Whether the token is valid
`expiresAt`	string	ISO date when the token expires
`scope`	string	Token scope (e.g., `colosseum_copilot:read`)

GET /colosseum_copilot/filters

Fetch available filters and canonical hackathon chronology. Use to translate hackathon or track names into valid slugs/keys before searching, and to get startDate values for chronology-sensitive answers.

curl "$COLOSSEUM_COPILOT_API_BASE/filters" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT"

Response includes:

tracks[]: { key, name, hackathonSlug, projectCount }
hackathons[]: { slug, name, startDate, projectCount, winnerCount } — ordered chronologically (oldest first)
acceleratorBatches[]: { key, name, companyCount }
prizeTypes[]: prize category names
prizePlacements[]: placement ranks
problemTags[]: { tag, count } (top 25 by frequency)
solutionTags[]: { tag, count } (top 25 by frequency)
primitives[]: { tag, count } (top 25 by frequency)
techStack[]: { tag, count } (top 25 by frequency)
targetUsers[]: { tag, count } (top 25 by frequency)
clusters[]: { key, label, projectCount } (key format v<N>-c<N>)
archiveSources[]: { key, label, documentCount? } (documentCount is optional)

POST /colosseum_copilot/search/projects

Primary similarity search for hackathon projects.

curl -X POST "$COLOSSEUM_COPILOT_API_BASE/search/projects" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "privacy wallet for stablecoin users",
    "limit": 10,
    "filters": {
      "winnersOnly": false,
      "acceleratorOnly": false
    }
  }'

Recommended defaults: limit 8–12, includeFacets false. Request parameters:

Param	Type	Default	Description
`query`	string	`""`	Natural language query (optional, max 500 chars; omit for filter-only browsing)
`hackathons`	string[]	-	Filter by hackathon slugs, max 10 (e.g., `["cypherpunk", "breakout"]`)
`trackKeys`	string[]	-	Filter by track keys, max 10 (format `<hackathonSlug>/<trackSlug>`)
`limit`	int	10	Max results to return (max 25)
`offset`	int	0	Pagination offset (applied after ranking)
`diversify`	boolean	true	Cross-hackathon diversity ranking. Set `false` for narrow deep-dives
`includeFacets`	boolean	false	Enable facet computation (adds overhead)
`includeDiagnostics`	boolean	false	Include search diagnostics in response

Filter parameters (filters object):

Param	Type	Description
`winnersOnly`	boolean	Only prize-winning projects
`acceleratorOnly`	boolean	Only accelerator portfolio companies
`acceleratorBatchKeys`	string[]	Specific batches, max 10 (format `accelerator/<batchSlug>`)
`prizePlacements`	int[]	Prize placement ranks
`prizeTypes`	string[]	Prize categories, max 10
`isUniversityProject`	boolean	University-affiliated projects
`isSolanaMobile`	boolean	Solana Mobile projects
`techStack`	string[]	Tech stack tags, max 10
`primitives`	string[]	Primitive/protocol tags, max 10
`problemTags`	string[]	Problem domain tags, max 10
`solutionTags`	string[]	Solution approach tags, max 10
`targetUsers`	string[]	Target user segments, max 10
`clusterKeys`	string[]	Cluster keys, max 10 (format `v<N>-c<N>`)

Discover valid filter values via GET /filters. Facet parameters:

Param	Type	Default	Description
`facets`	string[]	-	Dimensions: `hackathons`, `tracks`, `prizes`, `problemTags`, `solutionTags`, `primitives`, `techStack`, `clusters`. If omitted and `includeFacets=true`, all dimensions are computed
`facetTopK`	int	8	Max buckets per dimension (1–20)

Response:

Field	Type	Description
`results`	object[]	Array of project results (see below)
`filtersApplied`	object	`{ hackathons?: string[], trackKeys?: string[], filters?: object }`
`totalFound`	int	Total matching projects
`hasMore`	boolean	Whether more results are available
`facets`	object?	Facet buckets by dimension (only present when `includeFacets=true`)
`diagnostics`	object?	Search diagnostics (only present when `includeDiagnostics=true`)

Result object (results[]):

Field	Type	Nullable	Description
`slug`	string		Project slug
`name`	string		Project name
`oneLiner`	string	yes	Short project description
`similarity`	number		Match score
`hackathon`	object		`{ name, slug, startDate }`
`tracks`	object[]		`[{ name, key }]`
`links`	object		`{ github, demo, presentation, technicalDemo, twitter, colosseum }` (all fields nullable)
`evidence`	string[]		Short snippets showing why this matched (max 2)
`prize`	object	yes	`{ type, name?, placement?, amount?, trackName? }` (inner fields nullable)
`metrics`	object		`{ likesCount, commentsCount, updatesCount }`
`team`	object		`{ count }`
`crowdedness`	int	yes	Cluster size as a crowdedness proxy
`tags`	object	yes	`{ problemTags[], solutionTags[], primitives[], techStack[], targetUsers[] }`
`cluster`	object	yes	`{ key, label }`
`accelerator`	object	yes	`{ companySlug?, companyName?, batchKey, batchName }` (companySlug/companyName nullable)

Facet bucket shape: { key, label, count, sampleProjectSlugs[] } Diagnostics object (when includeDiagnostics=true):

Field	Type	Description
`modeUsed`	string	`vector`, `text`, `hybrid`, or `filters`
`fallbackUsed`	boolean	Whether a fallback search tier was used
`fallbackReason`	string?	Reason for fallback (if applicable)
`vectorCandidates`	int	Candidates from vector search
`textCandidates`	int	Candidates from text search
`tagCandidates`	int	Candidates from semantic tag search
`diversityDropped`	int	Results removed by diversity filter
`totalFoundIsEstimate`	boolean	Whether `totalFound` is an estimate
`effectiveFilters`	object	Filters actually applied after resolution
`queryExpanded`	string	Query after synonym expansion

Score interpretation: Scores reflect hybrid RRF fusion across vector, text, and semantic tag channels. Use relative ranking within a result set rather than absolute thresholds.

POST /colosseum_copilot/search/archives

Search archival documents for conceptual precedents. Auto-cascades through tiers (vector → chunk text → document text) when a tier returns no results.

curl -X POST "$COLOSSEUM_COPILOT_API_BASE/search/archives" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "prediction markets governance",
    "limit": 5,
    "maxChunksPerDoc": 2
  }'

Recommended defaults: limit 4–6, maxChunksPerDoc 2, minSimilarity 0.2. Request parameters:

Param	Type	Default	Description
`query`	string	-	Search query, required (min 1, max 500 chars). 3–6 focused keywords recommended
`sources`	string[]	-	Filter by source keys, max 20. Use `GET /filters` for valid values
`limit`	int	5	Max documents returned (max 10)
`offset`	int	0	Pagination offset (per document, not per chunk)
`maxChunksPerDoc`	int	2	Chunks per document (min 1, max 4). Up to `maxChunksPerDoc` chunks per document in `vector`/`chunk_text` tiers; `doc_text` returns one snippet per document
`maxDocsPerSource`	int	3	Cap results from any single source (0 for unlimited, max 10)
`intent`	string	`docs`	`docs` for precision, `ideation` for broader recall
`minSimilarity`	float	0.2	Minimum cosine similarity (0–1). Lower for niche queries

Response:

Field	Type	Description
`results`	object[]	Array of archive results (see below)
`filtersApplied`	object	`{ sources?: string[] }`
`searchTier`	string	Which tier produced results: `vector`, `chunk_text`, or `doc_text`
`totalFound`	int	Total matching documents
`totalMatched`	int	Total matched before pagination
`hasMore`	boolean	Whether more results are available

Result object (results[]):

Field	Type	Nullable	Description
`documentId`	string (UUID)		Archive document identifier
`title`	string		Document title
`author`	string	yes	Document author
`source`	string		Source key
`url`	string	yes	Document URL
`publishedAt`	string	yes	ISO date string
`similarity`	number		Cosine similarity score
`snippet`	string		Relevant excerpt from the document
`chunkIndex`	int		Chunk position within the document

Score interpretation: Similarity above 0.4 is a strong topical match. 0.2–0.4 is worth reading but verify relevance. Below 0.2 is usually tangential. These thresholds apply to vector tier results. For chunk_text and doc_text tiers, prioritize snippet relevance over score magnitude. Query tips:

Keep to 3–6 focused keywords. Too short is vague; too long dilutes embedding similarity.
If results are all pre-2010 for a modern query, re-query with ecosystem-specific terms.
If empty, try conceptual synonyms (e.g., "prediction markets" → "futarchy").

GET /colosseum_copilot/archives/:documentId

Fetch a paged archive document slice.

curl "$COLOSSEUM_COPILOT_API_BASE/archives/DOCUMENT_UUID?offset=0&maxChars=8000" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT"

Parameters:

Param	Type	Default	Description
`documentId`	string (UUID)	-	Archive document identifier (required, path parameter)
`offset`	int	0	Character offset to start from (min 0)
`maxChars`	int	8000	Maximum characters to return (min 200, max 20000)

Use offset + maxChars to page through long documents. Check hasMore and use nextOffset for the next page. Response:

Field	Type	Nullable	Description
`documentId`	string (UUID)		Archive document identifier
`title`	string		Document title
`author`	string	yes	Document author
`source`	string		Source key
`url`	string	yes	Document URL
`publishedAt`	string	yes	ISO date string
`content`	string		Document content slice
`restricted`	boolean		Whether content is truncated due to licensing
`offset`	int		Starting character offset of this slice
`maxChars`	int		Requested max characters
`totalChars`	int		Total document length in characters
`nextOffset`	int	yes	Offset for the next page (`null` if no more content)
`hasMore`	boolean		Whether more content is available

GET /colosseum_copilot/projects/by-slug/:slug

Fetch full project details by slug.

curl "$COLOSSEUM_COPILOT_API_BASE/projects/by-slug/your-project-slug" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT"

Use for 1–2 top results when evidence from search results is insufficient. Response:

Field	Type	Nullable	Description
`slug`	string		Project slug
`name`	string		Project name
`description`	string	yes	Full project description
`oneLiner`	string	yes	Short project description
`hackathon`	object		`{ name, slug, startDate }`
`tracks`	object[]		`[{ name, key }]`
`links`	object		`{ github, demo, presentation, technicalDemo, twitter, colosseum }` (all fields nullable)
`team`	object		`{ count, members[] }` where each member has `{ displayName?, username?, githubHandle?, twitterHandle? }` (all nullable)
`isWinner`	boolean		Whether the project won a prize
`accelerator`	object	yes	`{ companySlug?, companyName?, batchKey, batchName }` (companySlug/companyName nullable)
`createdAt`	string		ISO date string
`tags`	object	yes	`{ problemTags[], solutionTags[], primitives[], techStack[], targetUsers[] }`
`cluster`	object	yes	`{ key, label }`
`metrics`	object	yes	`{ likesCount, commentsCount, updatesCount }`
`prize`	object	yes	`{ type, name?, placement?, amount?, trackName? }` (inner fields nullable)

Cohort definition

The /analyze and /compare endpoints accept a shared cohort definition to scope which projects are included:

Field	Type	Description
`hackathons`	string[]	Filter by hackathon slugs
`trackKeys`	string[]	Filter by track keys (format `<hackathonSlug>/<trackSlug>`)
`winnersOnly`	boolean	Only prize-winning projects
`acceleratorOnly`	boolean	Only accelerator portfolio companies
`acceleratorBatchKeys`	string[]	Specific batches (format `accelerator/<batchSlug>`)
`prizePlacements`	int[]	Prize placement ranks
`clusterKeys`	string[]	Cluster keys (format `v<N>-c<N>`)

All fields are optional. An empty cohort {} includes all projects.

POST /colosseum_copilot/analyze

Summarize tag/track distributions for a cohort.

curl -X POST "$COLOSSEUM_COPILOT_API_BASE/analyze" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "cohort": { "hackathons": ["breakout", "radar"], "winnersOnly": true },
    "dimensions": ["tracks", "problemTags"],
    "topK": 5,
    "samplePerBucket": 1
  }'

Request parameters:

Param	Type	Default	Description
`cohort`	object	-	Cohort definition (see above), required
`dimensions`	string[]	-	Dimensions to analyze: `tracks`, `problemTags`, `solutionTags`, `primitives`, `techStack`, `targetUsers`, `clusters`
`topK`	int	10	Max buckets per dimension (1–20)
`samplePerBucket`	int	2	Sample project slugs per bucket (0–5)

Response:

Field	Type	Description
`totals`	object	`{ projects, winners }` (counts for the cohort)
`buckets`	object	Keyed by dimension name, each an array of `{ key, label, count, share, sampleProjectSlugs[] }`

POST /colosseum_copilot/compare

Compare two cohorts across the same dimensions.

curl -X POST "$COLOSSEUM_COPILOT_API_BASE/compare" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "cohortA": { "hackathons": ["breakout"], "winnersOnly": true },
    "cohortB": { "hackathons": ["breakout"], "winnersOnly": false },
    "dimensions": ["tracks", "problemTags"],
    "topK": 5
  }'

Request parameters:

Param	Type	Default	Description
`cohortA`	object	-	First cohort definition (see above), required
`cohortB`	object	-	Second cohort definition (see above), required
`dimensions`	string[]	-	Dimensions to compare: `tracks`, `problemTags`, `solutionTags`, `primitives`, `techStack`, `targetUsers`, `clusters`
`topK`	int	10	Max items per dimension (1–20)

Response:

Field	Type	Description
`totalsA`	object	`{ projects, winners }` (counts for cohort A)
`totalsB`	object	`{ projects, winners }` (counts for cohort B)
`results`	object	Keyed by dimension name, each an array of comparison items

Comparison item shape:

Field	Type	Description
`key`	string	Dimension value key
`label`	string	Human-readable label
`countA`	int	Count in cohort A
`shareA`	number	Share in cohort A (0–1)
`countB`	int	Count in cohort B
`shareB`	number	Share in cohort B (0–1)
`lift`	number	Relative difference (shareA / shareB)
`delta`	number	Absolute difference (shareA − shareB)
`examplesA`	string[]	Sample project slugs from cohort A
`examplesB`	string[]	Sample project slugs from cohort B

GET /colosseum_copilot/clusters/:key

Fetch cluster details.

curl "$COLOSSEUM_COPILOT_API_BASE/clusters/v1-c12" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT"

Response:

Field	Type	Description
`key`	string	Cluster key (format `v<N>-c<N>`)
`label`	string	Cluster label
`summary`	string	LLM-generated cluster description
`projectCount`	int	Total projects in cluster
`winnerCount`	int	Prize-winning projects in cluster
`representativeProjects`	object[]	`[{ slug, name, oneLiner, isWinner }]`
`topTags`	object	`{ problemTags: [{ tag, count }], primitives: [{ tag, count }], techStack: [{ tag, count }] }`

POST /colosseum_copilot/source-suggestions

Suggest a new source for the archive corpus.

curl -X POST "$COLOSSEUM_COPILOT_API_BASE/source-suggestions" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/solana-mev-research",
    "name": "MEV Research Blog",
    "reason": "Great technical analysis of Solana MEV strategies"
  }'

Parameters:

Param	Type	Required	Description
`url`	string	Yes	URL of the source to suggest (must be a valid URL)
`name`	string	No	Name or title of the source (max 200 chars)
`reason`	string	No	Why this source would be valuable (max 500 chars)

Response: 201 Created

{ "message": "Thanks! We'll review your suggestion." }

Every submission is reviewed by the team. Approved sources are added to the archive pipeline.

POST /colosseum_copilot/feedback

Report errors, quality issues, or suggestions to help improve Copilot.

curl -X POST "$COLOSSEUM_COPILOT_API_BASE/feedback" \
  -H "Authorization: Bearer $COLOSSEUM_COPILOT_PAT" \
  -H "Content-Type: application/json" \
  -d '{
    "category": "quality",
    "message": "Search returned low-relevance results for DePIN query",
    "severity": "medium",
    "context": { "query": "DePIN infrastructure", "endpoint": "/search/projects" }
  }'

Parameters:

Param	Type	Required	Description
`category`	string	Yes	One of: `error`, `quality`, `suggestion`, `other`
`message`	string	Yes	Description of the issue (max 5000 chars)
`severity`	string	No	One of: `low`, `medium` (default), `high`, `critical`
`context`	object	No	Structured context such as query, endpoint, error details (max 10KB)

Response: 201 Created

{ "message": "Feedback received. Thank you." }

High and critical severity feedback is escalated to the team immediately.

Error handling

All errors return:

{ "error": "<message>", "code": "<ERROR_CODE>", "retryable": <boolean> }

Server errors (5xx) also include a requestId field for log correlation when reporting issues.

Status	Code	Retryable	Meaning
400	`INVALID_JSON`	No	Request body contains invalid JSON
400	`INVALID_QUERY`	No	Bad params or unknown fields
400	`BAD_REQUEST`	No	Malformed request body
401	`UNAUTHORIZED`	No	Missing or invalid PAT
403	`FORBIDDEN`	No	Access denied for this resource
404	`NOT_FOUND`	No	Resource not found
413	`PAYLOAD_TOO_LARGE`	No	Request body exceeds 1 MB size limit
415	`UNSUPPORTED_MEDIA_TYPE`	No	Unsupported content encoding or charset
429	`RATE_LIMITED`	Yes	Rate or concurrency limit exceeded
500	`INTERNAL_ERROR`	Yes	Unexpected server error
503	`SERVICE_UNAVAILABLE`	Yes	Infrastructure temporarily unavailable

Some 5xx responses may use a more specific code derived from the server-side error class instead of INTERNAL_ERROR. Treat any 5xx with retryable: true as transient and include the requestId when reporting issues. For 429: check the Retry-After header for seconds to wait.

​Rate limits

​Endpoints

​GET /colosseum_copilot/status

​GET /colosseum_copilot/filters

​POST /colosseum_copilot/search/projects

​POST /colosseum_copilot/search/archives

​GET /colosseum_copilot/archives/:documentId

​GET /colosseum_copilot/projects/by-slug/:slug

​Cohort definition

​POST /colosseum_copilot/analyze

​POST /colosseum_copilot/compare

​GET /colosseum_copilot/clusters/:key

​POST /colosseum_copilot/source-suggestions

​POST /colosseum_copilot/feedback

​Error handling

Rate limits

Endpoints

GET /colosseum_copilot/status

GET /colosseum_copilot/filters

POST /colosseum_copilot/search/projects

POST /colosseum_copilot/search/archives

GET /colosseum_copilot/archives/:documentId

GET /colosseum_copilot/projects/by-slug/:slug

Cohort definition

POST /colosseum_copilot/analyze

POST /colosseum_copilot/compare

GET /colosseum_copilot/clusters/:key

POST /colosseum_copilot/source-suggestions

POST /colosseum_copilot/feedback

Error handling