Main orchestration endpoint that coordinates classification, article fetching, embedding generation and topic discovery across micro-services. Returns 202 Accepted immediately so the client can poll for completion.
query required | string User's research question |
auto_detect | boolean Default: true Auto-detect the best source when |
max_articles | integer >= 1 Default: 50 Cap on number of articles processed. |
nr_topics | integer >= 1 Default: 5 The desired number of topics to be generated. |
min_cluster_size | integer >= 1 Default: 2 Minimum articles per topic. |
source | string Enum: "arxiv" "reddit" |
category | string Manual category/feed override. |
{- "query": "string",
- "auto_detect": true,
- "max_articles": 50,
- "nr_topics": 5,
- "min_cluster_size": 2,
- "source": "arxiv",
- "category": "string"
}
{- "code": "string",
- "message": "string",
- "details": { }
}
Returns previously submitted analyses ordered by creation date. Does not include nested topic/article data to keep payload small.
limit | integer Default: 20 Maximum number of items to return. |
offset | integer Default: 0 Offset into the result-set for pagination. |
{- "total": 0,
- "limit": 0,
- "offset": 0,
- "items": [
- {
- "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
- "status": "PENDING",
- "query": "string",
- "type": "research",
- "total_articles_processed": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "topics": [
- {
- "id": "string",
- "title": "string",
- "description": "string",
- "article_count": 0,
- "relevance": 100,
- "articles": [
- {
- "id": "string",
- "title": "string",
- "summary": "string",
- "authors": [
- "string"
], - "published": "2019-08-24T14:15:22Z",
- "source": "arxiv",
- "metadata": {
- "arxiv_id": "2301.12345",
- "categories": [
- "cs.AI",
- "cs.LG"
], - "subreddit": "MachineLearning",
- "score": 42,
- "num_comments": 15
}
}
]
}
]
}
]
}
Returns 200 with full result. The status
field indicates whether the analysis is complete, in-progress or has failed.
id required | string <uuid> |
{- "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
- "status": "PENDING",
- "query": "string",
- "type": "research",
- "total_articles_processed": 0,
- "created_at": "2019-08-24T14:15:22Z",
- "topics": [
- {
- "id": "string",
- "title": "string",
- "description": "string",
- "article_count": 0,
- "relevance": 100,
- "articles": [
- {
- "id": "string",
- "title": "string",
- "summary": "string",
- "authors": [
- "string"
], - "published": "2019-08-24T14:15:22Z",
- "source": "arxiv",
- "metadata": {
- "arxiv_id": "2301.12345",
- "categories": [
- "cs.AI",
- "cs.LG"
], - "subreddit": "MachineLearning",
- "score": 42,
- "num_comments": 15
}
}
]
}
]
}
Uses a Large Language Model (LLM) to decide whether the user's query should be answered with peer-reviewed research articles (arXiv) or community-sourced discussions (Reddit). Returns the suggested source and category along with a confidence score.
query required | string User's natural language query |
{- "query": "string"
}
{- "source": "arxiv",
- "source_type": "research",
- "suggested_category": "string",
- "confidence": 1
}
Generate and cache embeddings for multiple texts
texts required | Array of strings Texts to generate embeddings for |
ids required | Array of strings Document IDs for caching |
{- "texts": [
- "string"
], - "ids": [
- "string"
]
}
{- "embeddings": [
- [
- 0
]
], - "cached_count": 0,
- "found_count": 0
}
Generate optimized query for a specific data source using AI
source required | string Enum: "arxiv" "reddit" Target data source |
search_terms required | string Natural language search terms |
object Source-specific filters |
{- "search_terms": "string",
- "filters": {
- "category": "string",
- "subreddit": "string",
- "timeframe": "past_hour",
- "author": "string",
- "language": "string"
}
}
{- "query": "string",
- "description": "string",
- "source": "string"
}
Takes a prompt and returns a generated text from a specified Large Language Model (e.g., Gemini, OpenRouter).
This can be used for various generative tasks like summarizing topics, creating labels, or answering questions based on provided context.
The prompt can contain placeholders like [DOCUMENTS]
and [KEYWORDS]
which should be replaced by the caller.
prompt required | string The prompt to send to the LLM. Can include placeholders like [DOCUMENTS] and [KEYWORDS]. |
model | string The model to use for generation. Defaults to the service's configured model. |
max_tokens | integer Default: 256 Maximum number of tokens to generate. |
temperature | number <float> Default: 0.7 Controls randomness. Lower is more deterministic. |
{- "prompt": "string",
- "model": "gemini-pro",
- "max_tokens": 256,
- "temperature": 0.7
}
{- "text": "string",
- "model": "string",
- "prompt": "string"
}
Retrieve articles from research or community sources
query required | string Search query (source-specific syntax) |
limit | integer Default: 50 Maximum articles to fetch |
source required | string Enum: "arxiv" "reddit" Specific source to fetch from |
category | string Source-specific category (e.g., cs.AI or MachineLearning) |
object Additional filters understood by the target source |
{- "query": "string",
- "limit": 50,
- "source": "arxiv",
- "category": "string",
- "filters": { }
}
{- "articles": [
- {
- "id": "string",
- "title": "string",
- "summary": "string",
- "authors": [
- "string"
], - "published": "2019-08-24T14:15:22Z",
- "source": "arxiv",
- "metadata": {
- "arxiv_id": "2301.12345",
- "categories": [
- "cs.AI",
- "cs.LG"
], - "subreddit": "MachineLearning",
- "score": 42,
- "num_comments": 15
}
}
], - "total_found": 0,
- "source": "arxiv"
}
Cluster articles into coherent topics using embeddings
query required | string Original user query |
article_ids required | Array of strings Article IDs with cached embeddings |
required | Array of objects (Article) Article metadata |
min_cluster_size | integer Default: 2 Minimum articles per topic |
nr_topics | integer >= 1 Default: 5 Maximum number of topics to generate. |
{- "query": "string",
- "article_ids": [
- "string"
], - "articles": [
- {
- "id": "string",
- "title": "string",
- "summary": "string",
- "authors": [
- "string"
], - "published": "2019-08-24T14:15:22Z",
- "source": "arxiv",
- "metadata": {
- "arxiv_id": "2301.12345",
- "categories": [
- "cs.AI",
- "cs.LG"
], - "subreddit": "MachineLearning",
- "score": 42,
- "num_comments": 15
}
}
], - "min_cluster_size": 2,
- "nr_topics": 5
}
{- "query": "string",
- "topics": [
- {
- "id": "string",
- "title": "string",
- "description": "string",
- "article_count": 0,
- "relevance": 100,
- "articles": [
- {
- "id": "string",
- "title": "string",
- "summary": "string",
- "authors": [
- "string"
], - "published": "2019-08-24T14:15:22Z",
- "source": "arxiv",
- "metadata": {
- "arxiv_id": "2301.12345",
- "categories": [
- "cs.AI",
- "cs.LG"
], - "subreddit": "MachineLearning",
- "score": 42,
- "num_comments": 15
}
}
]
}
], - "total_articles_processed": 0
}