NicheExplorer API (1.0.0)

Download OpenAPI specification:Download

License: MIT

Unified API specification for NicheExplorer microservices. Provides topic discovery, article fetching, and AI-powered analysis.

Analysis

Launch analysis job

Main orchestration endpoint that coordinates classification, article fetching, embedding generation and topic discovery across micro-services. Returns 202 Accepted immediately so the client can poll for completion.

Request Body schema: application/json
query
required
string

User's research question

auto_detect
boolean
Default: true

Auto-detect the best source when source is omitted.

max_articles
integer >= 1
Default: 50

Cap on number of articles processed.

nr_topics
integer >= 1
Default: 5

The desired number of topics to be generated.

min_cluster_size
integer >= 1
Default: 2

Minimum articles per topic.

source
string
Enum: "arxiv" "reddit"
category
string

Manual category/feed override.

Responses

Request samples

Content type
application/json
{
  • "query": "string",
  • "auto_detect": true,
  • "max_articles": 50,
  • "nr_topics": 5,
  • "min_cluster_size": 2,
  • "source": "arxiv",
  • "category": "string"
}

Response samples

Content type
application/json
{
  • "code": "string",
  • "message": "string",
  • "details": { }
}

List analyses (paginated)

Returns previously submitted analyses ordered by creation date. Does not include nested topic/article data to keep payload small.

query Parameters
limit
integer
Default: 20

Maximum number of items to return.

offset
integer
Default: 0

Offset into the result-set for pagination.

Responses

Response samples

Content type
application/json
{
  • "total": 0,
  • "limit": 0,
  • "offset": 0,
  • "items": [
    ]
}

Poll analysis status

Returns 200 with full result. The status field indicates whether the analysis is complete, in-progress or has failed.

path Parameters
id
required
string <uuid>

Responses

Response samples

Content type
application/json
{
  • "id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  • "status": "PENDING",
  • "query": "string",
  • "type": "research",
  • "feed_url": "http://example.com",
  • "total_articles_processed": 0,
  • "created_at": "2019-08-24T14:15:22Z",
  • "topics": [
    ]
}

Delete analysis

Permanently remove analysis and all related data.

path Parameters
id
required
string <uuid>

Responses

Response samples

Content type
application/json
{
  • "code": "string",
  • "message": "string",
  • "details": { }
}

AI

Classify query intent (research vs community)

Uses a Large Language Model (LLM) to decide whether the user's query should be answered with peer-reviewed research articles (arXiv) or community-sourced discussions (Reddit). Returns the suggested source and category along with a confidence score.

Request Body schema: application/json
query
required
string

User's natural language query

Responses

Request samples

Content type
application/json
{
  • "query": "string"
}

Response samples

Content type
application/json
{
  • "source": "arxiv",
  • "source_type": "research",
  • "suggested_category": "string",
  • "confidence": 1
}

Generate embeddings

Generate and cache embeddings for multiple texts

Request Body schema: application/json
texts
required
Array of strings

Texts to generate embeddings for

ids
required
Array of strings

Document IDs for caching

Responses

Request samples

Content type
application/json
{
  • "texts": [
    ],
  • "ids": [
    ]
}

Response samples

Content type
application/json
{
  • "embeddings": [
    ],
  • "cached_count": 0,
  • "found_count": 0
}

Retrieve cached embeddings

Get previously generated embeddings by document IDs

query Parameters
ids
required
Array of strings

Document IDs to retrieve embeddings for

Responses

Response samples

Content type
application/json
{
  • "embeddings": [
    ],
  • "cached_count": 0,
  • "found_count": 0
}

Build source-specific query

Generate optimized query for a specific data source using AI

path Parameters
source
required
string
Enum: "arxiv" "reddit"

Target data source

Request Body schema: application/json
search_terms
required
string

Natural language search terms

object

Source-specific filters

Responses

Request samples

Content type
application/json
{
  • "search_terms": "string",
  • "filters": {
    }
}

Response samples

Content type
application/json
{
  • "query": "string",
  • "description": "string",
  • "source": "string"
}

Generate text using a large language model

Takes a prompt and returns a generated text from a specified Large Language Model (e.g., Gemini, OpenRouter). This can be used for various generative tasks like summarizing topics, creating labels, or answering questions based on provided context. The prompt can contain placeholders like [DOCUMENTS] and [KEYWORDS] which should be replaced by the caller.

Request Body schema: application/json
prompt
required
string

The prompt to send to the LLM. Can include placeholders like [DOCUMENTS] and [KEYWORDS].

model
string

The model to use for generation. Defaults to the service's configured model.

max_tokens
integer
Default: 256

Maximum number of tokens to generate.

temperature
number <float>
Default: 0.7

Controls randomness. Lower is more deterministic.

Responses

Request samples

Content type
application/json
{
  • "prompt": "string",
  • "model": "gemini-pro",
  • "max_tokens": 256,
  • "temperature": 0.7
}

Response samples

Content type
application/json
{
  • "text": "string",
  • "model": "string",
  • "prompt": "string"
}

Articles

Fetch articles

Retrieve articles from research or community sources

Request Body schema: application/json
query
required
string

Search query (source-specific syntax)

limit
integer
Default: 50

Maximum articles to fetch

source
required
string
Enum: "arxiv" "reddit"

Specific source to fetch from

category
string

Source-specific category (e.g., cs.AI or MachineLearning)

object

Additional filters understood by the target source

Responses

Request samples

Content type
application/json
{
  • "query": "string",
  • "limit": 50,
  • "source": "arxiv",
  • "category": "string",
  • "filters": { }
}

Response samples

Content type
application/json
{
  • "articles": [
    ],
  • "total_found": 0,
  • "source": "arxiv"
}

Get source categories

Get available categories for a specific data source

path Parameters
source
required
string
Enum: "arxiv" "reddit"

Data source

Responses

Response samples

Content type
application/json
{
  • "property1": [
    ],
  • "property2": [
    ]
}

Topics

Discover topics

Cluster articles into coherent topics using embeddings

Request Body schema: application/json
query
required
string

Original user query

article_ids
required
Array of strings

Article IDs with cached embeddings

required
Array of objects (Article)

Article metadata

min_cluster_size
integer
Default: 2

Minimum articles per topic

nr_topics
integer >= 1
Default: 5

Maximum number of topics to generate.

Responses

Request samples

Content type
application/json
{
  • "query": "string",
  • "article_ids": [
    ],
  • "articles": [
    ],
  • "min_cluster_size": 2,
  • "nr_topics": 5
}

Response samples

Content type
application/json
{
  • "query": "string",
  • "topics": [
    ],
  • "total_articles_processed": 0
}