Highlights API#
MK1 Highlights is our API service that retrieves the most relevant source text for any user query. Built on top of our custom large language model (LLM), Highlights is designed to scan large volumes of text quickly while maintaining near-perfect recall.
API Endpoint#
POST https://api.highlights.mk1.ai/search
Authentication#
Authentication is required for all API calls. You’ll need to include your API key in the request headers:
X-API-Key: YOUR-API-KEY
To obtain an API key, please sign up here.
Request Format#
The API accepts POST requests with a JSON body. Chunks can be provided in two formats:
Simple string
{
"query": "Your Query here",
"chunk_txts": ["Chunk 1 text here", "Chunk 2 text here", "..."],
"top_n": 3
}
Dictionary with required “text” field and optional metadata
{
"query": "Your query here",
"chunk_txts": [
{
"text": "First chunk with metadata",
"metadata": {
"source": "document1",
"page": 1
}
},
"Simple text chunk without metadata",
{
"text": "Third chunk with metadata",
"metadata": {
"source": "document2",
"category": "introduction"
}
}
],
"top_n": 10,
"true_order": true
}
Parameters#
Parameter |
Type |
Required |
Description |
---|---|---|---|
|
string |
Yes |
The natural language query to search with |
|
array |
Yes |
Array of text chunks with optional metadata |
|
integer |
No |
Number of top results to return (default: 10) |
|
boolean |
No |
Maintain original chunk order in results (default: true) |
Chunk Format Options#
Each chunk in the chunk_txts
array can be one of:
Field |
Type |
Required |
Description |
---|---|---|---|
|
string |
Yes |
The content of the text chunk |
|
object |
No |
Additional metadata associated with the chunk |
|
integer |
No |
Original position of chunk in input array (auto-set) |
Response Format#
The API returns a JSON response with relevant text chunks and their metadata:
{
"results": [
{
"chunk_id": 0,
"chunk_txt": "The relevant text chunk...",
"chunk_score": 136.13050842285156,
"metadata": {
"source": "document1",
"page": 1
},
"original_index": 0
}
],
"metadata": {
"num_query_tokens": 5,
"num_context_tokens": 44
}
}
Response Fields#
Field |
Type |
Description |
---|---|---|
|
array |
Array of relevant text chunks, ordered by relevance |
|
integer |
Index of the chunk in results array |
|
string |
The content of the relevant text chunk |
|
float |
Relevance score of the chunk to the query |
|
object |
Additional metadata associated with the chunk |
|
integer |
Original position of chunk in input array |
|
object |
Additional information about the request |
|
integer |
Number of tokens in the query |
|
integer |
Total number of tokens in the provided text chunks |
Best Practices#
General guidance to optimize your experience and get the most out of the Highlights API.
Input Preparation#
Text Chunking#
Keep chunk sizes between 512-10,000 characters for optimal performance
Ensure no empty chunks are included as these will cause errors
Maintain consistent chunk sizes across your dataset
Semantic Chunking#
Split text at natural semantic boundaries like paragraphs and sections
Keep related concepts together within chunks
Include enough context in each chunk for it to be meaningful on its own
Query Construction#
Writing Effective Queries#
Frame queries as clear, specific questions
Use natural language rather than just keywords
Include key terms that match your target content
Query Structure#
Balance brevity with necessary detail
Avoid overly long queries that could dilute relevance
Ensure queries contain enough information to be meaningful
Python Client#
For easier integration, you can use our Python client:
from typing import List, Optional
import requests
class HighlightsClient:
def __init__(self, api_key: str, base_url: str = "https://api.highlights.mk1.ai"):
self.api_key = api_key
self.base_url = base_url
self.headers = {
"X-API-Key": api_key,
"Content-Type": "application/json"
}
def search(
self,
query: str,
text_chunks: List[str],
top_n: Optional[int] = 3
) -> dict:
"""
Search through text chunks to find relevant passages.
Args:
query: The search query
text_chunks: List of text passages to search through
top_n: Number of top results to return
Returns:
Dictionary containing search results and metadata
"""
endpoint = f"{self.base_url}/search"
payload = {
"query": query,
"text_chunks": text_chunks,
"top_n": top_n
}
response = requests.post(endpoint, headers=self.headers, json=payload)
response.raise_for_status()
return response.json()
Example Usage#
Python Client Example#
from highlights_client import HighlightsClient
# Initialize the client
client = HighlightsClient(api_key="your-api-key")
# Sample text chunks
text_chunks = [
"Machine learning models can process vast amounts of data quickly.",
"Natural language processing helps computers understand human language.",
"Deep learning is a subset of machine learning based on neural networks.",
"Data science combines statistics, programming, and domain expertise."
]
# Perform the search
results = client.search(
query="What is machine learning?",
chunks=text_chunks,
top_n=2,
true_order=True
)
# Display results
print("Search Results:")
print(json.dumps(results, indent=2))
# Output:
# Search Results:
# {
# "results": [
# {
# "chunk_id": 0,
# "chunk_txt": "Machine learning models can process vast amounts of data quickly.",
# "chunk_score": 136.13050842285156,
# "metadata": {},
# "original_index": 0
# },
# {
# "chunk_id": 2,
# "chunk_txt": "Deep learning is a subset of machine learning based on neural networks.",
# "chunk_score": 119.29242706298828,
# "metadata": {},
# "original_index": 2
# }
# ],
# "metadata": {
# "num_query_tokens": 5,
# "num_context_tokens": 44
# }
# }
cURL Example#
curl -X POST "https://api.highlights.mk1.ai/distributor/search" \
-H "Content-Type: application/json" \
-H "X-API-Key: YOUR-API-KEY" \
-d '{
"query": "What is machine learning?",
"chunk_txts": [
"Machine learning models can process vast amounts of data quickly.",
"Natural language processing helps computers understand human language.",
"Deep learning is a subset of machine learning based on neural networks.",
"Data science combines statistics, programming, and domain expertise."
],
"top_n": 2,
"true_order": true
}'
Error Codes#
Status Code |
Description |
---|---|
200 |
Success |
400 |
Bad Request - Invalid parameters |
401 |
Unauthorized - Authentication failed |
429 |
Too Many Requests - Rate limit exceeded |
500 |
Internal Server Error |