Data Processing

Semantic Search Redirector

Maps messy user queries to precise catalog items using TF-IDF vector similarity with cosine scoring. Supports synonym expansion, fuzzy matching, and bigram analysis.

POST /api/semantic-search
<50ms latency
API Key

What makes this API special?

Unlike traditional keyword search, this API understands semantic meaning through TF-IDF vectorization and cosine similarity scoring. It handles typos via Levenshtein distance, expands synonyms, and weights title matches higher—delivering precise results even from messy, natural-language queries.

Code Examples

curl -X POST "https://api.atomicapis.dev/api/semantic-search" \
  -H "X-RapidAPI-Proxy-Secret: YOUR_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "wireless noise cancelling headphones",
    "items": [
      {
        "id": "prod_001",
        "text": "Industry-leading noise cancellation with 30-hour battery life",
        "title": "Sony WH-1000XM5 Wireless Headphones",
        "category": "audio"
      },
      {
        "id": "prod_002",
        "text": "Premium noise cancelling headphones with comfortable fit",
        "title": "Bose QuietComfort 45",
        "category": "audio"
      },
      {
        "id": "prod_003",
        "text": "Active noise cancellation with spatial audio",
        "title": "Apple AirPods Pro 2",
        "category": "audio"
      }
    ],
    "topK": 10,
    "minScore": 0.01,
    "fuzzyMatch": true,
    "synonymExpansion": true,
    "titleWeight": 2.0
  }'

How It Works

The Semantic Search Redirector API transforms imprecise user queries into accurate catalog matches using a multi-layered approach to text understanding. It combines classical information retrieval techniques with modern NLP concepts—without requiring any ML model or stored state.

TF-IDF Vectorization

Converts text into weighted vectors based on term frequency and inverse document frequency, highlighting unique, meaningful words.

Cosine Similarity

Measures the angle between query and document vectors to determine semantic relevance, regardless of text length.

Fuzzy Matching

Uses Levenshtein distance to handle typos and spelling variations, ensuring matches even with imperfect input.

Bigram Analysis

Analyzes word pairs to capture multi-word concepts and phrases, improving understanding of compound terms.

Request Parameters

Parameter Type Required Description
query string Yes The user's search query. Can be messy, contain typos, or use natural language.
items array Yes Array of items to search against. Each item must have id and text, with optional title, category, and metadata.
items[].id string Yes Unique identifier for the item.
items[].text string Yes Text content for similarity matching.
items[].title string No Title of the item. Weighted by titleWeight in scoring.
items[].category string No Category for the item.
items[].metadata object No Key-value pairs of additional metadata. Passed through to results unchanged.
topK number No Maximum number of results to return. Default: 10
minScore number No Minimum similarity score for results. Default: 0.01
fuzzyMatch boolean No Enable fuzzy matching for typo tolerance. Default: true
synonymExpansion boolean No Enable synonym expansion. Default: true
titleWeight number No Weight multiplier for title matches. Default: 2.0

Response Format

200 OK - Successful Response
{
  "query": "wireless noise cancelling headphones",
  "normalizedQuery": "wireless noise cancelling headphones",
  "totalItems": 3,
  "matchedItems": 3,
  "results": [
    {
      "id": "prod_001",
      "score": 0.89,
      "title": "Sony WH-1000XM5 Wireless Headphones",
      "category": "audio",
      "matchedTerms": ["wireless", "noise", "headphones"],
      "metadata": null
    },
    {
      "id": "prod_002",
      "score": 0.76,
      "title": "Bose QuietComfort 45",
      "category": "audio",
      "matchedTerms": ["noise", "cancelling", "headphones"],
      "metadata": null
    },
    {
      "id": "prod_003",
      "score": 0.62,
      "title": "Apple AirPods Pro 2",
      "category": "audio",
      "matchedTerms": ["noise", "cancelling"],
      "metadata": null
    }
  ],
  "searchDurationMs": 12.34
}

Response Fields

Field Type Description
query string The original query string.
normalizedQuery string The query after normalization (lowercasing and stop-word removal).
totalItems number Total number of items searched.
matchedItems number Number of items that matched above minScore.
results array Array of matched items sorted by relevance score (highest first).
results[].id string ID of the matched item.
results[].score number Similarity score. Higher is more relevant.
results[].title string Title of the matched item (if provided in the request).
results[].category string Category of the matched item (if provided in the request).
results[].matchedTerms array Which terms contributed to the match.
results[].metadata object Key-value metadata passed through from the matched catalog item.
searchDurationMs number Search processing time in milliseconds.

Use Cases

E-commerce Search

Power product search that understands customer intent even with typos, slang, or vague descriptions like "comfy shoes for running."

Retail Marketplace

Knowledge Base

Enable natural language search across documentation, FAQs, and support articles without maintaining complex search indexes.

Docs Support

Product Discovery

Help users find products through conversational queries, feature-based searches, and semantic recommendations.

Discovery Recommendations

Build Constraints

Performance

  • Sub-50ms response time for typical catalogs (<1000 items)
  • Pure computation—no database queries or external calls
  • Linear time complexity O(n) relative to catalog size

Architecture

  • 100% stateless—no data persistence between requests
  • No ML model required—uses classical NLP algorithms
  • User provides catalog with each request
Maximum catalog size: 10,000 items per request. For larger catalogs, consider preprocessing or pagination.

Error Codes

Code Status Description
400 Bad Request Invalid request body. Check that query and catalog are provided and properly formatted.
401 Unauthorized Missing or invalid API key. Include a valid Authorization header.
413 Payload Too Large Catalog exceeds maximum size limit (10,000 items).
429 Too Many Requests Rate limit exceeded. Wait before making additional requests.
500 Server Error Internal server error. Contact support if the issue persists.

Ready to get started?

Start building with the Semantic Search Redirector API today. Free tier includes 1,000 requests per month.

MCP Integration MCP Ready

What is MCP?

Model Context Protocol (MCP) allows AI assistants like Claude to call this API as a native tool during conversation. Instead of writing HTTP requests, the AI invokes the tool directly — no API keys or boilerplate needed on the client side.

Tool Details

Tool Class
SemanticSearchTools
Method
SemanticSearch()

Description

Matches a query against a catalog of items using TF-IDF similarity