Building Enterprise Search with Amazon Kendra - Natural Language Queries and FAQ Auto-Extraction

Build an enterprise search platform that lets you search internal documents using natural language. This article covers data source connector configuration, search accuracy tuning, and RAG integration.

Overview of Kendra

Kendra is an ML-powered enterprise search service. Unlike traditional keyword search, when you ask a natural language question like "How do I apply for paid leave?", it extracts the relevant section from internal policy documents and provides a direct answer. Search results highlight relevant passages within documents. It connects to internal systems through over 40 data source connectors, supports natural language queries in 14 languages, and can also serve as a retriever for RAG.

Data Sources and Accuracy Tuning

Data source connectors link to internal systems such as S3, SharePoint, Confluence, and ServiceNow, periodically crawling and indexing content. ACL-aware connectors filter search results based on user access permissions. For search accuracy tuning, registering custom synonym dictionaries, applying relevance boosting (weighting specific fields), and leveraging user feedback are effective approaches.

RAG and Generative AI Integration

Kendra serves as a retriever for RAG (Retrieval-Augmented Generation), providing high-accuracy search results to generative AI applications. By combining Amazon Bedrock foundation models with Kendra, you can build chatbots that generate accurate answers based on internal documents. The Kendra Retrieve API fetches relevant document excerpts and passes them as context in the prompt to the foundation model. ACL-based access control ensures only results matching the user's permissions are returned, preventing confidential information leakage. Custom document enrichment lets you run Lambda functions to add metadata or preprocess text before indexing. For understanding Kendra's model design, related books (Amazon) can be a useful reference.

Kendra Pricing and Optimization

Kendra pricing depends on the index edition (Developer or Enterprise), connector sync frequency, and document count. The Developer edition costs approximately $810 per month and supports up to 10,000 documents and 4,000 queries per day. The Enterprise edition costs approximately $1,008 per month and supports up to 100,000 documents and 8,000 queries per day. You can scale with additional document storage and query capacity units. Optimize connector sync schedules to match data update frequency and avoid unnecessary re-indexing. Leveraging FAQ data sources to directly answer frequently asked questions improves both search accuracy and user experience.

Summary

Kendra is an ML-based enterprise search service that provides natural language question answering and serves as a retriever for RAG. It connects to internal systems through over 40 data source connectors and returns search results filtered by user permissions through ACL-based access control. Custom document enrichment and FAQ data sources further improve search accuracy.