Amazon Kendra – How to guide

Amazon announced Amazon Kendra[1] at re:Invent 2019, a new AI Service[2]. As of today, AWS has 13 AI services, with Amazon Kendra being one of the latest additions. So what is Amazon Kendra?

Amazon Kendra is a highly accurate and easy to use enterprise search service that’s powered by machine learning. Kendra delivers powerful natural language search capabilities to your websites and applications so your end users can more easily find the information they need within the vast amount of content spread across your company.

In this post, we are going to see the main benefits of Kendra and also a hands-on example of how to setup Kendra.

  • AWS Level: 100
  • Total Cost: 0$. As of this day, Amazon Kendra is in Free tier -> https://aws.amazon.com/kendra/pricing/
    • WARNING: After free tier is 7$ per hour! So be careful!

Amazon Kendra

With Kendra, you can quickly build a search engine across multiple sources and distribute the results to applications, websites, search bars, chatbots, and more. There are two main “domains” in Kendra, user interaction, and under-the-hood functionality.

  • Amazon Kendra users can express queries in a variety of formats[3].
    • Factoid questions — who, what, when, or where questions such as Who is Amazon’s CEO? or What is the height of the Space Needle. These require fact-based answers that can be returned in the form of a single word or phrase. The precise answer, however, must be explicitly stated in the ingested text content.
    • Descriptive questions — Questions where the answer could be a sentence, passage, or an entire document. For example, How do I connect my Echo Plus to my network? Or How do I obtain tax benefits for lower-income families?.
    • Keyword searches — For questions where the intent and scope isn’t clear, Amazon Kendra uses its deep learning models to return relevant documents. For example, vacation policy or health benefits.
  • Under-the-hood Kendra uses Machine learning to understand the language and be able to answer the questions. Also, there is an option for the user to fine-tune the relevance of the search.

 

Setting Up Kendra

The setup of Kendra is pretty straightforward. The only things you need to have are the following:

  1. Some data for searching (I downloaded 1000 articles from Wikipedia)
  2. An AWS account with an S3 bucket (to save the files)
  3. The account should have access to Amazon Kendra

So follow these steps, and you will be fine (assuming you already uploaded some data in an S3 bucket):

  • Set up an Index in Kendra by clicking “Launch Amazon Kendra -> Create Index.” Give a name, a description, and create a new role. This will take some time to create, about 30 mins. Go grab a beverage, and it will be ready soon.

  • When the index is created, add a data source. Click on “Step 2. Add data sources”.  On Amazon S3, click on “Add Connector.” Give a name and a description of the data source and click Next. Add the name of the bucket, the prefix in the bucket, where the data are (In my case, they are in wiki/ prefix). Create a new role and select at the bottom to refresh on-demand. Review your options and click “Create.”

 

  • Sync the data on S3 with the index. On the data source’s page (the one from the section above), click on the Button Sync Now.

  • That’s it. Now we have a functioning search engine!

 

Query Data

Once the sync is done, go to the index’s page and select Step 3. Search Console. Here we will test the index we created on the data source.

The query we will run is the following:

what is Maine Public Broadcasting Network?

I know that I have a document from Wikipedia that reference Public Broadcasting Network. So we will ask Kendra to see if ti can find the document. Now beware, Kendra learns from your feedback, so It is good to rate the good results so Kendra will use incremental learning to improve the engine.

Clean up

To avoid unnecessary charges, you will have to delete the following:

  • The index from Kendra
  • The files from the S3 bucket

Conclusion

As you can see, setting up Amazon Kendra is very easy. You can integrate Kendra with your app by using Kendra’s API[4], following the instructions here. So far, that service looks very promising. I like the fast set up and the fact that It can use many different data sources like Plain text,  FAQs, HTML, PDFs, Microsoft Word, etc.[5][6]. I didn’t use Kendra a lot, so I cannot guarantee for the incremental learning, and that drives me to my first drawback, the price. Kendra is pricy! It starts from 5K$ per month for the Enterprise edition (24/7 use without counting the queries and the S3 charges). But as far as I know, they are planning to release a Developer’s edition too, which will be cheaper (around 1,8K$ per month)[7]. Overall, it is an excellent service! Tho, I would love to see a Serverless option!

Well, that is it for today, folks. I hope you find that article interesting. If you have any questions, suggestions, or notices, please let me know in the comment section below or at my Twitter account @siaterliskonsta. Make sure to check out my earlier post about Amazon SageMaker. Until next time, take care!

References

[1] Amazon Kendra – https://aws.amazon.com/kendra/

[2] Amazon AI Services – https://aws.amazon.com/machine-learning/ai-services/

[3] What Is Amazon Kendra? – https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html

[4] API Reference – https://docs.aws.amazon.com/kendra/latest/dg/API_Reference.html

[5] Adding Documents Directly to an Index – https://docs.aws.amazon.com/kendra/latest/dg/in-adding-documents.html

[6] Adding Documents from a Data Source – https://docs.aws.amazon.com/kendra/latest/dg/data-source.html

[7] Amazon Kendra pricing – https://aws.amazon.com/kendra/pricing/

Leave a Comment

Your email address will not be published. Required fields are marked *