Inverted index and Analyzers in ElasticSearch

Posted By :Rozi Ali |26th February 2021

 

ElasticSearch is an open-source search engine built on top of Apache Lucene, responsible for searching and indexing. It stores data in the form of documents. Hence, you don't need to provide a schema to store your data. 

However, internally ElasticSearch provides a schema called mapping to Lucene. This schema tells how to index data and what should be the data type. This mapping can be explicit or implicit.

In this blog, we will learn how ElasticSearch is able to process data very rapidly. 

 

Inverted Index

The inverted index is a data structure that supports a high-speed search for full texts. The inverted index is the reason behind this fast search that ElasticSearch provides. 

How does it work? let's understand this by a simple example:

 

suppose we insert two documents:

 

Document 1: "It is a beautiful day"

Document 2: "What a beautiful flower"

 

An inverted index of the above documents would look like:-

 

Terms Document Position Frequency

It 1 1 1

is 1 2 1

a 1, 2 3, 2 1

beautiful 1, 2 4, 3 1

day 1 5 1

what 2 1 1

flower 2 4 1

 

Using this type of data structure, it becomes very easy to perform searching for ElasticSearch.

ElasticSearch indexes all data in every field, and every indexed field has an optimized data structure.

 

ElasticSearch Analyzers

Analyzers are the algorithm that determines how a text field is transformed into terms in the inverted index. It first breaks the terms and then standardizes them. It is a three steps process:

 

Step 1: Character Filtering

It is a pre-process where the stream of characters is transformed by adding, removing, or updating characters.

 

Step 2: Tokenization

In this step, the stream of characters breaks down into terms, also known as tokens. For example, a stream can be tokenized by white space to generate individual works generated in output.

 

Step 3: Token Filters

In this final step, the tokens then filter and transformed into the given user standard.

 

The result of the analysis process is then put in the inverted index.

ElasticSearch Analyzers provides great support for improving search accuracy.


About Author

Rozi Ali

She is self-motivated and dedicated person in Development team. She is working on Java Technology.

Request For Proposal

Sending message..

Ready to innovate ? Let's get in touch


Notice: Undefined index: HTTP_REFERER in /var/html/www/AI/wp-content/themes/oxides-child/functions.php on line 272

Notice: Undefined index: HTTP_REFERER in /var/html/www/AI/wp-content/themes/oxides-child/functions.php on line 272

Notice: Undefined index: HTTP_REFERER in /var/html/www/AI/wp-content/themes/oxides-child/functions.php on line 272

Notice: Undefined index: HTTP_REFERER in /var/html/www/AI/wp-content/themes/oxides-child/functions.php on line 272

Chat With Us