Tools and Architecture supporting Okapi Framework

Posted By :Ravi Rose |30th June 2021


The Okapi Framework is a free, cross-platform, and open-source project offering a spread of tools that are pretty helpful for translators. However, there's a caution.

The project was initially developed as a toolset for localization engineers, not translators, making things a bit harder.


The Okapi Framework is a set of components meant to be put together to make processes for doing various translation-related tasks. Consider Okapi as a Lego with which developers or technical-minded users can build compelling utilities. But this isn't very practical for many translators since they might have something more concrete with what to figure.


Its goal is to permit tools developers and localizers to create new localization processes or enhance existing ones to meet their needs best while preserving compatibility and interoperability.



Here are a number of the tools and applications that supported the framework: 


Rainbow — It is a GUI application used to launch various utilities associated with translation and localization tasks, such as -

  • Text extraction (to XLIFF, OmegaT projects, RTF, etc.) and merging, 
  • Pre-translation, 
  • Encoding conversion, 
  • Terms extraction, 
  • File format conversions, 
  • Quality verification, 
  • Translation comparison, 
  • Search and replace on filtered text, 
  • Pseudo-translation, and far more. 


Using the framework's pipeline mechanism, you'll use Rainbow to make chains of steps that perform a custom set of tasks specific to your needs.


CheckMate — It is a GUI application that performs various quality checks on bilingual translation files like XLIFF, TMX, TTX, PO, TS, Trados-Tagged RTF, and the other bilingual format supported by the framework.


Ratel — Is a GUI application to make and maintain segmentation rules. Such rules are wont to break down the translatable text into additional meaningful parts. Ratel uses Okapi's SRX-based segmentation engine. SRX is that the segmentation Rules eXchange format. The appliance includes a test feature that permits you to ascertain the consequences of your segmentation rules on your sample text immediately as you edit the principles.


Tikal — may be a command-line tool that gives many functions, including simple extraction/merging, various file format conversions, access to translation resources, import/export for the Pensieve TM, etc.


Filters Plugin for OmegaT — These plugins bring transparent support for additional file formats like TTX, IDML, JSON, etc. Just drop the Jar enter OmegaT's plugins directory, restart OmegaT, and you're good to travel.


Longhorn - It is an application server to execute execution remotely. Pre-defined Pipelines and Filter configurations are often exported from Rainbow. Longhorn provides a REST interface.



The Okapi Framework architecture includes the following parts:


Interface Specifications: Okapi framework's components and applications communicate through several standard API sets and interfaces. A couple of them are defined as high-level specifications. Implementing these interfaces allows you to plug new components within the overall framework seamlessly. For example, all filters have an equivalent API to parse input files, so you'll write utilities that use any of the available filters.


Format Specifications: Storing and exchanging data is a crucial part of the localization process. We can increase interoperability by using as many open standards as possible. Whenever possible, the Okapi Framework makes use of existing standards like XLIFF, SRX, TMX, etc.


Components: The Okapi Framework also includes a growing set of features that implement the various interface specifications. Some basic and low-level parts will be re-used when programming more high-level components, while others are plugins that will be used directly in scripts or applications.


Applications: Lastly, the framework also provides end-user applications which will be utilized out-of-the-box. These tools make use of the Okapi components and supply ready-made platforms for plugging in your components.



There are two main types of components:

  • Filters — Filters include: HTML,, Microsoft Office files, Java properties files, .NET ResX files, Table-type files (e.g. CSV), XLIFF, SDLXLIFF, Qt TS, TMX files, XML format, IDML, etc.
  • Utilities — Utilities include Term extraction, Text extraction, merging, line-break conversion quality check, RTF to text conversion, translation comparison, text re-writing, etc.


About Author

Ravi Rose

Ravi is a versatile Backend Developer with a strong expertise in WordPress technology. He is well-versed in the latest technologies like HTML, CSS, Bootstrap, JS, WordPress, PHP, and ReactJS. Ravi has contributed to multiple internal and client projects such as TripCongo, Transleqo, Hydroleap, OodlesAI, and Nokenchain. He has also demonstrated his capabilities in various other areas such as project management, requirement analysis, client communication, project execution, and team management. With his wide range of skills and experience, he can deliver exceptional results and add value to any organization he works with.

Request For Proposal

[contact-form-7 404 "Not Found"]

Ready to innovate ? Let's get in touch

Chat With Us