Mapping the Advantages and Disadvantages of Sikuli for OCR

Posted By :Hemant Chauhan |30th June 2020

Advantages and Disadvantages of Sikuli for OCR


Sikuli is a tool to automate graphical user interfaces(GUI). You can automate anything using this GUI automation tool. It uses the Image recognition technique to interact with elements of any webpage or any window based popups(application). It considers all elements of window popups or webpage as images and recognizes these elements based on images. We, at Oodles, as an artificial intelligence development company are constantly adding new automation tools and techniques to our competency bucket. In this article, we explore the advantages and disadvantages of Siklui with regard to its text recognition capabilities. 



What is Sikuli?


Sikuli Automates anything whatever we see on the screen using the image recognition technology to identify and control elements. Sikuli uses the visual Image match method, in this method all elements of the webpage are taken as images and store inside the project resources directory. Sikuli will perform GUI interactions based on the image visual match, the image we passed as a parameter in all methods of Sikuli. (methods such as find, click, double-click region, etc.)


As we know, we selenium for automating web applications and it cant automate window desktop applications. Using Sikuli with selenium we can easily automate both desktop windows and web applications. Uploading and downloading is very easy in sikuli. We can also automate flash objects easily. Sikuli is preferred when GUI elements are stable(when GUI components not change).


By using sikuli, we can Automate adobe flash players both audio/video and flash games on the website. Also, Sikuli’s architecture consists of the OpenCV library that works for machine learning development models and applications. 


Sikuli working with text and OCR features


SikuliX uses the Tess4j library of java, which allows us to use the Tesseract-OCR features in java. In other words, internally, it uses Tesseract for OCR.

Sikuli version 2x provides TextOCR (Python scripts) and Text Recognizer (Java API level) methods.


We can easily integrate sikuli with java maven project.You can use below dependencies structure in pom.xml:








Advantages of sikuli


1. Open-source tool

2. We can automate desktop applications using sikuli.

3. It can easily automate Flash objects/flash websites.

4. When you test a web page, you don’t know the id/name of the element, then sikuli is useful. It can check the appearance of image and check the match is found or not.

5. It provides very simple API, sikuli methods are very easy to use.

6. Easily Integrated with selenium and other tools.

7. It can be used on any platform such as Windows/Linux/Mac/Mobile.

8. It allows us to automate anything we see on the screen.


Disadvantages of sikuli


1. We cannot assure you that the image match will be always accurate. Sometimes, if two or more similar images are available on the screen, Sikuli will attempt to select the wrong image.

2. If anyone of the image is missing in our directory, it will affect the execution of the program.

3. We have to take too many screenshots manually from the website we will going to automate.

4. “Find Failed ” exception will occur, if image appearance varies in pixel size.

About Author

Hemant Chauhan

He has a good understanding in java and have a strong willing to learn new technologies, likes to work on spring framework & mysql. He is good in sports.

Request For Proposal

Sending message..

Ready to innovate ? Let's get in touch

Chat With Us