Envision is an app that can extract visual information from your surroundings and convert it to meaningful audio output. The functions of Envision are categorised into two tabs: Text Recognition and General Recognition.

Text Recognition

All the current and future text recognition functions will be found under this tab that can be always accessed easily. Within this tab you will find the following functions:

Read Text Instantly

This function works with a live video feed and is meant for instantly reading short pieces of texts like displays on a train station or a price tag in a supermarket. 

Hold the camera over the text you want to read and press the button for it to start reading instantly. A soft beep sound will be heard when text is detected in a frame, this will help you position the phone over text.

As new text comes into view, that will also be read out, making this function convenient for scanning around.

Read Handwriting

This function is meant for reading handwritten text like the ones on a post-it note or a greeting card. 

Hold the camera over the text you want to read and tap this button. This will capture a photo of the text you are pointing at. You will hear a sound indicating that the image is being processed. Then, you will hear the audio output of the text.

Read Documents

This function is meant for reading long pieces of texts like the ones on official letters or ingredients on the back of a package. 

Once you press the button, a guidance system is activated which will then detect the edges of the page and speak out the edges that are not visible. You can keep adjusting your phone until it announces "all edges visible". A photo of the page will be automatically captured. If Envision is not able to detect edges or if you are reading something without well defined edges, you can tap the center of the screen to manually take a picture.

When processing is complete, the text will be displayed in a new page where it is accessible with VoiceOver. There is a play/pause button in the bottom for non-VoiceOver users. There is also an export button in the bottom that allows you to share the recognised text across various platforms.

Read Multiple Documents

This function allows you to capture multiple pages of document at once and read it out together, like multiple-page assignments or chapters from a book. 

To activate this, you need to long press on the "Read Documents" button. This will bring a pop up asking if you would like to read multiple pages. If you click yes, the guidance system will start detecting edges to guide you just like in the normal document reading mode. You can continue to capture images one after another. 

Upon completion, tap on the 'Done' button and you will start hearing the processing sound. The text will then be displayed on a page that can be easily navigated by VoiceOver. You can toggle between the captured pages using the indicators that appear below the text.


This function allows users with low vision to zoom into pieces of text they want to read. 

This can be activated by simply using pinch and zoom on the screen or by pressing the magnifier icon on the top left corner. You can use any of the text recognition functions along with this mode. There is also an option to invert colours if you want to read texts with a higher contrast.

General Recognition

All functions that does not involve recognising text can be found under this tab. Currently, this tab offers the following functions:

Describe Scene

This function allows you to capture an image of any scene with your camera and get a description of it. 

Point the phone in the direction of the scene you would like to have described. Tapping the button will take a picture. You will hear a sound indicating the processing of the image. Upon completion, a description of the scene will be spoken out.

Describe scene option also incorporates custom face and object recognition. So if you have taught Envision a person's face, when you take a picture of that person using the Describe Scene button, it will recognise them and tell you what they are doing. Similarly if you take a picture of a personal belonging you have trained, that is also recognised by the same option.

Describe scene option also does much more. It has been trained to recognise certain contexts and provide associated information. For example, if you take an image of a watch or a clock, it speaks out the current time. If you take an image of a window, it speaks out the current weather. We are continuing to add more contexts to this and will update you when we do.

Detect Color

This function could be used to detect colors of clothes, objects, etc. 

Point the camera of your phone at the area you would like to detect color of and press the button. It will immediately start speaking out the name of the color it sees. You can move the camera around to scan colors of different areas in real time. This function only picks up the color that is at the very center of the screen. So make sure you are as close to the area you are trying to detect the color of.

In the settings tab, you can also choose whether you want to recognise just the standard 30 shades of color or you would like to recognise a more descriptive 950 shades of color.

Teach Faces and Objects

Envision allows you to teach faces and objects to it that can later be recognised through the Describe Scene function. 

Tap on the Teach Envision button, this will take you to a screen with an option to 'Teach a face' and 'Teach an object'. Tapping any of those options takes you to a new screen, where you can start taking photos. By default the back camera of the phone is active, but this could be changed within the screen if you are intending to take a selfie. In the 'Teach a face' option, the camera will also provide a guide to help you position your face properly.

You are required to capture at least 5 photos, but we recommend taking around 10 photos for the recognition to be more accurate. Also, it helps if you take these photos from different angles and with different backgrounds. After clicking the photos, press Done. You will be prompted to enter the name of the person or the object. Once you do that, Envision will start teaching itself, which takes a few seconds. Once the teaching is successful you are taken back to the General Recognition tab.

Within the Teach Envision screen, you also have the option to Open Library. All the faces and objects that you have trained will be displayed here. You have the option to delete any of the faces or objects you no longer want Envision to recognise.

Recognise Images in other Apps

Envision can also be used to read and recognise images you come across in other apps like Photos, Twitter, WhatsApp, etc. This can be done by simply pressing the "Share" button from within that app and selecting the option "Envision it" from the list of actions that show up on the action sheet.

For the first time, you will have to enable this option by tapping on the "More" option in the bottom right corner of the share sheet and adding Envision It to the actions.


Envision is a constantly evolving app and we keep on improving its functions and capabilities. So make sure you either have your automatic updates on or check for new updates on a weekly basis. We will list a number of more tips here that we have crowdsourced from our users that may improve your experience of using Envision optimally:

  • A lot Envision's feature still depend on the internet. Though we have made sure that the processing happens lightening fast, having a decent internet connection helps. We never ever store any image or information that you capture.
  • If you have any feedback for Envision, you can share it from within the 'Give Feedback option in the Settings tab. If you need any help or clarifications you can also request a call from the Settings tab and we will call you back to help you out at the earliest.
  • For all text recognition features, Envision automatically detects the language of the text by default and reads it out. However, if you mostly only encounter text in one language and don't want Envision to get confused, you can turn the Automatic Language Detection off in the Speech Setting within the app itself.
  • Within the Speech Settings, you can also adjust the speaking rate and the voice of all non-VoiceOver speech within the app. These changes do not affect the VoiceOver settings.
  • If you are a subscriber of Envision, you can refer a friend of yours to use Envision free of cost for one month. All they need to do is enter your associated email in the Referral page on the Settings tab to claim it.
Go to LetsEnvision.com