Tutorials


The functions of Envision AI are segmented into three different tabs for ease of use and understanding. The three tabs are Text Recognition, General Recognition and Help & Settings. Of these, Help & Settings contains the support elements of the app such as Tutorials, Feedback, Subscription and Account Information. The real features of the app can be experienced through the first two tabs which are explained in more detail below.

Text Recognition

Under this tab you would find all the features that deal with recognition of text. There are currently three options for recognising text:

Live Text

  • This feature enables you to read short pieces of texts like price tags and street names in real time.

  • Unlike other features, Live Text Recognition works on a continuous video feed, instead of taking a photo.

  • On selecting this option, it starts scanning the video feed for text. It makes a beeping sound once it detects text in the frame. You can use this to position and point your camera correctly.

  • Once a piece of text is detected, it would be recognised and read out in the phone's system language. 

  • Please note that currently, this works only with languages with latin scripts.

Handwritten Text

  • This feature enables you to read handwritten text on post cards, letters, etc. 
  • Point your phone at the piece of handwritten text you wish to read.
  • Selecting the option, would take a picture of the text you are pointing at.
  • A beeping sound would indicate that the image is being processed.
  • The recognised result would be spoken out by the app in the phone's system language.
  • Please note that currently, this works only with languages with latin scripts.

Document Text

  • This features enables you to read, explore and share longer pieces of text in documents, letters, etc.

  • Point your phone at the piece of handwritten text you wish to read.
  • Selecting the option, would take a picture of the text you are pointing at.
  • A beeping sound would indicate that the image is being processed.

  • After processing, the text will be displayed in a new page and read out in the language that the text is in.

  • Using VoiceOver, you can explore the text to only read the parts you are interested in.

  • You can also export this text using the Export button on the page.

  • This feature is currently capable of recognising more than 60 languages.


General Recognition

Under this tab, you would find all the recognition features apart from text like scene description, barcode recognition, etc. There are currently two options available in this tab:

Describe Scene

  • This features enables you to obtain a description of an image from your surrounding.
  • Point your phone at the direction of what you want to be described and take a picture by selecting this option.
  • A beeping sound would indicate that the image is being processed.
  • An output of the described scene would be displayed and read out in the phone's system language. 
  • If the person in the image has been taught to Envision, their name would be included in the description.
  • If the captured image is of an object that you have taught Envision, the name of the object would be read out.
  • If the captured image is of a watch, the current time would be read out.
  • If the captured images is of a window or the sky, the current weather would be read out.
  • Please be advised that though the accuracy of this feature is constantly improving, it is still very experimental and is known to give quite inaccurate descriptions from time to time. So take it with a pinch of salt.
  • This feature currently supports output in more than 30 languages.

Train faces and objects

  • This feature enables you to train faces of your friends and/or your personal objects to Envision.
  • Select one of the two options: Teach a face or Teach an object.
  • It opens upto a camera page with the back camera of the phone active. Please switch the camera if you want to take a selfie.
  • Take at least 5 images of the face or the object from different angles and in different surroundings. The more images you take, the better the recognition. 
  • Provide a name for the face or object you have just trained.
  • You will hear a success message as your new face or object is added to the library.
  • You can then test this by taking a picture of the person or the object using the "Describe Scene" feature.