Learning Microsoft Cognitive Services
上QQ阅读APP看书,第一时间看更新

Vision

APIs under the vision flags allow your apps to understand images and video content. They allow you to retrieve information about faces, feelings, and other visual content. You can stabilize videos and recognize celebrities. You can read text in images and generate thumbnails from videos and images.

There are four APIs contained in the vision domain, which we will look at now.

Computer vision

Using the computer vision API, you can retrieve actionable information from images. This means that you can identify content (such as image format, image size, colors, faces, and more). You can detect whether or not an image is adult/racy. This API can recognize text in images and extract it to machine-readable words. It can detect celebrities from a variety of areas. Lastly, it can generate storage-efficient thumbnails with smart-cropping functionality.

We will look into computer vision in Chapter 2, Analyzing Images to Recognize a Face.

Face

We have already seen a very basic example of what the Face API can do. The rest of the API revolves around the detection, identification, organization, and tagging of faces in photos. As well as face detection, you can also see how likely it is that two faces belong to the same person. You can identify faces and also find similar-looking faces. We can also use the API to recognize emotions in images.

We will dive further into the Face API in Chapter 2, Analyzing Images to Recognize a Face.

Video indexer

Using the video indexer API, you can start indexing videos immediately upon upload. This means that you can get video insights without using experts or custom code. Content discovery can be improved, utilizing the powerful artificial intelligence of this API. This allows you to make your content more discoverable.

The video indexer API will be covered in greater detail in Chapter 3, Analyzing Videos.

Content moderator

The content moderator API utilizes machine learning to automatically moderate content. It can detect potentially offensive and unwanted images, videos, and text for over 100 languages. In addition, it allows you to review detected material to improve the service.

The content moderator will be covered in Chapter 2, Analyzing Images to Recognize a Face.

Custom vision service

The custom vision service allows you to upload your own labeled images to a vision service. This means that you can add images that are specific to your domain to allow recognition using the computer vision API.

The custom vision service will be covered in more detail in Chapter 2, Analyzing Images to Recognize a Face.