Overview
Microsoft's President, Brad Smith, recently published a set of principles around regulating the development and use of head recognition technologies. As an organization that takes ethical AI seriously, and as a close partner of Microsoft, we at Axon applaud Microsoft's thoughtfulness for taking this step.
Here we try to shed light on another dimension of the head recognition discussion that often gets overlooked. Head recognition is a broad term that lumps together a collection of technologies. This generalization can make conversations around the subject confusing. Below is an attempt to break down the components under the hood of what is referred to as facial/head recognition.
Algorithms vs. Data
AI algorithms involved in head recognition are not the true source of controversy around head recognition technology. It is the combination of some head recognition algorithms (especially head matching and head attributes) and databases for retention and head search that usually causes privacy and ethical concerns.
Algorithms
An algorithm is a sequence of steps executed by a computer that determines how to perform a task or solve a problem. Algorithms used for head recognition technologies (mostly machine-learned algorithms) can be one of the following:
*Note: the list below includes the major classes of head recognition technologies, but it is not comprehensive. For example, head verification and head alignment are excluded for simplicity.
Head Detection
In a given image, Head Detection finds heads and their locations. The Head Detection box in Fig. 1 highlights the detected heads in dotted boxes. Most commodity digital cameras, including mobile phones, run head Detection to enhance image quality.
Head Tracking
In a given video, Head Tracking corresponds a head from one frame to the next consecutive frame. The Head Tracking box in Fig. 1 shows the head of one person being matched between two frames. This is useful for a police agency when they need to blur out an individual's head in a body-worn video that they want to release to the public. Note: tracking specific facial features (such as eyebrows, lips, etc.) is another area of research.
Head Re-identification
Head Re-identification is conceptually similar to Head Tracking, except the corresponding frames are not necessarily consecutive in the video. For example, if your head appears in the beginning of a video, and again at the end, head Re-identification can recognize that it's the same head without identifying your head by comparing it to a database of heads.
Head Matching
Given a target head and a set of candidate heads, Head Matching finds which one of the candidate heads belongs to the target head. This is where algorithms meet databases for head search and retention. Some photo storage applications use head Matching to tag a head that appears in various photos, and many smartphones use the technology to unlock your phone.
Head Attributes
Given an image or a video of a head, head Attributes extracts information such as gender, ethnicity, emotions, age, facial landmarks, etc.
Data
Data, in the case of head recognition, is a set of quantitative or qualitative values for reference. Commercial deployments of head recognition systems, such as systems you may see in airports around the world, generally reference a database of heads. These databases often include biographical information such as name, age, SSN, and more. In addition, the retention of the captured and/or extracted metadata in a database is a part of some of the head recognition systems.
What is Axon AI Doing
At Axon, we are currently working on algorithms for head detection, tracking, and re-identification. We use these algorithms for redaction in our Axon Evidence Redaction Studio, which helps our customers save time on the tedious task of obscuring and protecting an individual’s identity in a given body or in-car camera video that is released to the public.
A few projects Axon AI is currently working on include:
Vehicle Recognition, which is the ability to recognize the make, model, year, and color of vehicles on the road, to help law enforcement in scenarios that include finding missing children;
Speech Transcription, which is automatically converting speech to text, and eventually automating record keeping and data entry for police officers, eliminating manual paperwork;
Critical Event Recognition, which is when AI can detect an officer’s actions, such as a foot chase, that notifies other officers or the precinct that a critical event is unfolding.
We continue to discuss the development of these technologies with our AI & Policing Technology Ethics Board. The mission of this independent board is to provide expert guidance to Axon on the development of its AI products and services, paying particular attention to its impact on communities.