Resource Center
In a world where trust and accountability are paramount, Axon’s body-worn cameras (BWCs) serve as a vital tool in ensuring reliable evidence. But how does Axon approach building these cameras to provide an objective account of incidents in the field? And what is important to understand about the relationship between camera vision and human vision?
We spoke to Juha Alakarhu, VP of Imaging at Axon, to delve into the imaging and design principles behind Axon’s BWCs to uncover how they strike a balance between reflecting the most important aspects of human vision while maintaining objectivity.
Axon designs its BWCs with a singular mission: to provide trustworthy, objective video evidence that empowers law enforcement to be transparent with their communities. In a world rife with misinformation and AI-manipulated content such as deepfakes, Axon focuses on secure, tamper-proof, true-to-life imaging. Ensuring consistent, high-quality footage that accurately reflects the truth means, for instance, avoiding AI-generated artifacts that could distort reality.
“What we want to do here is [ensure our cameras] generate video that our customers can trust. [The camera] is not adding anything that was not there. It's true to life,” says Alakarhu.
From the hardware design to secure evidence storage, everything in Axon body-worn cameras is engineered to ensure the footage remains unaltered and authentic. Unlike consumer devices, which may prioritize enhancing or altering imagery, Axon cameras are built to capture reality as it is, without unnecessary processing that could introduce artifacts or inaccuracies. The footage is securely uploaded to the Axon Evidence platform, complete with configurable permissions and audit trails, ensuring its integrity throughout the process.
Reliability and clarity are at the core of Axon’s imaging principles. Every design decision, from field of view (FOV) to low-light performance, is rooted in ensuring that the cameras support law enforcement professionals in the most challenging scenarios. This philosophy is backed by decades of expertise from Alakarhu’s talented team of camera specialists dedicated to understanding and anticipating user needs.
Body-worn cameras are invaluable tools for precisely documenting the events as they unfold, and having multiple cameras on scene from different angles enhances the overall picture of the incident. However, no camera, no matter the angle, can fully prove or disprove an individual’s perception of a given moment during the incident—whether that person is a member of the public or an officer. The most effective approach to protecting lives and advancing justice combines comprehensive camera footage with thorough, real-time reporting, sound investigative practices, and transparent accountability measures.
Human vision is a powerful imaging system, capable of processing dynamic, complex scenes with extraordinary depth, precision and sensitivity. Axon strives to create technology that aligns closely with human vision’s key characteristics. However, a camera sensor will also differ in certain key areas. Understanding these differences helps contextualize why both human vision and camera vision are valuable in their own ways.
The human eye has about 180-200 degree horizontal and 130 degree vertical field of view (FOV). Our stereo vision, where both eyes' fields overlap, covers about 120-140 degrees horizontally. “You can easily test this: move a finger into the periphery of your right-hand-side vision, then close your right eye. You’ll notice the finger disappears until you move it closer to the center to the stereo vision area,” says Alakarhu.
For the vertical field of view, we can perceive nearly the entire range in stereo. “An interesting fact about the vertical field of view is its slight asymmetry: we can see slightly more with our eyes pointed toward the ground than toward the sky—clearly being able to see where we step has been more crucial than observing what happens above!” notes Alakarhu. Also, note that human vision shape is not a rectangle, but more like an oval, and the aspect ratio is closer to 4:3 than 16:9.
Axon Body 4, the most advanced BWC offered by Axon, has about 160-degree diagonal FOV and two configurable aspect ratio settings. In 16:9 mode, its horizontal FOV roughly matches the human stereo field of view (140 degrees), but the vertical FOV (76 degrees) is significantly narrower. In the new 4:3 aspect ratio mode, the Axon Body 4 has 127-degree horizontal and 93-degree vertical FOV. Most consumer video content is 16:9, but the 4:3 option allows Axon Body 4 to better match the human eye aspect ratio if desired.
Furthermore, human vision has a blind spot, the area where the optic nerve connects to the retina. The blind spot is not small - about 7.5 degrees vertically and 5.5 degrees horizontally, but we don’t normally detect it, because we have binocular vision and our brains correct it automatically by “filling in” missing information. Of course, Axon body-worn cameras and other digital cameras don’t have a blind spot.
Last but not least, the body camera and the human eye are not perfectly aligned. We observe our surroundings by moving our head and eyes. This way, it is possible that the body camera footage covers things that were not in the officer’s field of view at the given moment, or the other way around, officer saw more than was captured. Axon has a Flex POV camera that can be attached to the eyeglasses or helmet to make the fields more aligned.
While the field of view can be straightforward, things start to get more interesting when we talk about resolution. Different sources quote different megapixel counts for human vision; however, Juha Alakarhu thinks they are mostly misleading due to fundamental differences in human eye and camera. It is better to use metric “pixels per degree”. Even though we don’t see “pixels”, we can compare human eye and cameras using this metric.
The human eye is capable of at least 60 pixels per degree (PPD) resolution , but only in the central part of our vision. This part corresponds to a special area in our retina, called fovea, which has a very high density of cone photoreceptors that are responsible for sharp vision and color perception. The fovea covers about 5 degrees, and its most central part, foveola, about 1-2 degrees of our field of view. Outside fovea, the resolution drops quickly being well below 10 PPD in the peripheral vision. This may sound confusing, because we feel that we see a high amount of details in all directions. However, our brains create this illusion while we are continuously scanning our surroundings. Try reading a text without watching the letters directly - it is very hard!
A digital camera field resolution is defined by its field of view and pixel count. Axon Body 4 has about 14, 9 and 6 PPD resolutions in 1080p, 720p, and 480p modes respectively. There can be slight differences over the field of view due to optics properties, but it is typically quite uniform. There is a significant difference to human eye, where resolution changes dramatically over field of view, and the area of the best resolution position moves when we more our heads and eyes.
Let’s consider a typical scenario: something happens at the edge of an officer’s field of view. This observation using the peripheral vision happens with a low resolution, but due to the possible movement or change occurring in the field view, it is high enough to catch the officer’s attention. The person quickly turns the head and eyes towards the situation, and as the scene lands into the foveal vision, it is possible to see the situation in high detail. This way, the camera can have a higher resolution image of the situation than the officer’s eyes in the beginning but lower than that after officer’s focusing.
The dynamic nature of the human eye resolution is one of the fundamental reasons why it is very hard to know for sure if the officer saw or didn't see something that is visible in the video footage.
The human eye is incredibly good in low light. Alakarhu gives an example comparing human vision of the Milky Way to that of a camera. “Even if you point a big SLR camera to the Milky Way, you may not be able to capture it in video mode. It is amazing how good human eye is, especially given how small it is. The human eye is still better than cameras in extreme low-light conditions,” says Alakarhu.
However, the human eye needs a longer time to adapt to dark conditions. The full adjustment to see the Milky Way in all its glory could take 30 minutes or even more.
In contrast, Axon cameras adapt to dark or bright conditions very quickly, typically in a second or so. The behavior is quite different from the human eye. While Axon body-worn cameras don’t have the full low-light performance of a human eye at full adjustment, their ability to quickly adapt to changing light conditions suits the fast-moving nature of the incidents officers respond to, lending to a very strong overall low-light performance.
Apart from the fundamental differences in resolution, low-light performance, and field of view, there are several other intriguing distinctions, such as dynamic range, colors, and video processing. While we don’t explore all these aspects in detail here, this article focuses on a few critical topics related to using cameras as evidence.
When under stress or intensive focus on something, humans may focus intently on a single object or event, neglecting their peripheral surroundings. In this kind of situation, our eyes and brain focus on the single most important task and ignore everything else.
A camera, by contrast, captures everything within its FOV consistently. Camera does not pay attention to any particular subject.
Now, if the recorded evidence is viewed later, especially frame by frame, the viewers will have a completely different experience than the officer who was involved. The people who view the video can spend a long time analyzing different areas of each frame, but in real life each frame lasts a fraction of a second. This way, the people who view the video can see things that the officer didn’t see at all or at least didn’t see in high resolution.
Unlike cameras, human memory is susceptible to change over time, influenced by stress, emotions, additional developments and subsequent recollections.
Our brain and visual system “compresses” the data by storing only the things that we believe to be relevant. What is the color of the jacket of the person you saw on the street 10 hours ago? You probably dont remember it, unless it was very distinct. If it was distinct and you remember it, your brain has clearly thought that was relevant information to be stored. Our memories can also fade and change over time, and the problem is that false memories feel as real as the real ones.
For cameras, everything is equally relevant, everything is stored, and everything can be accessed later without changes.
On the other hand, people may see things that were not captured by the camera. There can be additional contexts outside the field of view, human observers may have seen more details than the camera has captured, or the events may have been unfolding before the camera started recording.
That’s why it’s essential to have both verbal statements and body camera footage to provide a more comprehensive view of the incidents. Both perspectives lend to creating a full picture of the objective facts, as well as the experience of those involved and the perceptions that informed their decision-making.
Axon is continually innovating to meet the evolving needs of its users. Future advancements will likely focus on making cameras smaller, lighter, and more intuitive, with enhanced capabilities that further bridge the gap between technology and human perception.
As BWCs evolve, Axon’s commitment to trustworthy imaging ensures that they will remain a cornerstone of transparent, accountable policing. By understanding and respecting the differences between human vision and digital cameras, Axon delivers tools that not only document the truth but also empower those who rely on them.
In a world increasingly shaped by digital imagery, Axon’s cameras remind us of the power of seeing—and capturing—the truth. For Alakarhu, this work — his “dream job” — is about contributing to a system of truth, justice, and understanding through principled camera design. He and his team are able to blend their passion for engineering with the opportunity to create tools that can make a meaningful impact.
In the process of designing cameras for valued public safety professionals, his team explores the intersection of technology and human behavior, constantly striving to bridge the gap between what we see, what we think we see, and what truly happened. “We are very proud of the work we do here at Axon,” says Alakarhu, to support those on the front lines.
We would like to thank Associate Professor Soile Nymark of Tampere University for her insightful and valuable comments regarding this article.