An Introduction to Computer Vision
What is Computer Vision?
Computer Vision is a subcategory of artificial intelligence that deals with how computers can gain a high-level understanding of the visual world from digital images or videos. If AI enables computers to think, computer vision enables them to see, observe and understand.
Though early experiments in computer vision started in the 1950s and its first commercial use came in the 1970s to distinguish between typed and handwritten text, until the 1990s, computer vision only worked in a limited capacity. However, as the internet made strides in the 1990s, making large sets of images available online for analysis, facial recognition programs flourished. These growing data sets helped make it possible for machines to identify specific people in photos and videos.
Factors leading to advancements in this area –
Today, thanks to advances in artificial intelligence, deep learning and neural networks, the field of Computer Vision has taken great leaps in recent years and has surpassed humans in some tasks related to detecting and labelling objects. Accuracy rates have gone from 50% to 99% in less than a decade.
Several factors have together contributed to bringing about a renaissance in computer vision:
- Mobile phones with built-in cameras have saturated the world with photos and videos. (more than 3 billion images are shared online every day)
- The computing power required to analyze the data has become more affordable and easily accessible.
- Hardware designed specifically for computer vision and analysis is more widely available.
- New algorithms like convolutional neural networks can take advantage of the hardware and software capabilities and hence improving the object detection capabilities.
The effects of these advances on the computer vision field have been astounding and it is estimated that by end of 2022, the computer vision and hardware market is expected to reach $48.6 billion.
How does Computer Vision work?
Computer vision works in three basic steps :
- Capturing an image: Images and even large sets can be acquired in real-time through video, photos or 3D technology for analysis.
- Processing the image: Deep learning models are used to automate much of this process, but the models are often trained by first being fed thousands of labelled or pre-identified images.
- Understanding the image: This is the final step where an object is identified or classified.
Today’s AI systems can go a step further and take actions based on an understanding of the image. Many types of computer vision are used in different ways:
- Image classification: It sees an image and can classify it (a dog, an apple, a person’s face). More precisely, it is able to accurately predict that a given image belongs to a certain class.
- Object detection – It identifies a specific object in an image. Advanced object detection recognises many objects in a single image: a football field, an offensive player, a defensive player, a ball and so on. These models use X and Y coordinates to create a bounding box and identify everything inside the box.
- Image segmentation – It partitions an image into multiple pieces or regions to be examined separately.
- Facial identification – It is an advanced type of object detection that not only recognizes a human face in an image but identifies a specific individual.
- Edge detection – It is a technique used to identify the outside edge of an object or landscape to better identify what is in the image.
- Pattern detection – It is a process of recognizing repeated shapes, colours and other visual indicators in images.
- Feature matching – It is a type of pattern detection that matches similarities in images to help classify them.
- Scene reconstruction – It creates a 3D model of a scene inputted through images or video.
- Image restoration – In it, noise such as blurring is removed from photos using Machine Learning based filters.
Any simple application of computer vision may only use one of these techniques, but more advanced uses, like computer vision for self-driving cars, rely on multiple techniques to accomplish their goal.
There is a lot of research being done in the computer vision field, but it’s not just research. Real-world applications demonstrate how important computer vision is to endeavours in business, security, transportation, healthcare, and everyday life.
- Computer Vision in Self-Driving Cars – Self-driving cars have been at the core of the automobile industry over the past few years and computer vision brought a promise to transform this vision into reality. YOLO (You Look Only Once) is an immensely popular computer vision algorithm used for autonomous driving which can efficiently detect objects in the path.
- Computer vision in Facial Recognition – Computer vision also plays an important role in facial recognition applications, the technology that enables computers to match images of people’s faces to their identities. Consumer devices use facial recognition to authenticate the identities of their owners. Social media apps use facial recognition to detect and tag users. Law enforcement agencies also rely on facial recognition technology to identify criminals in video feeds. Siamese Network in computer vision is used to carry out facial recognition.
- Computer Vision in Augmented Reality – Computer vision also plays an important role in augmented reality, the technology that enables computing devices such as smartphones, tablets and smart glasses to overlay and embed virtual objects on real-world imagery. Using computer vision, AR can render a 3D registration of real and virtual objects.
- Computer Vision in Healthcare – Computer-Aided Diagnosis can assist medical professionals in training. Doctors can interpret medical images used in techniques like X-Ray and MRI using computer vision efficiently.
- Computer Vision in Business via Machine Vision – Machine Vision is defined as a set of methods to enable image-based automation for business operations like process control, automated inspection, robot guidance, etc. It is a bifurcation of systems engineering that integrated existing technologies in new ways and use them to solve real-world problems.
These are just a few applications of Computer Vision. If you want to dive deeper into CV Applications, a link is provided at the end of this article.
Why is Computer Vision difficult to implement?
- Mission-critical Computer Vision Use Cases Depend on Edge AI.
- Computer Vision Is Difficult Because Hardware Limits It.
- The Complexity of Scaling Computer Vision Systems.
Though much progress has been made, both in charting the process and in terms of discovering the tricks and shortcuts used in Computer Vision, we are far away from utilizing the power of computer vision to the fullest. However, with the recent developments in this field, the future looks bright for Computer Vision.
Interesting links related to this topic:
- Stanford University Video Playlist on Computer Vision – https://www.youtube.com/watch?v=vT1JzLTH4G4&list=PLf7L7Kg8_FNxHATtLwDceyh72QQL9pvpQ
- Roadmap to Computer Vision – https://towardsdatascience.com/roadmap-to-computer-vision-79106beb8be4
- 27+ Most Popular Computer Vision Applications and Use case in 2022 –https://www.v7labs.com/blog/computer-vision-applications
- Computer Vision project ideas – https://www.youtube.com/watch?v=ud7MjqDtP-Q
- Computer Vision Algorithms – https://www.sciencedirect.com/topics/computer-science/computer-vision-algorithms