Difference between revisions of "Smartphone Facial Recognition"

Revision as of 23:34, 21 October 2022

By Kenneth Wu

Note: This page is incomplete.

Facial recognition systems are computer programs that match faces against a database [1]. A trivial task for humans, achieving high levels of accuracy has been difficult for computers until recently.^[1] Deep learning [2] through the use of convolutional neural networks [3] currently dominates the facial recognition field.^[2] However, deep learning uses much more memory, disk storage, and computational resources than traditional computer vision, presenting significant challenges to facial recognition with the limited hardware capabilities of smartphones.^[3] Accordingly, smartphone manufacturers have taken to using processors with dedicated neural engines for deep learning tasks ^[4] as well as creating simpler and more compact models that mimic the behavior of more complex models.^[3]

Model

Facial recognition systems accomplish their tasks by detecting the presence of a face, analyzing its features, and confirming the identity of the person.^[5] Training data is fed into a facial detection algorithm, where the two most popular such methods are the Viola-Jones algorithm and the use of convolutional neural networks.^[6]

Viola-Jones Algorithm

The Viola-Jones algorithm was the first real-time object detection framework, and works by converting images to grayscale and looking for edges that signify the presence of human features. ^[7] While highly accurate in detecting well-lit front-facing faces and also requiring relatively little memory, it is slower than deep-learning based methods, including the now industry-standard convolutional neural network (CNNs).^[6]

=Convolutional Neural Networks

Convolutional neural networks are closely related to artificial neural networks (ANNs) [4]. Unlike traditional ANNs, CNNs have three dimensions - width, depth, and height - and only connect to a certain subset of the preceding layer.^[8] Their architecture is sparse [5], topographic, and feed-forward [6]^[9] featuring an input and output layer along with three types of hidden layers.^[10] The first hidden layer type is convolutional, which involves using a filter of n x n size with pre-determined values, sweeping across a larger matrix at a pre-determined stride and adding the dot products to an map.^[8] This presents a significant advantage over ANNs by greatly reducing the amount of information stored ^[8]. Because convolution only uses matrix multiplication, another process is needed to introduce non-linearity; the most popular is the Rectified Linear-Unit (RELU), which replaces negative values with zero. ^[10] The next hidden layer type is pooling, which works to further reduce the size and required computational power.^[11] The most common pooling method involves sweeping over the activation layer with another layer, usually 2 x 2 with a stride of 2, and selecting the largest value to put onto the next activation layer.^[11] The final hidden layer is the fully-connected layer, where neurons are fully connected to their two adjacent layers, as in an ANN.^[8] CNNs are usually configured in one of two ways: the first stacks convolutional layers which then pass to a stack of pooling layers; the second alternates between two stacks of convolutional layers and a stack of pooling layers.^[8]

Deep CNNs are the go-to method for supervised training and are even capable of unsupervised classification given a large enough training data set.^[12] Training results in learned weights, which are data patterns or rules extracted from the provided images.^[13] The trained filter values help determine the visual features of an input image, which it can compare to its existing database for a match.^[13] Once trained, models can be retrained to include faces not included in the original training image set in a process known as transfer learning.^[14] Through this process, weights for feature extraction - finding the features in an image - are retained, while weights for classification are changed.^[14] In this way, smartphones can learn new faces after they have already been trained.

History

Applications

References

↑ Brownlee, J. (2019, July 5). A gentle introduction to deep learning for face recognition. Machine Learning Mastery. Retrieved November 14, 2022, from https://machinelearningmastery.com/introduction-to-deep-learning-for-face-recognition/
↑ Almabdy, S., & Elrefaei, L. (2019). Deep convolutional neural network-based approaches for face recognition. Applied Sciences, 9(20), 4397. https://doi.org/10.3390/app9204397
↑ ^3.0 ^3.1 Computer Vision Machine Learning Team. (2017, November). An on-device deep neural network for face detection. Apple Machine Learning Research. Retrieved November 14, 2022, from https://machinelearning.apple.com/research/face-detection#1
↑ Samsung. (2018). Exynos 9810: Mobile Processor. Samsung Semiconductor Global. Retrieved November 14, 2022, from https://semiconductor.samsung.com/processor/mobile-processor/exynos-9-series-9810/
↑ Klosowski, T. (2020, July 15). Facial recognition is everywhere. here's what we can do about it. The New York Times. Retrieved November 14, 2022, from https://www.nytimes.com/wirecutter/blog/how-facial-recognition-works/
↑ ^6.0 ^6.1 Enriquez, K. (2018, May 15). (thesis). Faster face detection using Convolutional Neural Networks & the Viola-Jones algorithm. California State University Stanislaus. Retrieved November 14, 2022, from https://www.csustan.edu/sites/default/files/groups/University%20Honors%20Program/Journals/01_enriquez.pdf.
↑ Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of Simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. https://doi.org/10.1109/cvpr.2001.990517
↑ ^8.0 ^8.1 ^8.2 ^8.3 ^8.4 O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
↑ Gurucharan, M. (2022, July 28). Basic CNN architecture: Explaining 5 layers of Convolutional Neural Network. upGrad. Retrieved November 14, 2022, from https://www.upgrad.com/blog/basic-cnn-architecture/#:~:text=other%20advanced%20tasks.-,What%20is%20the%20architecture%20of%20CNN%3F,the%20main%20responsibility%20for%20computation.
↑ ^10.0 ^10.1 Mishra, M. (2020, August 26). Convolutional neural networks, explained. Towards Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939
↑ ^11.0 ^11.1 Saha, S. (2018, December 15). A comprehensive guide to convolutional neural network. Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53
↑ Guérin, J., Gibaru, O., Thiery, S., & Nyiri, E. (2018). CNN features are also great at unsupervised classification. Computer Science & Information Technology. https://doi.org/10.5121/csit.2018.80308
↑ ^13.0 ^13.1 Khandelwal, R. (2020, May 18). Convolutional Neural Network: Feature map and filter visualization. Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-network-feature-map-and-filter-visualization-f75012a5a49c
↑ ^14.0 ^14.1 Tammina, S. (2019). Transfer learning using VGG-16 with deep convolutional neural network for classifying images. International Journal of Scientific and Research Publications (IJSRP), 9(10), 143–150. https://doi.org/10.29322/ijsrp.9.10.2019.p9420

[1] Brownlee, J. (2019, July 5). A gentle introduction to deep learning for face recognition. Machine Learning Mastery. Retrieved November 14, 2022, from https://machinelearningmastery.com/introduction-to-deep-learning-for-face-recognition/

[2] Almabdy, S., & Elrefaei, L. (2019). Deep convolutional neural network-based approaches for face recognition. Applied Sciences, 9(20), 4397. https://doi.org/10.3390/app9204397

[apple-3] 3.0 ^3.1 Computer Vision Machine Learning Team. (2017, November). An on-device deep neural network for face detection. Apple Machine Learning Research. Retrieved November 14, 2022, from https://machinelearning.apple.com/research/face-detection#1

[4] Samsung. (2018). Exynos 9810: Mobile Processor. Samsung Semiconductor Global. Retrieved November 14, 2022, from https://semiconductor.samsung.com/processor/mobile-processor/exynos-9-series-9810/

[5] Klosowski, T. (2020, July 15). Facial recognition is everywhere. here's what we can do about it. The New York Times. Retrieved November 14, 2022, from https://www.nytimes.com/wirecutter/blog/how-facial-recognition-works/

[enriquez-6] 6.0 ^6.1 Enriquez, K. (2018, May 15). (thesis). Faster face detection using Convolutional Neural Networks & the Viola-Jones algorithm. California State University Stanislaus. Retrieved November 14, 2022, from https://www.csustan.edu/sites/default/files/groups/University%20Honors%20Program/Journals/01_enriquez.pdf.

[7] Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of Simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. https://doi.org/10.1109/cvpr.2001.990517

[intro-8] 8.0 ^8.1 ^8.2 ^8.3 ^8.4 O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.

[9] Gurucharan, M. (2022, July 28). Basic CNN architecture: Explaining 5 layers of Convolutional Neural Network. upGrad. Retrieved November 14, 2022, from https://www.upgrad.com/blog/basic-cnn-architecture/#:~:text=other%20advanced%20tasks.-,What%20is%20the%20architecture%20of%20CNN%3F,the%20main%20responsibility%20for%20computation.

[layers-10] 10.0 ^10.1 Mishra, M. (2020, August 26). Convolutional neural networks, explained. Towards Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939

[basic-11] 11.0 ^11.1 Saha, S. (2018, December 15). A comprehensive guide to convolutional neural network. Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53

[12] Guérin, J., Gibaru, O., Thiery, S., & Nyiri, E. (2018). CNN features are also great at unsupervised classification. Computer Science & Information Technology. https://doi.org/10.5121/csit.2018.80308

[train-13] 13.0 ^13.1 Khandelwal, R. (2020, May 18). Convolutional Neural Network: Feature map and filter visualization. Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-network-feature-map-and-filter-visualization-f75012a5a49c

[learn-14] 14.0 ^14.1 Tammina, S. (2019). Transfer learning using VGG-16 with deep convolutional neural network for classifying images. International Journal of Scientific and Research Publications (IJSRP), 9(10), 143–150. https://doi.org/10.29322/ijsrp.9.10.2019.p9420

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

@@ Line 5: / Line 5: @@
 Facial recognition systems are computer programs that match faces against a database [https://en.wikipedia.org/wiki/Facial_recognition_system]. A trivial task for humans, achieving high levels of accuracy has been difficult for computers until recently.<ref> Brownlee, J. (2019, July 5). ''A gentle introduction to deep learning for face recognition.'' Machine Learning Mastery. Retrieved November 14, 2022, from https://machinelearningmastery.com/introduction-to-deep-learning-for-face-recognition/  </ref> Deep learning [https://en.wikipedia.org/wiki/Deep_learning] through the use of convolutional neural networks [https://en.wikipedia.org/wiki/Convolutional_neural_network] currently dominates the facial recognition field.<ref> Almabdy, S., &amp; Elrefaei, L. (2019). Deep convolutional neural network-based approaches for face recognition. ''Applied Sciences'', 9(20), 4397. https://doi.org/10.3390/app9204397 </ref> However, deep learning uses much more memory, disk storage, and computational resources than traditional computer vision, presenting significant challenges to facial recognition with the limited hardware capabilities of smartphones.<ref name="apple"> Computer Vision Machine Learning Team. (2017, November). ''An on-device deep neural network for face detection.'' Apple Machine Learning Research. Retrieved November 14, 2022, from https://machinelearning.apple.com/research/face-detection#1 </ref> Accordingly, smartphone manufacturers have taken to using processors with dedicated neural engines for deep learning tasks <ref> Samsung. (2018). ''Exynos 9810: Mobile Processor.'' Samsung Semiconductor Global. Retrieved November 14, 2022, from https://semiconductor.samsung.com/processor/mobile-processor/exynos-9-series-9810/ </ref> as well as creating simpler and more compact models that mimic the behavior of more complex models.<ref name="apple" />
 == Model ==
-Facial recognition systems accomplish their tasks by detecting the presence of a face, analyzing its features, and confirming the identity of the person.<ref> Klosowski, T. (2020, July 15). ''Facial recognition is everywhere. here's what we can do about it.'' The New York Times. Retrieved November 14, 2022, from https://www.nytimes.com/wirecutter/blog/how-facial-recognition-works/ </ref> Training data is fed into a facial detection algorithm, where the two most popular such methods are the Viola-Jones algorithm and the use of convolutional neural networks.<ref name="enriquez"> Enriquez, K. (2018, May 15). (thesis). ''Faster face detection using Convolutional Neural Networks & the Viola-Jones algorithm.'' California State University Stanislaus. Retrieved November 14, 2022, from https://www.csustan.edu/sites/default/files/groups/University%20Honors%20Program/Journals/01_enriquez.pdf. </ref> The Viola-Jones algorithm was the first real-time object detection framework, and works by converting images to grayscale and looking for edges that signify the presence of human features. <ref> Viola, P., &amp; Jones, M. (2001). Rapid object detection using a boosted cascade of Simple features. ''Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.'' CVPR 2001. https://doi.org/10.1109/cvpr.2001.990517 </ref> While highly accurate in detecting well-lit front-facing faces and also requiring relatively little memory, it is slower than deep-learning based methods, including the now industry-standard convolutional neural network (CNNs).<ref name="enriquez" />
+Facial recognition systems accomplish their tasks by detecting the presence of a face, analyzing its features, and confirming the identity of the person.<ref> Klosowski, T. (2020, July 15). ''Facial recognition is everywhere. here's what we can do about it.'' The New York Times. Retrieved November 14, 2022, from https://www.nytimes.com/wirecutter/blog/how-facial-recognition-works/ </ref> Training data is fed into a facial detection algorithm, where the two most popular such methods are the Viola-Jones algorithm and the use of convolutional neural networks.<ref name="enriquez"> Enriquez, K. (2018, May 15). (thesis). ''Faster face detection using Convolutional Neural Networks & the Viola-Jones algorithm.'' California State University Stanislaus. Retrieved November 14, 2022, from https://www.csustan.edu/sites/default/files/groups/University%20Honors%20Program/Journals/01_enriquez.pdf. </ref>
+===Viola-Jones Algorithm===
+The Viola-Jones algorithm was the first real-time object detection framework, and works by converting images to grayscale and looking for edges that signify the presence of human features. <ref> Viola, P., &amp; Jones, M. (2001). Rapid object detection using a boosted cascade of Simple features. ''Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.'' CVPR 2001. https://doi.org/10.1109/cvpr.2001.990517 </ref> While highly accurate in detecting well-lit front-facing faces and also requiring relatively little memory, it is slower than deep-learning based methods, including the now industry-standard convolutional neural network (CNNs).<ref name="enriquez" />
+===Convolutional Neural Networks==
 Convolutional neural networks are closely related to artificial neural networks (ANNs) [https://en.wikipedia.org/wiki/Artificial_neural_network]. Unlike traditional ANNs, CNNs have three dimensions - width, depth, and height - and only connect to a certain subset of the preceding layer.<ref name="intro"> O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. ''arXiv preprint arXiv:1511.08458''. </ref> Their architecture is sparse [https://en.wikipedia.org/wiki/Sparse_network], topographic,  and feed-forward [https://en.wikipedia.org/wiki/Feedforward_neural_network]<ref> Gurucharan, M. (2022, July 28). ''Basic CNN architecture: Explaining 5 layers of Convolutional Neural Network.'' upGrad. Retrieved November 14, 2022, from https://www.upgrad.com/blog/basic-cnn-architecture/#:~:text=other%20advanced%20tasks.-,What%20is%20the%20architecture%20of%20CNN%3F,the%20main%20responsibility%20for%20computation. </ref> featuring an input and output layer along with three types of hidden layers.<ref name="layers"> Mishra, M. (2020, August 26). ''Convolutional neural networks, explained.'' Towards Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939 </ref> The first hidden layer type is convolutional, which involves using a filter of n x n size with pre-determined values, sweeping across a larger matrix at a pre-determined stride and adding the dot products to an map.<ref name="intro" /> This presents a significant advantage over ANNs by greatly reducing the amount of information stored <ref name="intro" />. Because convolution only uses matrix multiplication, another process is needed to introduce non-linearity; the most popular is the Rectified Linear-Unit (RELU), which replaces negative values with zero. <ref name="layers" /> The next hidden layer type is pooling, which works to further reduce the size and required computational power.<ref name="basic"> Saha, S. (2018, December 15). ''A comprehensive guide to convolutional neural network.'' Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 </ref> The most common pooling method involves sweeping over the activation layer with another layer, usually 2 x 2 with a stride of 2, and selecting the largest value to put onto the next activation layer.<ref name="basic" /> The final hidden layer is the fully-connected layer, where neurons are fully connected to their two adjacent layers, as in an ANN.<ref name="intro" /> CNNs are usually configured in one of two ways: the first stacks convolutional layers which then pass to a stack of pooling layers; the second alternates between two stacks of convolutional layers and a stack of pooling layers.<ref name="intro" />

Difference between revisions of "Smartphone Facial Recognition"

Revision as of 23:34, 21 October 2022

Contents

Model

Viola-Jones Algorithm

=Convolutional Neural Networks

History

Applications

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools