June, 07 2016
If businesses could sense emotion using tech at all times, they could capitalize on it to sell to the consumer in the opportune moment. The artificial intelligence focused on emotion recognition is a the new frontier that could have huge consequences in not only advertising, but in new startups, healthcare, wearables, education, and more. There’s a lot of applications and API-accessible software online that parallels the human ability to discern emotive gestures. These algorithm driven APIs use use facial detection and semantic analysis to interpret mood from photos, videos, text, and speech.
The visual detection market is expanding tremendously. It was recently estimated that the global advanced facial recognition market will grow from $2.77 Billion in 2015 to $6.19 Billion in 2020. Emotion recognition takes mere facial detection/recognition a step further, and its use cases are nearly endless.
User response to video games, commercials, or products can all be tested at a larger scale, with large data accumulated automatically, and thus more efficiently. Technology that reveals your feelings has also been suggested to spot struggling students in a classroom environment, or help autistics better interact with others. Some use cases include:
Helping to better measure TV ratings.
Adding another security layer to security at malls, airports, sports arenas, and other public venues to detect malicious intent.
Wearables that help autistics discern emotion
Check out counters, virtual shopping
Creating new virtual reality experiences
From a scientifical point of view, emotive analytics is an interesting blend of psychology and technology. A core set of emotions have been shown to be universally conveyed by facial expressions around the world: happiness, surprise, fear, anger, disgust, and sadness. So, the real challenge is to reliably detect and extract the micro expressions by analyzing the relationship between points on the face and the modeled emotions.
It is then evident that the first crucial step is to choose the correct features. Usually, two types of features are often taken into consideration: geometric features and appearance features: the first represent the face in terms of shape and location of the principal facial characteristics such as nose, mouth, eyes; the second describe faces in terms of texture, therefore considerating wrinkles, bulges, furrows. This is why we can often see something like a fishnet mask over our faces when trying emotion recognition demo software. This mask is composed of our characteristic face landmarks [3,4], which are unique for each person and, at the same time, general enough to describe a face (model).
The second step in the procedure consists in defining a decision/classification rule which associates the synthetic representation of the face with the “correspondent” facial expression. The expression recognition methods can be thought as two types of analysis: one is focused on extracting the emotion (with or without a reference) on the single frame [6, 7], while the other considers the sequence of the images referred to a single person [5, 8]. If these two types of analysis are not joint in some way, there can be limitations on the performances. For example, every classification method consists, in general, in associating any two examples having the same features to the same corresponding class. In the case of modelling the perception of a mood state, facial expressions are ambiguous and different people might perceive differently the same expression, thus leading to errors in assessing the emotion. Moreover, there is also the need of incorporating human expertise to simplify, accelerate and improve the modelling process.
Therefore, new approaches  tend to model the possible ambiguities in human perception of static facial expressions and improve the descriptiveness of a face by introducing a more complete set of contextual features. In August 2006 Sorci and Antonini  published the internet facial expressions evaluation survey in order to find a way to directly get humans’ perception of facial expressions. The aim of the survey is to collect a dataset created by a sample of real human observers, from all around the world, doing different jobs, having different cultural backgrounds, ages and gender, belonging to different ethnic groups, doing the survey from different places (work, home, on travel, etc.). The images used in the survey comes from the Cohn–Kanade database .
Computer vision APIs for mood recognition use facial detection, eye tracking, and specific facial position cues to determine a subject’s mood. There are many APIs that scan an image or video to detect faces, but these go the extra mile to spit back an emotive state. This is often a combination of weight assigned to 7 basic emotions, and valence — the subject’s overall sentiment. Here it is a list of the most popular APIs for Emotion Recognition.
1 – Google Cloud Vision
Google has released a beta for its latest cloud-based application program interface, which can detect faces, signs, landmarks, objects, text and even emotions within a single image. The Cloud Vision APIs can also detect facial features, allowing it to find images that display certain emotions. This means Google’s cloud platform can technically detect fear just as well as it can identify a taco or a goldfish. It is a pay-service for which developers can begin working with Google Cloud Vision API starting today, with the first 1,000 uses of each feature free per month. Those looking to sift through even more photos than that can expect to pay between $0.60 to $5 a month per feature, depending on usage. As always, need to pay attention to Google next moves on this field, they often raised benefits from diffusing simple and powerful tools to developers.
2 – Project Oxford by Microsoft
Microsoft’s Project Oxford is a catalogue of artificial intelligence APIs focused on computer vision, speech, and language analysis. The APIs only works with photos. The Emotion API for Video recognizes the facial expressions of people in a video, and returns an aggregate summary of their emotions. You can use this API to track how a person or a crowd responds to your content over time. The emotions detected are anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. Upload a photo to the free online demo here to test Project Oxford’s computer vision capabilities. As for Google, also Microsoft seems to be very interested in the developers’ community, which already has helped Microsoft to expand the Kinect market.
3 – Emotient
Emotient is now part of Imotions, which syncs with Emotient’s facial expression technology, and adds extra layers to detect confusion and frustration. The Imotions API can monitor video live feeds to extract valence, or can aggregate previously recorded videos to analyze for emotions. Particularly interesting is the possibility of combination with stimuli, eye-tracker, EEG, GSR (and others) as well as the application to a sensitive situation like car driving.
4 – Affectiva
With 3,289,274 faces analyzed to date, Affectiva is another solution for massive scale engagement detection. They offer SDKs and APIs for mobile developers, and provide nice visual analytics to track expressions over time. Visit their test demo to graph data points in response to viewing various ads. Interesting also the gaming perspective, where a new generation of games will adapt themselves to our emotions.
5 – Kairos
The Emotion Analysis API by Kairos is a scalable and on-demand tool, you send them video, and they send back coordinates that detect smiles, surprise, anger, dislike and drowsiness. They offer a Free Demo (no account setup required) that will analyze and graph your facial responses to a few commercial ads. The Kairos repo could be a developer favorite. It has transparent documentation for its Face Recognition API , Crowd Analytics SDK, and Reporting API. The Emotion Analysis API just recently went live.
<img class="wp-image-1466 aligncenter" src="http://fanci-project.eu/wp-content/uploads/2016/06/kairos-blog-003-1024×516.jpg" alt="kairos-blog-003" width="661" height="333" srcset="http://fanci-project.eu/wp-content/uploads/2016/06/kairos-blog-003-1024×516.jpg 1024w, http://fanci-project.eu/wp-content/uploads/2016/06/kairos-blog-003-250×126.jpg 250w, http://fanci-project.eu/wp-content/uploads/2016/06/kairos-blog-003-768×387 viagra kosten in der apotheke.jpg 768w” sizes=”(max-width: 661px) 100vw, 661px” />
6 – Faciometrics
Founded at Carnegie Mellon University (CMU), FacioMetrics is a company that provides SDKs for incorporating face tracking, pose and gaze tracking, and expression analysis into apps. Moreover, they implement Dense 3D tracking to improve results. Their demo video outlines some creative use cases in virtual reality scenarios. The software can be tested using the Intraface iOS app.
Bonus – Face++
Face++ is more of a face recognition tool that compares faces with stored faces — perfect for name tagging photos in social networks. Anyway, it does determine if a subject is smiling or not. Face++ has a wide set of developer SDKs in various languages and examples, and an online demo. Moreover, the “default” demo shows the concept of facial landmarks, which are the basis for all the applications involving the face, wheter they are focused on recognition, mood detection, authentication, alterations.
Wondering why there’s nothing about FANCI mood recognition??? We’ll talk about it very soon…
 – M. Sorci, G. Antonini, J. Cruz, T. Robin, M. Bierlaire, J.-Ph. Thiran, “Modelling human perception of static facial expressions”, Image and Vision Computing 28 (2010), pag. 790–806
 – M. Sorci, G. Antonini, J.-P. Thiran, M. Bierlaire, Facial Expressions Evaluation Survey, iTS (2007).
 -C. Hu, Y. Chang, R. Feris, M. Turk, Manifold based analysis of facial expression, CVPRW ’04: Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW’04), vol. 5, IEEE Computer Society, Washington, DC, USA, 2004, p. 81.
 -Y. Zhang, Q. Ji, Active and dynamic information fusion for facial expression understanding from image sequences, Transactions on Pattern Analysis and Machine Intelligence 27 (5) (2005) 699–714.
 – I. Cohen, N. Sebe, L. Chen, A. Garg, T.S. Huang, Facial expression recognition from video sequences: temporal and static modeling, Computer Vision and Image Understanding (10) (2003) 160–187.
 – G. Antonini, M. Sorci, M. Bierlaire, J. Thiran, Discrete choice models for static facial expression recognition, in: J. Blanc-Talon, W. Philips, D. Popescu, P. Scheunders (Eds.), 8th International Conference on Advanced Concepts for Intelligent Vision Systems, Lecture Notes in Computer Science, vol. 4179, Springer, Berlin/Heidelberg, Berlin, 2006, pp. 710–721.
 -M. Pantic, L.J.M. Rothkrantz, An expert system for recognition of facial actions and their intensity, in: National Conference on Artificial Intelligence (AAAI), 2000, pp. 1026–1033.
 – I.A. Essa, A.P. Pentland, Coding, analysis, interpretation, and recognition of facial expressions, IEEE Transaction on Pattern Analysis and Machine Intelligence 19 (7) (1997) 757–763.
 – T. Kanade, J. Cohn, Y.L. Tian, Comprehensive database for facial expression analysis, in: Proceedings of the fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG’00), 2000, pp. 46–53