A list of research papers on face recognition
By Josh Davis
This case report describes novel methodology used to identify a 43-year-old post-mortem photo of a drowned male recovered from a London river in the 1970s. Embedded in an array of foils, police super-recognisers (n = 25) possessing superior simultaneous face matching ability, and police controls (n = 139) provided confidence ratings as to the similarity of the post-mortem photo to an ante-mortem photo of a man who went missing at about the same time. Indicative of a match, compared to controls, super-recognisers provided higher ratings to the target than the foils. Effects were enhanced when drawing on the combined wisdom of super-recogniser crowds, but not control crowds. These findings supported additional case evidence allowing the coroner to rule that the deceased male and missing male were likely one and the same person. A description of how similar super-recogniser wisdom of the crowd procedures could be applied to other visual image identification cases when no other method is feasible is provided.
Super-recognisers occupy the extreme top end of a wide spectrum of human face recognition ability. Although test scores provide evidence of super-recognisers’ quantitative superiority, their abilities may be driven by qualitatively different cognitive or neurological mechanisms. Some super-recognisers scoring exceptionally highly on multiple short-term face memory tests do not achieve superior performances on measures of simultaneous face matching, long term face memory and/or spotting faces in a crowd. Heterogeneous performance patterns have implications for police, security or business aiming to utilise super-recognisers’ superior skills. Drawing on a global participant base (n ≈ 6,000,000), as well as theory and empirical research, this paper describes the background, development, and employment of tests designed to measure four components of superior face processing to assist in recruitment and deployment decisions.
It is suggested that accurate personality judgments of faces are driven by a morphological ‘kernel of truth’ from face shape. We hypothesised that this relationship could lead to those with better face identification ability being better at personality judgments. We investigated the relationship between face memory, face matching, Big Five personality traits, and accuracy in recognising Big Five personality traits from 50 photographs of unknown faces. In our sample (n = 792) there was overall good (but varying) face memory and personality judgment accuracy. However, there was convincing evidence that these two skills do not correlate (all r < 0.06). We also replicate the known relationship between extraversion and face memory ability in the largest sample to date.
Super-recognisers inhabit the extreme high end of an adult face processing ability spectrum in the population. While almost all research in this area has evaluated those with poor or mid-range abilities, evaluating whether super-recognisers’ superiority generates distinct electrophysiological brain activity, and transcends to different age group faces (i.e., children's) is important for enhancing theoretical understanding of normal and impaired face processing. It may also be crucial for policing, as super-recognisers may be deployed to operations involving child identification and protection. In Experiment 1, super-recognisers (n = 315) outperformed controls (n = 499) at adult and infant face recognition, while also displaying larger cross-age effects. These findings were replicated in Experiment 2 (super-recognisers, n = 19; controls, n = 28), although one SR with frequent infant exposure showed no cross-age effect. Compared to controls, super-recognisers also generated significantly greater electrophysiological activity in event-related potentials associated with pictorial processing (P1) and explicit recognition (P600). Experiment 3, employing an upright and inverted sequential matching design found super-recognisers (n = 24) outperformed controls (n = 20) at adult and infant face matching, but showed no upright cross-age matching effects. Instead, they displayed larger inversion effects, and cross-age inversion effects, implicating the role of holistic processing in their perceptual superiority. Larger cross-age effects in recognition, but not matching suggests that super-recognisers’ adult face recognition is partly driven by experience. However, their enhanced infant face recognition suggest super-recognisers’ superiority is also experience-independent, results that have implications for policing and for models of face recognition.
Police worldwide regularly review closed‐circuit television (CCTV) evidence in investigations. This research found that London police experts who work in a full‐time “Super‐Recogniser Unit” and front line police identifiers regularly making suspect identifications from CCTV possessed superior unfamiliar face recognition ability and, with higher levels of confidence, outperformed controls at locating actors in a bespoke Spot the Face in a Crowd Test. Police were also less susceptible to change blindness errors and possessed higher levels of conscientiousness and lower levels of neuroticism and openness. Controls who took part in Spot the Face in a Crowd Test actor familiarisation training outperformed untrained controls, suggesting this exercise might enhance identification of persons of interest in real investigations. This research supports an accumulating body of evidence demonstrating that international police forces may benefit from deploying officers with superior face recognition ability to roles such as CCTV review, as these officers may be the most likely to identify persons of interest.
Durova, M. D., Dimou, A., Litos, G., Daras, P., & Davis, J. P. (2017). TooManyEyes: Super-recogniser directed identification of target individuals on CCTV. Proceedings of the 8th IET International Conference on Imaging for Crime Detection and Prevention (ICDP-17), IET Digital Library, 43-48. DOI.org/10.5281/zenodo.1071986
For the current research, a `Spot the Face in a Crowd Test' (SFCT) comprising six video clips depicting target-actors and multiple bystanders was loaded on TooManyEyes, a bespoke multi-media platform adapted here for the human-directed identification of individuals in CCTV footage. To test the utility of TooManyEyes, police `super-recognisers' (SRs) who may possess exceptional face recognition ability, and police controls attempted to identify the target-actors from the SFCT. As expected, SRs correctly identified more target-actors; with higher confidence than controls. As such, the TooManyEyes system provides a useful platform for uploading tests for selecting police or security staff for CCTV review deployment.
The deployment of police super-recognisers (SRs) with exceptional face recognition ability, has transformed the manner in which some forces manage CCTV evidence. In London, SRs make high numbers of sometimes disguised suspect identifications from CCTV. In two experiments measuring immediate and one-week memory of unfamiliar faces in disguise, SRs were more accurate and confident than controls at correctly identifying targets, and ruling out faces not seen before. Accuracy and confidence were highest when targets wore no disguise, followed by hat and plaster, sunglasses, and balaclavas respectively. Even in the balaclava condition, SRs were more accurate than chance levels. These findings add to an accumulating body of empirical evidence demonstrating that SRs possess wide-ranging enhanced face processing abilities, and their deployment should complement ever advancing computerised face recognition systems.
There are large individual differences in the ability to recognise faces. Super‐recognisersare exceptionally good at face memory tasks. In London, a small specialist pool of police officers (also labelled ‘super‐recognisers’ by the Metropolitan Police Service) annually makes 1000's of suspect identifications from closed‐circuit television footage. Some suspects are disguised, have not been encountered recently or are depicted in poor quality images. Across tests measuring familiar face recognition, unfamiliar face memory and unfamiliar face matching, the accuracy of members of this specialist police pool was approximately equal to a group of non‐police super‐recognisers. Both groups were more accurate than matched control members of the public. No reliable relationships were found between the face processing tests and object recognition. Within each group, however, there were large performance variations across tests, and this research has implications for the deployment of police worldwide in operations requiring officers with superior face processing ability.
When the police have no suspect, they may ask an eyewitness to construct a facial composite of that suspect from memory. Faces are primarily processed holistically, and recently developed computerized holistic facial composite systems (e.g., EFIT-V) have been designed to match these processes. The reported research compared children aged 6–11 years with adults on their ability to construct a recognizable EFIT-V composite. Adult constructor's EFIT-Vs received significantly higher composite-suspect likeness ratings from assessors than children's, although there were some notable exceptions. In comparison to adults, the child constructors also overestimated the composite-suspect likeness of their own EFIT-Vs. In a second phase, there were no differences between adult controls and constructors in correct identification rates from video lineups. However, correct suspect identification rates by child constructors were lower than those of child controls, suggesting that a child's memory for the suspect can be adversely influenced by composite construction. Nevertheless, all child constructors coped with the demands of the EFIT-V system, and the implications for research, theory, and the criminal justice system practice are discussed.
The paradigm detailed in this manuscript describes an applied experimental method based on real police investigations during which an eyewitness or victim to a crime may create from memory a holistic facial composite of the culprit with the assistance of a police operator. The aim is that the composite is recognized by someone who believes that they know the culprit. For this paradigm, participants view a culprit actor on video and following a delay, participant-witnesses construct a holistic system facial composite. Controls do not construct a composite. From a series of arrays of computer-generated, but realistic faces, the holistic system construction method primarily requires participant-witnesses to select the facial images most closely meeting their memory of the culprit. Variation between faces in successive arrays is reduced until ideally the final image possesses a close likeness to the culprit. Participant-witness directed tools can also alter facial features, configurations between features and holistic properties (e.g., age, distinctiveness, skin tone), all within a whole face context. The procedure is designed to closely match the holistic manner by which humans’ process faces. On completion, based on their memory of the culprit, ratings of composite-culprit similarity are collected from the participant-witnesses. Similar ratings are collected from culprit-acquaintance assessors, as a marker of composite recognition likelihood. Following a further delay, all participants — including the controls — attempt to identify the culprit in either a culprit-present or culprit-absent video line-up, to replicate circumstances in which the police have located the correct culprit, or an innocent suspect. Data of control and participant-witness line-up outcomes are presented, demonstrating the positive influence of holistic composite construction on identification accuracy. Correlational analyses are conducted to measure the relationship between assessor and participant-witness composite-culprit similarity ratings, delay, identification accuracy, and confidence to examine which factors influence video line-up outcomes.
– The purpose of this paper is to describe four experiments evaluating post-production enhancement techniques with facial composites mainly created using the EFIT-V holistic system.
– Experiments 1-4 were conducted in two stages. In Stage 1, constructors created between one and four individual composites of unfamiliar targets. These were merged to create morphs. Additionally in Experiment 3, composites were vertically stretched. In Stage 2, participants familiar with the targets named or provided target-similarity ratings to the images.
– In Experiments 1-3, correct naming rates were significantly higher to between-witness 4-morphs, within-witness 4-morphs and vertically stretched composites than to individual composites. In Experiment 4, there was a positive relationship between composite-target similarity ratings and between-witness morph-size (2-, 4-, 8-, 16-morphs).
– The likelihood of a facial composite being recognised can be improved by morphing and vertical stretch.
– This paper improves knowledge of the theoretical underpinnings of these facial composite post-production enhancement techniques. This should encourage acceptance by the criminal justice system, and lead to better detection outcomes.
A street identification or live show-up provides an eyewitness with an opportunity to identify a suspect shortly after a crime. In England, the majority of suspects identified are subsequently included in a video line-up for the same witness to view. In Study 1, robbery squad data from three English police forces recorded 696 crimes, the identification procedures employed and prosecution decisions. A street identification was the most frequent identification procedure, being attempted in 22.7% of investigations, followed by mugshot albums (11.2%) and video line-ups (3.4%). In Study 2, data of 59 crimes were collected in which suspects, identified in a street identification, were subsequently filmed for a video line-up. Across both studies, most (84%) suspects identified in the street were subsequently identified in a video line-up, indicative of a commitment effect, in which a witness conforms to their first identification decision. All suspects identified in two procedures were eventually cautioned or charged to appear in court. The ground truth of suspect guilt in these field data cannot be determined. However, suggestions are made for reducing the likelihood of a mistaken identification of an innocent suspect caught up in an investigation; all possible steps should be taken to reduce the inherent suggestiveness of the street identification procedure.
Witnesses to a crime may be asked to create a facial composite of the offender from memory. They may then view a suspect in a police line‐up. Previous research on this topic has found both recognition impairment and enhancement following composite construction. In Experiment 1, creator‐participants employed the holistic facial composite system system EFIT‐V or the feature‐based E‐FIT system to create a single composite, and in Experiment 2, creators constructed up to three EFIT‐Vs. In both experiments, facial composite creators were one‐and‐a‐half times more likely than non‐composite creating controls to make correct target identifications from a video line‐up. No between condition effects were found in target‐absent trials in Experiment 1. The development of holistic facial composite systems has enhanced suspect identification rates in police investigations, and these results suggest that the use of such a system can also have a positive influence on a composite‐creating witness' later recognition of the suspect.
The use of street identification procedures—informal procedures in which witnesses attempt to identify an offender, usually soon after the commission of a crime and close to where it occurred—has attracted significant concern. These procedures are generally thought to give rise to a greater risk of mistaken identification because they lack the safeguards of formal procedures conducted under controlled conditions. This article describes the findings of empirical research undertaken by the authors. The research had three broad objectives. The first was to collect data which would provide some indication of the extent to which street identifications are used by police in England and Wales. The second was to compare the reliability of street identifications and video identification procedures involving the use of foils. The final objective was to investigate the influence that a street identification procedure would have on a subsequent video identification procedure involving the same witness and suspect. The findings suggest that substantial numbers of street identifications are conducted but, perhaps counter-intuitively, in terms of the risk of mistaken identification of innocent suspects, such procedures may be no less reliable than video identification procedures. Following identification of a suspect in a street identification, there is a very high likelihood that a formal procedure involving the same suspect and witness will result in the suspect being identified again, notwithstanding that the suspect is innocent.
A live showup (known as a street identification in the UK) allows the perpetrator to be identified shortly after a street crime. If the suspect disputes the identification, a video line‐up often ensues. Four experiments examined the reliability of live showups and their influence on a subsequent video line‐up using realistic procedures and conditions. Similar proportions of culprits and innocent suspects were identified from live showups and video line‐ups. Both culprits and innocent suspects previously identified were likely to be identified again in a subsequent line‐up, with delays from a few minutes to a month. Only a weak effect of clothing bias was observed. There was strong evidence of commitment to a previous identification but no reliable evidence of source monitoring errors. The results suggest that a live showup is not less fair than a line‐up, but the use of repeated identification procedures introduces an unfair bias against innocent suspects.
Expert witnesses using facial comparison techniques are regularly required to disambiguate cases of disputed identification in CCTV images and other photographic evidence in court. This paper describes a novel software-assisted photo-anthropometric facial landmark identification system, DigitalFace tested against a database of 70 full-face and profile images of young males meeting a similar description. The system produces 37 linear and 25 angular measurements across the two viewpoints. A series of 64 analyses were conducted to examine whether separate novel probe facial images of target individuals whose face dimensions were already stored within the database would be correctly identified as the same person. Identification verification was found to be unreliable unless multiple distance and angular measurements from both profile and full-face images were included in an analysis.
Student participant-witnesses produced 4 composites of unfamiliar faces with a system that uses a genetic algorithm to evolve appearance of artificial faces. Morphs of 4 composites produced by different witnesses (between-witness morphs) were judged better likenesses (Experiment 1) and were more frequently named (Experiment 2) by participants who were familiar with the target actors than were morphs of 4 composites produced by a single witness (within-witness morphs). Within-witness morphs were judged better likenesses and more frequently named than the best or the first-produced individual composites. The same results for likeness judgments were observed after possible artifacts in the comparison of between- and within-witness morphs were eliminated (Experiment 3). Experiment 4 showed that both internal and external features were better represented in morphs than in the original composites, although the representation of internal features improved more. The results suggest that morphing improves the representation of faces by reducing random error. Between-witness morphs yield more benefit than within-witness morphs by reducing consistent but idiosyncratic errors of individual witnesses. The experiments provide the first demonstration of an advantage for within-witness morphs produced using a single system. Experiment 2 provides the first demonstration of a reliable advantage for between-witness morphs in the most forensically relevant task: naming a composite of a familiar person produced by a witness who was unfamiliar with the target. Morphing would enhance the recognition of facial composites of criminals. Within-witness morphing provides a methodology for use in crimes in which the victim is the only witness. (Contains 2 footnotes, 2 tables, and 5 figures.)
Davis, J. P., Sulley, L. Solomon, C., & Gibson, S. (2010). A comparison of individual and morphed facial composites created using different systems. In G. Howells, K. Sirlantzis, A. Stoica, T. Huntsberger and A.T. Arslan (Eds.), 2010 IEEE International Conference on Emerging Security Technologies (pp. 56–60). Canterbury: IEEE. DOI: 10.1109/EST.2010.29
An evaluation of individual and morphed composites created using the E-FIT and EFIT-V production systems was conducted. With the assistance of trained police staff, composites of unfamiliar targets were constructed from memory following a Cognitive Interview. EFIT-V composite production followed either a two-day delay, or on the same day as viewing a video of the target. E-FIT composites were created on the same day as viewing the target video. Morphs were produced from merging either two, or three composites created by the same witness, but with the assistance of a different operator. Participants familiar with the targets supplied similarity-to-target photograph ratings. No differences were found in the rated quality of composites created using E-FIT or EFIT-V, although a two-day delay in production resulted in inferior images. Morphs were rated as better likenesses than individual composites, although the benefits were greater with EFIT-Vs. Encouraging witnesses to create more than one composite image for subsequent morphing might enhance the likelihood of recognition of facial composites of criminals.
Davis, J. P., & Valentine, T. (2009). CCTV on trial: Matching video images with the defendant in the dock. Applied Cognitive Psychology, 23, 482-505. DOI: 10.1002/acp.1490
The experiments reported in this paper investigated simultaneous identity matching of unfamiliar people physically present in person with moving video images typical of that captured by closed circuit television (CCTV). This simulates the decision faced by a jury in court when the identity of somebody caught on CCTV is disputed. Namely, ‘is the defendant in the dock the person depicted in video’? In Experiment 1, the videos depicted medium‐range views of a number of actor ‘culprits’. Experiment 2 used similar quality images taken a year previously, some of which showed the culprits in disguise. Experiment 3 utilised high‐quality close‐up video images. It was consistently found that in both culprit‐present and culprit‐absent videos and in optimal conditions, matching the identity of a person in video can be highly susceptible to error.
It has been previously demonstrated that extensive activation in the dorsolateral temporal lobes associated with masking a speech target with a speech masker, consistent with the hypothesis that competition for central auditory processes is an important factor in informational masking. Here, masking from speech and two additional maskers derived from the original speech were investigated. One of these is spectrally rotated speech, which is unintelligible and has a similar (inverted) spectrotemporal profile to speech. The authors also controlled for the possibility of “glimpsing” of the target signal during modulated masking sounds by using speech-modulated noise as a masker in a baseline condition. Functional imaging results reveal that masking speech with speech leads to bilateral superior temporal gyrus (STG) activation relative to a speech-in-noise baseline, while masking speech with spectrally rotated speech leads solely to right STG activation relative to the baseline. This result is discussed in terms of hemispheric asymmetries for speechperception, and interpreted as showing that masking effects can arise through two parallel neural systems, in the left and right temporal lobes. This has implications for the competition for resources caused by speech and rotated speech maskers, and may illuminate some of the mechanisms involved in informational masking.