Scores of first 10,000 to complete E2ID project

Josh P Davis and Nikolay Petrov

School of Human Sciences

Institute of Lifecourse Development

University of Greenwich

London SE10 9LS

5 August 2020

Advertised project name: Can you "Beat the Computer" at matching facial composites with a photo of the person they are supposed to depict?

Thanks to those participants who continue to support our ongoing Innovate UK funded, E2ID collaboration between the University of Greenwich and VisionMetric Ltd. See previous blog here.

We keep getting asked “what is a good score?” and this blog aims to provide feedback to those participants who have contributed.

First, we can announce the latest £50 randomly selected prize winners received their Amazon vouchers by e-mail last week.

L****.w******@****.ch J**m****** k**m**2*** Jc******2*

Strangely, as of today, none have claimed their award – hence a reminder in this blog.

We sent out another eight prizes earlier this year (see here),

E2ID is supported by an Innovate UK grant ( The E2ID project aims to develop a radical, new approach to achieving fast, accurate automatic matching of facial composite images to police suspect databases. The expected developments will give rise to improved investigative procedures for international police forces leading to greater case closure and a reduced demand on police resources.

The role of Josh Davis and Nikolay Petrov at the University of Greenwich is to assist “with developing neural (deep learning) procedures that successfully map the human cognitive processes implicit in the recognition of facial composite images. In this way, machine behaviour will be tailored for the first time to achieve composite face recognition” in a similar manner to humans.

Can you beat the computer study?

In one component of the project, participants are invited to provide a series of 50 randomly allocated facial composite-facial photo similarity ratings (1: not at all similar – 7: highly similar) to pairs of images that are sometimes of the same person, sometimes of two different people.

Participants are given their score out of 50 at the end of the project, as we know participants prefer such feedback when taking part in research. Ratings of 1-3 are scored as the participant probably believing the images are of two different people. Ratings of 5-7 are scored as the participant probably believing that the two images correctly depict the same person.

This allows us to measure generate a measure of confidence in decision making.

In terms of scores given at the end and in the histogram below, a score of 1 is given to each trial in which the belief is in the correct direction. We realise this is an imperfect method, but it probably approximates the actual belief of participants.

Note – scores on the test do not influence likelihood of winning a prize. This is entirely random.

Note that due to the random allocation system, one participant’s set of up to 50 trials may be far harder to rate than a second participant's, so direct comparison of scores is not advised. In other words, if you achieved a very low score, you may have been issued with very hard pairs to rate.

Pilot study

So, we conducted a pilot study at the end of 2019 (n = 5256) aiming to identify the ideal number of pairs of images to include in a set to be rated to ensure maximum response numbers (i.e. not to discourage participants by asking them to rate too many images). Note to psychologists, we varied the information participants received about the demands of the study to ask them to rate 10, 20 or 50 image-pairs.

Not surprisingly, participants were more likely to volunteer if asked to rate 10, then 20, then 50 pairs. However, the drop in participant numbers was compensated by the increase in the total numbers of ratings – so the final study contained 50 trials. We also assessed the ideal number of ratings that should be given to each pair in a set. In other words, how much extra value is given to the final data by increasing the number of participants providing ratings? We think we found the ‘sweet spot’ and hope to publish these data.

Ongoing study

For the ongoing main project, we uploaded six batches of composite-photo pairs to the Qualtrics system. If pairs are of the same person, composites deliberately vary in how similar they are to the target photo. By 5 August 2020, over 10,000 participants have provided around 50 ratings each to 22,113 image-pairs so far.

This is a total of 529,952 individual ratings.

Here is a histogram of the scores out of 50 so far (see above for scoring explanations).

As you will see, no participants have scored 49 or 50 so far. The highest score is 48 – achieved by one participant (three have scored 47).

The E2ID project is continuing until April 2021. If you would like to have a go at “beating the computer” and possibly winning £50 in a future prize draw, please click here.


©2019 by SR

  • Twitter