Cross-Age LFW (CALFW) Database


Motivation

Attention! We updated the positive/negative lists and baselines for CALFW in September 19th,2018. Please use the new list to do experiments. 

Welcome to Cross-Age LFW (CALFW) database, a renovation of Labeled Faces in the Wild (LFW), the de facto standard testbed for unconstraint face verification. 

Labeled Faces in the Wild (LFW) database has been widely utilized as the benchmark of unconstrained face verification and due to big data driven machine learning methods, the performance on the database approaches nearly 100%. However, we argue that this accuracy may be too optimistic. Besides different poses, illuminations, occlusions and expressions, cross-age face is another challenge in face recognition yet LFW does not pay much attention on it. Thereby we construct a Cross-Age LFW (CALFW) which deliberately searches and selects 3,000 positive face pairs with age gaps to add aging process intra-class variance. Negative pairs with same gender and race are also selected to reduce the influence of attribute difference between positive/negative pairs. We evaluate several metric learning and deep learning methods on the new database. Compared to the accuracy on LFW, the accuracy drops about 10\%-17\% on CALFW. There are three motivations behind the construction of CALFW benchmark as follows:

1.Establishing a relatively more difficult database to evaluate the performance of real world face verification so the effectiveness of several face verification methods can be fully justified.

2.CALFW emphasizes age gap of positive pairs to further enlarge intra-class variance and still considers other intra-class variations. Also, negative pairs are deliberately selected to avoid different gender or race. CALFW considers both the large intra-class variance and the tiny inter-class variance simultaneously.

3.Maintaining the data size, the face verification protocol which provides a 'same/different' benchmark and the same identities in LFW, so one can easily apply CALFW to evaluate the performance of face verification.

Comparison with LFW

Age gap comparison

Compared to the positive pairs in LFW, the age gaps of positive pairs in CALFW is larger. This shows we successfully add aging process to intra-class variations. Also, in LFW, age gaps of most positive pairs are less than 10 years while that of most negative pairs are larger than 10 years, in CALFW, there is no clear boundary to distinguish the two kinds of pairs, so age gap can not be a big influence on face verification in CALFW.

Positive pairs comparison

CALFW is collected by crowdsourcing efforts to seek the pictures of people in LFW with age gap as large as possible on the Internet. Compared to LFW, the positive pairs in CALFW contain obvious age difference.

Compared to LFW, the negative pairs in CALFW have same gender and race, which reduces the influence of attribute difference between positive pairs and negative pairs in face verification.

We dedicate to maintain the protocols, dataset size, and the identities in each fold of LFW database in order to encourage fair and meaningful comparisons. You can find more information about standard LFW protocol in Labeled Faces in the Wild (LFW).

We expect CALFW could promote algorithms to make reliable verification judgement, and close the large gap between the reported performance on benchmarks and performance on real world tasks.


Baseline Results

We select three SOTA deep face recognition methods that have achieved top performance on major benchmark databases: LFW, IJB-A and MegaFace..

COMPARISON OF VERIFICATION ACCURACY (%) ON LFW AND CALFW USING FOUR SOTA DEEP FACE RECOGNITION MODELS.

Method LFW CALFW
Centerface1 98.75% 85.48%
SphereFace2 99.27% 90.30%%
VGGFace23 99.43% 90.57%
ArcFace4 99.82% 95.45%
HUMAN-Individual 97.27% 82.32%
HUMAN-Fusion 99.85% 86.50%

COMPARISON OF 10-FOLD VALIDATION ERROR (%) OF FOUR SOTA DEEP FACE RECOGNITION MODELS. THE INCREASE OF ERROR IS ALSO ENUMERATED WHEN TRANSFERRING FROM LFW TO CALFW.

Method LFW CALFW
Centerface1 1.17 14.52 ( ↑ 1241%)
SphereFace2 0.65 9.70 ( ↑ 1492%)
VGGFace23 0.49 9.43 ( ↑ 1924%)
ArcFace4 0.10 4.55 ( ↑ 4550%)
  1. A discriminative feature learning approach for deep face recognition. In European Conference on Computer Vision, Springer, 2016, pp. 499–515.
  2. Deep hyperspherical learning. In NIPS, 2017, pp. 3953–3963.
  3. Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman. Vggface2: A dataset for recognising faces across pose and age. arXiv preprint arXiv:1710.08092, 2017.
  4. Arcface: Additive angular margin loss for deep face recognition. arXiv preprint arXiv:1801.05599, 2018.

Reference

Please cite as:


T. Zheng, W. Deng, and J. Hu, Cross-age LFW: A database for studying cross-age face recognition in unconstrained environments,CoRR, vol. abs/1708.08197, 2017. [Online]. Available: http://arxiv.org/abs/1708.08197.

BibTeX entry:
@article{DBLP:journals/corr/abs-1708-08197,
  author    = {Tianyue Zheng and
               Weihong Deng and
               Jiani Hu},
  title     = {Cross-Age {LFW:} {A} Database for Studying Cross-Age Face Recognition
               in Unconstrained Environments},
  journal   = {CoRR},
  volume    = {abs/1708.08197},
  year      = {2017},
  url       = {http://arxiv.org/abs/1708.08197},
  archivePrefix = {arXiv},
  eprint    = {1708.08197},
  timestamp = {Tue, 05 Sep 2017 10:03:46 +0200},
  biburl    = {http://dblp.org/rec/bib/journals/corr/abs-1708-08197},
  bibsource = {dblp computer science bibliography, http://dblp.org}
}

Download the database

You can download all the resources in CALFW. There are three folds in the website link. Feature fold contains the LBP feature we use in baseline results. Images fold contains all the images (similarity_align.rar contains images aligned using similarity transform and images&landmarks.rar contains images which are not alighed and landmarks) txts fold contains all the files including identity list, image name list and 6,000 pairs list. Please cite as Reference.


Contact

Please contact Tianyue Zheng (2231135739@qq.com) and Weihong Deng for questions about the database.