Abstract: Zero-shot learning (ZSL) has been attracting ever-increasing research interest due to its capability of recognizing novel or unseen classes.A lot of studies on ZSL are based mainly on two baseline models: compatible visual-semantic embedding (CVSE) and adversarial visual feature generation (AVFG). In this work, we integrate the merits of the two baselines and propose a novel and effective baseline model, coined adversarial visual-semantic embedding (AVSE). Different from CVSE and AVFG, AVSE learns visual and semantic embeddings adversarially and jointly in a latent feature space. Additionally, AVSE integrates a classifier to make latent embeddings discriminative, and a regressor to preserve semantic consistency during the embedding procedure. Moreover, we perform embedding-to-image generation which visually exhibits the embeddings learned in AVSE.The experiments on four standard benchmarks show the advantage of AVSE over CVSE and AVFG, and empirical insights through quantitative and qualitative results. Our code is at https://github.com/Liuy8/AVSE.