Department Seminar Series
Towards compute efficient large language models
23rd February 2023, 15:00
Ashton Lecture Theatre
Dr. Nikolaos Aletras
Computer Science Department, University of Sheffield
Abstract
Large language models (LLMs) are really effective when adapted in various downstream NLP tasks. However, pre-training requires access to large compute resources. In this talk, I will present our work on (1) speeding up pre-training with simple objectives compared to the widely used masked language modelling, (2) how the choice of pre-training objective affects LLMs capturing linguistic information, and (3) how we can support an unlimited vocabulary with a relatively small number of parameters.
Biography
I am a Senior Lecturer (~Associate Professor) in Natural Language Processing at the Computer Science Department of the University of Sheffield. Previously, I was a Lecturer in Data Science at the Information School, University of Sheffield. I’ve gained industrial experience working as a scientist at Amazon. I was a research associate at UCL, Department of Computer Science and I completed a PhD in Natural Language Processing at the University of Sheffield, Department of Computer Science.
Maintained by Othon Michail