Department Seminar Series

Towards compute efficient large language models

23rd February 2023, 15:00 add to calenderAshton Lecture Theatre
Dr. Nikolaos Aletras
Computer Science Department, University of Sheffield

Abstract

Large language models (LLMs) are really effective when adapted in various downstream NLP tasks. However, pre-training requires access to large compute resources. In this talk, I will present our work on (1) speeding up pre-training with simple objectives compared to the widely used masked language modelling, (2) how the choice of pre-training objective affects LLMs capturing linguistic information, and (3) how we can support an unlimited vocabulary with a relatively small number of parameters.
add to calender (including abstract)

Biography

I am a Senior Lecturer (~Associate Professor) in Natural Language Processing at the Computer Science Department of the University of Sheffield. Previously, I was a Lecturer in Data Science at the Information School, University of Sheffield. I’ve gained industrial experience working as a scientist at Amazon. I was a research associate at UCL, Department of Computer Science and I completed a PhD in Natural Language Processing at the University of Sheffield, Department of Computer Science.