Multi-Task Learning in Transformer-Based Architectures for Natural Language Processing
Data Science Conference via YouTube
Google, IBM & Microsoft Certificates — All in One Plan
Learn Python with Generative AI - Self Paced Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about multi-task learning in transformer-based NLP architectures through this 31-minute conference talk that explores cost-effective alternatives to training separate models. Discover how leveraging information across multiple tasks and datasets can enhance performance through shared models, representation bias, increased data efficiency, and eavesdropping. Explore solutions to challenges like catastrophic forgetting and interference, while diving into general approaches to multi-task learning, innovative adapter-based techniques, hypernetwork methods, and strategies for task sampling and balancing. The presentation covers key topics including the Bird Paper, architecture considerations, modularity concepts, function composition, input composition, parameter composition, fusion techniques, and shared hypernetworks, concluding with insights into Chad GP implementations.
Syllabus
Intro
Outline
Agenda
Bird Paper
Architecture
Problems
Adapters
Modularity
Compositions
Overview
Function Composition
Input Composition
Parameter Composition
Fusion
Hyper Networks
Shared Hyper Networks
Chad GP
Questions
Taught by
Data Science Conference