Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Ingesting 35 Million Hotel Images with Python in the Cloud

EuroPython Conference via YouTube

Overview

Build a Learning Habit
Download Class Central's free printable study calendar
Download for Free
Learn how Skyscanner built a distributed architecture using Python to process and manage 35 million hotel images in the cloud. Explore the challenges and solutions involved in creating an incremental image processing pipeline that discards poor quality and duplicate images while optimizing for mobile devices. Discover the tools and techniques used, including Pillow, ImageHash, Kombu, and Boto, to handle tasks such as ingesting partner-provided images, detecting and removing bad quality and duplicates, resizing images, and ensuring scalability within time constraints. Gain insights into the technical stack, triggering mechanisms, downloading processes, fingerprinting techniques, and methods for choosing the best images in this informative conference talk from EuroPython 2016.

Syllabus

Intro
Processing 55 million images
Tech stack
Triggering
Downloading images
Fingerprinting
Duplication
Duplicator
Guarantee
Choosing the best images
Picking the best image
Resize images
Final result

Taught by

EuroPython Conference

Reviews

Start your review of Ingesting 35 Million Hotel Images with Python in the Cloud

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.