Why `len('‍️') == 4` and Other Weird Things You Should Know About Strings in Python

This PyCon US conference talk explores the unexpected behaviors of strings in Python, including puzzling phenomena like why `len('😶‍🌫️') == 4`, why 'ñ' doesn't equal 'ñ', and how 'dlrow olleh'.split()[1] equals 'olleh'. Dive into text encoding fundamentals and Unicode standards to understand how Python handles strings internally. Discover why a single code point can represent multiple characters, learn about locale-dependent case conversions, and explore the technical workings of emoji. Gain practical knowledge about common Unicode pitfalls and best practices for handling Unicode input in Python applications. After watching this 24-minute presentation, walk away with deeper insights into Python's string implementation, Unicode character encoding, and strategies to avoid text processing issues in your code.