It's no secret Python's `pickle` module is unsafe. It's also very popular, especially in the scientific Python community. Many of its users actually have pretty good reasons to use it, so just telling them not to isn't very helpful. This talk explores how Pickles are unsafe, advanced exploitation techniques, and how we can make Pickles safer without giving up on its useful properties.
Python's standard library comes with an object serialization framework called pickle
. It's no secret pickles are unsafe. Any time you load a pickle, you really have no guarantee what it'll do. While it's designed to just reconstitute objects from some bytes, it could open network connections, delete all your files, or really anything else it wants.
Despite all of these flaws, it's very popular, especially in the scientific Python community. Many of its users actually have pretty good reasons to use it, so just telling them not to isn't very helpful. For example, numpy's various array types have custom behavior to fine-tune how they are serialized and deserialized -- features explicitly supported by the Pickle module's extension points. Most scientific applications also don't just have a matrix: they might have fairly complex object graphs that require serialization. Pickle supports this out of the box, and that's great!
This talk is intended for intermediate and expert Python programmers. It's expected that you have heard of the Pickle module before and that you know it's unsafe. We'll (quickly) go through how Pickle works at a Pickle VM opcode level. Then we'll show how you can get arbitrary code execution, followed by some more advanced exploitation techniques (e.g. how you can put other behavior in a pickle while keeping equivalent functionality; e.g. the pickle still turns into the object you were expecting). Finally, we'll talk about how we can make this safe again in the end anyway, culminating in the practical implementation I have open sourced.
If I have time (big stretch for the talk, but maybe not for the hallway track), I'll show techniques for how to attack Pickles that are "protected" by malleable encryption schemes like AES-CTR.
Security person who specializes in cryptography and security tooling. Prinicpal at Latacora: we bootstrap information security practices for startups. Writes a good chunk of Clojure but has a long relationship with Python. Ask lvh about: application security, cryptography, security auditing tools (AWS, GCP, GSuite, you name it), network penetration testing, Certificate Transparency...