Evolve or Die: How I Stopped Avoiding and Starting Loving Python Upgrades

30 Minute Talk
Sunday at 2:00 pm in Orchid Ballroom West

Imagine you’re all set to board a flight. You've got your book and snack, and you're nestled into your seat. The pilot assures you of a smooth journey ahead. Comforting, right? But for hundreds of thousands of Southwest passengers last holiday season, the reality was far from comforting. Remember the chaos? Southwest canceled around 17,000 flights due to outdated software in their crew scheduling system, leaving pilots clueless about their assignments.

The fallout was severe: a 15% plummet in share price, a $400 million hit to revenue, and about $300 million in compensation costs—not to mention the lasting stain on their reputation.

Now, let’s turn to you. What upgrades are you putting off? What will it cost you to delay?

Fear of the unknown is a major driver of delaying upgrades and inadvertently risking security. Which deprecations in the standard library actually affect me? Do my packages support the next version of Python? If I upgrade one package, which other packages are affected?

In my talk, I will demo how new tools—data that can be leveraged directly through Snowflake or through an app that also parses requirements.txt to do even more of the heavy lifting—can help the audience navigate minefields of Python upgrades and get to the latest Python version. The audience will also leave with practical, required actions for Python deprecations on popular platforms: AWS, Heroku, and AZURE.

Module Index Data:

We've created a unique database of all standard library Python modules by parsing data across different Python Standard Library Module Index versions—3.8, 3.9, 3.10, 3.11, and 3.12. This allows us to track changes from one version to the next, such as deprecations and removals. I will demo analyses.

Package-Module Data:

We’ve built datasets of the most popular Python Packages—including numpy, requests, python-dateutil, cryptography, boto3, certifi, idna, packaging, pytest, pyyaml, s3transfer, setuptools, six, urllib3, wheel—with their underlying standard library and non-standard library modules, imported objects, package version, and more. This enables us to warn users about modules slated for removal in their specific package dependencies.

I’ll demo how to leverage this data on Snowflake and show useful SQL queries to help uncover module deprecations and imminent removals. What is especially exciting about the data is the ability to connect package data with module deprecation data which exists for different Python versions. The potential use cases are infinite. I will demo a few that I believe will be incredibly beneficial for stakeholders.

  • Identify Deprecated Modules in a Package: This query joins a Python Package table (e.g., numpy) with a Python Standard Library Module Index table (e.g., Python 3.12) to find all modules in a package that are marked as deprecated in a Python release.
  • Track Non-Standard Library Modules: Retrieves detailed information about a package’s modules that are not part of the standard library, including their import path, object imported, and documentation link e.g., s3transfer, socket, and pytest in requests.
  • Discover Deprecated Modules Reinstated in Subsequent Releases: This query identifies modules that were marked as deprecated in one release but are not marked as deprecated in a subsequent release. This case can resolve the need for code changes.
  • Drill Down to Deprecated Module Objects by Package: This query lists the objects imported into a package for each deprecated module within a package. This allows users to identify the specific functionality at risk in their codebase.
  • Study Deprecation Proportion by Release: Calculates the proportion of modules that are deprecated in each release to gauge how much of the library is being phased out in different versions.

As Python continues to evolve, understanding the landscape of dependent modules—including those that are part of the standard library—becomes essential for maintaining a robust and secure codebase and planning successful upgrades. The stakeholders poised to derive significant value from this data include risk audit teams, software developers, and tech companies. These may include risk analysis and reporting, planning for future development, automation, and machine learning.

Then, it gets even better. I’ll also use a popular ML repo, TensorFlow, to demo an upgrade using a new app built in Python to simplify upgrades. A requirements.txt file is all that’s needed!

The app parses requirements.txt to do even more of the heavy lifting and leverages the Snowflake data above to visualize module deprecations and package dependencies. Before this app, a developer would have to scour the changelog, compile a list of deprecated modules, check each package for use of each of these modules individually to know if a package was affected by a deprecation! The Version^ team has done the work once and now the community can benefit.

This tool is accessible to users with varying levels of technical expertise, providing a unified framework for managing upgrades efficiently and thoughtfully. Even those without deep programming knowledge can evaluate the upgrade project's scope through the app's color-coded feedback system and effort estimates. Essentially, it transforms upgrading into a proactive, rather than reactive, endeavor.

For instance, some underlying modules in TensorFlow, an open-source library for numerical computation and machine learning, might be deprecated in Python 3.12. Spoiler alert: msilib is on this list! Some dependent packages may not be known to support Python 3.12 (psutil is one of many!)

For TensorFlow, which includes over 40 packages and 300 modules, manually tracking deprecations would be cumbersome. One would have to find all the deprecated modules in the target upgrade version, visit the repo of all the required packages, and search for each deprecated module in the documentation for each of the 40 packages one by one. The app automates this.

Users upload the file, which the app (Version^) parses to understand the dependencies. It does the following:
- Surfaces the deprecated module dependencies in the app’s packages
- Unpacks package dependencies and sub-dependencies
- Flags package that aren’t known to support the target Python version
- Identifies any modules that have been removed in the target Python version
- Exports a .csv to upload upgrade tasks into project management software

Version^ helps users navigate the complexities of software upgrades, promoting a secure and efficient transition to newer versions.

Finally, I will welcome questions about upgrades from the audience and present future development.

Presented by

Ruby Henry, Ph.d.