On Tuesday, security firm Trellix said its threat researchers found a vulnerability in Python
tarfile module, which provides a way to read and write compressed packages of files known as tarballs. Initially, insect hunters thought they stumbled upon a zero day.
It turned out to be a 5,500-day problem – the bug lived its best life for the past decade and a half awaiting extinction.
Identified as CVE-2007-4559, the vulnerability emerged on August 24, 2007, in a post on the Python mailing list by Jan Matejek, who was the maintainer of the Python package for SUSE at the time. It can be exploited to overwrite and potentially hijack files on a victim’s computer, when a vulnerable application opens a malicious tar archive via
“The vulnerability is basically like this: if you tar a file named
"../../../../../etc/passwd" and then you do the administrator
untar it, / etc / passwd is overwritten, “Matejek explained at the time.
The tarfile directory traversal flaw was reported on August 29, 2007 by Tomas Hoger, a software engineer at Red Hat.
But it had already been dealt with, more or less. A day earlier, Lars Gustäbel, maintainer of the tarfile module, made a change to the code that adds a default true value
check_paths parameter and a helper function for the
TarFile.extractall() method that generates an error if the path of a tar archive file is not secure.
But the fix didn’t address the
TarFile.extract() method – which according to Gustäbel “shouldn’t be used at all” – and left open the possibility that extracting data from untrusted archives could cause problems.
In a comment thread, Gustäbel explained that he no longer considers this a security issue. “tarfile.py does nothing wrong, its behavior conforms to the definition of pax and the guidelines for resolving the path in POSIX,” she wrote.
“There is no known or possible practical exploit [updated] documentation with a warning that it could be dangerous to extract archives from untrusted sources. This is the only thing to do IMO “.
In fact, the documentation describes this gun:
Warning: Never extract archives from untrusted sources without prior inspection. Files may be created outside of pathwayfor example members that have absolute filenames starting with
"/"or filenames with colons
Yet here we are, with both
extractall() still posing the threat of an arbitrary crossing of the route.
“The vulnerability is a path crossing attack in the
extractall functions in the tarfile module that allow an attacker to overwrite arbitrary files by appending the sequence ‘..’ to the filenames in a tar archive, “Kasimir Schulz, a vulnerability researcher for Trellix, explained in a blog post.
The sequence “..” changes the current working path to the parent directory. So, using code like the six-line snippet below, says Schulz, the
tarfile the module can be told to read and modify the file’s metadata before it is added to the tar archive. And the result is an exploit.
import tarfile def change_name(tarinfo): tarinfo.name = "../" + tarinfo.name return tarinfo with tarfile.open("exploit.tar", "w:xz") as tar: tar.add("malicious_file", filter=change_name)
According to Schulz, Trellix has created a free tool called Creosote to search for CVE-2007-4559. The software has already found the lurking bug in applications like Spyder IDE, an open source science environment written for Python, and Polemarch, an IT infrastructure management service for Linux and Docker.
The company estimates the
tarfile the flaw can be found “in over 350,000 open source projects and prevalent in closed-source projects”. He also points this out
tarfile it is a default module in any Python project and is present in frameworks created by AWS, Facebook, Google and Intel and in applications for machine learning, automation and Docker containers.
Trellix says it is working on making the repaired code available to affected projects.
“Using our tools, we currently have patches for 11,005 repositories, ready for pull requests,” said Charles McFarland, a vulnerability researcher for Trellix, in a blog post. “Each patch will be added to a forked repository and a pull request made over time. This will help both individuals and organizations become aware of the problem and provide them with a one-click solution.
“Due to the size of the vulnerable projects, we plan to continue this process in the coming weeks. This is expected to reach 12.06% of all vulnerable projects, just over 70,000 projects upon completion.”
The remaining 87.94% of affected projects may wish to consider other possible options. ®