Corporate Security

From Self-Hosted GitHub Runner to Self-Hosted Backdoor

Overview

Continuous Integration and Continuous Delivery (CI/CD) systems are powerful and configurable tools within modern environments. At Praetorian, we are seeing organizations migrate to SaaS solutions like GitHub (GitHub.com) as their source code management and CI/CD solution, instead of on-premises tools like BitBucket, Bamboo, and Jenkins. On our Red Team engagements , we routinely employ advanced tradecraft to reach our attack objectives. Recently, we have successfully targeted CI/CD environments to gain expanded network access.

This post will outline how compromised GitHub access can be used to pivot into an organization’s internal environment and often lead to an attacker achieving their objectives, with little to no visibility into an attacker’s actions until it is too late. Our focus is to outline a threat model for organizations that utilize GitHub and leverage self-hosted runners in conjunction with GitHub Actions. We will additionally provide recommendations for security controls that an organization could implement to protect against attackers seeking to conduct this style of attack.

Primer

GitHub is unique in that its primary offering is an Internet facing SaaS tool that provides source control management and a CI/CD system called GitHub Actions. Furthermore, GitHub offers several paid solutions to customers, but we found that many features that are essential to building effective security controls require the most expensive plan. Praetorian leveraged weak security configurations and GitHub’s inherently weak security model to conduct an attack that led to a persistent foothold within an organization’s internal network.

GitHub Actions

GitHub Actions allow the execution of code specified within workflows as part of the CI/CD process. GitHub Actions can be executed on ephemeral runners hosted within Azure or on self-hosted runners executing the GitHub Actions agent software. Many organizations opt to utilize self-hosted runners due to cost savings and integration with internal network infrastructure for package deployments.

If not configured correctly, self-hosted runners can lead to devastating security impacts for an organization. GitHub’s documentation makes no claims to security here; rather, it clearly states:

“Self-hosted runners for GitHub do not have guarantees around running in ephemeral clean virtual machines, and can be persistently compromised by untrusted code in a workflow.”

GitHub Access

Traditional Authentication

GitHub allows logging in via traditional username and password authentication through the web interface as well as through the official GItHub command line interface. Users can configure multi-factor authentication (MFA) when logging in through the CLI or web application.

Personal Access Tokens

GitHub classic Personal Access Tokens (PAT), under their current implementation, are a security risk for organizations. The problem stems from the fact that GitHub connects tokens to a user, not the organization. If a token is generated with the repo scope, that token will have the same access that the user has to all repositories accessible by that user. Suppose the user is an administrator to a private repository within their employer’s GitHub organization, but they created a token to work on a personal project at home. In that case, the token will still have administrative access to their employer’s GitHub repository.

To make matters worse, when a classic PAT exists for a GitHub Teams customer, restricting whether a user’s PAT can work on the organization’s repositories is impossible. Likewise, an organization has no way to inventory valid classic PATs associated with their users.

The Attack

On one of our Red Team engagements, we obtained access to a GitHub personal access token associated with a machine account. There are many different ways that a developer could accidentally disclose a PAT that they have generated, such as phishing, personal laptop compromise, and accidental inclusion in command line logs.

This token possessed the repo scope. Our Red Team then identified and exploited the use of self-hosted runners and created a malicious GitHub Actions workflow to obtain persistence on the runner. This opened the door for privilege escalation and lateral movement.

Figure 1 shows our complete attack path.

Figure 1: A diagram of our attack path using GitHub PATs.

Attack Steps:

Obtained GH access token
Identified use of Self-Hosted Runners
Updated existing GitHub Action to run a payload
Triggered malicious GH action
Received Malware Check-in to Praetorian C2

This organization grouped the GitHub runners into a pool for the entire organization. Any developer in the organization could create a project that utilized GitHub Actions. Upon a trigger event under this configuration, an available runner would pick up the workflow and begin executing code specified in the YAML file specified within the repository. By default, all repositories within an organization have GitHub Actions enabled, and the default configuration is to have a single runner group for the organization.

Enumeration

Exposed Tokens

As we see in Figure 2, GitHub states clearly that PATs are sensitive, but let us explore the actions that could allow a PAT to be leaked or disclosed.

Figure 2: GitHub’s warning regarding the sensitivity of personal access tokens.

Git Clone Operations: When a user clones a repository with HTTPS using a PAT, the token is stored in plaintext within the .git folder. Simply running `git remote -v` will disclose the token. Suppose a developer clones a repo they are working on with their personal account and shares it, not knowing that the token is saved, as Figure 3 demonstrates. In that case, they could inadvertently provide access to their employer’s repositories. An example of a `git clone` using a GitHub token for authentication is

git clone https://ghp_SECRET_TOKEN@github.com/org/repo

GitHub API calls and security logging: A user could make GitHub API calls with curl, and have their commands logged by an EDR agent, which could then be ingested into a SIEM.
Malware: GitHub tokens start with the recognizable string ‘ghp_’, as Figure 3 demonstrates. Malware authors could write custom malware to search bash history and locally-cloned git repos for tokens and send it to the attacker. This is very simple to write and is unlikely to trigger any detections for custom code. An attacker even could sneak this into legitimate tools that developers are likely to use, such as a malicious Vim plugin.

Figure 3: GitHub tokens start with the ‘ghp_’ string, which are stored on the filesystem for repositories cloned using HTTPS.

The red team had retrieved a valid GH token through an exposure vulnerability. We were then able to authenticate to the organization GH instance and navigate through available repositories. We eventually discovered a repository that contained secrets in an old commit. One of them was another PAT for the same user, which contained the admin:org , repo , and workflow scopes. This broadened the type of attacks that we could utilize.

The repo scope alone cannot update or create new workflows, but can run a workflow by updating a branch that is already configured to execute workflows or by creating a new branch. If a workflow file calls a shell script or Makefile within the repo as part of a build step, then an attacker can simply update that file in order to execute code they control.

Payload Creation

Our Red Team identified that certain self-hosted runners had Docker installed, and the user running the GitHub runner process was a member of the docker group. This meant that the host could execute a docker container, which could provide root-level access to the host depending on the invocation. Our Red Team utilized this privilege escalation path to obtain root access on the self-hosted runners.

This also provided some degree of evasion from EDR solutions present on the host. We utilized an Alpine Linux Docker container and configured it to download a malicious payload we hosted on GitHub in an encrypted zip. Furthermore, if an attacker executes a Docker container in detached mode, it will persist beyond the execution of the job and will prevent the runner from hanging.

Persistence

We achieved persistence by using Docker; however, we could still choose to run a task in the background from a GitHub Action. Typically, a GitHub Action will clean up all orphan processes upon termination. This theoretically would prevent an attacker from simply executing a background task with nohup or disown. A little-known environment variable does allow processes initiated by the workflow to persist beyond the initial execution, though.

By setting the RUNNER_TRACKING_ID environment variable to 0, the GitHub Actions runner will not attempt to clean up any child process still running after the action completes. An attacker can use this to spawn a persistent background process to obtain a foothold on the action runner itself.

If a workflow calls a shell script, adding `export RUNNER_TRACKING_ID=0` before executing any command will prevent the cleanup step from terminating the orphan process. To test this, we had our runner execute the following lines of code within a shell script:

export RUNNER_TRACKING_ID=0 && nohup python3 -c "import time; time.sleep(100)" &

sleep 25

While the sleep command executed, the Python script ran with the parent process specified. Termination of the sleep 25 killed the parent process, but the Python process continued to run with pid 1 as a parent.

ghrunner    2501       1  0 19:33 pts/0    00:00:00 python3 -c import time; time.sleep(100)

ghrunner    2501    2499  0 19:33 pts/0    00:00:00 python3 -c import time; time.sleep(100)

Execution

To trigger our attack, we identified a repository with enabled GitHub Actions and an existing workflow. We created a new branch within that repository and updated the workflow YAML file to run only on self-hosted runners. This workflow file ran bash code that downloaded a payload from an external GitHub repository that we controlled and executed a payload. With the repo scope, we could find code called by the existing YAML file and modify it to perform our malicious actions.

This led to an implant callback from within our client’s environment. We utilized this foothold to move laterally and gain substantial access within that environment.

Cleanup

After executing our payload, we utilized GitHub API calls to delete the workflow runs associated with our branch and the branch itself. These API calls required the workflow scope. However, an attacker could simply skip this step.

Detection & Mitigation

Due to GitHub restricting security features based on plan tier, some recommendations will vary based on the plan in use. Pertinent aspects of GitHub’s security guide are available for reference here .

Logging Blind Spots

Organizations that use GitHub have no visibility into traffic to GitHub.com unless GitHub exposes that information in their audit logs. This is because, as a Saas solution, all connections go directly to GitHub.com.

We performed a thorough evaluation of GitHub’s audit logging functionality prior to conducting the self-hosted runner attack. We saw the following two issues with GitHub’s audit logging:

Lack of visibility into GET requests made against the GitHub API with PATs
Restriction of what we view as essential logging capabilities

The first issue with GitHub’s audit log is that it currently does not capture events where an attacker is simply querying information about an organization. If an attacker obtains a PAT with no other knowledge about it , they can use a combination of REST API requests and git operations to learn:

This leads into the next issue: many security features that Praetorian considers essential for an organization are locked behind GitHub’s highest price tier, as Figure 4 highlights.

Figure 4: GitHub documentation noting that certain logging capabilities are available only at the most expensive tier.

Implications for Teams Plan Users

The more affordable Teams plan lacks certain audit logging features that would enable an organization’s SoC to learn of an attacker’s actions before it is too late. In particular, any clone, fetch, or push actions do not generate audit log events for customers that use Teams’ price tier. This means that if an attacker obtained a GitHub PAT that allowed access to a private organization on the Teams plan, they could clone every single repository within that organization without the generation of audit log events.

Within the attack chain, the time between the attacker conducting the first clone operation and executing their attack is the defender’s largest detection window. This is where an attacker will be enumerating repositories for existing workflows and determining what code they can modify, as Figure 5 shows.

Figure 5: The series of steps an attacker must take, with red denoting the steps that bracket the optimal detection window.

Furthermore, the REST API is only enabled for Enterprise customers, while other organizations must utilize the web interface to view audit logs. This makes it impossible for organizations on the Teams plan to develop a real-time detection to alert defenders of possible malicious activity.

Similarly to how the git clone, push, and fetch events required the more expensive plan, the workflow events associated with executing a GitHub action are not viewable by users on the Teams plan. To query them , the user must have access to the Audit Log API under the Enterprise plan. Following is an example query that retrieves audit log events associated with workflow run creations:

```

curl 

  -H "Accept: application/vnd.github+json" 

  -H "Authorization: Bearer <YOUR-TOKEN>" 

  https://api.github.com/orgs/ORG/audit-log?phrase=action:workflows.created_workflow_run

```

Pay-gated Security Strikes Again

Only organizations on the Enterprise Plan can create multiple runner groups. Organizations using the Teams plan can either add runners to this default group or configure them on a per-repository basis, as seen in Figure 6.

Figure 6: The option to use either a default group or one-off configuration for runners under the GitHub Teams plan.

This restriction is unfortunate because it creates a situation where developer productivity and security controls are placed directly at odds.

Fine-Grained Access Tokens

On October 18th, 2022, GitHub announced a beta for fine-grained access tokens . These tokens allow granular permissions settings, organization-level approval, and insight into tokens.

This is a step in the right direction for organizations because it shifts the visibility of tokens that apply to an organization from user accounts to the organization itself. Unfortunately, this feature currently is an all-or-nothing setting. An administrator can disable all PATs for an organization, but there is no way to retain classic PAT functionality to facilitate a smooth transition of systems that currently rely on them while disabling them for standard organization members.

General

Adopt a zero-trust security posture concerning self-hosted runners. Minimize access to the broader internal network.
Configure an egress allow-list policy for connections from the runner to the Internet.
Re-cycle runners routinely. When leveraging a runner pool, developing tooling to disable, delete, and re-register runners can prevent attackers from establishing a long-term foothold on a single runner. GitHub’s API can be used to programmatically provision runners. Combining this with containerization can allow quickly cycling runners with little overhead.
Ensure that EDR and/or anti-virus solutions are running on self-hosted runners.
Treat the ability to run an action on a runner as equivalent to running actions for any repository which utilizes that runner. If repositories in your organization have more restricted access, make sure they do not share runners with repositories with fewer restrictions.

Teams

Enable actions only for selected repositories, as demonstrated in Figure 7.

Figure 7: Opting for selected repositories to use GitHub Actions rather than all repositories.

Configure self-hosted runners at the repository level and not the organization for any repositories containing more sensitive source code.

Enterprise

Enable SAML SSO for the organization. This will require that all personal access tokens are individually authorized to the organization.
Consume workflow log events into a SIEM
Log the org_credential_authorization.grant event and capture the description to determine whether it is associated with a PAT SSO authorization, as in Figure 8.

Figure 8: Logging the org_credential_authorization.grant event and capturing its description.

Log actor IP addresses in audit logs. This will require adjusting the default setting, as shown in Figure 9. We recommend enabling this so events associated with token compromise can be detected. A detection on `git clone` entries combined with source IP checks will detect any incidence of enumeration of repositories themselves conducted by an attacker.

Figure 9: The setting for IP disclosure is not enabled by default.

People

Ensure that developers are trained to understand the security implications of their CI/CD configuration and the important role they play in ensuring the security of the organization.
Encourage developers to utilize fine-grained personal access tokens even if it is not possible to switch to them exclusively in the immediate term.

What’s Next?

We hope that Microsoft will alter its stance on security features offered with lower-tier plans. Self-hosted runners expose organizations to considerable risk if not configured correctly. Therefore, if lower-cost plans support self-hosted runners, they also should include the full suite of security features necessary to monitor them before an attacker has already obtained code execution. The next blog post in our CI/CD series will show how we executed a similar attack on self-hosted runners for GitLab, which is a competing SCM and CI/CD platform.

At Praetorian, we understand the latest CI/CD attacks and how to best secure your organization’s CI/CD environment against sophisticated attackers. Let our experts guide you through the process of building detections against malicious CI/CD activity or test the robustness of your existing defense.

About the Authors

Catch the Latest

Catch our latest exploits, news, articles, and events.

Cloud Security

July 24, 2024

Exploiting Broken Authentication Control In GraphQL

Cloud Security

July 24, 2024

Recursive Amplification Attacks: Botnet-as-a-Service

Cloud Security

July 18, 2024

Capturing Exposed AWS Keys During Dynamic Web Application Tests

Ready to Discuss Your Next Continuous Threat Exposure Management Initiative?

Praetorian’s Offense Security Experts are Ready to Answer Your Questions

Platform

Free ASM Scan

Penetration Testing Services

Advanced Offensive Security

Managed Services

Customer Case Studies

Resources

Use Cases

About Praetorian

Join Praetorian