Skip to content

feat: detect vulnerable GitHub Actions #1021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

behnazh-w
Copy link
Member

@behnazh-w behnazh-w commented Mar 21, 2025

Summary

This pull request introduces a new check mcn_githubactions_vulnerabilities_1 to detect vulnerable GitHub Actions, enhancing the security of workflows and automating the identification of potential risks in CI/CD pipelines. The key changes include:

Changes

  • A check for specific versions of third-party GitHub Actions that are known to be affected by vulnerabilities. The version can be a commit SHA associated with a tag (version).
  • A new module has been added to implement interactions with OSV (Open Source Vulnerability) API.

Documentation

  • The documentation website has been updated to reflect this new check.
  • A tutorial has been added to guide users through the process of setting up and using the check in their workflows.

Tests

  • Integration and unit tests have been added.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Mar 21, 2025
@behnazh-w behnazh-w force-pushed the behnaz/add-gh-vuln-gha-check branch 2 times, most recently from 3f5b70c to e056b7c Compare March 26, 2025 00:30
@behnazh-w behnazh-w force-pushed the behnaz/add-gh-vuln-gha-check branch 5 times, most recently from 6e429f7 to 9776ec8 Compare April 4, 2025 05:38
@behnazh-w behnazh-w marked this pull request as ready for review April 7, 2025 04:48
@behnazh-w behnazh-w requested a review from tromai as a code owner April 7, 2025 04:48
@benmss benmss self-requested a review April 7, 2025 04:53
@@ -46,12 +46,12 @@ Current checks in Macaron
The table below shows the current set of actionable checks derived from
the requirements that are currently supported by Macaron.

.. list-table:: Mapping SLSA requirements to Macaron checks
.. list-table:: Macaron checks descriptions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. list-table:: Macaron checks descriptions
.. list-table:: Macaron check descriptions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

How to detect vulnerable GitHub Actions
=======================================

This tutorial explains how to use a check in Macaron that detects vulnerable third-party GitHub Actions. This check is important for preventing security issues in your CI/CD pipeline, especially in light of recent incidents, such as vulnerabilities discovered in popular GitHub Actions like `tj-actions/changed-files <https://www.cve.org/CVERecord?id=CVE-2025-30066>`_, and `reviewdog/action-setup <https://www.cve.org/CVERecord?id=CVE-2025-30154>`_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This tutorial explains how to use a check in Macaron that detects vulnerable third-party GitHub Actions. This check is important for preventing security issues in your CI/CD pipeline, especially in light of recent incidents, such as vulnerabilities discovered in popular GitHub Actions like `tj-actions/changed-files <https://www.cve.org/CVERecord?id=CVE-2025-30066>`_, and `reviewdog/action-setup <https://www.cve.org/CVERecord?id=CVE-2025-30154>`_.
This tutorial explains how to use a check in Macaron to detect vulnerable third-party GitHub Actions. This check is important for preventing security issues in your CI/CD pipeline, especially in light of recent incidents, such as vulnerabilities discovered in popular GitHub Actions: `tj-actions/changed-files <https://www.cve.org/CVERecord?id=CVE-2025-30066>`_, and `reviewdog/action-setup <https://www.cve.org/CVERecord?id=CVE-2025-30154>`_.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.


This tutorial explains how to use a check in Macaron that detects vulnerable third-party GitHub Actions. This check is important for preventing security issues in your CI/CD pipeline, especially in light of recent incidents, such as vulnerabilities discovered in popular GitHub Actions like `tj-actions/changed-files <https://www.cve.org/CVERecord?id=CVE-2025-30066>`_, and `reviewdog/action-setup <https://www.cve.org/CVERecord?id=CVE-2025-30154>`_.

We will guide you on how to enable and use this check to enhance the security of your development pipeline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We will guide you on how to enable and use this check to enhance the security of your development pipeline.
We will demonstrate how to enable and use this check to enhance the security of your development pipeline.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

Comment on lines 93 to 96
if isinstance(callee, GitHubWorkflowNode) and callee.node_type in [
GitHubWorkflowType.EXTERNAL,
GitHubWorkflowType.REUSABLE,
]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isinstance(callee, GitHubWorkflowNode) and callee.node_type in [
GitHubWorkflowType.EXTERNAL,
GitHubWorkflowType.REUSABLE,
]:
if isinstance(callee, GitHubWorkflowNode) and callee.node_type in {
GitHubWorkflowType.EXTERNAL,
GitHubWorkflowType.REUSABLE,
}:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

Comment on lines 107 to 111
if not workflow_name:
logger.debug("Workflow %s is not relevant. Skipping...", callee.name)
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only trigger if callee.name has nothing before the @ symbol it contains. Is that expected in some cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also triggers when there is no @ in the callee.name, e.g., when an internal reusable workflow is called in the same repo. We are not interested in these internal workflows. I will improve the condition to be more explicit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

CheckResultData
The result of the check.
"""
result_tables: list[CheckFacts] = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving this further down the function to just before where it's actually used. E.g. for vuln_res in batch_vulns:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

@behnazh-w behnazh-w changed the base branch from staging to main April 8, 2025 04:41
Comment on lines 222 to 264
section_name = "osv_dev"
if not defaults.has_section(section_name):
return []
section = defaults[section_name]

url_netloc = section.get("url_netloc")
if not url_netloc:
raise APIAccessError(
f'The "url_netloc" key is missing in section [{section_name}] of the .ini configuration file.'
)
url_scheme = section.get("url_scheme", "https")
query_endpoint = section.get("querybatch_endpoint")
if not query_endpoint:
raise APIAccessError(
f'The "query_endpoint" key is missing in section [{section_name}] of the .ini configuration file.'
)
try:
url = urllib.parse.urlunsplit(
urllib.parse.SplitResult(
scheme=url_scheme,
netloc=url_netloc,
path=query_endpoint,
query="",
fragment="",
)
)
except ValueError as error:
raise APIAccessError("Failed to construct the API URL.") from error

response = send_post_http_raw(url, json_data=query_data, headers=None)
res_obj = None
if response:
try:
res_obj = response.json()
except requests.exceptions.JSONDecodeError as error:
raise APIAccessError(f"Unable to get a valid response from {url}: {error}") from error

results = res_obj.get("results") if res_obj else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this code is identical to call_osv_query_api. I think we should be re-using it instead of duplicating it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.

description: |
Analyzing with PURL and repository path without dependency resolution.
Analyzing the staging branch of the Macaron repo to detect vulnerable GitHub Actions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should be updated to use branch main.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

provenances=[],
build_info_results=InTotoV01Payload(statement=InferredProvenance().payload),
)
match ci_name:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for using a match statement here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this approach tidier than using if-else statements, and it also makes it easier to extend support for additional CI types. But I can switch to if-else if preferred, it’s not a big deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

Comment on lines 55 to 72
def test_is_affected_version_invalid_commit() -> None:
"""Test if the function can handle invalid commits"""
with pytest.raises(APIAccessError):
OSVDevService.is_version_affected(
vuln={}, pkg_name="pkg", pkg_version="invalid_commit", ecosystem="GitHub Actions"
)


def test_is_affected_version_invalid_response() -> None:
"""Test if the function can handle empty OSV response."""
with pytest.raises(APIAccessError):
OSVDevService.is_version_affected(
vuln={"vulns": []}, pkg_name="repo/workflow", pkg_version="1.0.0", ecosystem="GitHub Actions"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it seems that both of these functions fail because vuln is not correctly populated. In is_version_affected we have the following block:

affected = json_extract(vuln, ["affected"], list)
if not affected:
    raise APIAccessError(f"Failed to extracted info for {pkg_name}@{pkg_version}.")

Can we be sure that the error is being raised from the expected part of the function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will improve the test to make a distinction between these two cases explicitly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

@@ -1,4 +1,4 @@
.. Copyright (c) 2023 - 2023, Oracle and/or its affiliates. All rights reserved.
.. Copyright (c) 2023 - 2025, Oracle and/or its affiliates. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I don't think this file's copyright notice should be updated.

Copy link
Member Author

@behnazh-w behnazh-w Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I noticed that the copyright year was not updated even though the content is changed in previous PRs. I will remove it from my PR, but we should watch for these updates because we don't automatically change the headers in rst files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

@@ -907,3 +907,79 @@ def is_empty_repo(git_obj: Git) -> bool:
return False
except GitCommandError:
return True


def is_commit_hash(version_str: str) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename this variable to make it more generic? From the doc string, I don't think it must be a version string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

if isinstance(results, list):
if expected_size:
if len(results) != expected_size:
raise APIAccessError(f"Unable to get a valid result from {url}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we add the reason why this result is invalid (for example, saying that the results doesn't have the expected number of elements).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.


results = res_obj.get("results") if res_obj else None

if isinstance(results, list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If results is not of type list, should we raise an exception instead ? Or we treat that as an "empty" result 🤔 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the dosctring indicates, in this case we return an empty list. But I can change that behavior to raise an exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Thanks.
I think it's okay to leave it as is, and perhaps it would be good to add an example of improperly formatted being that results is not of an expected type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit a684508.

@behnazh-w behnazh-w force-pushed the behnaz/add-gh-vuln-gha-check branch from 9776ec8 to a684508 Compare April 22, 2025 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants