github - GHSA-c65p-x677-fgj6

GHSA-c65p-x677-fgj6

Vulnerability from github

Published

2025-05-28 18:03

Modified

2025-05-29 21:36

Severity ?

4.2 (Medium) - CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L

Summary

vLLM has a Weakness in MultiModalHasher Image Hashing Implementation

Details

Summary

In the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image’s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks.

Details

Affected file: vllm/multimodal/hasher.py
Affected method: MultiModalHasher.serialize_item https://github.com/vllm-project/vllm/blob/9420a1fc30af1a632bbc2c66eb8668f3af41f026/vllm/multimodal/hasher.py#L34-L35
Current behavior: For Image.Image instances, only obj.tobytes() is used for hashing.
Problem description: obj.tobytes() does not include the image’s width, height, or mode metadata.
Impact: Two images with the same pixel byte sequence but different sizes could be regarded as the same image by the cache and hashing system, which may result in:
- Incorrect cache hits, leading to abnormal responses
- Deliberate construction of images with different meanings but the same hash value

Recommendation

In the serialize_item method, serialization of Image.Image objects should include not only pixel data, but also all critical metadata—such as dimensions (size), color mode (mode), format, and especially the info dictionary. The info dictionary is particularly important in palette-based images (e.g., mode 'P'), where the palette itself is stored in info. Ignoring info can result in hash collisions between visually distinct images with the same pixel bytes but different palettes or metadata. This can lead to incorrect cache hits or even data leakage.

Summary:
Serializing only the raw pixel data is insecure. Always include all image metadata (size, mode, format, info) in the hash calculation to prevent collisions, especially in cases like palette-based images.

Impact for other modalities For the influence of other modalities, since the video modality is transformed into a multi-dimensional array containing the length, width, time, etc. of the video, the same problem exists due to the incorrect sequence of numpy as well.

For audio, since the momo function is not enabled in librosa.load, the loaded audio is automatically encoded into single channels by librosa and returns a one-dimensional array of numpy, thus keeping the structure of numpy fixed and not affected by this issue.

Fixes

https://github.com/vllm-project/vllm/pull/17378

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.7.0"
            },
            {
              "fixed": "0.9.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2025-46722"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-1023",
      "CWE-1288"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2025-05-28T18:03:41Z",
    "nvd_published_at": "2025-05-29T17:15:21Z",
    "severity": "MODERATE"
  },
  "details": "## Summary\n\nIn the file `vllm/multimodal/hasher.py`, the `MultiModalHasher` class has a security and data integrity issue in its image hashing method. Currently, it serializes `PIL.Image.Image` objects using only `obj.tobytes()`, which returns only the raw pixel data, without including metadata such as the image\u2019s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks.\n\n## Details\n\n- **Affected file:** `vllm/multimodal/hasher.py`\n- **Affected method:** `MultiModalHasher.serialize_item`\nhttps://github.com/vllm-project/vllm/blob/9420a1fc30af1a632bbc2c66eb8668f3af41f026/vllm/multimodal/hasher.py#L34-L35\n- **Current behavior:** For `Image.Image` instances, only `obj.tobytes()` is used for hashing.\n- **Problem description:** `obj.tobytes()` does not include the image\u2019s width, height, or mode metadata.\n- **Impact:** Two images with the same pixel byte sequence but different sizes could be regarded as the same image by the cache and hashing system, which may result in:\n    - Incorrect cache hits, leading to abnormal responses\n    - Deliberate construction of images with different meanings but the same hash value\n\n\n## Recommendation\n\nIn the `serialize_item` method, **serialization of `Image.Image` objects should include not only pixel data, but also all critical metadata**\u2014such as dimensions (`size`), color mode (`mode`), format, and especially the `info` dictionary. The `info` dictionary is particularly important in palette-based images (e.g., mode `\u0027P\u0027`), where the palette itself is stored in `info`. Ignoring `info` can result in hash collisions between visually distinct images with the same pixel bytes but different palettes or metadata. This can lead to incorrect cache hits or even data leakage.\n\n**Summary:**  \nSerializing only the raw pixel data is insecure. Always include all image metadata (`size`, `mode`, `format`, `info`) in the hash calculation to prevent collisions, especially in cases like palette-based images.\n\n**Impact for other modalities**\nFor the influence of other modalities, since the video modality is transformed into a multi-dimensional array containing the length, width, time, etc. of the video, the same problem exists due to the incorrect sequence of numpy as well.\n\nFor audio, since the momo function is not enabled in librosa.load, the loaded audio is automatically encoded into single channels by librosa and returns a one-dimensional array of numpy, thus keeping the structure of numpy fixed and not affected by this issue.\n\n## Fixes\n\n* https://github.com/vllm-project/vllm/pull/17378",
  "id": "GHSA-c65p-x677-fgj6",
  "modified": "2025-05-29T21:36:24Z",
  "published": "2025-05-28T18:03:41Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-46722"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/pull/17378"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848"
    },
    {
      "type": "WEB",
      "url": "https://github.com/pypa/advisory-database/tree/main/vulns/vllm/PYSEC-2025-43.yaml"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/vllm-project/vllm"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L",
      "type": "CVSS_V3"
    }
  ],
  "summary": "vLLM has a Weakness in MultiModalHasher Image Hashing Implementation"
}

cve-2025-46722

Vulnerability from cvelistv5

Published

2025-05-29 16:36

Modified

2025-05-29 18:13

Severity ?

4.2 (Medium) - CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L

Summary

vLLM has a Weakness in MultiModalHasher Image Hashing Implementation

References

▼	URL	Tags
	https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6	x_refsource_CONFIRM
	https://github.com/vllm-project/vllm/pull/17378	x_refsource_MISC
	https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848	x_refsource_MISC

Impacted products

▼	Vendor	Product
	vllm-project	vllm

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2025-46722",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2025-05-29T18:12:29.713264Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2025-05-29T18:13:02.824Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.7.0, \u003c 0.9.0"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image\u2019s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "HIGH",
            "attackVector": "NETWORK",
            "availabilityImpact": "LOW",
            "baseScore": 4.2,
            "baseSeverity": "MEDIUM",
            "confidentialityImpact": "LOW",
            "integrityImpact": "NONE",
            "privilegesRequired": "LOW",
            "scope": "UNCHANGED",
            "userInteraction": "NONE",
            "vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-1288",
              "description": "CWE-1288: Improper Validation of Consistency within Input",
              "lang": "en",
              "type": "CWE"
            }
          ]
        },
        {
          "descriptions": [
            {
              "cweId": "CWE-1023",
              "description": "CWE-1023: Incomplete Comparison with Missing Factors",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-05-29T16:36:12.879Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6"
        },
        {
          "name": "https://github.com/vllm-project/vllm/pull/17378",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/pull/17378"
        },
        {
          "name": "https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848"
        }
      ],
      "source": {
        "advisory": "GHSA-c65p-x677-fgj6",
        "discovery": "UNKNOWN"
      },
      "title": "vLLM has a Weakness in MultiModalHasher Image Hashing Implementation"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2025-46722",
    "datePublished": "2025-05-29T16:36:12.879Z",
    "dateReserved": "2025-04-28T20:56:09.084Z",
    "dateUpdated": "2025-05-29T18:13:02.824Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1"
}

pysec-2025-43

Vulnerability from pysec

Published

2025-05-29 17:15

Modified

2025-05-29 19:21

Details

vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image’s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0.

Aliases

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm",
        "purl": "pkg:pypi/vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "99404f53c72965b41558aceb1bc2380875f5d848"
            }
          ],
          "repo": "https://github.com/vllm-project/vllm",
          "type": "GIT"
        },
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.9.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ],
      "versions": [
        "0.0.1",
        "0.1.0",
        "0.1.1",
        "0.1.2",
        "0.1.3",
        "0.1.4",
        "0.1.5",
        "0.1.6",
        "0.1.7",
        "0.2.0",
        "0.2.1",
        "0.2.1.post1",
        "0.2.2",
        "0.2.3",
        "0.2.4",
        "0.2.5",
        "0.2.6",
        "0.2.7",
        "0.3.0",
        "0.3.1",
        "0.3.2",
        "0.3.3",
        "0.4.0",
        "0.4.0.post1",
        "0.4.1",
        "0.4.2",
        "0.4.3",
        "0.5.0",
        "0.5.0.post1",
        "0.5.1",
        "0.5.2",
        "0.5.3",
        "0.5.3.post1",
        "0.5.4",
        "0.5.5",
        "0.6.0",
        "0.6.1",
        "0.6.1.post1",
        "0.6.1.post2",
        "0.6.2",
        "0.6.3",
        "0.6.3.post1",
        "0.6.4",
        "0.6.4.post1",
        "0.6.5",
        "0.6.6",
        "0.6.6.post1",
        "0.7.0",
        "0.7.1",
        "0.7.2",
        "0.7.3",
        "0.8.0",
        "0.8.1",
        "0.8.2",
        "0.8.3",
        "0.8.4",
        "0.8.5",
        "0.8.5.post1"
      ]
    }
  ],
  "aliases": [
    "CVE-2025-46722",
    "GHSA-c65p-x677-fgj6"
  ],
  "details": "vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image\u2019s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0.",
  "id": "PYSEC-2025-43",
  "modified": "2025-05-29T19:21:01.611587+00:00",
  "published": "2025-05-29T17:15:21+00:00",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6"
    },
    {
      "type": "FIX",
      "url": "https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/pull/17378"
    }
  ]
}

Action not permitted

GHSA-c65p-x677-fgj6

Vulnerability from github

Summary

Details

Recommendation

Fixes

cve-2025-46722

Vulnerability from cvelistv5

pysec-2025-43

Vulnerability from pysec

Tags