ghsa-fgc5-qhj8-xm5g
Vulnerability from github
Published
2024-08-28 09:30
Modified
2024-09-10 18:30
Details

In the Linux kernel, the following vulnerability has been resolved:

mm: gup: stop abusing try_grab_folio

A kernel warning was reported when pinning folio in CMA memory when launching SEV virtual machine. The splat looks like:

[ 464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520 [ 464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6 [ 464.325477] RIP: 0010:__get_user_pages+0x423/0x520 [ 464.325515] Call Trace: [ 464.325520] [ 464.325523] ? __get_user_pages+0x423/0x520 [ 464.325528] ? __warn+0x81/0x130 [ 464.325536] ? __get_user_pages+0x423/0x520 [ 464.325541] ? report_bug+0x171/0x1a0 [ 464.325549] ? handle_bug+0x3c/0x70 [ 464.325554] ? exc_invalid_op+0x17/0x70 [ 464.325558] ? asm_exc_invalid_op+0x1a/0x20 [ 464.325567] ? __get_user_pages+0x423/0x520 [ 464.325575] __gup_longterm_locked+0x212/0x7a0 [ 464.325583] internal_get_user_pages_fast+0xfb/0x190 [ 464.325590] pin_user_pages_fast+0x47/0x60 [ 464.325598] sev_pin_memory+0xca/0x170 [kvm_amd] [ 464.325616] sev_mem_enc_register_region+0x81/0x130 [kvm_amd]

Per the analysis done by yangge, when starting the SEV virtual machine, it will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the memory. But the page is in CMA area, so fast GUP will fail then fallback to the slow path due to the longterm pinnalbe check in try_grab_folio().

The slow path will try to pin the pages then migrate them out of CMA area. But the slow path also uses try_grab_folio() to pin the page, it will also fail due to the same check then the above warning is triggered.

In addition, the try_grab_folio() is supposed to be used in fast path and it elevates folio refcount by using add ref unless zero. We are guaranteed to have at least one stable reference in slow path, so the simple atomic add could be used. The performance difference should be trivial, but the misuse may be confusing and misleading.

Redefined try_grab_folio() to try_grab_folio_fast(), and try_grab_page() to try_grab_folio(), and use them in the proper paths. This solves both the abuse and the kernel warning.

The proper naming makes their usecase more clear and should prevent from abusing in the future.

peterx said:

: The user will see the pin fails, for gpu-slow it further triggers the WARN : right below that failure (as in the original report): : : folio = try_grab_folio(page, page_increm - 1, : foll_flags); : if (WARN_ON_ONCE(!folio)) { <------------------------ here : / : * Release the 1st page ref if the : * folio is problematic, fail hard. : / : gup_put_folio(page_folio(page), 1, : foll_flags); : ret = -EFAULT; : goto out; : }

[1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge1116@126.com/

[shy828301@gmail.com: fix implicit declaration of function try_grab_folio_fast] Link: https://lkml.kernel.org/r/CAHbLzkowMSso-4Nufc9hcMehQsK9PNz3OSu-+eniU-2Mm-xjhA@mail.gmail.com

Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2024-44943"
  ],
  "database_specific": {
    "cwe_ids": [],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2024-08-28T08:15:06Z",
    "severity": "MODERATE"
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nmm: gup: stop abusing try_grab_folio\n\nA kernel warning was reported when pinning folio in CMA memory when\nlaunching SEV virtual machine.  The splat looks like:\n\n[  464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520\n[  464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6\n[  464.325477] RIP: 0010:__get_user_pages+0x423/0x520\n[  464.325515] Call Trace:\n[  464.325520]  \u003cTASK\u003e\n[  464.325523]  ? __get_user_pages+0x423/0x520\n[  464.325528]  ? __warn+0x81/0x130\n[  464.325536]  ? __get_user_pages+0x423/0x520\n[  464.325541]  ? report_bug+0x171/0x1a0\n[  464.325549]  ? handle_bug+0x3c/0x70\n[  464.325554]  ? exc_invalid_op+0x17/0x70\n[  464.325558]  ? asm_exc_invalid_op+0x1a/0x20\n[  464.325567]  ? __get_user_pages+0x423/0x520\n[  464.325575]  __gup_longterm_locked+0x212/0x7a0\n[  464.325583]  internal_get_user_pages_fast+0xfb/0x190\n[  464.325590]  pin_user_pages_fast+0x47/0x60\n[  464.325598]  sev_pin_memory+0xca/0x170 [kvm_amd]\n[  464.325616]  sev_mem_enc_register_region+0x81/0x130 [kvm_amd]\n\nPer the analysis done by yangge, when starting the SEV virtual machine, it\nwill call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the memory. \nBut the page is in CMA area, so fast GUP will fail then fallback to the\nslow path due to the longterm pinnalbe check in try_grab_folio().\n\nThe slow path will try to pin the pages then migrate them out of CMA area.\nBut the slow path also uses try_grab_folio() to pin the page, it will\nalso fail due to the same check then the above warning is triggered.\n\nIn addition, the try_grab_folio() is supposed to be used in fast path and\nit elevates folio refcount by using add ref unless zero.  We are guaranteed\nto have at least one stable reference in slow path, so the simple atomic add\ncould be used.  The performance difference should be trivial, but the\nmisuse may be confusing and misleading.\n\nRedefined try_grab_folio() to try_grab_folio_fast(), and try_grab_page()\nto try_grab_folio(), and use them in the proper paths.  This solves both\nthe abuse and the kernel warning.\n\nThe proper naming makes their usecase more clear and should prevent from\nabusing in the future.\n\npeterx said:\n\n: The user will see the pin fails, for gpu-slow it further triggers the WARN\n: right below that failure (as in the original report):\n: \n:         folio = try_grab_folio(page, page_increm - 1,\n:                                 foll_flags);\n:         if (WARN_ON_ONCE(!folio)) { \u003c------------------------ here\n:                 /*\n:                         * Release the 1st page ref if the\n:                         * folio is problematic, fail hard.\n:                         */\n:                 gup_put_folio(page_folio(page), 1,\n:                                 foll_flags);\n:                 ret = -EFAULT;\n:                 goto out;\n:         }\n\n[1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge1116@126.com/\n\n[shy828301@gmail.com: fix implicit declaration of function try_grab_folio_fast]\n  Link: https://lkml.kernel.org/r/CAHbLzkowMSso-4Nufc9hcMehQsK9PNz3OSu-+eniU-2Mm-xjhA@mail.gmail.com",
  "id": "GHSA-fgc5-qhj8-xm5g",
  "modified": "2024-09-10T18:30:42Z",
  "published": "2024-08-28T09:30:34Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-44943"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/26273f5f4cf68b29414e403837093408a9c98e1f"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/f442fa6141379a20b48ae3efabee827a3d260787"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
      "type": "CVSS_V3"
    }
  ]
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading...

Loading...

Loading...
  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.