Skip to content

[LTS 8.6] RDMA: CVE-2023-2176, CVE-2022-48925, CVE-2022-50543; mm: CVE-2023-53178, CVE-2024-26832#851

Open
pvts-mat wants to merge 6 commits intoctrliq:ciqlts8_6from
pvts-mat:ciqlts8_6-CVE-batch-21
Open

[LTS 8.6] RDMA: CVE-2023-2176, CVE-2022-48925, CVE-2022-50543; mm: CVE-2023-53178, CVE-2024-26832#851
pvts-mat wants to merge 6 commits intoctrliq:ciqlts8_6from
pvts-mat:ciqlts8_6-CVE-batch-21

Conversation

@pvts-mat
Copy link
Contributor

@pvts-mat pvts-mat commented Feb 6, 2026

[LTS 8.6]

CVE-2023-2176 VULN-4121
CVE-2022-48925 VULN-175450
CVE-2022-50543 VULN-158126
CVE-2023-53178 VULN-154318
CVE-2024-26832 VULN-175655

Commits

CVE-2023-2176 (+ CVE-2022-48925)

RDMA/cma: Do not change route.addr.src_addr outside state checks

jira VULN-175450
cve CVE-2022-48925
commit-author Jason Gunthorpe <jgg@nvidia.com>
commit 22e9f71072fa605cbf033158db58e0790101928d
RDMA/core: Refactor rdma_bind_addr

jira VULN-4121
cve CVE-2023-2176
commit-author Patrisious Haddad <phaddad@nvidia.com>
commit 8d037973d48c026224ab285e6a06985ccac6f7bf
RDMA/core: Update CMA destination address on rdma_resolve_addr

jira VULN-4121
cve-bf CVE-2023-2176
commit-author Shiraz Saleem <shiraz.saleem@intel.com>
commit 0e15863015d97c1ee2cc29d599abcc7fa2dc3e95

Commit RDMA/cma: Do not change route.addr.src_addr outside state checks was a prerequisite for RDMA/core: Refactor rdma_bind_addr, with its own CVE-2022-48925. The explanation for the CVE-2023-2176 fix (which isn't a mere code refactor) can be found in #599 for LTS 9.2. Another, independently developed solution converging to the same result for FIPS 9 compliant can be found at #584.

While RHEL 8 is listed as not affected by CVE-2022-48925 on https://access.redhat.com/security/cve/cve-2022-48925 the discussion on slack led to the inclusion of the fix for LTS 8.6 with a dedicated jira ticket anyway.

we don't have the fixing 22e9f71 in LTS 8.6 while the Fixes: 732d41c was backported as fd0af7f. Maybe it's a matter of config.

Sure seems like 8.6 was affected by CVE-2022-48925 to me.  CONFIG_INFINIBAND and CONFIG_INFINIBAND_ADDR_TRANS are both enabled in 8.6, so it doesn't seem like a config issue.

CVE-2022-50543

RDMA/rxe: Fix mr->map double free

jira VULN-158126
cve CVE-2022-50543
commit-author Li Zhijian <lizhijian@fujitsu.com>
commit 7d984dac8f6bf4ebd3398af82b357e1d181ecaac

The issue of LTS 8.6 being affected by CVE-2022-50543 is a bit convoluted. From the commit message of the upstream fix 7d984da:

This issue was firstly exposed since commit b18c7da ("RDMA/rxe: Fix
memory leak in error path code") and then we fixed it in commit
8ff5f5d ("RDMA/rxe: Prevent double freeing rxe_map_set()") but this
fix was reverted together at last by commit 1e75550 (Revert
"RDMA/rxe: Create duplicate mapping tables for FMRs")

In LTS 8.6 the last two commits are missing, while the first one is backported as 09e1409, so the bug applies.

The reason 8ff5f5d RDMA/rxe: Prevent double freeing rxe_map_set() wasn't used for the actual fix on LTS 8.6 was that it requires 647bf13 RDMA/rxe: Create duplicate mapping tables for FMRs as prerequisite. The reverting commit 1e75550 reverts both of them, so the addressed function rxe_mr_init_user(…) goes back to the state at b18c7da RDMA/rxe: Fix memory leak in error path code, such that it's 7d984da which cherry-picks cleanly on ciqlts8_6, not 8ff5f5d, despite the latter being the chronologically first solution to this problem.

CVE-2023-53178 (+ CVE-2024-26832)

mm: fix zswap writeback race condition

jira VULN-154318
cve CVE-2023-53178
commit-author Domenico Cerasuolo <cerasuolodomenico@gmail.com>
commit 04fc7816089c5a32c29a04ec94b998e219dfb946
upstream-diff Version ciqlts8_6 lacks commit
  75fa68a5d89871a35246aa2759c95d6dfaf1b582 ("mm/swap: convert
  delete_from_swap_cache() to take a folio") so
  `delete_from_swap_cache()' operates on pages directly and the
  `page_folio()' call is not needed. (That function is not defined in
  ciqlts8_6 anyway, as it was introduced in a non-backported commit
  7b230db3b8d373219f88a3d25c8fbbf12cc7f233)
mm: zswap: fix missing folio cleanup in writeback race path

jira VULN-175655
cve CVE-2024-26832
commit-author Yosry Ahmed <yosryahmed@google.com>
commit e3b63e966cac0bf78aaa1efede1827a252815a1d
upstream-diff Manual change; no commit was actually cherry-picked as the
  auto resolver puts the change completely out of place and its contents
  had to be rewritten entirely anyway. Kernel ciqlts8_6 lacks the
  transition from page to folio in `zswap_writeback_entry()' introduced
  in 96c7b0b42239e7b8987b2664b458dc74e825f760, so in the backported
  version `unlock_page(page)' is used instead of `folio_unlock(folio)'
  and `put_page(page)' instead of `folio_put(folio)'. (See also the
  relevant, non-backported commit
  4e1364286d0a2dd384bceb6db6185b99c0e2c0bc). In the upstream, up until
  e3b63e9, the `zswap_writeback_entry()' function underwent major
  refactors in ff9d5ba202f98db53da98330b46b68a9228c63e4,
  98804a944a63237814257dd149a5b04d6b93489c,
  32acba4c04830487ca3002d716325e02069e053a and 96c7b0b compared to the
  ciqlts8_6 version, but the writeback race path addressed by the fix
  remained largely intact, and in ciqlts8_6 falls under the
  ZSWAP_SWAPCACHE_NEW case.

The CVE-2024-26832 fix is a bugfix of the CVE-2023-53178 fix.

The inclusion of e3b63e9 backport can be contested by RH's classification of RHEL 8 (and others) as "Not affected" by CVE-2024-26832: https://access.redhat.com/security/cve/cve-2024-26832. However, the fix addresses specifically the "writeback race path" added by 04fc781 in the zswap_writeback_entry() function; therefore LTS 8.6 is either affected by both CVE-2023-53178 and CVE-2024-26832 (provided that the former was fixed by 04fc781), or none of them. It's likely that RH's CVE-2024-26832 evaluation was simply the result of CVE-2023-53178 not being fixed at the time yet - see Last modified: October 3, 2025 at https://access.redhat.com/security/cve/cve-2024-26832 and Issued: 2025-11-12 at https://access.redhat.com/errata/RHSA-2025:21084.

Additionally, the end result (of backporting both cve fixes) is the same as in rocky8_10 - compare the functions zswap_writeback_entry() in rocky8_10 https://github.com/ctrliq/kernel-src-tree/blob/rocky8_10/mm/zswap.c#L892 and in this patch, specifically the case ZSWAP_SWAPCACHE_NEW fragment. The CVEs were addressed in rocky8_10 in the back-engineered commits 60f2415 and b923a51. Given the very similar histories of mm/zswap.c file in ciqlts8_6 and rocky8_10 this further supports thesis that LTS 8.6 is affected by CVE-2024-26832. It's worth noting that the Rocky 8.10 solution was discovered only after the vulnerabilities were already solved on LTS 8.6, so the convergence of the results increases confidence in the solution.

kABI check: passed

[1/2] kabi_check_kernel	Check ABI of kernel [ciqlts8_6-CVE-batch-21]	_kabi_check_kernel__x86_64--test--ciqlts8_6-CVE-batch-21
++ uname -m
+ python3 /data/src/ctrliq-github-haskell/kernel-dist-git-el-8.6/SOURCES/check-kabi -k /data/src/ctrliq-github-haskell/kernel-dist-git-el-8.6/SOURCES/Module.kabi_x86_64 -s vms/x86_64--build--ciqlts8_6/build_files/kernel-src-tree-ciqlts8_6-CVE-batch-21/Module.symvers
kABI check passed
+ touch state/kernels/ciqlts8_6-CVE-batch-21/x86_64/kabi_checked

Boot test: passed

boot-test.log

Kselftests: passed relative

Reference

kselftests–ciqlts8_6–run1.log

Patch

kselftests–ciqlts8_6-CVE-batch-21–run1.log

Comparison

The tests results for the reference and the patch are the same.

$ ktests.xsh diff  kselftests*.log

Column    File
--------  --------------------------------------------
Status0   kselftests--ciqlts8_6--run1.log
Status1   kselftests--ciqlts8_6-CVE-batch-21--run1.log

TestCase                                     Status0  Status1  Summary
android:run.sh                               skip     skip     same
bpf:get_cgroup_id_user                       pass     pass     same
bpf:test_bpftool.sh                          pass     pass     same
bpf:test_bpftool_build.sh                    pass     pass     same
bpf:test_bpftool_metadata.sh                 pass     pass     same
bpf:test_cgroup_storage                      pass     pass     same
bpf:test_dev_cgroup                          pass     pass     same
bpf:test_doc_build.sh                        pass     pass     same
bpf:test_flow_dissector.sh                   pass     pass     same
bpf:test_lirc_mode2.sh                       pass     pass     same
bpf:test_lpm_map                             pass     pass     same
bpf:test_lru_map                             pass     pass     same
bpf:test_lwt_ip_encap.sh                     pass     pass     same
bpf:test_lwt_seg6local.sh                    pass     pass     same
bpf:test_netcnt                              pass     pass     same
bpf:test_offload.py                          fail     fail     same
bpf:test_skb_cgroup_id.sh                    pass     pass     same
bpf:test_sock                                pass     pass     same
bpf:test_sock_addr.sh                        pass     pass     same
bpf:test_sysctl                              pass     pass     same
bpf:test_tag                                 pass     pass     same
bpf:test_tc_edt.sh                           pass     pass     same
bpf:test_tc_tunnel.sh                        pass     pass     same
bpf:test_tcp_check_syncookie.sh              pass     pass     same
bpf:test_tcpnotify_user                      pass     pass     same
bpf:test_tunnel.sh                           pass     pass     same
bpf:test_verifier                            pass     pass     same
bpf:test_verifier_log                        pass     pass     same
bpf:test_xdp_meta.sh                         pass     pass     same
bpf:test_xdp_redirect.sh                     pass     pass     same
bpf:test_xdp_veth.sh                         pass     pass     same
bpf:test_xdp_vlan_mode_generic.sh            pass     pass     same
bpf:test_xdp_vlan_mode_native.sh             pass     pass     same
bpf:test_xdping.sh                           pass     pass     same
bpf:urandom_read                             pass     pass     same
breakpoints:breakpoint_test                  pass     pass     same
capabilities:test_execve                     pass     pass     same
core:close_range_test                        pass     pass     same
cpu-hotplug:cpu-on-off-test.sh               pass     pass     same
cpufreq:main.sh                              fail     fail     same
exec:execveat                                pass     pass     same
firmware:fw_run_tests.sh                     skip     skip     same
fpu:run_test_fpu.sh                          skip     skip     same
fpu:test_fpu                                 pass     pass     same
ftrace:ftracetest                            fail     fail     same
futex:run.sh                                 pass     pass     same
gpio:gpio-mockup.sh                          fail     fail     same
intel_pstate:run.sh                          pass     pass     same
ipc:msgque                                   pass     pass     same
kcmp:kcmp_test                               pass     pass     same
kexec:test_kexec_file_load.sh                skip     skip     same
kexec:test_kexec_load.sh                     skip     skip     same
kvm:access_tracking_perf_test                fail     fail     same
kvm:amx_test                                 fail     fail     same
kvm:cr4_cpuid_sync_test                      fail     fail     same
kvm:debug_regs                               fail     fail     same
kvm:demand_paging_test                       pass     pass     same
kvm:dirty_log_perf_test                      pass     pass     same
kvm:dirty_log_test                           fail     fail     same
kvm:emulator_error_test                      fail     fail     same
kvm:evmcs_test                               fail     fail     same
kvm:get_cpuid_test                           fail     fail     same
kvm:get_msr_index_features                   fail     fail     same
kvm:hardware_disable_test                    pass     pass     same
kvm:hyperv_clock                             fail     fail     same
kvm:hyperv_cpuid                             fail     fail     same
kvm:hyperv_features                          fail     fail     same
kvm:kvm_binary_stats_test                    pass     pass     same
kvm:kvm_create_max_vcpus                     skip     skip     same
kvm:kvm_page_table_test                      pass     pass     same
kvm:kvm_pv_test                              fail     fail     same
kvm:memslot_modification_stress_test         pass     pass     same
kvm:memslot_perf_test                        fail     fail     same
kvm:mmio_warning_test                        fail     fail     same
kvm:mmu_role_test                            fail     fail     same
kvm:platform_info_test                       fail     fail     same
kvm:rseq_test                                fail     fail     same
kvm:set_boot_cpu_id                          fail     fail     same
kvm:set_memory_region_test                   pass     pass     same
kvm:set_sregs_test                           fail     fail     same
kvm:smm_test                                 fail     fail     same
kvm:state_test                               fail     fail     same
kvm:steal_time                               pass     pass     same
kvm:svm_int_ctl_test                         fail     fail     same
kvm:svm_vmcall_test                          fail     fail     same
kvm:sync_regs_test                           fail     fail     same
kvm:tsc_msrs_test                            fail     fail     same
kvm:userspace_msr_exit_test                  fail     fail     same
kvm:vmx_apic_access_test                     fail     fail     same
kvm:vmx_close_while_nested_test              fail     fail     same
kvm:vmx_dirty_log_test                       fail     fail     same
kvm:vmx_nested_tsc_scaling_test              fail     fail     same
kvm:vmx_pmu_msrs_test                        fail     fail     same
kvm:vmx_preemption_timer_test                fail     fail     same
kvm:vmx_set_nested_state_test                fail     fail     same
kvm:vmx_tsc_adjust_test                      fail     fail     same
kvm:xapic_ipi_test                           fail     fail     same
kvm:xen_shinfo_test                          fail     fail     same
kvm:xen_vmcall_test                          fail     fail     same
kvm:xss_msr_test                             fail     fail     same
lib:bitmap.sh                                skip     skip     same
lib:prime_numbers.sh                         skip     skip     same
lib:printf.sh                                skip     skip     same
lib:scanf.sh                                 fail     fail     same
livepatch:test-callbacks.sh                  pass     pass     same
livepatch:test-ftrace.sh                     pass     pass     same
livepatch:test-livepatch.sh                  pass     pass     same
livepatch:test-shadow-vars.sh                pass     pass     same
livepatch:test-state.sh                      pass     pass     same
membarrier:membarrier_test_multi_thread      pass     pass     same
membarrier:membarrier_test_single_thread     pass     pass     same
memfd:memfd_test                             pass     pass     same
memfd:run_fuse_test.sh                       fail     fail     same
memfd:run_hugetlbfs_test.sh                  pass     pass     same
memory-hotplug:mem-on-off-test.sh            pass     pass     same
mount:run_tests.sh                           pass     pass     same
net/forwarding:bridge_port_isolation.sh      pass     pass     same
net/forwarding:bridge_sticky_fdb.sh          pass     pass     same
net/forwarding:bridge_vlan_aware.sh          fail     fail     same
net/forwarding:bridge_vlan_unaware.sh        pass     pass     same
net/forwarding:ethtool.sh                    fail     fail     same
net/forwarding:gre_multipath.sh              fail     fail     same
net/forwarding:ip6_forward_instats_vrf.sh    fail     fail     same
net/forwarding:ipip_flat_gre.sh              pass     pass     same
net/forwarding:ipip_flat_gre_key.sh          pass     pass     same
net/forwarding:ipip_flat_gre_keys.sh         pass     pass     same
net/forwarding:ipip_hier_gre.sh              pass     pass     same
net/forwarding:ipip_hier_gre_key.sh          pass     pass     same
net/forwarding:loopback.sh                   skip     skip     same
net/forwarding:mirror_gre.sh                 fail     fail     same
net/forwarding:mirror_gre_bound.sh           pass     pass     same
net/forwarding:mirror_gre_bridge_1d.sh       pass     pass     same
net/forwarding:mirror_gre_bridge_1q.sh       pass     pass     same
net/forwarding:mirror_gre_bridge_1q_lag.sh   pass     pass     same
net/forwarding:mirror_gre_changes.sh         fail     fail     same
net/forwarding:mirror_gre_flower.sh          fail     fail     same
net/forwarding:mirror_gre_lag_lacp.sh        pass     pass     same
net/forwarding:mirror_gre_neigh.sh           pass     pass     same
net/forwarding:mirror_gre_nh.sh              pass     pass     same
net/forwarding:mirror_gre_vlan.sh            pass     pass     same
net/forwarding:mirror_vlan.sh                pass     pass     same
net/forwarding:router.sh                     fail     fail     same
net/forwarding:router_bridge.sh              pass     pass     same
net/forwarding:router_bridge_vlan.sh         pass     pass     same
net/forwarding:router_broadcast.sh           fail     fail     same
net/forwarding:router_multicast.sh           fail     fail     same
net/forwarding:router_multipath.sh           fail     fail     same
net/forwarding:router_vid_1.sh               pass     pass     same
net/forwarding:tc_chains.sh                  pass     pass     same
net/forwarding:tc_flower.sh                  pass     pass     same
net/forwarding:tc_flower_router.sh           pass     pass     same
net/forwarding:tc_mpls_l2vpn.sh              pass     pass     same
net/forwarding:tc_shblocks.sh                pass     pass     same
net/forwarding:tc_vlan_modify.sh             pass     pass     same
net/forwarding:vxlan_asymmetric.sh           pass     pass     same
net/forwarding:vxlan_bridge_1d.sh            fail     fail     same
net/forwarding:vxlan_bridge_1d_port_8472.sh  pass     pass     same
net/forwarding:vxlan_bridge_1q.sh            fail     fail     same
net/forwarding:vxlan_bridge_1q_port_8472.sh  pass     pass     same
net/forwarding:vxlan_symmetric.sh            pass     pass     same
net/mptcp:diag.sh                            pass     pass     same
net/mptcp:mptcp_connect.sh                   pass     pass     same
net/mptcp:mptcp_sockopt.sh                   pass     pass     same
net/mptcp:pm_netlink.sh                      pass     pass     same
net:bareudp.sh                               pass     pass     same
net:devlink_port_split.py                    pass     pass     same
net:drop_monitor_tests.sh                    skip     skip     same
net:fcnal-test.sh                            pass     pass     same
net:fib-onlink-tests.sh                      pass     pass     same
net:fib_rule_tests.sh                        fail     fail     same
net:fib_tests.sh                             pass     pass     same
net:gre_gso.sh                               pass     pass     same
net:icmp_redirect.sh                         pass     pass     same
net:ip6_gre_headroom.sh                      pass     pass     same
net:ipv6_flowlabel.sh                        pass     pass     same
net:l2tp.sh                                  pass     pass     same
net:msg_zerocopy.sh                          fail     fail     same
net:netdevice.sh                             pass     pass     same
net:pmtu.sh                                  pass     pass     same
net:psock_snd.sh                             fail     fail     same
net:reuseaddr_conflict                       pass     pass     same
net:reuseport_bpf                            pass     pass     same
net:reuseport_bpf_cpu                        pass     pass     same
net:reuseport_bpf_numa                       pass     pass     same
net:reuseport_dualstack                      pass     pass     same
net:rtnetlink.sh                             skip     skip     same
net:run_afpackettests                        pass     pass     same
net:run_netsocktests                         pass     pass     same
net:rxtimestamp.sh                           pass     pass     same
net:so_txtime.sh                             fail     fail     same
net:test_bpf.sh                              pass     pass     same
net:test_vxlan_fdb_changelink.sh             pass     pass     same
net:tls                                      pass     pass     same
net:traceroute.sh                            pass     pass     same
net:udpgro.sh                                fail     fail     same
net:udpgro_bench.sh                          fail     fail     same
net:udpgso.sh                                pass     pass     same
net:veth.sh                                  fail     fail     same
net:vrf-xfrm-tests.sh                        pass     pass     same
netfilter:conntrack_icmp_related.sh          fail     fail     same
netfilter:conntrack_tcp_unreplied.sh         fail     fail     same
netfilter:ipvs.sh                            skip     skip     same
netfilter:nft_flowtable.sh                   fail     fail     same
netfilter:nft_meta.sh                        pass     pass     same
netfilter:nft_nat.sh                         skip     skip     same
netfilter:nft_queue.sh                       skip     skip     same
nsfs:owner                                   pass     pass     same
nsfs:pidns                                   pass     pass     same
proc:fd-001-lookup                           pass     pass     same
proc:fd-002-posix-eq                         pass     pass     same
proc:fd-003-kthread                          pass     pass     same
proc:proc-loadavg-001                        pass     pass     same
proc:proc-self-map-files-001                 pass     pass     same
proc:proc-self-map-files-002                 fail     fail     same
proc:proc-self-syscall                       pass     pass     same
proc:proc-self-wchan                         pass     pass     same
proc:proc-uptime-001                         pass     pass     same
proc:proc-uptime-002                         pass     pass     same
proc:read                                    pass     pass     same
proc:setns-dcache                            fail     fail     same
pstore:pstore_post_reboot_tests              skip     skip     same
pstore:pstore_tests                          fail     fail     same
ptrace:peeksiginfo                           pass     pass     same
ptrace:vmaccess                              fail     fail     same
rseq:basic_percpu_ops_test                   pass     pass     same
rseq:basic_test                              pass     pass     same
rseq:param_test                              pass     pass     same
rseq:param_test_benchmark                    pass     pass     same
rseq:param_test_compare_twice                pass     pass     same
rseq:run_param_test.sh                       fail     fail     same
sgx:test_sgx                                 fail     fail     same
sigaltstack:sas                              pass     pass     same
size:get_size                                pass     pass     same
splice:default_file_splice_read.sh           pass     pass     same
static_keys:test_static_keys.sh              skip     skip     same
tc-testing:tdc.sh                            pass     pass     same
timens:clock_nanosleep                       pass     pass     same
timens:exec                                  pass     pass     same
timens:procfs                                pass     pass     same
timens:timens                                pass     pass     same
timens:timer                                 pass     pass     same
timens:timerfd                               pass     pass     same
timers:inconsistency-check                   fail     fail     same
timers:mqueue-lat                            pass     pass     same
timers:nanosleep                             pass     pass     same
timers:nsleep-lat                            fail     fail     same
timers:posix_timers                          pass     pass     same
timers:rtcpie                                pass     pass     same
timers:set-timer-lat                         fail     fail     same
timers:threadtest                            pass     pass     same
tpm2:test_smoke.sh                           fail     fail     same
tpm2:test_space.sh                           fail     fail     same
vm:run_vmtests                               fail     fail     same
x86:amx_64                                   fail     fail     same
x86:check_initial_reg_state_64               pass     pass     same
x86:corrupt_xstate_header_64                 pass     pass     same
x86:fsgsbase_64                              pass     pass     same
x86:fsgsbase_restore_64                      pass     pass     same
x86:ioperm_64                                pass     pass     same
x86:iopl_64                                  pass     pass     same
x86:mov_ss_trap_64                           pass     pass     same
x86:mpx-mini-test_64                         fail     fail     same
x86:protection_keys_64                       pass     pass     same
x86:sigaltstack_64                           pass     pass     same
x86:sigreturn_64                             pass     pass     same
x86:single_step_syscall_64                   pass     pass     same
x86:syscall_nt_64                            pass     pass     same
x86:sysret_rip_64                            pass     pass     same
x86:sysret_ss_attrs_64                       pass     pass     same
x86:test_mremap_vdso_64                      pass     pass     same
x86:test_vdso_64                             pass     pass     same
x86:test_vsyscall_64                         pass     pass     same
zram:zram.sh                                 pass     pass     same

jira VULN-175450
cve CVE-2022-48925
commit-author Jason Gunthorpe <jgg@nvidia.com>
commit 22e9f71

If the state is not idle then resolve_prepare_src() should immediately
fail and no change to global state should happen. However, it
unconditionally overwrites the src_addr trying to build a temporary any
address.

For instance if the state is already RDMA_CM_LISTEN then this will corrupt
the src_addr and would cause the test in cma_cancel_operation():

           if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)

Which would manifest as this trace from syzkaller:

  BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
  Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204

  CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:79 [inline]
   dump_stack+0x141/0x1d7 lib/dump_stack.c:120
   print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
   __kasan_report mm/kasan/report.c:399 [inline]
   kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
   __list_add_valid+0x93/0xa0 lib/list_debug.c:26
   __list_add include/linux/list.h:67 [inline]
   list_add_tail include/linux/list.h:100 [inline]
   cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline]
   rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751
   ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102
   ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732
   vfs_write+0x28e/0xa30 fs/read_write.c:603
   ksys_write+0x1ee/0x250 fs/read_write.c:658
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This is indicating that an rdma_id_private was destroyed without doing
cma_cancel_listens().

Instead of trying to re-use the src_addr memory to indirectly create an
any address derived from the dst build one explicitly on the stack and
bind to that as any other normal flow would do. rdma_bind_addr() will copy
it over the src_addr once it knows the state is valid.

This is similar to commit bc0bdc5 ("RDMA/cma: Do not change
route.addr.src_addr.ss_family")

Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com
	Cc: stable@vger.kernel.org
Fixes: 732d41c ("RDMA/cma: Make the locking for automatic state transition more clear")
	Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com
	Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
	Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
(cherry picked from commit 22e9f71)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-4121
cve CVE-2023-2176
commit-author Patrisious Haddad <phaddad@nvidia.com>
commit 8d03797

Refactor rdma_bind_addr function so that it doesn't require that the
cma destination address be changed before calling it.

So now it will update the destination address internally only when it is
really needed and after passing all the required checks.

Which in turn results in a cleaner and more sensible call and error
handling flows for the functions that call it directly or indirectly.

	Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
	Reported-by: Wei Chen <harperchen1110@gmail.com>
	Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/3d0e9a2fd62bc10ba02fed1c7c48a48638952320.1672819273.git.leonro@nvidia.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit 8d03797)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-4121
cve-bf CVE-2023-2176
commit-author Shiraz Saleem <shiraz.saleem@intel.com>
commit 0e15863

8d03797 ("RDMA/core: Refactor rdma_bind_addr") intoduces as regression
on irdma devices on certain tests which uses rdma CM, such as cmtime.

No connections can be established with the MAD QP experiences a fatal
error on the active side.

The cma destination address is not updated with the dst_addr when ULP
on active side calls rdma_bind_addr followed by rdma_resolve_addr.
The id_priv state is 'bound' in resolve_prepare_src and update is skipped.

This leaves the dgid passed into irdma driver to create an Address Handle
(AH) for the MAD QP at 0. The create AH descriptor as well as the ARP cache
entry is invalid and HW throws an asynchronous events as result.

[ 1207.656888] resolve_prepare_src caller: ucma_resolve_addr+0xff/0x170 [rdma_ucm] daddr=200.0.4.28 id_priv->state=7
[....]
[ 1207.680362] ice 0000:07:00.1 rocep7s0f1: caller: irdma_create_ah+0x3e/0x70 [irdma] ah_id=0 arp_idx=0 dest_ip=0.0.0.0
destMAC=00:00:64:ca:b7:52 ipvalid=1 raw=0000:0000:0000:0000:0000:ffff:0000:0000
[ 1207.682077] ice 0000:07:00.1 rocep7s0f1: abnormal ae_id = 0x401 bool qp=1 qp_id = 1, ae_src=5
[ 1207.691657] infiniband rocep7s0f1: Fatal error (1) on MAD QP (1)

Fix this by updating the CMA destination address when the ULP calls
a resolve address with the CM state already bound.

Fixes: 8d03797 ("RDMA/core: Refactor rdma_bind_addr")
	Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230712234133.1343-1-shiraz.saleem@intel.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit 0e15863)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-158126
cve CVE-2022-50543
commit-author Li Zhijian <lizhijian@fujitsu.com>
commit 7d984da

rxe_mr_cleanup() which tries to free mr->map again will be called when
rxe_mr_init_user() fails:

   CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ ctrliq#25
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
   Call Trace:
    <TASK>
    dump_stack_lvl+0x45/0x5d
    panic+0x19e/0x349
    end_report.part.0+0x54/0x7c
    kasan_report.cold+0xa/0xf
    rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
    __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
    rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
    ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
    ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
    ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]

This issue was firstly exposed since commit b18c7da ("RDMA/rxe: Fix
memory leak in error path code") and then we fixed it in commit
8ff5f5d ("RDMA/rxe: Prevent double freeing rxe_map_set()") but this
fix was reverted together at last by commit 1e75550 (Revert
"RDMA/rxe: Create duplicate mapping tables for FMRs")

Simply let rxe_mr_cleanup() always handle freeing the mr->map once it is
successfully allocated.

Fixes: 1e75550 ("Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"")
Link: https://lore.kernel.org/r/1667099073-2-1-git-send-email-lizhijian@fujitsu.com
	Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
	Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
(cherry picked from commit 7d984da)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-154318
cve CVE-2023-53178
commit-author Domenico Cerasuolo <cerasuolodomenico@gmail.com>
commit 04fc781
upstream-diff Version ciqlts8_6 lacks commit
  75fa68a ("mm/swap: convert
  delete_from_swap_cache() to take a folio") so
  `delete_from_swap_cache()' operates on pages directly and the
  `page_folio()' call is not needed. (That function is not defined in
  ciqlts8_6 anyway, as it was introduced in a non-backported commit
  7b230db)

The zswap writeback mechanism can cause a race condition resulting in
memory corruption, where a swapped out page gets swapped in with data that
was written to a different page.

The race unfolds like this:
1. a page with data A and swap offset X is stored in zswap
2. page A is removed off the LRU by zpool driver for writeback in
   zswap-shrink work, data for A is mapped by zpool driver
3. user space program faults and invalidates page entry A, offset X is
   considered free
4. kswapd stores page B at offset X in zswap (zswap could also be
   full, if so, page B would then be IOed to X, then skip step 5.)
5. entry A is replaced by B in tree->rbroot, this doesn't affect the
   local reference held by zswap-shrink work
6. zswap-shrink work writes back A at X, and frees zswap entry A
7. swapin of slot X brings A in memory instead of B

The fix:
Once the swap page cache has been allocated (case ZSWAP_SWAPCACHE_NEW),
zswap-shrink work just checks that the local zswap_entry reference is
still the same as the one in the tree.  If it's not the same it means that
it's either been invalidated or replaced, in both cases the writeback is
aborted because the local entry contains stale data.

Reproducer:
I originally found this by running `stress` overnight to validate my work
on the zswap writeback mechanism, it manifested after hours on my test
machine.  The key to make it happen is having zswap writebacks, so
whatever setup pumps /sys/kernel/debug/zswap/written_back_pages should do
the trick.

In order to reproduce this faster on a vm, I setup a system with ~100M of
available memory and a 500M swap file, then running `stress --vm 1
--vm-bytes 300000000 --vm-stride 4000` makes it happen in matter of tens
of minutes.  One can speed things up even more by swinging
/sys/module/zswap/parameters/max_pool_percent up and down between, say, 20
and 1; this makes it reproduce in tens of seconds.  It's crucial to set
`--vm-stride` to something other than 4096 otherwise `stress` won't
realize that memory has been corrupted because all pages would have the
same data.

Link: https://lkml.kernel.org/r/20230503151200.19707-1-cerasuolodomenico@gmail.com
	Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
	Acked-by: Johannes Weiner <hannes@cmpxchg.org>
	Reviewed-by: Chris Li (Google) <chrisl@kernel.org>
	Cc: Dan Streetman <ddstreet@ieee.org>
	Cc: Johannes Weiner <hannes@cmpxchg.org>
	Cc: Minchan Kim <minchan@kernel.org>
	Cc: Nitin Gupta <ngupta@vflare.org>
	Cc: Seth Jennings <sjenning@redhat.com>
	Cc: Vitaly Wool <vitaly.wool@konsulko.com>
	Cc: <stable@vger.kernel.org>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 04fc781)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-175655
cve CVE-2024-26832
commit-author Yosry Ahmed <yosryahmed@google.com>
commit e3b63e9
upstream-diff Manual change; no commit was actually cherry-picked as the
  auto resolver puts the change completely out of place and its contents
  had to be rewritten entirely anyway. Kernel ciqlts8_6 lacks the
  transition from page to folio in `zswap_writeback_entry()' introduced
  in 96c7b0b, so in the backported
  version `unlock_page(page)' is used instead of `folio_unlock(folio)'
  and `put_page(page)' instead of `folio_put(folio)'. (See also the
  relevant, non-backported commit
  4e13642). In the upstream, up until
  e3b63e9, the `zswap_writeback_entry()' function underwent major
  refactors in ff9d5ba,
  98804a9,
  32acba4 and 96c7b0b compared to the
  ciqlts8_6 version, but the writeback race path addressed by the fix
  remained largely intact, and in ciqlts8_6 falls under the
  ZSWAP_SWAPCACHE_NEW case.

In zswap_writeback_entry(), after we get a folio from
__read_swap_cache_async(), we grab the tree lock again to check that the
swap entry was not invalidated and recycled.  If it was, we delete the
folio we just added to the swap cache and exit.

However, __read_swap_cache_async() returns the folio locked when it is
newly allocated, which is always true for this path, and the folio is
ref'd.  Make sure to unlock and put the folio before returning.

This was discovered by code inspection, probably because this path handles
a race condition that should not happen often, and the bug would not crash
the system, it will only strand the folio indefinitely.

Link: https://lkml.kernel.org/r/20240125085127.1327013-1-yosryahmed@google.com
Fixes: 04fc781 ("mm: fix zswap writeback race condition")
	Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
	Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
	Acked-by: Johannes Weiner <hannes@cmpxchg.org>
	Reviewed-by: Nhat Pham <nphamcs@gmail.com>
	Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
	Cc: <stable@vger.kernel.org>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit e3b63e9)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant