directory delete fails on multiple MDS

Description

Can’t remove directory on multiple MDS , see below sanity 1,10 test logs

https://jira.whamcloud.com/browse/LU-15978

https://review.whamcloud.com/#/c/47812/

$ PTLDEBUG=-1 SHARED_DIRECTORY=/tmp/test_logs/ LOAD_MODULES_REMOTE=true /usr/lib64/lustre/tests/auster -f multinode -vr -D ~/log10 sanity --only 1

== sanity test 1: mkdir; remkdir; rmdir ================== 10:50:17 (1653303017) striped dir -i1 -c2 -H all_char /mnt/lustre/d1.sanity striped dir -i1 -c2 -H crush /mnt/lustre/d1.sanity/d2 mkdir: cannot create directory '/mnt/lustre/d1.sanity/d2': File exists /mnt/lustre/d1.sanity/d2 has type dir OK rmdir: failed to remove '/mnt/lustre/d1.sanity/d2': Invalid argument rmdir: failed to remove '/mnt/lustre/d1.sanity': Directory not empty /mnt/lustre/d1.sanity exists sanity test_1: @@@@@@ FAIL: d1.sanity was not removed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6406:error() = /usr/lib64/lustre/tests/sanity.sh:296:test_1() = /usr/lib64/lustre/tests/test-framework.sh:6723:run_one() = /usr/lib64/lustre/tests/test-framework.sh:6770:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:6596:run_test() = /usr/lib64/lustre/tests/sanity.sh:298:main() Dumping lctl log to /root/log10/sanity.test_1.*.1653303019.log FAIL 1 (4s) resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 == sanity test complete, duration 28 sec ================= 10:50:23 (1653303023) sanity: FAIL: test_1 d1.sanity was not removed rm: cannot remove '/mnt/lustre/d1.sanity': Directory not empty total 12 drwxr-xr-x 4 root root 4096 May 23 10:50 . drwxr-xr-x. 7 root root 92 May 20 04:15 .. drwxr-xr-x 3 root root 8192 May 23 10:50 d1.sanity sanity test_904: @@@@@@ FAIL: remove sub-test dirs failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6406:error() = /usr/lib64/lustre/tests/test-framework.sh:5890:check_and_cleanup_lustre() = /usr/lib64/lustre/tests/sanity.sh:28247:main() Dumping lctl log to /root/log10/sanity.test_904.*.1653303024.log sanity returned 1

$ PTLDEBUG=-1 SHARED_DIRECTORY=/tmp/test_logs/ LOAD_MODULES_REMOTE=true /usr/lib64/lustre/tests/auster -f multinode -vr -D ~/log10 sanity --only 10

# PTLDEBUG=-1 SHARED_DIRECTORY=/tmp/test_logs/ LOAD_MODULES_REMOTE=true /usr/lib64/lustre/tests/auster -f multinode -vr -D ~/log10 sanity --only 10 == sanity test 10: mkdir .../d10 .../d10/d2; touch .../d10/d2/f ================================================================ 10:44:03 (1653302643) striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre/d10.sanity striped dir -i0 -c2 -H all_char /mnt/lustre/d10.sanity/d2 /mnt/lustre/d10.sanity/d2/f10.sanity has type file OK PASS 10 (3s) resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 resend_count is set to 4 4 4 4 4 4 4 == sanity test complete, duration 28 sec ================= 10:44:08 (1653302648) rm: cannot remove '/mnt/lustre/d10.sanity': Directory not empty total 12 drwxr-xr-x 4 root root 4096 May 23 10:44 . drwxr-xr-x. 7 root root 92 May 20 04:15 .. drwxr-xr-x 3 root root 8192 May 23 10:44 d10.sanity sanity test_904: @@@@@@ FAIL: remove sub-test dirs failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6406:error() = /usr/lib64/lustre/tests/test-framework.sh:5890:check_and_cleanup_lustre() = /usr/lib64/lustre/tests/sanity.sh:28247:main() Dumping lctl log to /root/log10/sanity.test_904.*.1653302649.log sanity returned 1

Activity

Xinliang Liu 
August 21, 2023 at 1:29 AM

Patch merged.

Xinliang Liu 
July 19, 2023 at 2:27 AM

Patch on reviewing.

Kevin Zhao 
June 30, 2023 at 6:48 AM

revisit back after the OpenEuler job is finished

Xinliang Liu 
August 23, 2022 at 2:14 AM

Updated patch according to comments. But it causes multi-node luster cluster start failure which need to be analyzed.

Xinliang Liu 
June 28, 2022 at 10:40 AM

Done

Details

Assignee

Reporter

Original estimate

Time tracking

No time logged4w remaining

Priority

Checklist

Sentry

Created May 23, 2022 at 10:37 AM
Updated August 21, 2023 at 1:29 AM
Resolved August 21, 2023 at 1:29 AM