Hallo zusammen,
ich habe folgendes System aufgesetzt:
Debian 10, Raid 6 (6 Devices + 1 Device(SSD) als Write-Cache).
Das Debian läuft auf einer seperaten Festplatte.
Code:
fdisk -l |grep 'Disk /dev/sd'
Disk /dev/sda: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk /dev/sdb: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk /dev/sdc: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk /dev/sdd: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk /dev/sde: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk /dev/sdf: 111,8 GiB, 120034123776 bytes, 234441648 sectors
Disk /dev/sdg: 149,1 GiB, 160041885696 bytes, 312581808 sectors
Disk /dev/sdh: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Code:
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid6 sdh1[1] sda1[0] sdf3[6](J) sde1[2] sdd1[3] sdc1[4] sdb1[5]
15627540480 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/6] [UUUUUU]
Den Cache habe ich nach Seite 83 eingebunden:
Code:
mdadm --manage /dev/md0 --add-journal /dev/sdf3
Auf dem Raid habe ich weiter ein LVM-Volume mit Lese-Cache aufgesetzt. Auf dem darauf befindlichen Dateisystem einnen LUKS-Container als Datei eingebunden.
Code:
root@fileserver:~# df -h
Dateisystem Größe Benutzt Verf. Verw% Eingehängt auf
udev 974M 0 974M 0% /dev
tmpfs 198M 9,4M 189M 5% /run
/dev/sdg4 28G 11G 16G 40% /
tmpfs 990M 0 990M 0% /dev/shm
tmpfs 5,0M 0 5,0M 0% /run/lock
tmpfs 990M 0 990M 0% /sys/fs/cgroup
/dev/sdg1 453M 88M 338M 21% /boot
/dev/sdg6 20G 7,4G 12G 40% /var
/dev/mapper/raid6--4T-r6_4T_files 15T 12T 1,8T 88% /files
tmpfs 198M 0 198M 0% /run/user/0
tmpfs 198M 0 198M 0% /run/user/1000
/dev/mapper/tbna-home 2,9T 2,5T 312G 89% /files/sicherung/tbna
So weit so gut. Nach dem Transfer von ca. 1 TB kommen in der Kern.log folgende Meldungen:
Code:
Mar 22 21:13:48 fileserver kernel: [36055.261296] nfsd: peername failed (err 107)!
Mar 22 21:54:12 fileserver kernel: [38479.523603] usb 10-5.4: USB disconnect, device number 22
Mar 22 21:54:12 fileserver kernel: [38479.523707] usb 10-5.4.1: USB disconnect, device number 23
Mar 23 00:26:23 fileserver kernel: [47610.442518] INFO: task khugepaged:38 blocked for more than 120 seconds.
Mar 23 00:26:23 fileserver kernel: [47610.442665] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:23 fileserver kernel: [47610.442784] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:23 fileserver kernel: [47610.442933] khugepaged D 0 38 2 0x80000000
Mar 23 00:26:23 fileserver kernel: [47610.443046] Call Trace:
Mar 23 00:26:23 fileserver kernel: [47610.443116] ? __schedule+0x2a2/0x870
Mar 23 00:26:23 fileserver kernel: [47610.443199] schedule+0x28/0x80
Mar 23 00:26:23 fileserver kernel: [47610.443269] io_schedule+0x12/0x40
Mar 23 00:26:23 fileserver kernel: [47610.443346] wbt_wait+0x205/0x300
Mar 23 00:26:23 fileserver kernel: [47610.443420] ? wbt_wait+0x300/0x300
Mar 23 00:26:23 fileserver kernel: [47610.443499] rq_qos_throttle+0x31/0x40
Mar 23 00:26:23 fileserver kernel: [47610.443582] blk_mq_make_request+0x111/0x530
Mar 23 00:26:23 fileserver kernel: [47610.443675] generic_make_request+0x1a4/0x400
Mar 23 00:26:23 fileserver kernel: [47610.443770] ? end_swap_bio_read+0xc0/0xc0
Mar 23 00:26:23 fileserver kernel: [47610.443857] submit_bio+0x45/0x130
Mar 23 00:26:23 fileserver kernel: [47610.443932] ? get_swap_bio+0xbb/0xf0
Mar 23 00:26:23 fileserver kernel: [47610.444011] __swap_writepage+0xf2/0x3c0
Mar 23 00:26:23 fileserver kernel: [47610.444095] ? __frontswap_store+0x6e/0xf2
Mar 23 00:26:23 fileserver kernel: [47610.444185] pageout.isra.49+0x117/0x340
Mar 23 00:26:23 fileserver kernel: [47610.444272] shrink_page_list+0xa47/0xc70
Mar 23 00:26:23 fileserver kernel: [47610.444361] shrink_inactive_list+0x207/0x590
Mar 23 00:26:23 fileserver kernel: [47610.444456] shrink_node_memcg+0x20c/0x780
Mar 23 00:26:23 fileserver kernel: [47610.444546] shrink_node+0xcf/0x450
Mar 23 00:26:23 fileserver kernel: [47610.444625] do_try_to_free_pages+0xc6/0x370
Mar 23 00:26:23 fileserver kernel: [47610.444716] try_to_free_pages+0xf0/0x1b0
Mar 23 00:26:23 fileserver kernel: [47610.444805] __alloc_pages_slowpath+0x35a/0xcb0
Mar 23 00:26:23 fileserver kernel: [47610.444900] ? __switch_to+0x8c/0x440
Mar 23 00:26:23 fileserver kernel: [47610.444980] ? put_prev_entity+0x20/0x100
Mar 23 00:26:23 fileserver kernel: [47610.445068] __alloc_pages_nodemask+0x28b/0x2b0
Mar 23 00:26:23 fileserver kernel: [47610.445165] khugepaged_alloc_page+0x17/0x50
Mar 23 00:26:23 fileserver kernel: [47610.445256] khugepaged+0xb6e/0x2110
Mar 23 00:26:23 fileserver kernel: [47610.445340] ? finish_wait+0x80/0x80
Mar 23 00:26:23 fileserver kernel: [47610.445419] ? collapse_shmem+0xc00/0xc00
Mar 23 00:26:23 fileserver kernel: [47610.445503] kthread+0x112/0x130
Mar 23 00:26:23 fileserver kernel: [47610.445575] ? kthread_bind+0x30/0x30
Mar 23 00:26:23 fileserver kernel: [47610.445656] ret_from_fork+0x1f/0x40
Mar 23 00:26:23 fileserver kernel: [47610.445761] INFO: task md0_reclaim:188 blocked for more than 120 seconds.
Mar 23 00:26:23 fileserver kernel: [47610.445895] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:23 fileserver kernel: [47610.446012] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:23 fileserver kernel: [47610.446162] md0_reclaim D 0 188 2 0x80000000
Mar 23 00:26:23 fileserver kernel: [47610.446274] Call Trace:
Mar 23 00:26:23 fileserver kernel: [47610.446340] ? __schedule+0x2a2/0x870
Mar 23 00:26:23 fileserver kernel: [47610.446524] schedule+0x28/0x80
Mar 23 00:26:23 fileserver kernel: [47610.446601] io_schedule+0x12/0x40
Mar 23 00:26:23 fileserver kernel: [47610.446677] wbt_wait+0x205/0x300
Mar 23 00:26:23 fileserver kernel: [47610.446751] ? wbt_wait+0x300/0x300
Mar 23 00:26:23 fileserver kernel: [47610.446829] rq_qos_throttle+0x31/0x40
Mar 23 00:26:23 fileserver kernel: [47610.446911] blk_mq_make_request+0x111/0x530
Mar 23 00:26:23 fileserver kernel: [47610.447004] generic_make_request+0x1a4/0x400
Mar 23 00:26:23 fileserver kernel: [47610.447097] ? sched_clock+0x5/0x10
Mar 23 00:26:23 fileserver kernel: [47610.447174] submit_bio+0x45/0x130
Mar 23 00:26:23 fileserver kernel: [47610.447264] ? md_super_write.part.63+0x90/0x120 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.447389] md_update_sb.part.65+0x3a3/0x8d0 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.447510] r5l_do_reclaim+0x32d/0x3b0 [raid456]
Mar 23 00:26:23 fileserver kernel: [47610.447624] ? md_rdev_init+0xb0/0xb0 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.447726] ? r5l_reclaim_thread+0xe2/0x1f0 [raid456]
Mar 23 00:26:23 fileserver kernel: [47610.447844] ? md_rdev_init+0xb0/0xb0 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.447949] md_thread+0x94/0x150 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.448039] ? finish_wait+0x80/0x80
Mar 23 00:26:23 fileserver kernel: [47610.451995] kthread+0x112/0x130
Mar 23 00:26:23 fileserver kernel: [47610.455936] ? kthread_bind+0x30/0x30
Mar 23 00:26:23 fileserver kernel: [47610.459862] ret_from_fork+0x1f/0x40
Mar 23 00:26:23 fileserver kernel: [47610.463730] INFO: task jbd2/dm-4-8:530 blocked for more than 120 seconds.
Mar 23 00:26:23 fileserver kernel: [47610.467610] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:23 fileserver kernel: [47610.471478] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:23 fileserver kernel: [47610.475411] jbd2/dm-4-8 D 0 530 2 0x80000000
Mar 23 00:26:23 fileserver kernel: [47610.479364] Call Trace:
Mar 23 00:26:23 fileserver kernel: [47610.483404] ? __schedule+0x2a2/0x870
Mar 23 00:26:23 fileserver kernel: [47610.487224] ? bio_alloc_bioset+0xdc/0x220
Mar 23 00:26:23 fileserver kernel: [47610.491041] schedule+0x28/0x80
Mar 23 00:26:23 fileserver kernel: [47610.494873] md_write_start+0x14b/0x220 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.498733] ? finish_wait+0x80/0x80
Mar 23 00:26:23 fileserver kernel: [47610.502601] ? finish_wait+0x80/0x80
Mar 23 00:26:23 fileserver kernel: [47610.506452] raid5_make_request+0x83/0xb70 [raid456]
Mar 23 00:26:23 fileserver kernel: [47610.510289] ? part_round_stats+0xbb/0x170
Mar 23 00:26:23 fileserver kernel: [47610.514154] ? finish_wait+0x80/0x80
Mar 23 00:26:23 fileserver kernel: [47610.518034] ? __split_and_process_non_flush+0x159/0x1f0 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.521955] ? finish_wait+0x80/0x80
Mar 23 00:26:23 fileserver kernel: [47610.525881] md_handle_request+0x119/0x190 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.529851] md_make_request+0x78/0x160 [md_mod]
Mar 23 00:26:23 fileserver kernel: [47610.533831] generic_make_request+0x1a4/0x400
Mar 23 00:26:23 fileserver kernel: [47610.537802] submit_bio+0x45/0x130
Mar 23 00:26:23 fileserver kernel: [47610.541763] ? guard_bio_eod+0x32/0x100
Mar 23 00:26:23 fileserver kernel: [47610.545727] submit_bh_wbc+0x163/0x190
Mar 23 00:26:23 fileserver kernel: [47610.549716] jbd2_journal_commit_transaction+0x5d8/0x1820 [jbd2]
Mar 23 00:26:23 fileserver kernel: [47610.553780] kjournald2+0xbd/0x270 [jbd2]
Mar 23 00:26:23 fileserver kernel: [47610.557844] ? finish_wait+0x80/0x80
Mar 23 00:26:23 fileserver kernel: [47610.561908] ? commit_timeout+0x10/0x10 [jbd2]
Mar 23 00:26:23 fileserver kernel: [47610.565967] kthread+0x112/0x130
Mar 23 00:26:23 fileserver kernel: [47610.570007] ? kthread_bind+0x30/0x30
Mar 23 00:26:23 fileserver kernel: [47610.574149] ret_from_fork+0x1f/0x40
Mar 23 00:26:23 fileserver kernel: [47610.582528] INFO: task loop0:1465 blocked for more than 120 seconds.
Mar 23 00:26:23 fileserver kernel: [47610.587014] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:23 fileserver kernel: [47610.591492] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:23 fileserver kernel: [47610.596079] loop0 D 0 1465 2 0x80000000
Mar 23 00:26:23 fileserver kernel: [47610.600713] Call Trace:
Mar 23 00:26:23 fileserver kernel: [47610.605312] ? __schedule+0x2a2/0x870
Mar 23 00:26:23 fileserver kernel: [47610.609929] ? bit_wait_timeout+0x90/0x90
Mar 23 00:26:23 fileserver kernel: [47610.614567] schedule+0x28/0x80
Mar 23 00:26:23 fileserver kernel: [47610.619194] io_schedule+0x12/0x40
Mar 23 00:26:23 fileserver kernel: [47610.623761] bit_wait_io+0xd/0x50
Mar 23 00:26:23 fileserver kernel: [47610.628247] __wait_on_bit+0x73/0x90
Mar 23 00:26:23 fileserver kernel: [47610.632669] out_of_line_wait_on_bit+0x91/0xb0
Mar 23 00:26:23 fileserver kernel: [47610.637051] ? init_wait_var_entry+0x40/0x40
Mar 23 00:26:23 fileserver kernel: [47610.641462] do_get_write_access+0x2d5/0x430 [jbd2]
Mar 23 00:26:23 fileserver kernel: [47610.645917] ? ext4_dirty_inode+0x46/0x60 [ext4]
Mar 23 00:26:23 fileserver kernel: [47610.650335] jbd2_journal_get_write_access+0x37/0x50 [jbd2]
Mar 23 00:26:23 fileserver kernel: [47610.654873] __ext4_journal_get_write_access+0x36/0x70 [ext4]
Mar 23 00:26:23 fileserver kernel: [47610.659446] ext4_reserve_inode_write+0x96/0xc0 [ext4]
Mar 23 00:26:23 fileserver kernel: [47610.664054] ext4_mark_inode_dirty+0x51/0x1d0 [ext4]
Mar 23 00:26:23 fileserver kernel: [47610.668637] ? jbd2__journal_start+0xd9/0x1e0 [jbd2]
Mar 23 00:26:23 fileserver kernel: [47610.673261] ext4_dirty_inode+0x46/0x60 [ext4]
Mar 23 00:26:23 fileserver kernel: [47610.677834] __mark_inode_dirty+0x1ba/0x380
Mar 23 00:26:23 fileserver kernel: [47610.682432] generic_update_time+0xb6/0xd0
Mar 23 00:26:23 fileserver kernel: [47610.687021] file_update_time+0xe1/0x130
Mar 23 00:26:23 fileserver kernel: [47610.691581] __generic_file_write_iter+0x98/0x1c0
Mar 23 00:26:23 fileserver kernel: [47610.696201] ext4_file_write_iter+0xc6/0x3b0 [ext4]
Mar 23 00:26:23 fileserver kernel: [47610.700755] do_iter_readv_writev+0x13a/0x1b0
Mar 23 00:26:23 fileserver kernel: [47610.705504] do_iter_write+0x80/0x190
Mar 23 00:26:23 fileserver kernel: [47610.710033] lo_write_bvec+0x62/0x100 [loop]
Mar 23 00:26:23 fileserver kernel: [47610.714553] loop_queue_work+0x1c2/0x9b0 [loop]
Mar 23 00:26:23 fileserver kernel: [47610.719090] ? loop_info64_to_compat+0x220/0x220 [loop]
Mar 23 00:26:23 fileserver kernel: [47610.723643] kthread_worker_fn+0x7c/0x1c0
Mar 23 00:26:23 fileserver kernel: [47610.728207] kthread+0x112/0x130
Mar 23 00:26:23 fileserver kernel: [47610.732748] ? kthread_bind+0x30/0x30
Mar 23 00:26:23 fileserver kernel: [47610.737304] ret_from_fork+0x1f/0x40
Mar 23 00:26:23 fileserver kernel: [47610.742311] INFO: task kworker/u9:4:6226 blocked for more than 120 seconds.
Mar 23 00:26:23 fileserver kernel: [47610.746895] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:23 fileserver kernel: [47610.751442] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:23 fileserver kernel: [47610.756000] kworker/u9:4 D 0 6226 2 0x80000000
Mar 23 00:26:23 fileserver kernel: [47610.760602] Workqueue: kcryptd kcryptd_crypt [dm_crypt]
Mar 23 00:26:23 fileserver kernel: [47610.765236] Call Trace:
Mar 23 00:26:23 fileserver kernel: [47610.769832] ? __schedule+0x2a2/0x870
Mar 23 00:26:23 fileserver kernel: [47610.774461] ? __percpu_counter_sum+0x56/0x60
Mar 23 00:26:23 fileserver kernel: [47610.779072] schedule+0x28/0x80
Mar 23 00:26:23 fileserver kernel: [47610.783658] schedule_preempt_disabled+0xa/0x10
Mar 23 00:26:23 fileserver kernel: [47610.788309] __mutex_lock.isra.8+0x2b5/0x4a0
Mar 23 00:26:23 fileserver kernel: [47610.792962] kcryptd_crypt+0x26e/0x3b0 [dm_crypt]
Mar 23 00:26:23 fileserver kernel: [47610.797629] process_one_work+0x1a7/0x3a0
Mar 23 00:26:23 fileserver kernel: [47610.802294] worker_thread+0x30/0x390
Mar 23 00:26:23 fileserver kernel: [47610.806928] ? create_worker+0x1a0/0x1a0
Mar 23 00:26:23 fileserver kernel: [47610.811514] kthread+0x112/0x130
Mar 23 00:26:23 fileserver kernel: [47610.816169] ? kthread_bind+0x30/0x30
Mar 23 00:26:23 fileserver kernel: [47610.820711] ret_from_fork+0x1f/0x40
Mar 23 00:26:23 fileserver kernel: [47610.825247] INFO: task kworker/0:2:6229 blocked for more than 120 seconds.
Mar 23 00:26:23 fileserver kernel: [47610.829879] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:23 fileserver kernel: [47610.834557] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:23 fileserver kernel: [47610.839340] kworker/0:2 D 0 6229 2 0x80000000
Mar 23 00:26:23 fileserver kernel: [47610.844148] Workqueue: kcopyd do_work [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.848897] Call Trace:
Mar 23 00:26:23 fileserver kernel: [47610.853618] ? __schedule+0x2a2/0x870
Mar 23 00:26:23 fileserver kernel: [47610.858354] schedule+0x28/0x80
Mar 23 00:26:23 fileserver kernel: [47610.863065] io_schedule+0x12/0x40
Mar 23 00:26:23 fileserver kernel: [47610.867784] wbt_wait+0x205/0x300
Mar 23 00:26:23 fileserver kernel: [47610.872497] ? wbt_wait+0x300/0x300
Mar 23 00:26:23 fileserver kernel: [47610.877158] rq_qos_throttle+0x31/0x40
Mar 23 00:26:23 fileserver kernel: [47610.881776] blk_mq_make_request+0x111/0x530
Mar 23 00:26:23 fileserver kernel: [47610.886335] generic_make_request+0x1a4/0x400
Mar 23 00:26:23 fileserver kernel: [47610.890838] ? bvec_alloc+0x51/0xe0
Mar 23 00:26:23 fileserver kernel: [47610.895328] submit_bio+0x45/0x130
Mar 23 00:26:23 fileserver kernel: [47610.899793] ? bio_add_page+0x48/0x60
Mar 23 00:26:23 fileserver kernel: [47610.904280] dispatch_io+0x1ae/0x3f0 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.908777] ? dm_copy_name_and_uuid+0xa0/0xa0 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.913337] ? list_get_page+0x30/0x30 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.917886] ? blk_mq_run_hw_queue+0x88/0x110
Mar 23 00:26:23 fileserver kernel: [47610.922485] ? dm_kcopyd_do_callback+0x40/0x40 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.927102] dm_io+0x111/0x220 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.931706] ? dm_copy_name_and_uuid+0xa0/0xa0 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.936351] ? list_get_page+0x30/0x30 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.940992] ? blk_mq_run_hw_queue+0x88/0x110
Mar 23 00:26:23 fileserver kernel: [47610.945602] run_io_job+0xe0/0x1d0 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.950175] ? dm_kcopyd_do_callback+0x40/0x40 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.954787] process_jobs+0x89/0x230 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.959368] ? dm_kcopyd_client_destroy+0x140/0x140 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.964032] do_work+0xb9/0xf0 [dm_mod]
Mar 23 00:26:23 fileserver kernel: [47610.968644] process_one_work+0x1a7/0x3a0
Mar 23 00:26:23 fileserver kernel: [47610.973249] worker_thread+0x30/0x390
Mar 23 00:26:23 fileserver kernel: [47610.977847] ? create_worker+0x1a0/0x1a0
Mar 23 00:26:23 fileserver kernel: [47610.982462] kthread+0x112/0x130
Mar 23 00:26:23 fileserver kernel: [47610.987041] ? kthread_bind+0x30/0x30
Mar 23 00:26:23 fileserver kernel: [47610.991635] ret_from_fork+0x1f/0x40
Mar 23 00:26:23 fileserver kernel: [47610.996325] INFO: task kworker/u8:4:6232 blocked for more than 120 seconds.
Mar 23 00:26:23 fileserver kernel: [47611.000721] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:23 fileserver kernel: [47611.005102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:23 fileserver kernel: [47611.009559] kworker/u8:4 D 0 6232 2 0x80000000
Mar 23 00:26:23 fileserver kernel: [47611.014022] Workqueue: writeback wb_workfn (flush-253:4)
Mar 23 00:26:23 fileserver kernel: [47611.018485] Call Trace:
Mar 23 00:26:23 fileserver kernel: [47611.022951] ? __schedule+0x2a2/0x870
Mar 23 00:26:23 fileserver kernel: [47611.027429] schedule+0x28/0x80
Mar 23 00:26:24 fileserver kernel: [47611.031910] md_write_start+0x14b/0x220 [md_mod]
Mar 23 00:26:24 fileserver kernel: [47611.036375] ? finish_wait+0x80/0x80
Mar 23 00:26:24 fileserver kernel: [47611.040782] ? finish_wait+0x80/0x80
Mar 23 00:26:24 fileserver kernel: [47611.045073] raid5_make_request+0x83/0xb70 [raid456]
Mar 23 00:26:24 fileserver kernel: [47611.049349] ? part_round_stats+0xbb/0x170
Mar 23 00:26:24 fileserver kernel: [47611.053574] ? finish_wait+0x80/0x80
Mar 23 00:26:24 fileserver kernel: [47611.057783] ? __split_and_process_non_flush+0x159/0x1f0 [dm_mod]
Mar 23 00:26:24 fileserver kernel: [47611.062047] ? finish_wait+0x80/0x80
Mar 23 00:26:24 fileserver kernel: [47611.066324] md_handle_request+0x119/0x190 [md_mod]
Mar 23 00:26:24 fileserver kernel: [47611.070620] md_make_request+0x78/0x160 [md_mod]
Mar 23 00:26:24 fileserver kernel: [47611.074919] generic_make_request+0x1a4/0x400
Mar 23 00:26:24 fileserver kernel: [47611.079202] ? set_next_entity+0x96/0x1b0
Mar 23 00:26:24 fileserver kernel: [47611.083489] submit_bio+0x45/0x130
Mar 23 00:26:24 fileserver kernel: [47611.087804] ext4_io_submit+0x49/0x60 [ext4]
Mar 23 00:26:24 fileserver kernel: [47611.092140] ext4_bio_write_page+0x24a/0x4d0 [ext4]
Mar 23 00:26:24 fileserver kernel: [47611.096480] mpage_submit_page+0x53/0x70 [ext4]
Mar 23 00:26:24 fileserver kernel: [47611.100851] mpage_process_page_bufs+0xe7/0xf0 [ext4]
Mar 23 00:26:24 fileserver kernel: [47611.105232] mpage_prepare_extent_to_map+0x1db/0x2b0 [ext4]
Mar 23 00:26:24 fileserver kernel: [47611.109658] ext4_writepages+0x3da/0xf00 [ext4]
Mar 23 00:26:24 fileserver kernel: [47611.113984] ? __ip_queue_xmit+0x15d/0x410
Mar 23 00:26:24 fileserver kernel: [47611.118293] ? do_writepages+0x41/0xd0
Mar 23 00:26:24 fileserver kernel: [47611.122517] do_writepages+0x41/0xd0
Mar 23 00:26:24 fileserver kernel: [47611.126643] ? __tcp_push_pending_frames+0x31/0xd0
Mar 23 00:26:24 fileserver kernel: [47611.130769] ? tcp_sendmsg_locked+0x491/0xd50
Mar 23 00:26:24 fileserver kernel: [47611.134905] __writeback_single_inode+0x3d/0x350
Mar 23 00:26:24 fileserver kernel: [47611.139057] writeback_sb_inodes+0x1e3/0x450
Mar 23 00:26:24 fileserver kernel: [47611.143228] __writeback_inodes_wb+0x5d/0xb0
Mar 23 00:26:24 fileserver kernel: [47611.147404] wb_writeback+0x25f/0x2f0
Mar 23 00:26:24 fileserver kernel: [47611.151587] ? get_nr_inodes+0x35/0x50
Mar 23 00:26:24 fileserver kernel: [47611.155770] ? cpumask_next+0x16/0x20
Mar 23 00:26:24 fileserver kernel: [47611.159959] wb_workfn+0x186/0x400
Mar 23 00:26:24 fileserver kernel: [47611.164175] ? call_transmit+0x1b6/0x210 [sunrpc]
Mar 23 00:26:24 fileserver kernel: [47611.168378] process_one_work+0x1a7/0x3a0
Mar 23 00:26:24 fileserver kernel: [47611.172571] worker_thread+0x30/0x390
Mar 23 00:26:24 fileserver kernel: [47611.176773] ? create_worker+0x1a0/0x1a0
Mar 23 00:26:24 fileserver kernel: [47611.180964] kthread+0x112/0x130
Mar 23 00:26:24 fileserver kernel: [47611.185146] ? kthread_bind+0x30/0x30
Mar 23 00:26:24 fileserver kernel: [47611.189294] ret_from_fork+0x1f/0x40
Mar 23 00:26:24 fileserver kernel: [47611.193490] INFO: task kworker/u9:2:6459 blocked for more than 120 seconds.
Mar 23 00:26:24 fileserver kernel: [47611.198110] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:24 fileserver kernel: [47611.202969] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:24 fileserver kernel: [47611.207642] kworker/u9:2 D 0 6459 2 0x80000000
Mar 23 00:26:24 fileserver kernel: [47611.212325] Workqueue: kcryptd kcryptd_crypt [dm_crypt]
Mar 23 00:26:24 fileserver kernel: [47611.216990] Call Trace:
Mar 23 00:26:24 fileserver kernel: [47611.221641] ? __schedule+0x2a2/0x870
Mar 23 00:26:24 fileserver kernel: [47611.226318] ? __percpu_counter_sum+0x56/0x60
Mar 23 00:26:24 fileserver kernel: [47611.230967] schedule+0x28/0x80
Mar 23 00:26:24 fileserver kernel: [47611.235574] schedule_preempt_disabled+0xa/0x10
Mar 23 00:26:24 fileserver kernel: [47611.240176] __mutex_lock.isra.8+0x2b5/0x4a0
Mar 23 00:26:24 fileserver kernel: [47611.244757] kcryptd_crypt+0x26e/0x3b0 [dm_crypt]
Mar 23 00:26:24 fileserver kernel: [47611.249357] process_one_work+0x1a7/0x3a0
Mar 23 00:26:24 fileserver kernel: [47611.253949] worker_thread+0x30/0x390
Mar 23 00:26:24 fileserver kernel: [47611.258536] ? create_worker+0x1a0/0x1a0
Mar 23 00:26:24 fileserver kernel: [47611.263120] kthread+0x112/0x130
Mar 23 00:26:24 fileserver kernel: [47611.267700] ? kthread_bind+0x30/0x30
Mar 23 00:26:24 fileserver kernel: [47611.272276] ret_from_fork+0x1f/0x40
Mar 23 00:26:24 fileserver kernel: [47611.276833] INFO: task kworker/u9:0:6527 blocked for more than 120 seconds.
Mar 23 00:26:24 fileserver kernel: [47611.281412] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:26:24 fileserver kernel: [47611.285950] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:26:24 fileserver kernel: [47611.290541] kworker/u9:0 D 0 6527 2 0x80000000
Mar 23 00:26:24 fileserver kernel: [47611.295135] Workqueue: kcryptd kcryptd_crypt [dm_crypt]
Mar 23 00:26:24 fileserver kernel: [47611.299713] Call Trace:
Mar 23 00:26:24 fileserver kernel: [47611.304273] ? __schedule+0x2a2/0x870
Mar 23 00:26:24 fileserver kernel: [47611.308855] ? __percpu_counter_sum+0x56/0x60
Mar 23 00:26:24 fileserver kernel: [47611.313427] schedule+0x28/0x80
Mar 23 00:26:24 fileserver kernel: [47611.317973] schedule_preempt_disabled+0xa/0x10
Mar 23 00:26:24 fileserver kernel: [47611.322595] __mutex_lock.isra.8+0x2b5/0x4a0
Mar 23 00:26:24 fileserver kernel: [47611.327218] kcryptd_crypt+0x26e/0x3b0 [dm_crypt]
Mar 23 00:26:24 fileserver kernel: [47611.331842] process_one_work+0x1a7/0x3a0
Mar 23 00:26:24 fileserver kernel: [47611.336471] worker_thread+0x30/0x390
Mar 23 00:26:24 fileserver kernel: [47611.341072] ? create_worker+0x1a0/0x1a0
Mar 23 00:26:24 fileserver kernel: [47611.345625] kthread+0x112/0x130
Mar 23 00:26:24 fileserver kernel: [47611.350145] ? kthread_bind+0x30/0x30
Mar 23 00:26:24 fileserver kernel: [47611.354652] ret_from_fork+0x1f/0x40
Mar 23 00:28:26 fileserver kernel: [47733.329407] INFO: task khugepaged:38 blocked for more than 120 seconds.
Mar 23 00:28:26 fileserver kernel: [47733.334326] Not tainted 4.19.0-8-amd64 #1 Debian 4.19.98-1
Mar 23 00:28:26 fileserver kernel: [47733.339055] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 23 00:28:26 fileserver kernel: [47733.343862] khugepaged D 0 38 2 0x80000000
Mar 23 00:28:26 fileserver kernel: [47733.348614] Call Trace:
Mar 23 00:28:26 fileserver kernel: [47733.353322] ? __schedule+0x2a2/0x870
Mar 23 00:28:26 fileserver kernel: [47733.358052] schedule+0x28/0x80
Mar 23 00:28:26 fileserver kernel: [47733.362711] io_schedule+0x12/0x40
Mar 23 00:28:26 fileserver kernel: [47733.367343] wbt_wait+0x205/0x300
Mar 23 00:28:26 fileserver kernel: [47733.371989] ? wbt_wait+0x300/0x300
Mar 23 00:28:26 fileserver kernel: [47733.376592] rq_qos_throttle+0x31/0x40
Mar 23 00:28:26 fileserver kernel: [47733.381204] blk_mq_make_request+0x111/0x530
Mar 23 00:28:26 fileserver kernel: [47733.385851] generic_make_request+0x1a4/0x400
Mar 23 00:28:26 fileserver kernel: [47733.390457] ? end_swap_bio_read+0xc0/0xc0
Mar 23 00:28:26 fileserver kernel: [47733.395055] submit_bio+0x45/0x130
Mar 23 00:28:26 fileserver kernel: [47733.399658] ? get_swap_bio+0xbb/0xf0
Mar 23 00:28:26 fileserver kernel: [47733.404225] __swap_writepage+0xf2/0x3c0
Mar 23 00:28:26 fileserver kernel: [47733.408796] ? __frontswap_store+0x6e/0xf2
Mar 23 00:28:26 fileserver kernel: [47733.413406] pageout.isra.49+0x117/0x340
Mar 23 00:28:26 fileserver kernel: [47733.417993] shrink_page_list+0xa47/0xc70
Mar 23 00:28:26 fileserver kernel: [47733.422615] shrink_inactive_list+0x207/0x590
Mar 23 00:28:26 fileserver kernel: [47733.427195] shrink_node_memcg+0x20c/0x780
Mar 23 00:28:26 fileserver kernel: [47733.431823] shrink_node+0xcf/0x450
Mar 23 00:28:26 fileserver kernel: [47733.436350] do_try_to_free_pages+0xc6/0x370
Mar 23 00:28:26 fileserver kernel: [47733.440901] try_to_free_pages+0xf0/0x1b0
Mar 23 00:28:26 fileserver kernel: [47733.445485] __alloc_pages_slowpath+0x35a/0xcb0
Mar 23 00:28:26 fileserver kernel: [47733.450069] ? __switch_to+0x8c/0x440
Mar 23 00:28:26 fileserver kernel: [47733.454655] ? put_prev_entity+0x20/0x100
Mar 23 00:28:26 fileserver kernel: [47733.459184] __alloc_pages_nodemask+0x28b/0x2b0
Mar 23 00:28:26 fileserver kernel: [47733.463682] khugepaged_alloc_page+0x17/0x50
Mar 23 00:28:26 fileserver kernel: [47733.468108] khugepaged+0xb6e/0x2110
Mar 23 00:28:26 fileserver kernel: [47733.472463] ? finish_wait+0x80/0x80
Mar 23 00:28:26 fileserver kernel: [47733.476750] ? collapse_shmem+0xc00/0xc00
Mar 23 00:28:26 fileserver kernel: [47733.481019] kthread+0x112/0x130
Mar 23 00:28:26 fileserver kernel: [47733.485283] ? kthread_bind+0x30/0x30
Mar 23 00:28:26 fileserver kernel: [47733.489558] ret_from_fork+0x1f/0x40
Danach nehmen die Prozesse dmcrypt_write und md0_raid6 jeweils einen Thread mit 100% Auslastung in Beschlag.
der Load Average ist mit 186 weit weg von allem "Normalen".
Nach einem Neustart sieht das Raid so aus:
Code:
root@fileserver:~# root@fileserver:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdh1[1](S) sda1[0](S) sde1[2](S) sdd1[3](S) sdc1[4](S) sdb1[5](S)
23441312685 blocks super 1.2
unused devices: <none>
root@fileserver:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Thu Aug 8 15:30:29 2019
Raid Level : raid6
Used Dev Size : 18446744073709551615
Raid Devices : 6
Total Devices : 7
Persistence : Superblock is persistent
Update Time : Sat Mar 21 19:34:04 2020
State : clean, FAILED, Not Started
Active Devices : 6
Working Devices : 7
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : journal
Name : fileserver:0 (local to host fileserver)
UUID : 09f3e3e1:5f3a19d2:b3dc9e5d:8ad6180a
Events : 51404
Number Major Minor RaidDevice State
- 0 0 0 removed
- 0 0 1 removed
- 0 0 2 removed
- 0 0 3 removed
- 0 0 4 removed
- 0 0 5 removed
- 8 1 0 sync /dev/sda1
- 8 83 - spare /dev/sdf3
- 8 113 1 sync /dev/sdh1
- 8 65 2 sync /dev/sde1
- 8 49 3 sync /dev/sdd1
- 8 33 4 sync /dev/sdc1
- 8 17 5 sync /dev/sdb1
sämtlich Versuche das Raid zum Laufen zu bewegen mit --assemble und --run führten nicht zum Erfolg.
Erst mit folgender umständlicher Prozedur bekomme ich wieder ein lauffähiges RAID:
- Dateisystem in /etc/fstab/ auskommentieren.
- lvm2 deinstallieren, sonst meldet mdadm das RAID sei besetzt.
- mittels fdisk die Cache-Pratition löschen.
- Rechner neustarten.
- Danach siehr das RAID so aus:
Code:
root@fileserver:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Raid Level : raid0
Total Devices : 6
Persistence : Superblock is persistent
State : inactive
Working Devices : 6
Name : fileserver:0 (local to host fileserver)
UUID : 09f3e3e1:5f3a19d2:b3dc9e5d:8ad6180a
Events : 51404
Number Major Minor RaidDevice
- 8 1 - /dev/sda1
- 8 113 - /dev/sdh1
- 8 65 - /dev/sde1
- 8 49 - /dev/sdd1
- 8 33 - /dev/sdc1
- 8 17 - /dev/sdb1
- mit das RAID starten
- auf der SSD wieder mittels fdisk die Cache-Partition erstellen.
- Cache mittels
Code:
mdadm --manage /dev/md0 --add-journal /dev/sdf3
einbinden - lvm2 tools installieren
- Dateisystem in der fstab wieder aktiv setzten.
- Rechner neustarten
Nun die 2 Fragen:
- Kann jemand anhand der Meldungen sagen was genau das Problem ist?
- Ich vermute es liegt am Cache des RAIDS. Weiß jemand wie man den wieder entfern? Es gibt zwar viele Quellen die zeigen, wie der erstellt wird, aber ich habe keine gefunden, die zegt wie man ihn wieder entfernt.
Fehlen noch Informationen?
Danke für eure Hilfe!
Lesezeichen