From patchwork Thu May 19 08:56:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Tremer X-Patchwork-Id: 5627 Return-Path: Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by web04.haj.ipfire.org (Postfix) with ESMTPS id 4L3kHd2RGXz3yZG for ; Thu, 19 May 2022 08:56:45 +0000 (UTC) Received: from mail02.haj.ipfire.org (mail02.haj.ipfire.org [172.28.1.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail02.haj.ipfire.org", Issuer "R3" (verified OK)) by mail01.ipfire.org (Postfix) with ESMTPS id 4L3kHZ50TPz47J; Thu, 19 May 2022 08:56:42 +0000 (UTC) Received: from mail02.haj.ipfire.org (localhost [127.0.0.1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4L3kHZ1gMwz2ysv; Thu, 19 May 2022 08:56:42 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4L3kHX2MQ6z2x9g for ; Thu, 19 May 2022 08:56:40 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4L3kHW61B5z13k; Thu, 19 May 2022 08:56:39 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1652950599; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yPeH1+941mWNyIXPXBMqA1K6q1FfrjS4Ij4qaXr3v1c=; b=FoJ5nLHox3Jwe+CHTQFrC/+wi5H7CJYwSvsTMwRnIW1nAb644uCQs2561VeiOOmX6xKd3r pt1+oIZ6cwXMj9DQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1652950599; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yPeH1+941mWNyIXPXBMqA1K6q1FfrjS4Ij4qaXr3v1c=; b=cDytCx84LBLGhC7AtaEdDNswBeEm8a/i+SKFhlXKxRDQs/DcEyRjJTGDpGuSE1mj4Rl1w8 lRBikmHuxpwenYBwAG3JayNIMUlvGIDON6Ltp6pQmE93vaxRxhRqAnyYmrnVas7jEs3azX /g8hSxUGQRoAe0GHGNhrtRwnXsXx427i6EqfBf0trULK8Ix7y8m1/A2rzJqdTt4PakZ7G2 CJ8N9Rxr2pTH4h1M6VyacUV5ExsF+ak2KC3ez6roorRKlm5uh92SJB7s6lqBanvfOVWvSb C5WMYd6vh2D8RTDAkPeo7dXQRYgxyK+cx6PAxZYqD5gIZCFAsCxhSHo9Wh3K4g== From: Michael Tremer To: development@lists.ipfire.org Subject: [PATCH 2/2] core168: Add script to automatically repair MDRAID arrays Date: Thu, 19 May 2022 08:56:34 +0000 Message-Id: <20220519085634.197389-2-michael.tremer@ipfire.org> In-Reply-To: <20220519085634.197389-1-michael.tremer@ipfire.org> References: <20220519085634.197389-1-michael.tremer@ipfire.org> MIME-Version: 1.0 X-BeenThere: development@lists.ipfire.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: IPFire development talk List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michael Tremer Errors-To: development-bounces@lists.ipfire.org Sender: "Development" Please see the header of the script for more details. Signed-off-by: Michael Tremer --- config/rootfiles/common/aarch64/stage2 | 1 + config/rootfiles/common/armv6l/stage2 | 1 + config/rootfiles/common/x86_64/stage2 | 1 + config/rootfiles/core/168/update.sh | 3 + src/scripts/repair-mdraid | 169 +++++++++++++++++++++++++ 5 files changed, 175 insertions(+) create mode 100644 src/scripts/repair-mdraid diff --git a/config/rootfiles/common/aarch64/stage2 b/config/rootfiles/common/aarch64/stage2 index 352c704d4..e328a4526 100644 --- a/config/rootfiles/common/aarch64/stage2 +++ b/config/rootfiles/common/aarch64/stage2 @@ -99,6 +99,7 @@ usr/local/bin/ipsec-interfaces usr/local/bin/makegraphs usr/local/bin/qosd usr/local/bin/readhash +usr/local/bin/repair-mdraid usr/local/bin/run-parts usr/local/bin/scanhd usr/local/bin/settime diff --git a/config/rootfiles/common/armv6l/stage2 b/config/rootfiles/common/armv6l/stage2 index 198461a01..2bd00d968 100644 --- a/config/rootfiles/common/armv6l/stage2 +++ b/config/rootfiles/common/armv6l/stage2 @@ -97,6 +97,7 @@ usr/local/bin/ipsec-interfaces usr/local/bin/makegraphs usr/local/bin/qosd usr/local/bin/readhash +usr/local/bin/repair-mdraid usr/local/bin/run-parts usr/local/bin/scanhd usr/local/bin/settime diff --git a/config/rootfiles/common/x86_64/stage2 b/config/rootfiles/common/x86_64/stage2 index b03a7fecf..586b88e3d 100644 --- a/config/rootfiles/common/x86_64/stage2 +++ b/config/rootfiles/common/x86_64/stage2 @@ -99,6 +99,7 @@ usr/local/bin/ipsec-interfaces usr/local/bin/makegraphs usr/local/bin/qosd usr/local/bin/readhash +usr/local/bin/repair-mdraid usr/local/bin/run-parts usr/local/bin/scanhd usr/local/bin/settime diff --git a/config/rootfiles/core/168/update.sh b/config/rootfiles/core/168/update.sh index c4005dba9..84dec941c 100644 --- a/config/rootfiles/core/168/update.sh +++ b/config/rootfiles/core/168/update.sh @@ -125,6 +125,9 @@ if ! grep -q rd.auto /etc/default/grub; then sed -e "s/panic=10/& rd.auto/" -i /etc/default/grub fi +# Repair any broken MDRAID arrays +/usr/local/bin/repair-mdraid + # Start services /etc/init.d/fcron restart /etc/init.d/sshd restart diff --git a/src/scripts/repair-mdraid b/src/scripts/repair-mdraid new file mode 100644 index 000000000..a622ff71d --- /dev/null +++ b/src/scripts/repair-mdraid @@ -0,0 +1,169 @@ +#!/bin/bash +############################################################################### +# # +# IPFire.org - A linux based firewall # +# Copyright (C) 2022 IPFire Team # +# # +# This program is free software: you can redistribute it and/or modify # +# it under the terms of the GNU General Public License as published by # +# the Free Software Foundation, either version 3 of the License, or # +# (at your option) any later version. # +# # +# This program is distributed in the hope that it will be useful, # +# but WITHOUT ANY WARRANTY; without even the implied warranty of # +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # +# GNU General Public License for more details. # +# # +# You should have received a copy of the GNU General Public License # +# along with this program. If not, see . # +# # +############################################################################### +# +# This script is supposed to repair any broken RAID installations +# where the system has been booted from only one of the RAID devices +# without the software RAID being activated first. +# +# This script does as follows: +# +# * It tries to find an inactive RAID called "ipfire:0" +# * It will then destroy any devices that are still part of this RAID. +# This is required because if the RAID is being assembled correctly, +# data from the disk that has NOT been mounted will be replicated +# back to the device that has been changed. That causes that any +# data that has been written to the mounted disk will be lost. +# To avoid this, we will partially destroy the RAID. +# * We will then erase any partition tables and destroy any filesystems +# on the devices so that they do not get accidentially mounted again. +# * The system will then need to be rebooted where the RAID will be +# mounted again in a degraded state which might take some extra +# time at boot (the system stands still for about a minute). +# * After the system has been booted up correctly, we will re-add +# the devices back to the RAID which will resync and the system +# will be back to its intended configuration. + +find_inactive_raid() { + local status + local device + local arg + local args + + while read -r status device args; do + if [ "${status}" = "INACTIVE-ARRAY" ]; then + for arg in ${args}; do + case "${arg}" in + name=ipfire:0) + echo "${device}" + return 0 + ;; + esac + done + fi + done <<< "$(mdadm --detail --scan)" + + return 1 +} + +find_root() { + local device + local mp + local fs + local args + + while read -r device mp fs args; do + if [ "${mp}" = "/" ]; then + echo "${device:0:-1}" + return 0 + fi + done < /proc/mounts + + return 1 +} + +find_raid_devices() { + local raid="${1}" + + local IFS=, + + local device + for device in $(mdadm -v --detail --scan "${raid}" | awk -F= '/^[ ]+devices/ { print $2 }'); do + echo "${device}" + done + + return 0 +} + +destroy_everything() { + local device="${1}" + local part + + # Destroy the RAID superblock + mdadm --zero-superblock "${device}" + + # Wipe the partition table + wipefs -a "${device}" + + # Wipe any partition signatures + for part in ${device}*; do + wipefs -a "${part}" + done +} + +raid_rebuild() { + local devices=( "$@" ) + + cat > /etc/rc.d/rcsysinit.d/S99fix-raid </dev/null + + # Destroy any useful data on all remaining RAID devices + local device + for device in ${devices[@]}; do + # Skip root + [ "${device}" = "${root}" ] && continue + + destroy_everything "${device}" + done &>/dev/null + + # Re-add devices to the RAID + raid_rebuild "${device}" + + return 0 +} + +main "$@" || return $?