Squoggle

Mac's tech blog

Category Archives: CentOS/RHEL

CentOS Drive Testing

My Server was making noises that were uncharacteristic. This is how I tested my hard drives for failure.

  1. Install smartmontools:
    # yum install smartmontools
  2. Get a listing of all your hard drives:
    # lsblk
  3. Run a test on one of the hard drives:
    # smartctl -t short /dev/sda
    You will see something similar to the following:
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
    Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
    Testing has begun.
    Please wait 2 minutes for test to complete.
    Test will complete after Fri Sep 23 13:02:21 2022
    Use smartctl -X to abort test.
  4. It will give you a time when you can check the results. When the time has elapsed, come back and check the results like this:
    # smartctl -H /dev/sda
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
  5. If the test fails you will see something like this:
    # smartctl -H /dev/sdb
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: FAILED!
    Drive failure expected in less than 24 hours. SAVE ALL DATA.
    Failed Attributes:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    5 Reallocated_Sector_Ct 0x0033 063 063 140 Pre-fail Always FAILING_NOW 1089
  6. Looks like you need to replace /dev/sdb

How to Replace the Hard drive

This is what I did to replace the hard drive.

  1. Install lshw package:
    # yum install lshw
  2. Now list hardware of type disk:
    # lshw -class disk
    You should get way to much info.
  3. Filter the info with grep like so:
    # lshw -class disk | grep -A 5 -B 6 /dev/sdb
    You should now only get the one drive you are looking for.
    Mine looks like this:
    # lshw -class disk | grep -A 5 -B 6 /dev/sdb
    *-disk:1
    description: ATA Disk
    product: WDC WD1002FAEX-0
    vendor: Western Digital
    physical id: 1
    bus info: scsi@5:0.0.0
    logical name: /dev/sdb
    version: 1D05
    serial: WD-WCATR1933480
    size: 931GiB (1TB)
    capabilities: partitioned partitioned:dos
    configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512 signature=000cd438

So it looks like I need to replace a 1TB Western Digital. Fortunately this disk is in a two disk raid array.

Remove the HD from the Raid Array

This is what I did to remove the HD from the Raid Array. Before proceeding back up everything. I do a daily offsite backup so am covered in theory.

  1. Redo the lsblk command from above to confirm which disk is which:
    # lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sdb 8:16 0 931.5G 0 disk
    └─sdb1 8:17 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:4 0 931.4G 0 lvm /mnt/Raid
    sdc 8:32 0 931.5G 0 disk
    └─sdc1 8:33 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:4 0 931.4G 0 lvm /mnt/Raid
  2. Remember that the defective disk in this case is /dev/sdb and the good one is /dev/sdc
  3. Write all cache to disk:
    # sync
  4. Set the disk as failed with mdadm:
    # mdadm --manage /dev/md0 --fail /dev/sdb1
    This is the failed partition from /dev/sdb.
    You should see something like this:
    mdadm: set /dev/sdb1 faulty in /dev/md0
  5. Confirm it has been marked as failed:
    # cat /proc/mdstat
    Personalities : [raid1]
    md0 : active raid1 sdc1[1] sdb1[0](F)
    976630464 blocks super 1.2 [2/1] [_U]
    bitmap: 0/8 pages [0KB], 65536KB chunk

    The (F) next to sdb1 indicates Failed.
  6. Now remove the disk with mdadm:
    # mdadm --manage /dev/md0 --remove /dev/sdb1
  7. Now confirm with the cat command as before:
    # cat /proc/mdstat
    Personalities : [raid1]
    md0 : active raid1 sdc1[1]
    976630464 blocks super 1.2 [2/1] [_U]
    bitmap: 0/8 pages [0KB], 65536KB chunk

    Notice that sdb1 is now gone.
  8. You can also confirm this with the lsblk command:
    # lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sdb 8:16 0 931.5G 0 disk
    └─sdb1 8:17 0 931.5G 0 part
    sdc 8:32 0 931.5G 0 disk
    └─sdc1 8:33 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:4 0 931.4G 0 lvm /mnt/Raid
  9. You can now shutdown the server and replace that hard drive.
    It is easy to find the correct hard drive with the serial number you got from the lshw command you ran earlier. The serial number is: WD-WCATR1933480
  10. Power on server.
  11. Here is where I ran into an issue that left me scratching my head for quite some time. I’m documenting it here so if it happens again I can resolve it quickly.
    It turns out that the spare drive I had on hand I thought was new but was not. It was actually a drive I had installed in another system that was retired and this drive had a boot partition on it. When I booted the server, that was the partition that booted instead of my regular boot partition. I even had to recover passwords on it because the user and root passwords were not the same. All along I was thinking something had happened to bork the users somehow. But it turns out the new drive I had put in was booting and it was not really new. Lesson learned here is to make sure the drive you put in has had any partitions removed. I did this by putting the drive in another system and using fdisk to remove the partitions. Now when I boot the server the normal boot partition boots and this new drive is designated as sdb as I expect.
  12. Now you can copy the partition information from the good disk (/dev/sdc) to the new disk (/dev/sdb). Be warned that this will destroy any partition information on the new disk. Since I already destroyed any partition information in the previous step I’m good with this. The command looks like this:
    # sfdisk -d /dev/sdc | sfdisk /dev/sdb
  13. You can check the partition info is correct with the lsblk command:
    # lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sdb 8:16 0 931.5G 0 disk
    └─sdb1 8:17 0 931.5G 0 part
    sdc 8:32 0 931.5G 0 disk
    └─sdc1 8:33 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:2 0 931.4G 0 lvm /mnt/Raid
  14. Now you can reverse the process and create the mirror that you previously had like this:
    # mdadm --manage /dev/md0 --add /dev/sdb1
  15. Now you can verify the status of your raid like this:
    # mdadm --detail /dev/md0
# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Tue Jun 27 17:49:31 2017
        Raid Level : raid1
        Array Size : 976630464 (931.39 GiB 1000.07 GB)
     Used Dev Size : 976630464 (931.39 GiB 1000.07 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sat Sep 24 14:46:35 2022
             State : clean, degraded, recovering 
    Active Devices : 1
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 1

Consistency Policy : bitmap

    Rebuild Status : 1% complete

          Name : Serenity.localdomain:0  (local to host Serenity.localdomain)
          UUID : f06aeaae:e0c9707b:6d982f07:3f320578
        Events : 114297

Number   Major   Minor   RaidDevice State
   2       8       17        0      spare rebuilding   /dev/sdb1
   1       8       33        1      active sync   /dev/sdc1
  • You can see that the ‘Rebuild Status is at 1% and that this is in a rebuilding state.
  • You can get the status of the rebuild like so:
    # cat /proc/mdstat
# cat /proc/mdstat
Personalities : [raid1] 
md0 : active raid1 sdb1[2] sdc1[1]
      976630464 blocks super 1.2 [2/1] [_U]
      [>....................]  recovery =  0.7% (7077312/976630464) finish=129.7min speed=124486K/sec
      bitmap: 8/8 pages [32KB], 65536KB chunk

You can watch this command if it is interesting to you.

There’s something missing here. It probably relates to this:

CentOS 7 created mdadm array disappears after reboot

Linux Convert Command

This command requires that the imagemagick package be installed.

sudo apt install imagemagick

To combine two single page pdf files into one multi-page pdf:

convert file1.pdf file2.pdf merged.pdf

 

More to come

IPTables

These are my notes on IP tables. Maybe at some point I may do a complete tutorial or How To but don’t hold your breath.

Chains:

There are typically 3 chains in a standard setup. They are:

INPUT
FORWARD
OUTPUT

Input is things coming into the server.
Forward are things that are forwarded by the server.
Output are things that are leaving the server.

Policies:

There is a default policy (rule) set up for each chain. CentOS somes standard with the default policy of ACCEPT for each of these chains. The command line to set the policy is like this:

# iptables -P INPUT ACCEPT

Pretty simple really. The -P is the flag to set the policy.

Flush:

The command that flushes the iptables is -F. This deletes the rules in the table.

# iptables -F

If changing rules from a remote host, the tutorial says to first put the INPUT policy to ACCEPT especially if you are going to flush the table. The flush command flushes everything except the default policy so if you have set it to ACCEPT then you won’t lock yourself out. Be sure to undo the ACCEPT on the INPUT policy or it will basically be wide open unless you have some other rule locking it down.

So the first two commands, accept on the input policy and flush on the table leave you with a pretty much blank rule set.

Saving:

It’s important to understand that the commands take effect immediately so if you do a wrong command you can lock yourself out. However they are not permanently stored until you save them with this:

# service iptables save

If you did lock yourself out then theoretically you could reboot the server before saving and it would revert back to whatever it was on the last save. I have not tested this yet but that is what I understand.

Showing:

You need to be able to see the results of your commands so you can show your tables like this:

iptables -L

This leaves a bit to be desired as it shows everything and may be too much information. This will show all Chains. If you just want to list one of the Chains you can do it like this:

# iptables -L [CHAIN]

For example:

# iptables -L INPUT

Even more useful is to list with Line Numbers. This is helpful if you want to insert a rule after a certain existing rule. That command looks like this:

# iptables -L --line-numbers

or

# iptables -L INPUT --line-numbers

Even better is using the -v or the verbose flag. That’s probably the best.

# iptables -L INPUT -v --line-numbers

Adding Rules:

Rules are added or deleted to the table by the -A or -D flag. The -A appends a rule and the -D deletes a rule. For example:

# iptables -A INPUT -i lo -j ACCEPT

This will allow everything to reach the lo interface. This is a good idea as programs running on the server interact with the lo interface.

By the way, the -i flag specifies an interface. The -j flag is the jump flag.  In the above example if something comes in (INPUT) on the lo interface, then jump to ACCEPT.

It is also important to note that rules are put into the table in the order they are typed in using the -A (append) command. You need to be sure you do not give permissions to something and then take it away later. It is also a good idea to set the policies for INPUT and FORWARD to drop then specifically set up the exceptions to this with the rules.

Deleting Rules:

The -D flag can be used with line numbers and is useful to delete specific lines in your IPTables config. The Delete command is done like this:

# iptables -D INPUT 4

In other words, deleting from the INPUT chain rule number 4.

Inserting Rules:

Rules need to be in certain orders or you could cause problems. You can insert a rule to the table with this:

# iptables -I INPUT 3 -p tcp --dport 23 -j ACCEPT

This inserts at line 3 the rule to accept telnet in the INPUT chain.

IP Addresses:

You can also specify IP addresses in a rule. For example if I wanted to specify that I wanted to accept connections coming from a certain source IP address I would use something like this:

# iptables -A INPUT -s 192.168.0.4 -j ACCEPT

The -s means source IP.

You can also specify entire networks like this:

# iptables -A INPUT -s 192.168.0.0/24 -j ACCEPT

Comments:

Comments can also be added. This is useful if you are putting the lines into a script or something. Everything after the “#” is ignored.

# iptables -A INPUT -s 192.168.0.0/24 -j ACCEPT  # using standard slash notation

Mac Addresses:

You can also filter by mac addresses in a rule. Something like this:

# iptables -A INPUT -m mac --mac-source 00:26:B9:D1:D9:6B -j ACCEPT

Anything from the specified source mac address will be accepted by the above rule.

You can also add an IP address as well as a mac address for further filtering:

# iptables -A INPUT -s 192.168.0.4 -m mac --mac-source 00:50:8D:FD:E6:32 -j ACCEPT

The above will append the rule to the end of the chain. It is probably better to insert it somewhere like this:

# iptables -I INPUT 75 -s 192.168.0.4 -m mac --mac-source 00:50:8D:FD:E6:32 -j ACCEPT

Protocols & Ports:

To further refine you really need protocols and ports defined in a lot of the rules. Going back to this example:

# iptables -I INPUT 3 -p tcp --dport 23 -j ACCEPT

The -p means protocol, in this case TCP and the --dport means destination port, in this case port 23 or telnet.

To get more granular on the rules you will want to put them together. For ecample I want to accept on the INPUT chain connections coming from 192.168.0.0/24 and port 23 or telnet. The rule would look like this:

# -A INPUT -s 192.168.0.0/24 -p tcp --dport 23 -j ACCEPT

Drop & Reject:

The default configuration of IP Tables for CentOS is to have the INPUT, FORWARD & OUTPUT policies set to ACCEPT. This is probably not a good idea. Depending on your security posture it might be OK for your OUTPUT policy, unless you want to limit what goes out of your system. The way you solve this issue is to add some REJECT rules to the end of your configs. A REJECT rule looks something like this:

# iptables -A INPUT -j REJECT --reject-with icmp-host-prohibited
# iptables -A FORWARD -j REJECT --reject-with icmp-host-prohibited

A REJECT will send a TCP Reject whereas a DROP simply drops the connection. Depending on your security posture a DROP might be a better decision.

States:

The State Match

The most useful match criterion is supplied by the `state’ extension, which interprets the connection-tracking analysis of the `ip_conntrack’ module. This is highly recommended.

Specifying `-m state’ allows an additional `–state’ option, which is a comma-separated list of states to match (the `!’ flag indicates not to match those states). These states are:

NEW: A packet which creates a new connection.
ESTABLISHED: A packet which belongs to an existing connection (i.e., a reply packet, or outgoing packet on a connection which has seen replies).
RELATED: A packet which is related to, but not part of, an existing connection, such as an ICMP error, or (with the FTP module inserted), a packet establishing an ftp data connection.
INVALID: A packet which could not be identified for some reason: this includes running out of memory and ICMP errors which don’t correspond to any known connection. Generally these packets should be dropped.

An example of this powerful match extension would be:

# iptables -A FORWARD -i ppp0 -m state ! --state NEW -j DROP

The default iptables configuration has the ESTABLISHED and RELATED states set to ACCEPT on the INPUT chain like this:

# iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

Probably a good idea to keep that and put it first in the list. This will keep any connections going that have already been established. For example if you have an ssh connection going and you change the iptables for SSH and lock yourself out the connection will remain until you close the connection. This has saved me at least once but you could potentially keep bad connections going as well so use with care.

I have also seen that it is a good idea to use states with regular rules like this:

# iptables -A INPUT -s 192.168.0.0/24 -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT

I don’t completely understand it myself. Things should work without it. I’ve just seen posts about it and think it is useful to mention here.

Another good thing to use might be the INVALID state. INVALID means a packet that could not be identified somehow. Just drop them:

# iptables -A INPUT -m state --state INVALID -j DROP
# iptables -A FORWARD -m state --state INVALID -j DROP
# iptables -A OUTPUT -m state --state INVALID -j DROP

Script:

Here’s a little script that sets up pretty much what I talked about on this page:

#!/bin/bash

# iptables example configuration script
# Flush all current rules from iptables
 iptables -F

# Set default policies for INPUT, FORWARD and OUTPUT chains
 iptables -P INPUT DROP
 iptables -P FORWARD DROP
 iptables -P OUTPUT ACCEPT

# Drop invalid packets
 iptables -A INPUT -m state --state INVALID -j DROP
 iptables -A FORWARD -m state --state INVALID -j DROP
 iptables -A OUTPUT -m state --state INVALID -j DROP

# Accept packets belonging to established and related connections
 iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# Allow private lan to ping
 iptables -A INPUT -s 192.168.0.0/24 -p icmp -j ACCEPT

# Set access for localhost
 iptables -A INPUT -i lo -j ACCEPT

# Allow private lan to ssh
 iptables -A INPUT -s 192.168.0.0/24 -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT

# Save settings
 /sbin/service iptables save

# List rules
 iptables -L -v --line-numbers