Squoggle

Mac's tech blog

Monthly Archives: September 2022

CentOS Drive Testing

My Server was making noises that were uncharacteristic. This is how I tested my hard drives for failure.

  1. Install smartmontools:
    # yum install smartmontools
  2. Get a listing of all your hard drives:
    # lsblk
  3. Run a test on one of the hard drives:
    # smartctl -t short /dev/sda
    You will see something similar to the following:
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
    Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
    Testing has begun.
    Please wait 2 minutes for test to complete.
    Test will complete after Fri Sep 23 13:02:21 2022
    Use smartctl -X to abort test.
  4. It will give you a time when you can check the results. When the time has elapsed, come back and check the results like this:
    # smartctl -H /dev/sda
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
  5. If the test fails you will see something like this:
    # smartctl -H /dev/sdb
    smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-693.11.6.el7.x86_64] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: FAILED!
    Drive failure expected in less than 24 hours. SAVE ALL DATA.
    Failed Attributes:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    5 Reallocated_Sector_Ct 0x0033 063 063 140 Pre-fail Always FAILING_NOW 1089
  6. Looks like you need to replace /dev/sdb

How to Replace the Hard drive

This is what I did to replace the hard drive.

  1. Install lshw package:
    # yum install lshw
  2. Now list hardware of type disk:
    # lshw -class disk
    You should get way to much info.
  3. Filter the info with grep like so:
    # lshw -class disk | grep -A 5 -B 6 /dev/sdb
    You should now only get the one drive you are looking for.
    Mine looks like this:
    # lshw -class disk | grep -A 5 -B 6 /dev/sdb
    *-disk:1
    description: ATA Disk
    product: WDC WD1002FAEX-0
    vendor: Western Digital
    physical id: 1
    bus info: scsi@5:0.0.0
    logical name: /dev/sdb
    version: 1D05
    serial: WD-WCATR1933480
    size: 931GiB (1TB)
    capabilities: partitioned partitioned:dos
    configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512 signature=000cd438

So it looks like I need to replace a 1TB Western Digital. Fortunately this disk is in a two disk raid array.

Remove the HD from the Raid Array

This is what I did to remove the HD from the Raid Array. Before proceeding back up everything. I do a daily offsite backup so am covered in theory.

  1. Redo the lsblk command from above to confirm which disk is which:
    # lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sdb 8:16 0 931.5G 0 disk
    └─sdb1 8:17 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:4 0 931.4G 0 lvm /mnt/Raid
    sdc 8:32 0 931.5G 0 disk
    └─sdc1 8:33 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:4 0 931.4G 0 lvm /mnt/Raid
  2. Remember that the defective disk in this case is /dev/sdb and the good one is /dev/sdc
  3. Write all cache to disk:
    # sync
  4. Set the disk as failed with mdadm:
    # mdadm --manage /dev/md0 --fail /dev/sdb1
    This is the failed partition from /dev/sdb.
    You should see something like this:
    mdadm: set /dev/sdb1 faulty in /dev/md0
  5. Confirm it has been marked as failed:
    # cat /proc/mdstat
    Personalities : [raid1]
    md0 : active raid1 sdc1[1] sdb1[0](F)
    976630464 blocks super 1.2 [2/1] [_U]
    bitmap: 0/8 pages [0KB], 65536KB chunk

    The (F) next to sdb1 indicates Failed.
  6. Now remove the disk with mdadm:
    # mdadm --manage /dev/md0 --remove /dev/sdb1
  7. Now confirm with the cat command as before:
    # cat /proc/mdstat
    Personalities : [raid1]
    md0 : active raid1 sdc1[1]
    976630464 blocks super 1.2 [2/1] [_U]
    bitmap: 0/8 pages [0KB], 65536KB chunk

    Notice that sdb1 is now gone.
  8. You can also confirm this with the lsblk command:
    # lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sdb 8:16 0 931.5G 0 disk
    └─sdb1 8:17 0 931.5G 0 part
    sdc 8:32 0 931.5G 0 disk
    └─sdc1 8:33 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:4 0 931.4G 0 lvm /mnt/Raid
  9. You can now shutdown the server and replace that hard drive.
    It is easy to find the correct hard drive with the serial number you got from the lshw command you ran earlier. The serial number is: WD-WCATR1933480
  10. Power on server.
  11. Here is where I ran into an issue that left me scratching my head for quite some time. I’m documenting it here so if it happens again I can resolve it quickly.
    It turns out that the spare drive I had on hand I thought was new but was not. It was actually a drive I had installed in another system that was retired and this drive had a boot partition on it. When I booted the server, that was the partition that booted instead of my regular boot partition. I even had to recover passwords on it because the user and root passwords were not the same. All along I was thinking something had happened to bork the users somehow. But it turns out the new drive I had put in was booting and it was not really new. Lesson learned here is to make sure the drive you put in has had any partitions removed. I did this by putting the drive in another system and using fdisk to remove the partitions. Now when I boot the server the normal boot partition boots and this new drive is designated as sdb as I expect.
  12. Now you can copy the partition information from the good disk (/dev/sdc) to the new disk (/dev/sdb). Be warned that this will destroy any partition information on the new disk. Since I already destroyed any partition information in the previous step I’m good with this. The command looks like this:
    # sfdisk -d /dev/sdc | sfdisk /dev/sdb
  13. You can check the partition info is correct with the lsblk command:
    # lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sdb 8:16 0 931.5G 0 disk
    └─sdb1 8:17 0 931.5G 0 part
    sdc 8:32 0 931.5G 0 disk
    └─sdc1 8:33 0 931.5G 0 part
    └─md0 9:0 0 931.4G 0 raid1
    └─vg_raid-lv_raid 253:2 0 931.4G 0 lvm /mnt/Raid
  14. Now you can reverse the process and create the mirror that you previously had like this:
    # mdadm --manage /dev/md0 --add /dev/sdb1
  15. Now you can verify the status of your raid like this:
    # mdadm --detail /dev/md0
# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Tue Jun 27 17:49:31 2017
        Raid Level : raid1
        Array Size : 976630464 (931.39 GiB 1000.07 GB)
     Used Dev Size : 976630464 (931.39 GiB 1000.07 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sat Sep 24 14:46:35 2022
             State : clean, degraded, recovering 
    Active Devices : 1
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 1

Consistency Policy : bitmap

    Rebuild Status : 1% complete

          Name : Serenity.localdomain:0  (local to host Serenity.localdomain)
          UUID : f06aeaae:e0c9707b:6d982f07:3f320578
        Events : 114297

Number   Major   Minor   RaidDevice State
   2       8       17        0      spare rebuilding   /dev/sdb1
   1       8       33        1      active sync   /dev/sdc1
  • You can see that the ‘Rebuild Status is at 1% and that this is in a rebuilding state.
  • You can get the status of the rebuild like so:
    # cat /proc/mdstat
# cat /proc/mdstat
Personalities : [raid1] 
md0 : active raid1 sdb1[2] sdc1[1]
      976630464 blocks super 1.2 [2/1] [_U]
      [>....................]  recovery =  0.7% (7077312/976630464) finish=129.7min speed=124486K/sec
      bitmap: 8/8 pages [32KB], 65536KB chunk

You can watch this command if it is interesting to you.

There’s something missing here. It probably relates to this:

CentOS 7 created mdadm array disappears after reboot

My Python Notes

Virtual Environments for Python

Python3 is already installed on my Linux distro.

Install Virtual Environments

You will want to use Virtual Environments for Python. Install Virtualenv:

$ pip install virtualenv

Create a Virtual Environment

Create a base directory where you want your Virtual Environments to live:

$ cd ~
$ mkdir -p Python/VirtualEnv

Now create a Virtual Environment within that directory structure. I’m going to use P3VirtEnv to signify it is a Python 3 Virtual Environment.

$ cd ~/Python/VirtualEnv
$ virtualenv P3VirtEnv

You should see some message that essentially mean that the Virtual Environment was successfully created. You should see a new directory named ~/Python/VirtualEnv/P3VirtEnv.

Now you need to enter the Virtual Environment and set it active:

$ cd P3VirtEnv/
$ source bin/activate

Your prompt will now show what environment you are logged into. At this point it looks like this:

(P3VirtEnv) mac@Gob:~/Python/VirtualEnv/P3VirtEnv$

This process essentially copies your Python installation form the standard location to this Virtual Environment.
You can test this by seeing where Python is installed

$ which python
/home/mac/Python/VirtualEnv/P3VirtEnv/bin/python

Exit the Virtual Environment by simply typing the word deactivate.

$ deactivate

Your prompt and path are now returned to normal.

sdfsdfsdf

Packages

Python has packages that can be installed.

To see what packages have been installed do:

$ pip list

You should see a list of all the packages that have been installed. Mine looks something like this at this point:

Package Version
---------- -------
pip        22.2.2
setuptools 65.3.0
wheel      0.37.1

To see a list of all packages that have been installed by pip then do:

$ pip freeze

Why the virtualenv package is not listed is a mystery to me at this point.

Now is a good time to check that out. Install Selenium like this:

$ pip install selenium

A whole bunch of stuff gets installed. Now you can check pip freeze again and see all of what got installed

Another library that we will need for making API calls is called Requests. Install it like this:

$ pip install requests

The do the freeze thing to verify.

Install PyCharm

Instructions for downloading and installing PyCharm are located here:

https://www.jetbrains.com/help/pycharm/installation-guide.html

Review Section 2, Chapter 10 in the Class

Variables

When you define a string as a variable you can use a single quote or double quote to encapsulate the string. Numbers do not need to be put in quotes. I think if you put it in quotes then it becomes a string and not a number.

Variables cannot start with a Number, just letters or Underscore. The variable may contain numbers or underscore but cannot start with a number.

Data Types

Integers

Integers are numbers without decimals. They can be positive, negative or zero.

You can do math on integers such as Addition, subtraction, multiplication and division as well as other operations.

If you have multiple operations in a single line you should use parenthesis.

Float

Floats (Floating Point Numbers) are numbers with decimal places.

Operations on floating numbers will result in floating numbers.

In Python if you were comparing 1 vs 1.0 the comparison would be false because the first is an integer and the second is a float.

Strings

Strings are a sequence of single characters and are represented with quotes. Quotes can be single or double.

You can use one type of quote within the other type but not one type within the same type.

For example these is valid:

"Bring me your 'stash' of contraband"

But this is not valid:

"Bring me your "stash" of contraband"
Slicing

Slicing or a Slice is a subset of a String. The index number is the character counting from either right or left of the string and gives you a specific character of the string. See chapter 16 in the course.

If counting left to right you start at 0. If counting right to left you start with -1.

Syntax for slicking is:

variable_name[start index : finish index]

The slice does not include the finish index.

Example:

my_string[0]

Would give the first character of the string.

my_string[0:]

Would give the entire string

my_string[9:12]

Would give characters 9 through eleven. Remember that counting starts at 0 and not 1

Type command

You can determine what data type a variable is with the type command.

Lets assume you have a variable named var. You can get what type of data is with the type command like this:

type(var)

And the class of data will be returned to you.

String Methods

https://docs.python.org/3/library/stdtypes.html#string-methods