Wednesday, November 8, 2017

Overview of RAID

What is RAID

  1.  RAID is an acronym for Redundant Array of Independent (or Inexpensive) Disks.
  2. RAID is a combination of several independent and relatively small disks to form a single storage of a large size.
  3. The disks included in the array are called Array members.
  4. The disks can be combined into the array in different ways which are known as RAID LEVELS.

 WHAT ARE THE ADVANTAGES /GOALS OF RAID

  1. Increase data reliability
  2. Increase I/O performance 

CHARACTERISTICS OF RAID LEVELS 


 FAULT TOLERANCE
  • The ability to survive of one or several disk failures.
PERFORMANCE
  •  The read and write speed of the entire array is changed when compared to a single       disk
CAPACITY
      • Compared to the capacity of single drive usage the capacity and the amount of user  data that can be stored is increased with the use of RAID.  
      • The array capacity on the RAID level does not match always with the sum of individual RAID member.

         

        HOW RAID IS ORGANIZED


        RAID is organized in two aspects as below

        ORGANIZATION OF DATA
          • How the data is organized in the array which includes RAID storage techniques like striping, mirroring, parity and also combination of all techniques

            IMPLEMENTATION OF EACH PARTICULAR RAID INSTALLATION 
              • Whether it is hardware RAID or software RAID 
                  

                RAID STORAGE TECHNIQUES 

                 The main methods of storing data in the array are 
                  

                STRIPING
                1. Splitting the data into blocks of a certain size and then writing these blocks across the RAID one by one is called Striping.
                2. This way of data storage affects on the performance

                MIRRORING
                1. Identical copies of data are stored on the RAID members simultaneously.
                2. This way of data storage affects the fault tolerance as well as performance

                PARITY
                1.  It is an error correction technique
                2. It is used to reconstruct data on a drive that has failed in an array.
                   Xor operation is used to find the parity

                  

                RAID LEVELS

                NRAID

                1. It is called as Non Raid
                2. Only one drive is used to create a NRaid
                3. Total array space is equal to disk space
                4. No redundancy and no striping
                 
                RAID 0

                1. It is based on Striping
                2. Minimum of 2 disks are required
                3. No data redundancy
                4. No fault tolerance
                5. High performance


                1. If R0 is created with disks of different sizes, the total available storage space in the array is limited by the size of the smallest disk.
                2. For example, a 450GB disk and 300GB disk are used to create R0. The total of the vdisk size will be (300GB+300GB) =600GB


                RAID 1

                1. It is based on Mirroring
                2. Minimum of 2 disks are required
                3. Increased read performance
                4. Write operations are slow
                5. Fault tolerance
                6. Rebuild will be faster
                7. Recommended in applications such as email, web serves

                1. Raid 1 can be created with different  sizes of disks but the total available array size will be size of the smallest disk.
                2. For example when a 450GB and 300GB disks are used to create R1, the total array size will be min(450Gb,300GB)=300GB


                RAID 10

                1. It is based on Striping(Raid 0) and Mirroring(Raid 1)
                2. Minimum of 4 disks are required
                3. High performance from RAID 0
                4. Fault tolerance from RAID 1
                5. Better performance than all other redundant RAID levels.
                6. Recommended for database applications

                1. RAID10 can be created with disks of different sizes but the total available storage space in the array is limited by the size of the smallest disk.
                2. For example, three 400GB and one 300GB disks are used, the total space will be (3+1)/2*min(450GB,300GB)=600GB
                3. Can survive a single drive failure in each R1
                 RAID 3





                1. It uses byte level striping with a dedicated parity disk.
                2. The disks have to spin in sync to get to the data
                3. Sequential read and write will have good performance
                4. Random read and write will have worst performance
                RAID 5

                1. It uses block level striping with distributed parity
                2. Minimum of 3 disks are required
                3. High read performance 
                4. Slow write performance
                5. Fault tolerance
                6. Recommended for backup applications

                1. RAID 5 can be created with disks of different sizes but the total available storage space in the array is limited by the size of the smallest disk
                2. Parity data consumes a complete disk, so if 3 disks are used the array space is created only by using 2 disks
                3. For example if there are two 450Gb disks and one 300GB disk, the array space will be (3-1)*min(450GB,300GB)=600GB

                 RAID 50


                1. RAID 50 is a combination of R5 (striping and parity) and R0 (striping)
                2. A minimum of 6 disks are required
                3. R50 provides better performance than R5 but required more disks.
                4. Better write performance
                5. High fault tolerance along with high capacity
                6. R50 recommended for backup applications


                1. R50 can be created with disks of different sizes, but the total available storage space in the array is limited by the size of the smallest disk.
                2. For example if there are five 450Gb disks and 1 300Gb disk, then array size will be (6-2)*min(450GB,300GB)=1200GB
                3. Can survive a single drive failure in each R5



                RAID 6

                1. Similar to RAID 5 with two parity blocks distributed
                2. Minimum of 4 disks are required
                3. Can survive disk failures up to 2 drives
                4. Read speed is same as in RAID 5 
                5. Writes are slow
                6. Low performance when reconstructing of a failed disk 

                1. R6 can be created with disks of different sizes but the total available space in the array is limited by the smallest disk
                2. Parity data consumes two complete disks so n-2 will be the total space
                3. For example 3 450GB disks and one 300GB disk is used, the total space will be (4-2)*min(450Gb,300GB)=600GB




                RAID IMPLEMENTATIONS


                RAID can be created in two different ways

                SOFTWARE RAID
                1. When the operating system drives are used it is called Software RAID
                2. It is the cheapest RAID solution
                3. It have low performance because of consuming resource from hosts.
                4. Operating systems has the built in capacity to create a raid and only few raid levels are supported
                5. Windows home editions allows RAID 0
                6. Windows server editions allows  RAID 1 and RAID 5
                7. Software RAID used Host CPU for implementation
                8. Only RAID 1 can contain boot partition and cannot create a system boot with RAID 5 or RAID 0
                9. Software RAID doesn't implement hot swapping so it cannot be used where continuous availability is required

                HARDWARE RAID
                1. When a special hardware is used it is called hardware RAID
                2. High performance
                3. There are 2 options to create a hardware raid
                4. Volume management is performed by controller card
                • Inexpensive RAID chip possibly built into the motherboard
                • More expensive option with a complex standalone RAID controller which has their own CPU, battery back up cache memory and they support hot-swapping.
                • Hardware raid looks like this



                ADVANTAGES OF HARDWARE RAID OVER SOFTWARE RAID

                1. It doesn't use CPU of the host computer
                2. Allow users to create boot partitions
                3. Handles errors better as it communicates directly with devices
                4. supports hot-swapping


                RAID ARRAY IS NOT A BACKUP SOLUTION

                A raid array does not allow to recover a deleted or corrupted file due to a bug in your application.


                Doubts


                1. Which is the best raid level -- which is best for reads  and which is best for writes
                2. Which raid we are using hardware or software (Hardware, but where are raid controllers, battery backup, cache and CPU in our storage box)
                3. How to create a software raid in Linux and windows, what things should be taken care of
                4. What is the use of NRAID
                5.  Any other points missed regarding Raid's
                6. Difference between redundancy and fault tolerance

                Summary

                1. Block size
                2. Hot spare----Spare drive which can be used automatically to replace a failed drive
                3. Hot swap---- 
                4. Throughput----
                5. Chunks----Size of data which can be minimum from 4KB and more, chunk size can increase the IO performance
                6. Reliability
                7. Redundancy
                8. Fault tolerant
                9. block level striping
                10. byte level striping
                11. Interleaving---It is a process or methodology to make a system more efficient, fast and reliable by arranging data in a non continuous manner.
                12. R0 provides maximum usable disk space
                13. Unique characteristic of R6 is two independent distributed parity
                14. Characteristic of R5 is distributed parity
                15. Raid types used for data protection are R5
                16. R5 has write hole problem --> what is this
                17. If read and write are only criteria then R0 is a good choice
                18. Raid levels which support double disk failure are R6, R10 and R50
                19. R10 and R50 are nested raid levels
                20. Raid module in Linux kernel is called md --> what is full form and how to implement this
                21. If a drive in R5 fails with or without hot spare , both reads and writes can continue
                22. RAID table 
                 Refer RAID TABLE.xlsx

                Key points to remember in each Raid level 


                R0

                Minimum 2 disks
                Excellent performance (as blocks are striped)
                No redundancy (no parity, no mirror)
                Not used for any critical system

                R1

                Minimum 2 disks
                Good performance (no striping, no parity)
                Excellent redundancy (as blocks are mirrored)

                R5

                Minimum 3 disks
                Good performance (as blocks are striped)
                Good redundancy (distributed parity)
                Best cost effective option providing both performance and redundancy
                Used in databases where read is important
                Write operations are slow

                R10

                Minimum 4 disks
                Also called as Stripe of mirrors
                Excellent redundancy (as blocks are mirrored)
                Excellent performance (as blocks are striped)
                Best option for any mission critical applications and data bases

                R3

                Uses byte level striping -stripes bits across the disks
                Uses multiple data disks and a dedicated disk to store parity
                The disks have to spin in sync to get to the data
                Sequential read and write will have good performance
                Random read and write will have worst performance
                Not commonly used.

                R6

                Minimum 4 disks 
                Block level striping with dual parity disks
                Can handle 2 disks failure


                Links

                 http://www.freeraidrecovery.com/library/what-is-raid.aspx
                 http://blog.iweb.com/en/2010/05/an-overview-of-raid-technology/4283.html
                 http://www.sanfoundry.com/raid-interview-questions/
                 http://www.sanfoundry.com/raid-questions-answers/
                http://www.sanfoundry.com/raid-questions-answers/ 
                https://mkskistudy.weebly.com/interview-questions-on-raid.html










                1.  

                1 comment:

                1. Casino Bonus Codes 2021 | Grizzly Casino
                  Read casino bonus reviews for 라이브채팅 casino games at 케이벳 Grizzly Casino. Discover the best 승인전화없는 토토 no deposit bonuses and promotions for 게임종류 2021. 1xbet 우회 Claim your bonus now!

                  ReplyDelete