Skip to main content

CHỌN CLUSTER SIZE 4K HAY 64K ?

 First published on TECHNET on Jan 13, 2017

Microsoft’s file systems organize storage devices based on cluster size. Also known as the allocation unit size, cluster size represents the smallest amount of disk space that can be allocated to hold a file. Because ReFS and NTFS don’t reference files at a byte granularity, the cluster size is the smallest unit of size that each file system can reference when accessing storage. Both ReFS and NTFS support multiple cluster sizes, as different sized clusters can offer different performance benefits, depending on the deployment.

In the past couple weeks, we’ve seen some confusion regarding the recommended cluster sizes for ReFS and NTFS, so this blog will hopefully disambiguate previous recommendations while helping to provide the reasoning behind why some cluster sizes are recommended for certain scenarios.

IO amplification

Before jumping into cluster size recommendations, it’ll be important to understand what IO amplification is and why minimizing IO amplification is important when choosing cluster sizes:

  • IO amplification refers to the broad set of circumstances where one IO operation triggers other, unintentional IO operations. Though it may appear that only one IO operation occurred, in reality, the file system had to perform multiple IO operations to successfully service the initial IO. This phenomenon can be especially costly when considering the various optimizations that the file system can no longer make:

    • When performing a write, the file system could perform this write in memory and flush this write to physical storage when appropriate. This helps dramatically accelerate write operations by avoiding accessing slow, non-volatile media before completing every write.

    • Certain writes, however, could force the file system to perform additional IO operations, such as reading in data that is already written to a storage device. Reading data from a storage device significantly delays the completion of the original write, as the file system must wait until the appropriate data is retrieved from storage before making the write.




ReFS cluster sizes

ReFS offers both 4K and 64K clusters. 4K is the default cluster size for ReFS, and we recommend using 4K cluster sizes for most ReFS deployments because it helps reduce costly IO amplification:

  • In general, if the cluster size exceeds the size of the IO, certain workflows can trigger unintended IOs to occur. Consider the following scenarios where a ReFS volume is formatted with 64K clusters:

    • Consider a tiered volume . If a 4K write is made to a range currently in the capacity tier, ReFS must read the entire cluster from the capacity tier into the performance tier before making the write . Because the cluster size is the smallest granularity that the file system can use, ReFS must read the entire cluster, which includes an unmodified 60K region, to be able to complete the 4K write.

    • If a cluster is shared by multiple regions after a block cloning operation occurs, ReFS must copy the entire cluster to maintain isolation between the two regions. So if a 4K write is made to this shared cluster, ReFS must copy the unmodified 60K cluster before making the write.

    • Consider a deployment that enables integrity streams . A sub-cluster granularity write will cause the entire cluster to be re-allocated and re-written, and the new checksum must be computed. This represents additional IO that ReFS must perform before completing the new write, which introduces a larger latency factor to the IO operation.



  • By choosing 4K clusters instead of 64K clusters, one can reduce the number of IOs that occur that are smaller than the cluster size, preventing costly IO amplifications from occurring as frequently.


Additionally, 4K cluster sizes offer greater compatibility with Hyper-V IO granularity, so we strongly recommend using 4K cluster sizes with Hyper-V on ReFS.  64K clusters are applicable when working with large, sequential IO, but otherwise, 4K should be the default cluster size.

NTFS cluster sizes

NTFS offers cluster sizes from 512 to 64K, but in general, we recommend a 4K cluster size on NTFS, as 4K clusters help minimize wasted space when storing small files. We also strongly discourage the usage of cluster sizes smaller than 4K. There are two cases, however, where 64K clusters could be appropriate:

  • 4K clusters limit the maximum volume and file size to be 16TB

    • 64K cluster sizes can offer increased volume and file capacity, which is relevant if you’re are hosting a large deployment on your NTFS volume, such as hosting VHDs or a SQL deployment.



  • NTFS has a fragmentation limit, and larger cluster sizes can help reduce the likelihood of reaching this limit

    • Because NTFS is backward compatible, it must use internal structures that weren’t optimized for modern storage demands. Thus, the metadata in NTFS prevents any file from having more than ~1.5 million extents.

      • One can, however, use the “format /L” option to increase the fragmentation limit to ~6 million. Read more here .



    • 64K cluster deployments are less susceptible to this fragmentation limit, so 64K clusters are a better option if the NTFS fragmentation limit is an issue. (Data deduplication, sparse files, and SQL deployments can cause a high degree of fragmentation.)

      • Unfortunately, NTFS compression only works with 4K clusters, so using 64K clusters isn’t suitable when using NTFS compression. Consider increasing the fragmentation limit instead, as described in the previous bullets.


While a 4K cluster size is the default setting for NTFS, there are many scenarios where 64K cluster sizes make sense, such as: Hyper-V, SQL, deduplication, or when most of the files on a volume are large.

Source link: https://techcommunity.microsoft.com/t5/storage-at-microsoft/cluster-size-recommendations-for-refs-and-ntfs/ba-p/425960

Comments

Popular posts from this blog

[RAID] SWITCH FROM AHCI TO RAID WITH INTEL C600 CONTROLLER

I personally have used other ways to do this. Manipulating some registry settings in combination with a safe boot before booting normally does the trick as well. This works with both SATA SSD and M.2 NVMe drives and it enables relatively fast switching between back and forth between AHCI and RAID. I have described this method below.  I have also tried the same process used to switch from RAD to AHCI and that works as well. Switch to safe boot Reboot into BIOS Change from AHCI to RAID in the BIOS Boot into safe mode Turn off safe mode and reboot normally again Nothing else and that also did the trick, just like with moving from RAID to AHCI.  So the link above and my step by step below is here for completeness. You have options in case one of them doesn’t work! Step by step AHCI to RAID registry method This procedure I describe below works on Windows 10 1803/1809 and has been tested on Dell Latitude E6220 an XPS 13 9360. Editing the registry is...

[Hyper-V] - Lỗi không boot vào được sau khi convert máy vật lý sang máy ảo

XỬ LÝ LỖI KHÔNG BOOT ĐƯỢC VÀO MÁY ẢO SAU KHI CONVERT TỪ MÁY VẬT LÝ BẰNG DISK2VHD Sau khi convert server vật lý sang file VHD để import vào Hyper thì khi start máy ảo lên màn hình máy ảo chỉ nhấp nháy con trỏ chuột trên màn hình đen (blinking cursor) NGUYÊN NHÂN Do máy vật lý sử dụng ổ đĩa cài OS được format theo chuẩn GPT (thay vì MBR như truyền thống, tham khảo GPT và MBR ) XỬ LÝ Bước 1: chuyển ổ GPT thành MBR Copy file VHD của ổ đĩa chứa OS về 1 máy tính Windows 8 trở lên Trên máy Windows 8+ click phải chuột lên file VHD vừa copy, chọn lệnh Mount . Lúc này dùng 1 phần mềm miễn phí (vd: Mini Partition Wizard ) để convert ổ đĩa vừa mount từ GPT  -> MBR Sau đó Delete phần Partition dư ra ở phần đầu ổ đĩa được mount (khoảng vài trăm MB) Set " Active " cho ổ đĩa này để là ổ đĩa boot OS Nhấn Apply để phần mềm thực thi tác vụ Sau khi phần mềm làm xong, tắt phần mềm Mini Partition Wizard, vào My Computer chọn eject ổ đĩa đang mount . Copy file VHD vừa đư...

LỖI "The provided partition "Migration...." is not a valid Migration mailbox"

  Solution for a valid Migration mailbox could not be found for this organization To address this issue, we will: Delete Migration mailbox in Active Directory Users and Computers Recreate Migration mailbox with /PrepareAD command Enable Migration mailbox with Exchange Management Shell 1. Delete Migration mailbox in Active Directory Users and Computers We do see the mailbox in ADUC, let’s remove it. If you don’t see it, search for it. It might be in a different container than the default container  Users . We can always verify in Exchange Management Shell if the Migration mailbox is present. If it shows up in the output, it means that it’s present and enabled. The output should be empty. [PS] C:\> Set - ADServerSettings - ViewEntireForest $true ; Get - Mailbox - Identity "Migration.8f3e7716-2011-43e4-96b1-aba62d229136" - Arbitration | Format-Table Name , ServerName , Database , AdminDisplayVersion , ProhibitSendQuota Copy 2. Recreate Migration mailbox with /Prepare...