I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.

This one is about why it was a mistake to call 1024 bytes a kilobyte. It’s about a 20min read so thank you very much in advance if you find the time to read it.

Feedback is very much welcome. Thank you.

  • Australis13@fedia.io
    link
    fedilink
    arrow-up
    8
    arrow-down
    1
    ·
    6 months ago

    This whole mess regularly frustrates me… why the units can’t be used consistently?!

    The other peeve of mine with this debacle is that drive capacities using SI units do not use the full available address space (since it’s binary). Is the difference between 250GB and 256GiB really used effectively for wear-levelling (which only applies to SSDs) or spare sectors?

      • Australis13@fedia.io
        link
        fedilink
        arrow-up
        5
        ·
        6 months ago

        Of course. The thing is, though, that if the units had been consistent to begin with, there wouldn’t be anywhere near as much confusion. Most people would just accept MiB, GiB, etc. as the units on their storage devices. People already accept weird values for DVDs (~4.37GiB / 4.7GB), so if we had to use SI units then a 256GiB drive could be marketed as a ~275GB drive (obviously with the non-rounded value in the fine print, e.g. “Usable space approx. 274.8GB”).

        • wewbull@feddit.uk
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 months ago

          They were consistent until around 2005 (it’s an estimate) when drives got large enough where the absolute difference between the two forms became significant. Before that everyone is computing used base 2 prefixes.

          I bet OP does too when talking about RAM.

      • wischi@programming.devOP
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        7
        ·
        6 months ago

        It’s not as simple as that. A lot of “computer things” are not exact powers of two. A prominent example would be HDDs.

        • Lmaydev@programming.dev
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          2
          ·
          edit-2
          6 months ago

          In terms of storage 1000 and 1024 take the same amount of bytes bits to represent. So from a computer point of view 1024 makes a lot more sense.

          It’s just a binary Vs decimal thing. 1000 is not nicely represented in binary the same as 1024 isn’t in decimal.

          Edit: was talking about storing the actual number.

          • abhibeckert@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            6 months ago

            In terms of storage 1000 and 1024 take the same amount of bytes.

            What? No. A terabyte in 1024 units is 8,796,093,022,208 bits. In 1000 units it’s 8,000,000,000,000 bits.

            The difference is substantial with larger numbers.

            • Lmaydev@programming.dev
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              edit-2
              6 months ago

              Both require the same amount of bits again. So the second one makes more sense for a computer.

    • nous@programming.dev
      link
      fedilink
      English
      arrow-up
      8
      ·
      6 months ago

      Huh? What does how a drive size is measured affect the available address space used at all? Drives are broken up into blocks, and each block is addressable. This is irrelevant of if you measure it in GB or GiB and does not change the address or block size. Hell, you have have a block size in binary units and the overall capacity in SI units and it does not matter - that is how it is typically done with typical block sizes being 512 bytes, or 4096 (4KiB).

      Or have anything to do with ware leveling at all? If you buy a 250GB SSD then you will be able to write 250GB to it - it will have some hidden capacity for ware-leveling, but that could be 10GB, 20GB, 50GB or any number they want. No relation to unit conversions at all.

      • Australis13@fedia.io
        link
        fedilink
        arrow-up
        2
        arrow-down
        2
        ·
        edit-2
        6 months ago

        Huh? What does how a drive size is measured affect the available address space used at all? Drives are broken up into blocks, and each block is addressable.

        Sorry, I probably wasn’t clear. You’re right that the units don’t affect how the address space is used. My peeve is that because of marketing targeting nice round numbers, you end up with (for example) a 250GB drive that does not use the full address space available (since you necessarily have to address to up 256GiB). If the units had been consistent from the get-go, then I suspect the average drive would have just a bit more usable space available by default.

        My comment re wear-levelling was more to suggest that I didn’t think the unused address space (in my example of 250GB vs 256GiB) could be excused by saying it was taken up by spare sectors.

        • nous@programming.dev
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 months ago

          (for example) a 250GB drive that does not use the full address space available

          Current drives do not have different sized addressable spaces and a 256GiB drive does not use the full address space available. If it did then that would be the maximum size a drive could be. Yet we have 20TB+ drives and even those are no where near the address size limit of storage media.

          then I suspect the average drive would have just a bit more usable space available by default.

          The platter size might differ to get the same density and the costs would also likely be different. Likely resulting in a similar cost per GB, which is the number that generally matters more.

          My comment re wear-levelling was more to suggest that I didn’t think the unused address space (in my example of 250GB vs 256GiB) could be excused by saying it was taken up by spare sectors.

          There is a lot of unused address space - there is no need to come up with an excuse for it. It does not matter what size the drive is they all use the same number of bits for addressing the data.

          Address space is basically free, so not using it all does not matter. Putting in extra storage that can use the space does cost however. So there is no real relation between the address spaces and what space is on a drive and what space is accessible to the end user. So it makes no difference in what units you use to market the drives on.

          Instead the marketing has been incredibly consistent - way back to the early days. Physical storage has essentially always been labeled in SI units. There really is no marketing conspiracy here. It just that is they way it was always done. And why it was picked that way to begin with? Well, that was back in the day when binary units where not as common and physical storage never really fit the doubling pattern like other components like ram. You see all sorts of random sizes in early storage media so SI units I guess did not feel out of place.

    • abhibeckert@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 months ago

      The other peeve of mine with this debacle is that drive capacities using SI units do not use the full available address space (since it’s binary).

      The “full available address space” goes down as the drive gets older and bad sectors are removed.

      With a good drive, it might take ten or more years before you actually see the “size” of the drive shrink, but that’s mostly because you 500GB drive actually had something like 650GB of storage when it was brand new.