In this lesson students are introduced to the standard units for measuring the sizes of digital files, from a single byte, all the way up to terabytes and beyond. Students begin the lesson by comparing the size of a plain text file containing "hello" to a Word document with the same contents. Students are introduced to the units kilobyte, megabyte, gigabyte, and terabyte, and research the sizes of files they make use of every day, using the appropriate terminology. This lesson foreshadows an investigation of compression as a means for combatting the rapid growth of digital data.



Students will be able to:


The simple purposes of this lesson are:

  1. Get terminology out in the open.
  2. Become somewhat conversant with file types and sizes.
  3. Grapple with orders-of-magnitude differences between things.

The 8-bit byte has become the de-facto fundamental unit with which we measure the "size" of data on computers, and in fact, today most computers only let you save data as combinations of whole bytes; even if you only want to store 1 bit of information, you have to use a whole byte to do it. And many computer systems will require you store even more than that. Messages sent over the Internet are also typically structured as messages with byte-offsets.

Paralleling the explosion of computing power and speed, the sheer size of the digital data now created and consumed every day is staggering. Units of measure (terabytes) that previously seemed unfathomably large are now making their way into personal computing. This rapid growth of digital data presents many new opportunities and also poses new challenges to engineers and programmers. The implications of so-called Big Data will not be investigated until later in the course, but it's good and interesting to be thinking about the size of things now.


Getting Started

As we start a new unit about Data and Digital Information we need to get familiar with terminology about data and different types of data files. Recall that a single character of ASCII text requires 8 bits. The most technical term for 8 bits of data is a byte, which is the standard fundamental unit underlying most computing systems today. You may have heard "megabyte," "kilobyte," "gigabyte," etc. which are all different amounts of a bytes. We're going to learn more about them today

File Size Comaprison: .txt vs .doc

Prompt: In Unit 1 we learned that in addition to the actual text of a document, it is usually necessary to store the formatting information that allows the text to be displayed correctly. We might wonder just how much extra information, i.e. how many extra bytes, we need to store when we include all of this formatting. Let's find out!

If a single ASCII character is one byte then if we were to store the word "hello" in a plain ASCII text file in a computer, we would expect it to need 5 bytes (or 40 bits) of memory.

What about a Microsoft Word document that contains the single word "hello"? How many more bytes will a Word document require to store the word "hello" than a plain text document?

Discuss: Have students silently make their prediction, then share with a partner, then share with the group. Prompt a couple students to share why they chose the size they did.

Demonstrate: Do a live demo where you show the size of the different files. Use the "hello.txt" and "hello.docx" files included in the File Size Demonstration link in the Resources section.

Pro Tip: If you wish, it might be more fun to create these files in front of your students, saving them on the desktop for a quick demo. To make a plain ASCII text file you’ll need to use the correct program:

To find the actual size of a file on your computer, do one of the following:

In general, the Word Doc should be thousands of times larger than the plain text.

Review: Review students predictions to see how close they were.

The big difference in file size between .txt and .docx is due to the extensive formatting information included along with the actual text in .docx. Modern data files typically measure in the thousands, millions, billions or trillions of bytes. Let's get a little practice looking at files and how big they are.


Put students in pairs to find answers or have them work individually.

Distribute: Bytes and File Sizes - Activity Guide, which introduces the terminology, refers websites that students can use for reference, and asks questions like:

Allow students time to finish this activity either individually or in pairs by conducting online research.

Wrap Up

Review Worksheet

Share: Provide students an opportunity to clear up any remaining confusion and share interesting pieces of information they came across.

Review: Answers to the questions on the Activity Guide.

Foreshadow Compression

As you have seen data file size can grow very quickly in size. In the modern world there is a lot of data around us and usually we want it transmitted over the internet.

There is a problem though: If you want to transmit a lot of data you are limited by the speed of your internet connection. Even if you have a fast Internet connection there is a physical limit to how fast you can transmit bits.

What if the data you want to send is big enough that it takes an unreasonable amount of time to transmit it, even with a really fast internet connection. Assuming you can't make the Internet connection any faster, could you still transmit the data faster somehow?

The answer is yes and it is probably something you've done, or do every day!


Use the last 3 questions on the Activity Guide for assessment:

  1. A salesperson is trying to sell you a phone that has 16 GB of memory saying, "that’s enough space to record an hour of high quality video!" This salesperson is probably wrong, but in which direction? Would you have more than enough memory or not enough?
  2. Shakespeare’s complete works have approximately 3.5 million characters. Which is bigger in file size: Shakespeare’s complete works stored in plain ASCII text or a 4 minute song on mp3? How much bigger?
  3. Tricky: Assume your Internet connection can transmit 1 million bits per second. Approximately how long would it take you to download 1 Terabyte of data? (Hint: first figure out how many bits a terabyte is, second be prepared to wait a long time).

Standards Alignment