Students use the Text Compression Widget to experiment with compressing songs and poems and try to find their "personal best" compression. A video introduces important vocabulary for the lesson and demonstrates the full features of the widget. Students pick a text they think will be "easy" to compress and one they think will be "difficult", paying attention to why some texts might be more compressible than others. As a wrap-up, students discuss what factors make some texts more compressible than others.


Students will be able to:


As students have been creating images over the last few lessons, the number of bits it takes to represent that information has grown and grown. In this lesson, students are introduced to the concept of compression as a way to address the growing file sizes of all of our information. This lesson is anchored by the Text Compression widget, which is a very hands-on & active widget for students to experiment with. Most of the lesson should be spent in the widget, having students experiment with different strategies for compression and creating a memorable experience to help anchor the concept of compression. Students also watch a video that introduces lossless and lossy compression - today's lesson is an example of lossless compression, while tomorrow's lesson is dedicated to lossy compression. The widget is just one example of lossless compression and students aren't expected to master specific compression strategies - instead, they should understand that lossless compression uses less data and still lets them re-create the original information.


Getting Started

Prompt: This list represents several common abbreviations used in text messages. What other abbreviations could you add to this list?

Prompt: Why might we use abbreviations when sending messages? What are the advantages?

Discussion Goal: There are many possible responses to this - to talk in code, to hide information, to be clever - but an important response to highlight is that abbreviations save time & space when communicating. If a student suggested an abbreviation that not everyone knew, this is a great moment to bring up that both the sender and the receiver need to understand what the abbreviation stands for in order for it to make sense. Both of these points foreshadow today's activity on compression.



Prompt: How is this message the same as the first? What actually gets sent to my friend?

Discussion Goal

Students should notice that each symbol represents other snippets of text. By substituting each symbol for the text it represents, we can re-create the original message.

Students may need some guidance to see that the entire sent message is really two parts - the text with symbols and the key that shows what each symbol represents. Students should see that both need to be sent in order for the original message to be recreated - if only the text is sent, the receiver won't know how what each symbol represents to recreate the message.


Text Compression Widget

Do This: Provide students with links to the Lossless Text Compression project.


Circulate: Help students understand how this widget works so they can successfully compress text. Make note of students who have found successful strategies so they can be highlighted in the upcoming discussion.

Regroup: Gather the class back together. Emphasize the current compression rating. Have students make a note of their current Compression Percentage at the bottom of the box.

Prompt: What strategies are you using to compress your sample text? Which ones seem most successful?

Discussion Goal

Students will have encountered a variety of strategies, but there are a few worth emphasizing for the full class:

Video: Show Text Compression widget (tutorial) - Video (feel free to skip from 2:30-5:00, which shows Code.org's widget, which is different than the project used for this lesson plan. Don't miss 5:00+, which talks about concepts). After the video, be sure to emphasize two things:

Do This: Give students another 4 minutes to apply the strategies they've just seen to continue to raise their compression percentage.

Teaching Tip

Competitions: You could incorporate a peer-to-peer competition (in small groups or as a full class) to get the "highest" rating, but that can be isolating for students and suggests there is a single "best" way to do this. An alternate strategy is: when students start for the second time, have them compete against themselves to beat their rating during the first 4 minutes. In this way, success is measured by personal growth and has a higher chance of letting every student feel successful.

Starting Over: When solving computational problems, it can sometimes be helpful to restart completely from the beginning. This activity may be a good place to suggest this to students, especially those that feel particularly stuck or frustrated - sometimes restarting from the very beginning surfaces new ideas and strategies that we didn't see before.

Circulate: Check in with students on their strategies and their compression rates. Encourage students to continually try and reach a "personal best" by looking at how their compression rates change when they add or remove items from the dictionary.


Comparing Compressions


Group: Have students work with their neighbor for this activity. Place students in groups of 2 with at most one group of 3.

Do This: Students work together to compress an 'easy' text and a 'difficult' text.

Teaching Tip

"aaaa...aaa": Many groups will probably attempt the last option, all A's, as their "easy" text - it's possible to get a compression rating into the mid-80's with this text. This is fine, since it still emphasizes one of the big takeaways from this activity: information with high repetition is easier to compress. However, it is also reasonable to ask groups to do a second "easy" text once they're satisfied with this one

Priorities: It's not necessary for all groups to pick the same texts, nor is it important to find the very "best" compressions. Instead, students should focus on the qualities that they think make some texts "easier" or more "difficult" than others. You can emphasize this with the questions you ask as you circulate to groups: "What made you pick this for your 'easy' text? What made you pick this for your 'difficult' text?"

Wrap Up


Prompt: What made some messages "easier" to compress than others? What made some messages more "difficult" to compress than others?

Discussion Goal


Journal: Have students add the definition of lossless compression to their journal


Question: What is the most important quality of lossless compression?

Question: An author is preparing to send their book to a publisher as an email attachment. The file on their computer is 1000 bytes. When they attach the file to their email, it shows as 750 bytes. The author gets very upset because they are concerned that part of their book has been deleted by the email address. If you could talk to this author, how would you explain what is happening to their book?

Standards Alignment