I am continuing down the rabbit hole of compression. I aim to compress 1GB of Wikipedia -- the enwik9 dataset -- into as small a file as possible. I need to compress the file below 113 MB to beat the current record.

I have written two posts describing various methods I have attempted to compress text data. I have decided to turn "Adventures with Compression" into a series in which I share notes from my learnings at each stage of my journey. I will likely make mistakes along the way. If you notice a flaw in my thinking, let me know; compression is a relatively new subject to me. I am excited to see what I learn.

In this post, I am going to talk about my adventures in chunking a document to compress it further.