We consider the minimal directed acyclyc graph (DAG) lossless compression strategy introduced in Kieffer et. al., with the aim of testing its asymptotic effectiveness on binary trees of size n. We have four models for studying the compression strategy: two ways of measuring size (either the number of leaves or the depth of the tree), and two types of probability distributions (all planar trees are equally likely, or all nonplanar trees are equally likely). We calculate the average compression achieved by Kieffer et. al.'s strategy for some specific example classes of binary trees, and then more generally, averaging over all (either nonplanar, or planar) binary trees of a fixed size n. We use the results to draw conclusions about the kinds of trees for which the strategy is effective. An ultimate goal is to determine the extent to which the size of the DAG is correlated with the information embodied in the associated tree.

Author Bio

Mayfawny Bergmann is a junior in mathematics, statistics, and actuarial science at Purdue University. She began doing math research the summer after her freshman year with Dr. Mark Daniel Ward, through the Center for Science of Information. After graduation she plans to earn a master’s degree in applied statistics. Mayfawny enjoys singing and playing the piano and guitar.