But I can't take all the credit for most of the work is done by HTML::TreeBuilder. All I did was write a little function to traverse the tree and deal with the nodes (keep, toss, etc.)
The bulk of the work was coming up with a list of allowable tags/attributes. It's a hash (tag) of hashes (attributes). The values can be regular scalars (an action code) or code refs to dynamically take action or even change the parameter's value.