%PDF- <> %âãÏÓ endobj 2 0 obj <> endobj 3 0 obj <>/ExtGState<>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/Annots[ 28 0 R 29 0 R] /MediaBox[ 0 0 595.5 842.25] /Contents 4 0 R/Group<>/Tabs/S>> endobj ºaâÚÎΞ-ÌE1ÍØÄ÷{òò2ÿ ÛÖ^ÔÀá TÎ{¦?§®¥kuµùÕ5sLOšuY>endobj 2 0 obj<>endobj 2 0 obj<>endobj 2 0 obj<>endobj 2 0 obj<> endobj 2 0 obj<>endobj 2 0 obj<>es 3 0 R>> endobj 2 0 obj<> ox[ 0.000000 0.000000 609.600000 935.600000]/Fi endobj 3 0 obj<> endobj 7 1 obj<>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI]>>/Subtype/Form>> stream
For most of these scripts if you run them with a file argument, where the file contains some HTML, you should get some output. The 'h*sub' scripts take two arguments the first of which is a perl expression and the second an HTML file. In any case all of the files have an exlanatory comment. For example try running: lynx -dump -source -raw http://www.debian.org > /tmp/a.txt ./hanchors /tmp/a.txt Of course if http://www.debian.org is not your favourite web site you can make the appropriate substitution. hanchors - List all anchors in the HTML hlc - Correct any upper case tags to lower case hstrip - Removes deprecated scripting and styling tags and attributes htextsub - Apply arbirary perl expression to all text within HTML hrefsub - Apply arbirary perl expression to all hrefs within HTML htitle - Print title of the HTML document hdump - Output event information whilst parsing HTML document hform - Print analysis of form controls present in HTML htext - Print all the text from the HTML