opnsense-ports/www/p5-HTML-ExtractContent/pkg-descr
Franco Fichtner 8cb1a96ede ports: pull in a snapshot of the FreeBSD ports tree
Taken from:	https://github.com/freebsd/freebsd-ports.git
Commit id:	5070672073b68be364139bc6b3a89100bd17d331
2014-11-09 14:03:21 +01:00

11 lines
377 B
Text

HTML::ExtractContent is a module for extracting content from HTML with
scoring heuristics.
It guesses which block of HTML looks like content according to scores
depending on the amount of punctuation marks and the lengths of non-tag
texts.
It also guesses whether content end in the block or continue to the next
block.
WWW: http://search.cpan.org/dist/HTML-ExtractContent/