forked from Lainports/opnsense-ports
Taken from: https://github.com/freebsd/freebsd-ports.git Commit id: 5070672073b68be364139bc6b3a89100bd17d331
11 lines
377 B
Text
11 lines
377 B
Text
HTML::ExtractContent is a module for extracting content from HTML with
|
|
scoring heuristics.
|
|
|
|
It guesses which block of HTML looks like content according to scores
|
|
depending on the amount of punctuation marks and the lengths of non-tag
|
|
texts.
|
|
|
|
It also guesses whether content end in the block or continue to the next
|
|
block.
|
|
|
|
WWW: http://search.cpan.org/dist/HTML-ExtractContent/
|