\PhpTeaserTeaser

Create a summary from long text blocks

php-teaser is based on the original TextTeaser project written in Scala by Mojojolo and PyTeaser by xiaoxu193. It's completely re-written in PHP. The aim of php-teaser is is to take any news article and extract a brief summary from it.

php-teaser requires php-readability

Summary

Methods
Properties
Constants
createSummary()
$stopWords
$ideal
No constants found
No protected methods found
No protected properties found
N/A
getArticle()
computeScore()
sbs()
dbs()
splitWords()
computeKeywords()
splitSentences()
computeLengthScore()
computeTitleScore()
computeSentencePositionScore()
cleanText()
getNormalizedBounds()
No private properties found
N/A

Properties

$stopWords

$stopWords : 

List of common words to ignore

Type

$ideal

$ideal : 

Magic number representing a "good" sentence length.

Type

Methods

createSummary()

createSummary(String  $text, String  $type, String  $title = "", Int  $count = 3) : Array

Summarize some text, optionally by extracting an article from a URL namespace PhpTeaser;

Parameters

String $text

Accepts a block or text or a URL

String $type

The type of text given

String $title

The title (optional)

Int $count

The number of sentences to return (optional)

Returns

Array —

The resulting sentences

getArticle()

getArticle(String  $url) 

Extract article from a page using php-readability

Parameters

String $url

The URL fo the text to grab

computeScore()

computeScore(Array  $sentences, Array  $titleWords, Array  $keywords) : Array

Score sentences

Parameters

Array $sentences
Array $titleWords
Array $keywords

Returns

Array —

Resulting scores for each $sentence

sbs()

sbs(Array  $words, Array  $keywords) : Int

Score a sentence based on the presences of a keyword

Parameters

Array $words

All the words in the sentence

Array $keywords

All of the keyword and frequency tuples

Returns

Int —

score for the sentence representent by $words

dbs()

dbs(Array  $words, Array  $keywords) : Int

Score a sentence based on its proximity to other keywords (d as in distance)

Parameters

Array $words

All the words in the sentence

Array $keywords

All of the keyword and frequency tuples

Returns

Int —

score for the sentence representent by $words

splitWords()

splitWords(String  $text) : Array

Split text into words

Parameters

String $text

The Text to split

Returns

Array —

The resulting word list

computeKeywords()

computeKeywords(String  $text) : Array

Compute top 10 keywords based on frequency, elininating stopwords

Parameters

String $text

Text source from which to compute keywords

Returns

Array —

Scored keywords where each item is array('keyword',score)

splitSentences()

splitSentences(String  $text) : Array

Split text into sentences with regex

Borrowed from http://stackoverflow.com/questions/5032210/php-sentence-boundaries-detection

Parameters

String $text

Text to split

Returns

Array —

Sentences

computeLengthScore()

computeLengthScore(Array  $sentence) : Int

Score a sentence based on its length

Parameters

Array $sentence

Returns

Int —

score

computeTitleScore()

computeTitleScore(String  $title, String  $sentence) : Int

Score a sentence based on the title

Parameters

String $title
String $sentence

Returns

Int —

score

computeSentencePositionScore()

computeSentencePositionScore(Int  $i, Int  $size) : Float

Different sentence positions indicate different probability of being an important sentence

Parameters

Int $i

Sentence position in $text

Int $size

Length of sentence

Returns

Float —

sentence position score

cleanText()

cleanText(  $text) 

Remove unwanted tags etc from text

Parameters

$text

getNormalizedBounds()

getNormalizedBounds() : array

Translate sentence positions into scores

Returns

array