Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News

Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News,10.1109/TASL.2011.2160853,IEEE Transactions on Audio, Speech & Language Process

Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News  
BibTex | RIS | RefWorks Download
We propose Laplacian Eigenmaps (LE)-based ap- proaches to automatic story segmentation on speech recognition transcripts of broadcast news. We reinforce story boundaries by applying LE analysis to sentence connective strength matrix and reveal the intrinsic geometric structure of stories. Specifically, we construct a Euclidean space in which each sentence is mapped to a vector. As a result, the original inter-sentence connective strength is reflected by the Euclidean distances between the corre- sponding vectors and cohesive relations between sentences become geometrically evident. Taking advantage of LE, we present three story segmentation approaches: LE-TextTiling, spectral clustering and LE-DP. In LE-DP, we formalize story segmentation as a straightforward criterion minimization problem and give a fast dynamic programming solution to it. Extensive story segmenta- tion experiments on three corpora demonstrate that the proposed LE-based approaches achieve superior performances and signifi- cantly outperform several state-of-the-art methods. For instance, LE-TextTiling obtains a relative F1-measure increase of 17.8% on CCTV Mandarin BN corpus as compared to conventional TextTiling; LE-DP achieves a high F1-measure of 0.7460, which significantly outperforms a recent CRF-prosody approach with an F1-measure of 0.6783 on TDT2 Mandarin BN corpus.
Journal: IEEE Transactions on Audio, Speech & Language Processing - TASLP , vol. 20, no. 1, pp. 264-277, 2012
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.