keyboard_arrow_up
MAXIMAL MARGINAL RELEVANCE BASED MALAYALAM TEXT SUMMARIZATION WITH SUCCESSIVE THRESHOLDS

Authors

Ajmal E B and Rosna P Haroon
Mar Athanasius College Of Engineering, India

Abstract

Automatic text summarization has prime importance in the area of Natural Language Processing. As we are aware a large quantity of information are there on web, it is very difficult to extract the needed information from the huge. Text summarization is the process of shorten the document, so that it retains only the important points of the original document. As the problem of information overload has grown, and the quantity of data has assumed a greater significance, the need for an instant summarization of the un-touched language -Malayalam-assumes vital importance. Lots of summarization systems have already been developed for various languages, there is no such well performing system for Malayalam. In this paper propose a Malayalam text summarization system which is based on MMR technique with successive threshold. Here the sentences are selected based on the concept of maximal marginal relevance. The key idea is to use a unit step function at each step to decide the maximum marginal relevance and the number of sentences present in the summary would be equal to the number of paragraphs or the average number of sentences present in the text document, which can be achieved byusing successive threshold approach.

Keywords

Maximum Marginal Relevance, Successive Threshold, Unit step function