PHP: Text cropping with defining the number of characters, words or sentences

// March 18th, 2009 // PHP, PHP Tips, Web Development

I have recently been working on PHP RSS reader script that was supposed to do three different things with RSS description text. To make a teaser RSS text, I was asked to develop the functionality in a such a way that text can be cropped in these three ways.

  • with defining the number of words.
  • with defining the numbers of characters.
  • with defining the number of sentences.

Here is a set of functions that helps you crop a text in three different ways. The usage of the function is also mentioned at the end f the script.
<?php

/**
* cropsentence()
*
* @param mixed $str
* @param mixed $noof
* @param integer $mode
// 0 means crop by no of characters // 1 means crop by no of sentences // 2 means crop by number of words
* @param string $appendwith
* @return
*/
function cropsentence($str, $noof, $mode = 0, $appendwith = “…”)
{
if (
strlen($str) > $chars) {
if (
$mode == 0) { // 0 means crop by no of characters
$str = substr($str, 0, $noof);
} elseif (
$mode == 1) { // 1 means crop by no of sentences
$str = getLeadingSentences($str, $noof);
} elseif (
$mode == 2) { // 2 means crop by number of words
$str = wordTrim($str, $noof);

} elseif ($mode < 0 || $mode > 2) { // others means do nothing
echo “The current mode is not correct.”;
}
$str = $str . $appendwith;
return
$str;
} else {
return
$str;
}
}

//getLeadingSentences
//Copyright (c) 2000 Jason R. Pitoniak.  All rights reserved.
//jason@interbrite.com http://www.interbrite.com

//If you find this code useful, find a bug, or have a suggestion,
//please email me.  Feel free to use this code for any purpose.

/**
* getLeadingSentences()
*
* @param mixed $data
* @param mixed $max
* @return
*/
function getLeadingSentences($data, $max)
{
//given string $data, will return the first $max sentences in that string

//in: $data = the string to parse, $max = maximum # of sentences to return
//returns: string containing the first $max sentences
//(If the # of sentences in the string is less than $max,
//then entire string will be returned.)

//a sentence is any charactors except ., !, and ?
//any number of times,  plus one or more .s, ?s, or !s
//and any leading or trailing whitespace:
$re = “^s*[^.?!]+[.?!]+s*”;
$out = “”;
for (
$i = 0; $i < $max; $i++) {
if (
ereg($re, $data, $match)) {
//if a sentence is found, take it out of $data and add it to $out
$out .= $match[0];
$data = ereg_replace($re, “”, $data);
} else {
$i = $max;
}
}
return
$out;
}

/**
* getLeadingWords()
*
* @param mixed $data
* @param mixed $max
* @return
*/
function getLeadingWords($data, $max)
{
//given string $data, will return the first $max sentences in that string

//in: $data = the string to parse, $max = maximum # of sentences to return
//returns: string containing the first $max sentences
//(If the # of sentences in the string is less than $max,
//then entire string will be returned.)

//a sentence is any charactors except ., !, and ?
//any number of times,  plus one or more .s, ?s, or !s
//and any leading or trailing whitespace:
$re = “^s*[^.?!]+[.?!]+s*”;
$out = “”;
for (
$i = 0; $i < $max; $i++) {
if (
ereg($re, $data, $match)) {
//if a sentence is found, take it out of $data and add it to $out
$out .= $match[0];
$data = ereg_replace($re, “”, $data);
} else {
$i = $max;
}
}
return
$out;
}

/**
* wordTrim()
*
* @param mixed $str
* @param mixed $len
* @return
*/
function wordTrim($str, $len)
{
$wordCount = 0;
$charCount = 0;

$length = strlen(strip_tags($str));
for (
$i = 0; $i < $length; $i++) {
if (
$str[$i] == ‘ ’) {
$wordCount++;
$charCount++;
if (
$wordCount == $len)
break;
} else {
$charCount++;
}
# end if
} # end for loop

$newstr = substr($str, 0, $charCount);
return
$newstr;
}
# end function

# Example used for testing purpose

$str2 = “Hello there I work for you. I have been working for other guys, but you are the coolest guy among them.
Also I really like the way you guide me with new technologies and the latest trends. You always tell me to keep working hard!”
;
echo
cropsentence($str2, 22, 2);

?>

Usage

cropsentence($str, $noof, $mode, $appendwith );

$str: This is the string / text to be cropped

$noof: Number of characters/words/sentences based on the $mode parameter

$mode: Three values of $mode can be defined

1. 0 means crop by no of characters.
2. 1 means crop by no of sentences .
3. 2 means crop by number of words.

$appendwith: This can be any characters like ‘…’ , ‘…more’ etc which will be appended with the cropped text when its returned.

Download:

You can download the script here.

2 Responses to “PHP: Text cropping with defining the number of characters, words or sentences”

  1. I was just searching Google for “getLeadingSentences,” as I do every once in a while, to see where my code has ended up and I came across this blog. I’m glad that you found that function, that I wrote ages ago, useful.

    Thank you also for leaving that attribution intact. I’ve always believed in giving credit where credit is due, so I appreciate it when others do the same for my work. I’ve seen that function in other places with all of the comments removed and “stolen from somewhere” added to the top.

    –Jason

  2. Fahd Murtaza says:

    Yes Jason

    Its great that you shared it with the community. Its always great to share your code with others. It definitely saved my time. I ended up doing a way cooler thing with it which could have taken more time without using your code. I am mailing you the final output I delivered to the client. Its a XML (RSS) parser that had above features as well.

    -Fahd

Leave a Reply