/    Sign up×
Community /Pin to ProfileBookmark

Functions to generate meta keywords based on keyword density?

I want to dynamically generate meta keywords for my blog and forum. Is there a PHP function that can determine keyword density in a string or a function that can group together identical values in an array?
If the first function exists, then I can make the post into a sting and use the words that have the highest density as the keywords (other than common words).
If the second function exists, then I can explode() the post and group together the words that are repeated to determine density.

to post a comment
PHP

7 Comments(s)

Copy linkTweet thisAlerts:
@Jarrod1937Jul 24.2010 — Thats a cool idea, but i'm not sure it would be worth the effort to develop. Meta keywords are completely ignored by google and carry very little weighting in all other major search engine ranking algo's.

As for implementing it, i could be wrong, but i believe there is no stock php function that has this ability. I'd personally implement such a thing by first taking a string and filtering out any stop words, then i'd iterate through the string having any word be placed into an array and keeping track of the word count for any repeated words. Then i'd sort the array by word occurences and take the first few words and add them to the meta keywords tag.

Depending on the size of your text this could be a relatively intensive task, so you'd probably want to have this done as part of your cms and save the results into a new database field and simply print out the contents of that field when displaying the page.
Copy linkTweet thisAlerts:
@NogDogJul 25.2010 — While I agree that these days the keywords meta tag is of limited use (and can actually be damaging if abused), I felt like coming up with a solution anyway. ?
[code=php]
<?php
class Keywords
{
private $stopWords = array("a", "about", "above", "above", "across",
"after", "afterwards", "again", "against", "all", "almost", "alone",
"along", "already", "also", "although", "always", "am", "among",
"amongst", "amoungst", "amount", "an", "and", "another", "any", "anyhow",
"anyone", "anything", "anyway", "anywhere", "are", "around", "as", "at",
"back", "be", "became", "because", "become", "becomes", "becoming",
"been", "before", "beforehand", "behind", "being", "below", "beside",
"besides", "between", "beyond", "bill", "both", "bottom", "but", "by",
"call", "can", "cannot", "cant", "co", "con", "could", "couldnt", "cry",
"de", "describe", "detail", "do", "done", "down", "due", "during", "each",
"eg", "eight", "either", "eleven", "else", "elsewhere", "empty", "enough",
"etc", "even", "ever", "every", "everyone", "everything", "everywhere",
"except", "few", "fifteen", "fify", "fill", "find", "fire", "first",
"five", "for", "former", "formerly", "forty", "found", "four", "from",
"front", "full", "further", "get", "give", "go", "had", "has", "hasnt",
"have", "he", "hence", "her", "here", "hereafter", "hereby", "herein",
"hereupon", "hers", "herself", "him", "himself", "his", "how", "however",
"hundred", "ie", "if", "in", "inc", "indeed", "interest", "into", "is",
"it", "its", "itself", "keep", "last", "latter", "latterly", "least",
"less", "ltd", "made", "many", "may", "me", "meanwhile", "might", "mill",
"mine", "more", "moreover", "most", "mostly", "move", "much", "must",
"my", "myself", "name", "namely", "neither", "never", "nevertheless",
"next", "nine", "no", "nobody", "none", "noone", "nor", "not", "nothing",
"now", "nowhere", "of", "off", "often", "on", "once", "one", "only",
"onto", "or", "other", "others", "otherwise", "our", "ours", "ourselves",
"out", "over", "own", "part", "per", "perhaps", "please", "put", "rather",
"re", "same", "see", "seem", "seemed", "seeming", "seems", "serious",
"several", "she", "should", "show", "side", "since", "sincere", "six",
"sixty", "so", "some", "somehow", "someone", "something", "sometime",
"sometimes", "somewhere", "still", "such", "system", "take", "ten",
"than", "that", "the", "their", "them", "themselves", "then", "thence",
"there", "thereafter", "thereby", "therefore", "therein", "thereupon",
"these", "they", "thickv", "thin", "third", "this", "those", "though",
"three", "through", "throughout", "thru", "thus", "to", "together", "too",
"top", "toward", "towards", "twelve", "twenty", "two", "un", "under",
"until", "up", "upon", "us", "very", "via", "was", "we", "well", "were",
"what", "whatever", "when", "whence", "whenever", "where", "whereafter",
"whereas", "whereby", "wherein", "whereupon", "wherever", "whether",
"which", "while", "whither", "who", "whoever", "whole", "whom", "whose",
"why", "will", "with", "within", "without", "would", "yet", "you", "your",
"yours", "yourself", "yourselves", "the"
);
/**
* Get most common non-stop-words in string
* @return array
* @param string $text
* @param int $nbrWords Number of words to return, default = 5
*/
public function getKeywords($text, $nbrWords = 5)
{
$words = str_word_count($text, 1);
array_walk($words, array(
$this,
'filter'
));
$words = array_diff($words, $this->stopWords);
$wordCount = array_count_values($words);
arsort($wordCount);
echo "<pre>";
print_r($wordCount);
echo "</pre>";
$wordCount = array_slice($wordCount, 0, $nbrWords);
return array_keys($wordCount);
}
private function filter(&$val, $key)
{
$val = strtolower($val);
}
private function setStopWords()
{
$this->stopWords = array();
}
}
// USAGE:
$text = "
Four score and seven year ago, our fathers brought forth
upon this continent a new nation, conceived in liberty
and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war, testing whether this
nation or any other nation so conceived and so dedicated
can long edure.
";
$test = new Keywords();
$keywords = $test->getKeywords($text, 3);
echo implode(",", $keywords); // nation,conceived,dedicated
[/code]

(stop-word list taken from http://armandbrahaj.blog.al/2009/04/14/list-of-english-stop-words/)
Copy linkTweet thisAlerts:
@narutodude000authorJul 25.2010 — Thanks, I'll try that once my web host fixes my database (I'm considering switching)
Copy linkTweet thisAlerts:
@narutodude000authorJul 26.2010 — Nogdog, it works (almost) perfectly ?

The only problem is that abbreviations like "I'll", "should've", "can't", etc are included in the keywords. Here's a fixed version for whoever else needs it:

[code=php]class Keywords
{
private $stopWords = array("a", "about", "above", "above", "across",
"after", "afterwards", "again", "against", "all", "almost", "alone",
"along", "already", "also", "although", "always", "am", "among",
"amongst", "amoungst", "amount", "an", "and", "another", "any", "anyhow",
"anyone", "anything", "anyway", "anywhere", "are", "around", "as", "at",
"back", "be", "became", "because", "become", "becomes", "becoming",
"been", "before", "beforehand", "behind", "being", "below", "beside",
"besides", "between", "beyond", "bill", "both", "bottom", "but", "by",
"call", "can", "cannot", "cant", "co", "con", "could", "couldn't",
"de", "detail", "do", "done", "down", "due", "during", "each",
"eg", "eight", "either", "eleven", "else", "elsewhere", "empty", "enough",
"etc", "even", "ever", "every", "everyone", "everything", "everywhere",
"except", "few", "fifteen", "fify", "fill", "find", "first",
"five", "for", "former", "formerly", "forty", "found", "four", "from",
"front", "full", "further", "get", "give", "go", "had", "has", "hasnt",
"have", "he", "hence", "her", "here", "hereafter", "hereby", "herein",
"hereupon", "hers", "herself", "him", "himself", "his", "how", "however",
"hundred", "ie", "if", "in", "inc", "indeed", "interest", "into", "is",
"it", "its", "itself", "keep", "last", "latter", "latterly", "least",
"less", "ltd", "made", "many", "may", "me", "meanwhile", "might", "mill",
"mine", "more", "moreover", "most", "mostly", "move", "much", "must",
"my", "myself", "name", "namely", "neither", "never", "nevertheless",
"next", "nine", "no", "nobody", "none", "noone", "nor", "not", "nothing",
"now", "nowhere", "of", "off", "often", "on", "once", "one", "only",
"onto", "or", "other", "others", "otherwise", "our", "ours", "ourselves",
"out", "over", "own", "part", "per", "perhaps", "please", "put", "rather",
"re", "same", "see", "seem", "seemed", "seeming", "seems", "serious",
"several", "she", "should", "show", "side", "since", "sincere", "six",
"sixty", "so", "some", "somehow", "someone", "something", "sometime",
"sometimes", "somewhere", "still", "such", "take", "ten",
"than", "that", "the", "their", "them", "themselves", "then", "thence",
"there", "thereafter", "thereby", "therefore", "therein", "thereupon",
"these", "they", "thin", "third", "this", "those", "though",
"three", "through", "throughout", "thru", "thus", "to", "together", "too",
"top", "toward", "towards", "twelve", "twenty", "two", "un", "under",
"until", "up", "upon", "us", "very", "via", "was", "we", "well", "were",
"what", "whatever", "when", "whence", "whenever", "where", "whereafter",
"whereas", "whereby", "wherein", "whereupon", "wherever", "whether",
"which", "while", "whither", "who", "whoever", "whole", "whom", "whose",
"why", "will", "with", "within", "without", "would", "yet", "you", "your",
"yours", "yourself", "yourselves", "ll", "t", "s", "d", "ve", "m"
);
);
/**
* Get most common non-stop-words in string
* @return array
* @param string $text
* @param int $nbrWords Number of words to return, default = 5
*/
public function getKeywords($text, $nbrWords = 5)
{
$text = preg_replace('/'/',' ',$text);
$words = str_word_count($text, 1);
array_walk($words, array(
$this,
'filter'
));
$words = array_diff($words, $this->stopWords);
$wordCount = array_count_values($words);
arsort($wordCount);
$wordCount = array_slice($wordCount, 0, $nbrWords);
return array_keys($wordCount);
}
private function filter(&$val, $key)
{
$val = strtolower($val);
}
private function setStopWords()
{
$this->stopWords = array();
}
}

function meta_keywords($text) {
$text = strtolower(strip_tags($text));
$post = new Keywords();
$keywords = $post->getKeywords($text, 5);
return implode(",", $keywords);
}

$text = "Four score and seven year ago, our fathers brought forth
upon this continent a new nation, conceived in liberty
and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war, testing whether this
nation or any other nation so conceived and so dedicated
can long edure.";

echo meta_keywords("$text");[/code]
Copy linkTweet thisAlerts:
@NogDogJul 26.2010 — I might change that preg_replace() to:
[code=php]
$text = preg_replace('/'w*b/',' ',$text);
[/code]

That would take care of both contractions and possessives. You might need to add processing for a few special cases, such a "don't".
Copy linkTweet thisAlerts:
@sohguanhJul 27.2010 — As stop-words list can grow or shrink as time passes by, it would be better to store the stop-words list in some text files or database table. This mean any changes to stop-words list, you just amend the text file or SQL operations on the database table and leave your PHP program intact un-changed.
Copy linkTweet thisAlerts:
@NogDogJul 27.2010 — As stop-words list can grow or shrink as time passes by, it would be better to store the stop-words list in some text files or database table. This mean any changes to stop-words list, you just amend the text file or SQL operations on the database table and leave your PHP program intact un-changed.[/QUOTE]

Yes: I assumed that was self-evident, but it's probably good that you mentioned it, since it may not in fact be self-evident to everyone. ?
×

Success!

Help @narutodude000 spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.27,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...