/    Sign up×
Community /Pin to ProfileBookmark

word frequency count script

Hello people ?

I’m looking for a script that looks for (for example) the 5 most frequent used words in an input textblock, possibly with an option to only count words with more than 4 chars.
Someone knows where to find one?
Thanks!

to post a comment
PHP

2 Comments(s)

Copy linkTweet thisAlerts:
@bokehApr 24.2006 — [code=php]function MostPopularWords($target, $number_of_words = 5, $minimum_word_length = 5)
{
preg_match_all('/[a-z]{'.$minimum_word_length.',}/', strtolower($target), $matches);
$words = array();
foreach($matches[0] as $word)
{
isset($words[$word]) ? $words[$word]++ : $words[$word] = 1 ;
}
array_multisort($words, SORT_NUMERIC, SORT_DESC);
return(array_keys(array_slice($words, 0, $number_of_words)));
}[/code]
Returns an array of the [I]n[/I] most popular words DESC. Example use:[code=php]<?php

$target = <<<END
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce neque. Nulla auctor. Curabitur commodo mattis tortor. Sed ut nulla. Donec vel sem. Cras congue. Nulla sollicitudin, felis a suscipit bibendum, nisl arcu tristique lacus, a eleifend massa odio ac odio. Duis tempor, risus id aliquam condimentum, sapien quam facilisis dui, id sodales nunc eros ac lacus. Phasellus ante mi, volutpat ut, mattis vitae, sollicitudin a, nibh. Nam dapibus nisi id lectus. Aenean non arcu. Quisque vehicula. Sed ac nulla nec sapien consectetuer eleifend. Ut porttitor volutpat dui. Ut porta felis ut nisl. Cras placerat accumsan dui. Vestibulum turpis nibh, ornare nec, tristique vitae, eleifend non, libero. Donec dignissim. Vestibulum auctor lectus et ipsum.

In eget tellus at nunc molestie ultricies. Nullam eget lacus vitae leo varius dignissim. In in lacus. Nunc nec est. Praesent quam lacus, ornare id, pharetra non, feugiat eget, enim. Integer lobortis semper risus. Duis tristique convallis erat. Fusce sed nibh. Etiam vitae nisi a sem iaculis aliquam. Suspendisse non ipsum ac erat lobortis dapibus. Nam porttitor fringilla risus.

Aenean gravida. Donec sagittis luctus mi. Pellentesque eu sem ut ligula pulvinar placerat. Ut tempor pharetra nulla. Nulla facilisi. Donec adipiscing, purus eget faucibus mattis, magna dolor interdum ligula, ac sollicitudin ligula lorem at risus. Cras orci dolor, pharetra eu, molestie vel, pharetra vel, quam. Ut tempus elit eu orci. Nulla fermentum, lectus at egestas dignissim, mauris nisl convallis justo, eget semper lorem dolor quis metus. In hac habitasse platea dictumst. Duis vel pede. Ut non est et odio vulputate tristique.

In sed velit interdum felis eleifend aliquet. Ut porta pellentesque dolor. Fusce rhoncus vestibulum quam. Aliquam erat volutpat. Phasellus magna. Sed sed enim. Donec non ipsum. Nullam dapibus, lorem non commodo placerat, risus lacus lobortis magna, scelerisque ornare dui elit vitae neque. Duis felis lorem, varius quis, aliquet non, mollis non, magna. Morbi consectetuer. Donec tempor. Suspendisse potenti. Nam vitae nunc eget urna hendrerit faucibus. Sed nec odio. Etiam sapien lorem, vestibulum non, lacinia nec, eleifend sed, nisl. Quisque mattis, orci non vestibulum luctus, urna dolor dictum massa, aliquam scelerisque leo leo sit amet arcu.

Nullam eu nulla a mi auctor auctor. Mauris et augue eu metus consectetuer nonummy. Curabitur tincidunt purus. Integer vel mi id tellus sodales ornare. Fusce pulvinar. In lorem purus, volutpat et, faucibus id, condimentum ac, turpis. Duis neque. Donec molestie scelerisque est. Ut euismod nunc eu mauris. Etiam quam urna, vestibulum sit amet, volutpat quis, fringilla eget, lectus. Sed vestibulum ultricies urna.
END;

foreach(MostPopularWords($target) as $word)
{
echo "$word <br>n";
}

function MostPopularWords($target, $number_of_words = 5, $minimum_word_length = 5)
{
preg_match_all('/[a-z]{'.$minimum_word_length.',}/', strtolower($target), $matches);
$words = array();
foreach($matches[0] as $word)
{
isset($words[$word]) ? $words[$word]++ : $words[$word] = 1 ;
}
array_multisort($words, SORT_NUMERIC, SORT_DESC);
return(array_keys(array_slice($words, 0, $number_of_words)));
}

?>[/code]
Prints:[CODE]nulla
lorem
donec
vestibulum
dolor[/CODE]
Copy linkTweet thisAlerts:
@Bobby_SauthorApr 24.2006 — GEEH, that's cool. Thanks!
×

Success!

Help @Bobby_S spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.18,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...