/    Sign up×
Community /Pin to ProfileBookmark

[RESOLVED] RegExp parsing on a TXT file

First off my main data is stored in a file called “news.txt” (MySQL is another option but this approach well suits it’s purpose)
It’s contents:

[code]
[color=blue]<!– Start New Post –>[/color]
<!– Start Author –>
Ultimater 6-18-05
<!– End Author–>
<!– Start Clan News –>
<p></p>
<p>Welcome to our new member NightZon! The clan will meet tomorrow at 4:00 eastern time on 6-18-05 in the lounge</p>
<p>
</p>
<!– End Clan News –>
<!– Start Site Update –>
<p></p>
<p>Building a new skin for the site. It will have a glassy look to it.</p>
<!– End Site Update –>
[color=blue]<!– End New Post –>[/color]

[color=blue]<!– Start New Post –>[/color]
<!– Start Author –>
HarryPothead 8-29-05
<!– End Author–>
<!– Start Clan News –>
<p>Congrats to BurningShadow onbehalf of his co-leader promotion</p>
<p>the new ranking system for our clan that involves a system of tests will
(hopefully) be finished in its design by September 1st, its implimentation
date is yet to be set, but it would apear as though a large amount of our
clan aproves of the new system.. where as the old system many people were
objecting to πŸ˜‰ the new system will hopefully make ranks a more.. apeasing
thing to get involved in, where as before many people were “i have a
rank?! what!?” yeah.. that was not how i want our clan to be</p>
<p>MC raid last night!!! even tho i was realy the only person in it.. but i
will have screenshots added to the WoW shot section asap! enjoy!</p>

<p>
</p>
<!– End Clan News –>
<!– Start Site Update –>
<p></p>
<p>i apologize for not having made a news post for a wile, i dont want this
to be a place where people say.. oh.. the website.. there is nothing usefull
there.. i will work harder to get all important news updates recorded</p>
<p>Ultimater should finish the news blog before long…</p>
<!– End Site Update –>
[color=blue]<!– End New Post –>[/color]
[/code]

Now I need to parse the data into three major arrays $authors, $clannews, $siteupdates.

Open the above file using PHP, again it’s called “news.txt” and located in the root directory.
Next store all of it’s contents into a scalar variable for PHP to work-with by the name of $template.
Then execute the following regular expression on $template and store all the matches into a new array called $newposts:

[code]
/<!–s*Starts+News+Posts*–>(.*?)<!–s*Ends+News+Posts*–>/igs
[/code]

Loop through the array $newposts and execute one-by-one the-following-three regular expressions on it.

[code]
/<!–s*Starts+Authors*–>(.*?)<!–s*Ends+Authors*–>/igs (append the matches into an array $authors)
/<!–s*Starts+Clans+Newss*–>(.*?)<!–s*Ends+Clans+Newss*–>/igs (append the matches into an array $clannews)
/<!–s*Starts+Sites+Updates*–>(.*?)<!–s*Ends+Sites+Updates*–>/igs (append the matches into an array $siteupdates)
[/code]

PHP should now contain 3 major one-dimensional arrays to work-with furthermore.

[code]
[color=blue]$authors[/color]=array(‘Ultimater 6-18-05’, ‘HarryPothead 8-29-05’);

[color=blue]$clannews[/color]=array(‘<p></p>
<p>Welcome to our new member NightZon! The clan will meet tomorrow at 4:00 eastern time on 6-18-05 in the lounge</p>
<p>
</p>’,
‘<p>Congrats to BurningShadow onbehalf of his co-leader promotion</p>
<p>the new ranking system for our clan that involves a system of tests will
(hopefully) be finished in its design by September 1st, its implimentation
date is yet to be set, but it would apear as though a large amount of our
clan aproves of the new system.. where as the old system many people were
objecting to πŸ˜‰ the new system will hopefully make ranks a more.. apeasing
thing to get involved in, where as before many people were “i have a
rank?! what!?” yeah.. that was not how i want our clan to be</p>
<p>MC raid last night!!! even tho i was realy the only person in it.. but i
will have screenshots added to the WoW shot section asap! enjoy!</p>

<p>
</p>’);

[color=blue]$siteupdates[/color]=array(‘<p></p>
<p>Building a new skin for the site. It will have a glassy look to it.</p>’,
‘<p></p>
<p>i apologize for not having made a news post for a wile, i dont want this
to be a place where people say.. oh.. the website.. there is nothing usefull
there.. i will work harder to get all important news updates recorded</p>
<p>Ultimater should finish the news blog before long…</p>’);
[/code]

That is all.

to post a comment
PHP

6 Comments(s) ↴

Copy linkTweet thisAlerts:
@LiLcRaZyFuZzYDec 16.2005 β€”Β what's the problem?
Copy linkTweet thisAlerts:
@SpectreReturnsDec 16.2005 β€”Β I was actually wondering the same thing, but didn't think it nessecary to post. (but now I do - I'm bored.)
Copy linkTweet thisAlerts:
@artoDec 16.2005 β€”Β You really are reinventing the wheel here, Ultimater.

Use xml, that's what it is for. Let's say your data file looks like this: [code=php]<news>
<post>
<author>Ultimater</author>
<date>6-18-05</date>
<clan>
<paragraph>Welcome to our new member NightZon! The clan will meet tomorrow at 4:00 eastern time on 6-18-05 in the lounge</paragraph>
</clan>
<site>
<paragraph>Building a new skin for the site. It will have a glassy look to it.</paragraph>
</site>
</post>
<post>
<author>HarryPothead</author>
<date>8-29-05</date>
<clan>
<paragraph>Congrats to BurningShadow onbehalf of his co-leader promotion</paragraph>
<paragraph>the new ranking system for our clan that involves a system of tests will (hopefully) be finished in its design by September 1st, its implimentation date is yet to be set, but it would apear as though a large amount of our clan aproves of the new system.. where as the old system many people were objecting to ;) the new system will hopefully make ranks a more.. apeasing thing to get involved in, where as before many people were "i have a rank?! what!?" yeah.. that was not how i want our clan to be</paragraph>
<paragraph>MC raid last night!!! even tho i was realy the only person in it.. but i will have screenshots added to the WoW shot section asap! enjoy!</paragraph>
</clan>
<site>
<paragraph>i apologize for not having made a news post for a wile, i dont want this to be a place where people say.. oh.. the website.. there is nothing usefull there.. i will work harder to get all important news updates recorded</paragraph>
<paragraph>Ultimater should finish the news blog before long...</paragraph>
</site>
</post>
</news>[/code]
Now, you can use the SimpleXML extension from PHP5: [code=php]$file='news.xml';

if (!$xml=simplexml_load_file($file))
die("couldn't load $filen");[/code]
[I]$xml[/I] object contains all data from the file, for example:

[I]$xml->post[/I] is array with two posts in it,

[I]$xml->post[1]->clan->paragraph[1][/I] is second paragraph from clan news in second post ("the new ranking system...").

So you can use loops to output news to html: [code=php]foreach ($xml->post as $post) {
echo('<div class="cssPost">');
echo('<div class="cssPostHeader">'.$post->author.', '.$post->date.'</div>');
echo('<div class="cssClanNewsHeader">Clan news:</div>');
foreach ($post->clan->paragraph as $par)
echo('<div class="cssClanNewsPar">'.$par.'</div>');
echo('<div class="cssSiteUpdateHeader">Site update:</div>');
foreach ($post->site->paragraph as $par)
echo('<div class="cssSiteUpdatePar">'.$par.'</div>');
echo('</div>');
}[/code]
Simple, eh? Yes, that's why they call it SimpleXML ?

Now, attach some nice stylesheet and voila: [code=php]<style type="text/css">
.cssPost {
border: 1px solid black;
margin: 10px;
padding: 5px;
}
.cssPostHeader {
font-weight: bold;
color: blue;
text-decoration: underline;
}
.cssClanNewsHeader {
font-weight: bold;
}
.cssClanNewsPar {
margin-left: 10px;
margin-bottom: 10px;
}
.cssSiteUpdateHeader {
font-weight: bold;
}
.cssSiteUpdatePar {
margin-left: 10px;
margin-bottom: 10px;
}
</style>[/code]

Also, don't use empty paragraphs for formatting, this should be done with css.

Arto

PS. Of course, I would still use database for something like this ?
Copy linkTweet thisAlerts:
@ScleppelDec 16.2005 β€”Β If you're not going to use XML, this might work for you:
[code=php]<?php

// catch all errors while developing.
error_reporting(E_ALL);

// you might want to catch errors opening the file.
$file_contents = file_get_contents($_SERVER['DOCUMENT_ROOT'] . '/news.txt');

$regex = '#<!--s*Starts*News*Posts*-->s*'.
'<!--s*Starts*Authors*-->' .
'(.*)' .
'<!--s*Ends*Authors*-->s*' .
'<!--s*Starts*Clans*Newss*-->' .
'(.*)' .
'<!--s*Ends*Clans*Newss*-->s*' .
'<!--s*Starts*Sites*Updates*-->' .
'(.*)' .
'<!--s*Ends*Sites*Updates*-->s*' .
'<!--s*Ends*News*Posts*-->#misU';

preg_match_all($regex,$file_contents,$output);

/* The code below this comment trims the values before putting them into
their named arrays, if you don't care or will trim later, use this instead:

$authors = $output[1];
$clannews = $output[2];
$siteupdates = $output[3];

*/
$authors = array();
$clannews = array();
$siteupdates = array();

foreach($output[1] as $val)
{
$authors[] = trim($val);
}

foreach($output[2] as $val)
{
$clannews[] = trim($val);
}

foreach($output[3] as $val)
{
$siteupdates[] = trim($val);
}

// test it's done it right.
header('Content-Type: text/plain');
print_r($authors);
print_r($clannews);
print_r($siteupdates);

?>[/code]
Copy linkTweet thisAlerts:
@UltimaterauthorDec 16.2005 β€”Β @arto

Thank you for writting the code. It looks magnificent, simple, and well-structured. The only problem being my server sux. ?
<i>
</i>Fatal error: Call to undefined function: simplexml_load_file() in /var/www/html/phptest.php on line 4

Maybe when I upgrade to PHP5 in the future I'll try it out for learning purposes and be amazed.


@Scleppel

That worked perfectly on my server! Thank you!

Thanks you both. You were just as helpful as I am in the JavaScript forum for many others! ?
Copy linkTweet thisAlerts:
@SpectreReturnsDec 17.2005 β€”Β Thanks you both. You were just as helpful as I am in the JavaScript forum for many others! ?[/QUOTE] ****y bastard... (I love you anyway)
Γ—

Success!

Help @Ultimater spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 β€” BETA 5.18,
whats_new: community page,
up_next: more Davinciβ€’003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinciβ€’003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...