/    Sign up×
Community /Pin to ProfileBookmark

PHP 4 parsing xml or txt files

Hi all,

Alas, I found out my host is only running php4. I read that php5 has new simplexml functions that make life easier.

I wish to read xml or txt files in and convert them (styled , tabulated) into html.

Is there an easy way to do this? I assume using xml is better than trying to read in text.

I do not want to simply cut and paste the whole file but rather look for certain tags to place and style them differently. As such I assume XML is a better choice?

Thanks.

to post a comment
PHP

10 Comments(s)

Copy linkTweet thisAlerts:
@neilemrichOct 10.2006 — I have used xml input before in the form of rss and I dug up a couple of functions I wrote:

[CODE]function get_xmlvalue($val, $contents) {
if (ereg($val, $contents)==true) {
$chars = strlen($val);
$valbegin = "<".$val.">";
$valend = "</".$val.">";
$beginpos = strpos($contents, $valbegin);
$endpos = strpos($contents, $valend);
$dif = ($endpos - $beginpos) - ($chars+2);

return substr($contents, ($endpos-$dif), $dif);
}
}

function get_xmlvalue_multi($val, $contents) {
$split_data = explode("<".$val, $contents);
$num = count($split_data);
for($i=1; $i<$num; $i++) {
$data[$i] = get_xmlvalue($val, "<".$val.$split_data[$i]);
}
return $data;
}[/CODE]


The way it works is to do a file_get_contents of a url or file that you want to use then the get_xml_multi function will split it into the different records using $val as the marker i.e. in rss it's item - <item></item>

Then the get_xmlvalue function can be used to get the attributes for the item by the tag ($val).

For example:

your xml;

<item>

<name>This is the name</name>

<description>this is the description</description>

</item>

<item>

<name>This is the name2</name>

<description>this is the description2</description>

</item>


Then you could get this data like this;

[CODE]$items = get_xmlvalue_multi("item", file_get_contents(XML));
for($i=0; $i<count($items); $i++) {
echo 'Name of item '.$i.' = '.get_xmlvalue("name", $item[$i]).'<br />';
echo 'Description of item '.$i.' = '.get_xmlvalue("description", $item[$i]).'<br /><br />';
}[/CODE]


which would output;
[CODE]This is the name
this is the description

This is the name2
this is the description2[/CODE]


Hope this makes sense
Copy linkTweet thisAlerts:
@NogDogOct 10.2006 — [url=http://www.php.net/xml]XML Parser Functions[/url]

[url=http://pear.php.net/manual/en/package.xml.xml-parser.php]Pear XML_Parser[/url]
Copy linkTweet thisAlerts:
@supersteve3dauthorOct 10.2006 — [CODE]$items = get_xmlvalue_multi("item", file_get_contents(XML));
for($i=0; $i<count($items); $i++) {
echo 'Name of item '.$i.' = '.get_xmlvalue("name", $item[$i]).'<br />';
echo 'Description of item '.$i.' = '.get_xmlvalue("description", $item[$i]).'<br /><br />';
}[/CODE]


I am not sure, but I believe the $item[$i] should be $items[$i] instead?

neilemrich, I am trying the code you supplied. It works for getting the whole chunk of xml out. However I am unable to parse the attributes within the element tags.

The XML structure I have is as follows
[CODE]
<e1>
<e2>
<e3></e3>

<e4 attr1="xx" attr2="yy" attr3="zz">Some Name</e4>
<e4 attr1="aa" attr2="bb" attr3="cc">Some Name2</e4>
<e4 attr1="ii" attr2="jj" attr3="kk">Some Name3</e4>

</e2>
</e1>
[/CODE]


How can I extract the attr1 , attr2 values to be put into say.. a table?

Many thanks.
Copy linkTweet thisAlerts:
@neilemrichOct 10.2006 — You could use a function like:

[CODE]function find($val, $contents) {
$parts1 = explode($val.'="', $contents);
$parts2 = explode('"', $parts1[1]);
return $parts2[0];
}[/CODE]


use it like this:

[CODE]echo find('att2', $contents);[/CODE]
Copy linkTweet thisAlerts:
@supersteve3dauthorOct 10.2006 — neil,

Thank you. I got the attributes to work 80% ? See the output [URL=http://www.stevelim.com/web/firefox_extensions_xmltest.html]here[/URL]

A few problems though.

[list]
  • [*]The main element does not get listed properly, its gets mangled with the first attribute. (for the same reason, the first attribute cannot be listed).

  • [*]Not all entries seem to be listed, it stops before the end of the list.

  • [/list]


    here is the current code I am using to get the output from the page above. You might notice some element entries are missing.

    [CODE]$items = get_xmlvalue_multi("ext ", file_get_contents('/home/admin/extensions_list.xml'));
    for($i=0; $i<count($items); $i++) {

    //echo 'Name of item '.$i.' = '.get_xmlvalue("ext", $items[$i]).'<br />';
    //echo 'Description of item '.$i.' = '.get_xmlvalue("description", $items[$i]).'<br /><br />';

    echo '<b>'.find("creator", $items[$i]).'</b><br/>';
    echo find("description", $items[$i]).'<br/>';
    echo '<a href="'.find("homepageURL", $items[$i]).'">'.find("homepageURL", $items[$i]).'</a><br/><br/>';
    }[/CODE]


    The xml I am using can be found here -> [URL=http://www.stevelim.com/extensions_list.xml]extensions_list.xml[/URL]

    Thanks.
    Copy linkTweet thisAlerts:
    @neilemrichOct 11.2006 — Have you changed something since posting this, I just had a look at the output page you linked to and it looks alright. The xml you linked to also does have items on there that aren't on the output page but I noticed the output page has items that aren't on the xml page, are you taking info from multiple pages?

    I put the code into a page myself: [URL]http://www.k07.net/testt.php[/URL] and got all the entries coming up bar the last one (so I changed it to $i<count($items)+1; )
    Copy linkTweet thisAlerts:
    @supersteve3dauthorOct 11.2006 — Have you changed something since posting this, I just had a look at the output page you linked to and it looks alright. The xml you linked to also does have items on there that aren't on the output page but I noticed the output page has items that aren't on the xml page, are you taking info from multiple pages?

    I put the code into a page myself: [URL]http://www.k07.net/testt.php[/URL] and got all the entries coming up bar the last one (so I changed it to $i<count($items)+1; )[/QUOTE]


    Neil,

    Yes. apologies. I have been tinkering instead of remaining passively idle. =)

    The output is from another computer, however it has the same format.

    Regardless, when referring to the earlier example. Such as with the sample XML element below...

    [CODE]<ext version="0.3" disabled="false" homepageURL="http://philringnalda.com/mozilla/blogthis/" description="Adds right-click access to Blogger's BlogThis popup." creator="Phil Ringnalda" updateURL="http://philringnalda.com/mozilla/blogthis/update.rdf" id="{8F82D6F9-D8F0-4477-8C73-908531D73538}">BlogThis</ext>[/CODE]

    With your old code, what I was trying to say was I was unable to get the "BlogThis" entry between the main <ext ... </ext> tags. Also, something related was messing with the first attribute of the <ext> tag.. eg. version="0.3" I was not able to use your find function on the version attribute.

    Thanks for continuing to pursue this thread. Have a kipper on me. =)

    PS. I have since heard that perhaps it is a good idea to look into XSLT for styling XML into XHTML. However, learning how to brute force parse stuff is a good thing to know.

    Steve.
    Copy linkTweet thisAlerts:
    @neilemrichOct 12.2006 — I went over the functions because really they were meant for xml in the form of:

    <item>

    <name>Name</name>

    <descripton>Description</description>

    </item>

    Not that it matters, just wasn't written for data in the form of yours, so instead just use this which seemed to work for me:

    [CODE]function find($val, $contents) {
    $parts1 = explode($val.'="', $contents);
    $parts2 = explode('"', $parts1[1]);
    return $parts2[0];
    }

    function title($contents) {
    $temp = explode(">", $contents);
    return substr($temp[1], 0, -1);
    }


    function get_xmlvalue_multi($val, $contents) {
    $split_data = explode("<".$val, $contents);
    return $split_data;
    }

    $items = get_xmlvalue_multi("ext ", file_get_contents('http://www.stevelim.com/extensions_list.xml'));
    for($i=0; $i<count($items)+1; $i++) {

    echo '<b>'.title($items[$i]).'</b><br/>';
    echo '<b>'.find("creator", $items[$i]).'</b><br/>';
    echo find("description", $items[$i]).'<br/>';
    echo find("version", $items[$i]).'<br/>';
    echo '<a href="'.find("homepageURL", $items[$i]).'">'.find("homepageURL", $items[$i]).'</a><br/><br/>';
    }[/CODE]


    The functions are a lot simpler like that too.

    Hope this works for you

    Neil
    Copy linkTweet thisAlerts:
    @supersteve3dauthorOct 12.2006 — Neil,

    Thank you! It works great. However its possible I stumbled upon a tiny bug of sorts? The 'find' function works great and in a predictable manner. However, the title function is doing some strange things.

    Take for example the scenario of trying to remove the <br/> after the echo title, you will see that the subsequent echo does not trail it as you would expect. In fact, in my case, it seems to disappear(?) When I try removing the embedded <br/> or <b> tags from the echo find lines, it works as expected.

    Bizarre. Also when I tried the code verbatim, I noticed that everything turned bold. I wonder if you experienced the same problem. ie. seems like the closing </b> was ignored by the call to the 'title' function.

    Very likely related I suppose.

    Cheers,

    Steve.
    Copy linkTweet thisAlerts:
    @neilemrichOct 12.2006 — Ahh yes, just seen that, fixed it by changing the title function slightly:


    [CODE]function title($contents) {
    $temp = explode(">", $contents);
    return substr($temp[1], 0, -5);
    } [/CODE]


    notice the -5 not -1

    and I changed the line on the for loop to:
    [CODE]
    for($i=1; $i<=count($items); $i++) {[/CODE]


    The first record was just rubbish which was confusing things

    Neil
    ×

    Success!

    Help @supersteve3d spread the word by sharing this article on Twitter...

    Tweet This
    Sign in
    Forgot password?
    Sign in with TwitchSign in with GithubCreate Account
    about: ({
    version: 0.1.9 BETA 6.2,
    whats_new: community page,
    up_next: more Davinci•003 tasks,
    coming_soon: events calendar,
    social: @webDeveloperHQ
    });

    legal: ({
    terms: of use,
    privacy: policy
    });
    changelog: (
    version: 0.1.9,
    notes: added community page

    version: 0.1.8,
    notes: added Davinci•003

    version: 0.1.7,
    notes: upvote answers to bounties

    version: 0.1.6,
    notes: article editor refresh
    )...
    recent_tips: (
    tipper: @meenaratha,
    tipped: article
    amount: 1000 SATS,

    tipper: @meenaratha,
    tipped: article
    amount: 1000 SATS,

    tipper: @AriseFacilitySolutions09,
    tipped: article
    amount: 1000 SATS,
    )...