XPath Expression To Deal With BR Element

@senisevenAug 28.2012

I have a bunch of HTML documents with a P element and ‘id’ attribute set to ‘title’. Like so:

[code=html] Title of the document [/code]

In some cases, I have a title that has a forced line break:

[code=html] This Is A Title Of A Document With A BR Element In It [/code]

I have created an UpdateAndSynchronize.php document that scans a tree where all my web documents are, loads the document (using DOMDocument::loadHTML()), sets up the XPath object, and extracts the info I want to put in the MySQL database.

My XPath expression to get the document title is:

[code=php]$docTitle = $htmlXPath->query(‘.//p[@id=”title”]’)->item(0)->textContent; $docTitle = trim(str_replace(array(“n”, “rn”, “r”, “t”), ” “, $docTitle)); [/code]

$htmlXPath is an XPath object.

I had to add the second line to get rid of leading and trailing whitespace.
My problem is the str_replace() is not working, because the element in the XPath query is probably being converted (translated?) to some other character.

The question is:

~~[U]~~~~[B]~~How should I be setting up my XPath->query() to convert elements into a single space character?
[/B][/U]

Also, is there a good reference (book? web pages?) that show how to set up XPath queries (evaluations?) with lots of examples?

to post a comment

PHP

XPath Expression To Deal With BR Element

1 Comments(s) _↴

Also in #PHP _↴

Success!

Social

Version

XPath Expression To Deal With BR Element

1 Comments(s) ↴

Also in #PHP ↴

Success!

The web is an endless sea of information. Don't miss the boat... Subscribe!

Social

Version

1 Comments(s) _↴

Also in #PHP _↴