/    Sign up×
Community /Pin to ProfileBookmark

innerHTML and RegExp

You all maby heard about it, if so iam sorry for repeating it again. ?
I just dont know what to do ?

It seems that one cant use RegExp in innerHTML, please tell me it aint so!

I at first thought it was something with RegExp.Multiline. But when i deleted
all ther other lines in the html, except the one i want to use RegExp on…
It still would not work. ?

When i only write: document.body.innerHTML
The output is not as the scource, why’s that? It’s missing alot of <td> and other…
Output: Movie Poster<img src=”http://www.test.com/test.jpg“>

—START-Script—
var Test='<tr><td class=rowhead>Movie Poster</td><td align=left><img src=”http://www.test.com/test.jpg”></td></tr>’
var Real=(document.body.innerHTML);

myRe=/^.*Movie Poster</td><td align=left><img src=”(.+?)”></td>.*$/ig;
myArray = myRe.exec(Real);

var url = RegExp.$1; alert(url);
—END-Script—

The Test var is the line in the html i would like to do the RegExp, exctract the image url. Anyone with any ideas? Would greatly appreciate it! Thnx

to post a comment
JavaScript

11 Comments(s)

Copy linkTweet thisAlerts:
@pccodeAug 02.2006 — To extract the image url you only need to access the src attribute. There are a couple of different ways to accomplish this.
<i>
</i>//add an id attribute to your &lt;img&gt; element
&lt;img src="yourimage.gif" id="yourimage"&gt;

//retrieve the url
var url = document.getElementById('yourimage').getAttribute('src');

//or if you don't want to add an id attribute
var images = document.getElementsByTagName('img');

//then reference the image by index number
var url = images[0].getAttribute('src');


When you are using innerHTML you really shouldn't retrieve the entire document body. Just reference the target element and retrieve it's innerHTML.

<i>
</i>&lt;table&gt;
&lt;tr&gt;
&lt;td id="example"&gt;
This is an example.
&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

var text = document.getElementById('example').innerHTML;
Copy linkTweet thisAlerts:
@SonejauthorAug 02.2006 — It is not my homepage that i want to fetch from.

So i cant add id tags, it would been the easy way ?

getElementsByTagName workt great!, i googled 2 days after work

and did not come up with a solution to fetch this darn url, thnx to you now i can ?

I will start with the rest of the script tomorrow...

It includes fetching this url without going to that link that cointains the img url.

Preload it from the side before or something like that...

Dunno where to start, any pointers?

Thnx man!
Copy linkTweet thisAlerts:
@pccodeAug 02.2006 — You might want to consider using XMLHttpRequest(). It allows you to fetch a webpage's contents and also allows you to access it's elements.
<i>
</i>function grabURL(url) {
var request=new XMLHttpRequest();
request.open("GET",url,false);
request.send(null);
var webpage = request.responseXML;

<i> </i>/* You can also use request.responseText but
<i> </i>I'm not sure if you can access it's elements
<i> </i>with the DOM */

<i> </i>var images = webpage.getElementsByTagName('img');
<i> </i>var url = images[0].getAttribute('src');
}

grabURL('http://the-site-to-retrieve.com/index.htm');

Try googling xmlhttprequest for more info. I have listed a link below that includes a basic overview of xmlhttprequest. The function listed above is only an example. I haven't tested the code. Cheers.

http://www.xml.com/pub/a/2005/02/09/xml-http-request.html
Copy linkTweet thisAlerts:
@SonejauthorAug 03.2006 — It seems like XMLHttpRequest() is the thing i been searching for! ?

I have read up on it at this page... http://www.sitepoint.com/article/build-your-own-ajax-web-apps

It has some good information about XML..

I will have to try this when i quit work, thnx again!
Copy linkTweet thisAlerts:
@SonejauthorAug 03.2006 — One more question, should i use xmlhttp.onreadystatechange?

Or do you think i could go without?

Or do i just need it when request.open statement is set to True?
Copy linkTweet thisAlerts:
@SonejauthorAug 03.2006 — The XMLHttpRequest gave va a TypeError: webpage has no properties

I tryed with: var images = request.responseXML.getElementsByTagName('img');

But still no success... Any ideas?
Copy linkTweet thisAlerts:
@pccodeAug 03.2006 — I tested the code using responseXML and was unable to access the elements using DOM. Perhaps you could retrieve the webpage with responseText instead. It's not as efficient but it can be done. I have tested the code below and it is working properly. However I have no idea how it will react to non-local files. You may encounter security issues in Firefox. And for IE you will need to use different code to get it working. The link I listed in a previous post will explain how to code it in IE. And regarding onreadystatechange, you probably won't need it. It should function correctly without it in Firefox.
<i>
</i>//test.htm
&lt;html&gt;
&lt;head&gt;
&lt;script&gt;
function grabURL(url) {
var request=new XMLHttpRequest();
request.open("GET",url,false);
request.send(null);
var webpage = request.responseText;
var images = webpage.split('img');
var link = images[1].substr(images[1].indexOf('src=')+5);
link = link.substr(0,link.indexOf('&gt;')-1);
alert(link);
}
&lt;/script&gt;
&lt;/head&gt;
&lt;body&gt;
&lt;a href="javascript:grabURL('file:///C:/Documents%20and%20Settings/UserName/Desktop/test2.htm');"&gt;test&lt;/a&gt;
&lt;/body&gt;
&lt;/html&gt;

<i>
</i>//test2.htm
&lt;html&gt;
&lt;body&gt;
&lt;img src="test.gif"&gt;
&lt;/body&gt;
&lt;/html&gt;
Copy linkTweet thisAlerts:
@pccodeAug 03.2006 — I did some research on the xmlhttprequest object and it turns out that you can't request pages that are located on a different domain. It will only work if the webpage you are requesting has the same domain as the page that is calling it. ?

So unfortunately it looks like you will have to use a different approach to retrieve the webpage contents. The only suggestion I have would be to create an invisible iframe and load the webpage in that. Then access the image element using DOM.
<i>
</i>//your webpage
&lt;html&gt;
&lt;head&gt;
&lt;script&gt;
onload = function() {
var frame = parent.frames[0].document;
var images = frame.getElementsByTagName('img');
var url = images[0].getAttribute('src');
alert(url);
}
&lt;/script&gt;
&lt;/head&gt;
&lt;body&gt;
&lt;iframe width="0" height="0" src="http://the-other-webpage.com/index.htm" style="border:none;"&gt;&lt;/iframe&gt;
&lt;/body&gt;
&lt;/html&gt;
Copy linkTweet thisAlerts:
@SonejauthorAug 03.2006 — I'am making a Firefox Extension. The extension would work so if i hover my mouse on a Topic link on a forum homepage it would display the picture that is located inside that link.

And what about when i visit www.test.com/index.htm

and make my XMLHttpRequest to www.test.com/random/browse.htm

would that count as making the request from another domain?

And about the iframe, i cant controll the htm code becuse it is not my homepage.

I actually diden't get that one ?

I have the the picture display script allready, it just need to gab som values to get it working. Thats what we are working on now ?

http://www.dyn-web.com/dhtml/tooltips/tip-txt-img.html

The mouse.over.img scipt...

Thnx for lending me some time and help me kind of understand some parts of JavaScript ?
Copy linkTweet thisAlerts:
@pccodeAug 03.2006 — If you are using the xmlhttprequest object as part of a FF extension then you shouldn't have any problems with the security issues. Those issues only present a problem when you are using the object in unprivileged javascript. I've tested the function below and it works, but only to a certain degree. Unfortunately you can only access the elements in the page you retrieve if it's mime type is text/xml. Most webpages use the text/html mime type which is why the responseXML wasn't working. You can choose to either retrieve it with responseText and then use regex to find and return the img url or you can trick the object into thinking that it's returning an xml page. I've listed the code below that you can use to do this. The one downside, and it may present problems, is that xml is a very strict language. Some things that you can get away with in html will throw errors in xml. For example in html you don't have to close every tag with a forward slash. Image elements never include a closing tag.

[b]<img src="test.gif"/> or <img src="test.gif"></img>[/b]

That means that when the page is viewed as an xml document you won't be able to access any elements that are left open. I'm sure there's a way around this though.
<i>
</i>function getPage(url) {
var request=new XMLHttpRequest();
request.open("GET",url,false);
request.overrideMimeType('text/xml');
request.send(null);
var page = request.responseXML;
var images = page.getElementsByTagName('img');
alert(images.length);
}

getPage('https://webdeveloper.com/forum/showthread.php?t=116073');

The page below works great with the code above because the image tag is closed.
<i>
</i>//test.htm
&lt;html&gt;
&lt;body&gt;
&lt;img src="test.gif"&gt;&lt;/img&gt;
&lt;/body&gt;
&lt;/html&gt;

For more info on the xmlhttprequest object, specifically for Firefox, check out the link below. There's also a forum on xulplanet.com. They may be able to offer a better solution for this issue.

http://www.xulplanet.com/references/objref/XMLHttpRequest.html
Copy linkTweet thisAlerts:
@SonejauthorAug 06.2006 — Thnx, i will try it as soon as i have some time, thnx
×

Success!

Help @Sonej spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.16,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...