Web comic aggregator RSS feed general questions?

@mysteriousmonkJun 27.2016

Hello,I am trying to create a web comic aggregation website using HTML 5, CSS 3, and JavaScript. I would like users to be able to view comics of different dates from different websites all in one place. After some research, it seems like I’m probably going to need to use an RSS feed to accomplish this. However, I don’t fully understand the capabilities and usage of an RSS feed.

First, would it be possible to pull images from comic websites in an automated and orderly fashion using an RSS feed? Or would I need to use something else? Or would it not be possible at all? If it is possible with an RSS feed, I’m also confused somewhat about the general implementation. Would the code for the feed be in HTML, or JavaScript, or both? Would I need to use existing libraries and or APIs? Is their existing code with a similar enough function that I could use it as a starting point?

Thanks

to post a comment

HTML

7 Comments(s) _↴

@NogDogJun 27.2016 — #Assuming I'm correctly understanding what you want to do, RSS will probably only help you if the sites you want to pull from already provide an RSS feed. If they do, then you'll need code to grab their feed, which will output XML, and parse that XML to get whatever it is that you want to use from it. In order to avoid hitting each such RSS feed whenever a user of your site makes a request, I'd be inclined to do the RSS interactions on the server side on your site, rather than via JavaScript in the browser, allowing you to cache feed results for some reasonable amount of time on your site (e.g. in a DB, or other data storage means).

@mysteriousmonkauthorJun 27.2016 — #Okay, thanks for the info. After reading your answer and a few other similar questions on other forums, I have some follow-up questions:

First, I am concerned about the overall availability of RSS feeds, or at least of those that output what I want. After doing some research on various web comics to see if they included RSS feeds, I found that they almost overwhelmingly did, although some of them seem to include crippled feeds that don't actually include any images--they just have links to each individual comic on the original website. I also read somewhere else that this is becoming a trend (to make sure users see advertisements from the original website). I was wondering if there is any way around this kind of thing? Because the whole point of my website idea is to make the comics easy to read and all on one webpage, not to include a bunch of different links to different webpages.

Second, I am still a little confused as to the capabilities of RSS feeds. Do these feeds provide content for articles and comics in only a range of recent dates? Or can a typical RSS feed access all articles/comics from the original website across its entire history? Because most of what I'm reading online refers to RSS feeds as a way to get daily updated news articles. Don't get me wrong, I would like my website to update comics daily, but I would also like users to be able to access very old comics, so I'm wondering if a typical RSS feed will provide this capability.

Third, I have heard a lot of suggestions to try to develop this kind of thing with a combination of server-side and client-side code, and I'm wondering about the advantages and disadvantages of this approach. For example, I have heard that writing a website in this way benefits a website in search engine rankings, but I'm not exactly sure how. Are there other advantages?

Finally, I'm curious about the general implementation of a combination of server-side and client-side code in a website. After some research it looks like I would have to use a different language for the server-side code, like PHP or Ruby or whatever, and I was wondering how the different languages interact? Would it be like the interaction between HTML, CSS, and JavaScript? For example, would I "iinclude" the server-side code file inside the client-side HTML file?

Thanks again

@NogDogJun 28.2016 — #I'll start with the last question, since it's the easiest (from my perspective of what I do for a living ? ). It's actually the inverse: the server-side code will ultimately output the client-side code (HTML and JavaScript). Therefore, if you build your app in PHP, the PHP code will at some point be outputting HTML, which might include loading of CSS and JS files, as well as in-line JS -- whatever you need to eventually get sent to the browser.

As far as wanting to collect assorted comics from the web and put them in one easy-to-read location: I'll leave that to the intellectual property lawyers. (I'm definitely not a lawyer -- I don't even play one on TV.) My assumption would be that if a copyright owner is supplying a RSS feed, what is in that feed is intended for consumption and eventual display elsewhere -- otherwise there's no reason to supply such a feed. But if you take that info, and, for instance, use it to "scrape" additional content directly from their site so that you can display it on yours, you now are possibly stepping into copyright infringement territory, etc. (Again, I'm not a lawyer -- I just want you to think about it, and contact your own legal consul if you want to cover your butt.)

As for what's in an RSS feed, I believe it is typical to output some limited amount of recent content, but conceivably such a feed could be created to supply it by date range, etc. -- but I'm not sure if that's part of the RSS standard.

@mysteriousmonkauthorJun 29.2016 — #Thanks. If I ever get it working then I do plan on trying to pay for the content rather than just hope it falls under a legal gray area. However, for now I just want to see if I can get it to work at all and am just trying to do it the simplest way possible.

And as for the other question, just to make sure I understand you correctly, are you saying that if I decide to approach my website by using server-side code I won't need to write HTML/CSS/JavaScript directly? I would just write PHP or a similar language and then it would generate the HTML and other stuff for me?

And as for the RSS feed stuff, that's too bad if they only output recent content. I will have to look into it more. Do you know of any different (and preferably simple) ways to scrape content from sites if the RSS feed thing doesn't work out?

Thanks again

@NogDogJun 29.2016 — #Scraping is seldom "simple", unless the site in question want you to access some/all of their content -- in which case they might provide an API to do so. Of course, even if they do, it's unlikely any two API's would be the same, so you'd have to have site-specific code for each one (which would also be true if you want to scrape the site directly by crawling the site, parsing the HTML, and extracting what you're interested in, as no two sites are likely to have the same HTML structure).

As far as the PHP or other server-side stuff, you'd still be writing HTML/CSS/JavaScript as needed, but it would ultimately be output from that server-side application code, which could happen in a number of ways. For instance, in PHP, you could explicitly echo out the page's HTML, or you can actually just have whatever you want outside of the <?php...?> tags, which is automatically output as is.

[code=php]
 <html>
 <head><title>This is output as is</title></head>
 <body>
 <?php
 echo "<h1>A PHP Echo Statement</h1>";
 ?>
 </body>
 </html>
 [/code]

@ram12Jul 06.2016 — #RSS is basically a text based XML feed with well-defined elements. RSS is also extended by various others to give it more flexibility. The gist of it is you're going to get URL's for images 99% of the time in the feed. You would fetch the RSS feed, scan it, construct a list of "stories/items" where each item has multiple attributes like title, description, URL to the original source, URL's to media items. Then you would use network.request() calls to fetch the image from the URL and display it.

@designing_webJul 12.2016 — #Okay, thanks for the info. After reading your answer and a few other similar questions on other forums, I have some follow-up questions:

First, I am concerned about the overall availability of RSS feeds, or at least of those that output what I want. After doing some research on various web comics to see if they included RSS feeds, I found that they almost overwhelmingly did, although some of them seem to include crippled feeds that don't actually include any images--they just have links to each individual comic on the original website. I also read somewhere else that this is becoming a trend (to make sure users see advertisements from the original website). I was wondering if there is any way around this kind of thing? Because the whole point of my website idea is to make the comics easy to read and all on one webpage, not to include a bunch of different links to different webpages.

Second, I am still a little confused as to the capabilities of RSS feeds. Do these feeds provide content for articles and comics in only a range of recent dates? Or can a typical RSS feed access all articles/comics from the original website across its entire history? Because most of what I'm reading online refers to RSS feeds as a way to get daily updated news articles. Don't get me wrong, I would like my website to update comics daily, but I would also like users to be able to access very old comics, so I'm wondering if a typical RSS feed will provide this capability.

Third, I'm wondering about the advantages and disadvantages of the suggestion to use server-side code in conjunction with client-side code to create my website. How exactly would this benefit a website in search engine rankings? Are there other advantages?

Finally, I'm curious about the general implementation of a combination of server-side and client-side code in a website. After some research it looks like I would have to use a different language for the server-side code, like PHP or Ruby or whatever, and I was wondering, if this is the case, how do the different languages interact? Would it be like the interaction between HTML, CSS, and JavaScript? For example, would I "iinclude" the server-side code file inside the client-side HTML file?

Also in #HTML _↴

1 picture wont Load, others do.Target= with XHTML make HTML page searchable in any search engine

Success!

Help @mysteriousmonk spread the word by sharing this article on Twitter...

Tweet This

Web comic aggregator RSS feed general questions?

7 Comments(s) _↴

Also in #HTML _↴

Success!

Social

Version

Web comic aggregator RSS feed general questions?

7 Comments(s) ↴

Also in #HTML ↴

Success!

The web is an endless sea of information. Don't miss the boat... Subscribe!

Social

Version

7 Comments(s) _↴

Also in #HTML _↴