/    Sign up×
Community /Pin to ProfileBookmark

Robot.txt File

I want to discuss about the role of robot.txt file in SEO. What is the main purpose of this file when we submit sitemap to Google? What is the difference between sitemap and robot.txt file?

to post a comment
SEO

20 Comments(s)

Copy linkTweet thisAlerts:
@junkpopatJun 17.2014 — If you want a search engine to include each and every URLs of your website then do not create a robot.txt file. (There is no need of robot.txt file if everything should be included in search engine.)
Copy linkTweet thisAlerts:
@suneelideaJun 17.2014 — site map is different xml sitemap and robots.txt is a text file than search engine allow information.
Copy linkTweet thisAlerts:
@deathshadowJun 18.2014 — First off, it's called robot[b]s[/b].txt -- you don't make it plural, it doesn't do anything.

Second, it exists to block off access to links and directories so they aren't indexed, that's all it's for. As such, it's the OPPOSITE of SEO in that it's designed to make search NOT pay attention to the content it's masking off!

Third, it has absolutely NOTHING to do with what a sitemap is.

Though if we're going to talk 'sitemaps' if you have every page on your site linked to by at least one other page on the site -- aka building a site properly, there is NO reason to build a sitemap or submit it to Google. It's a bunch of bekaptah nonsense that has never served a legitimate purpose.

Finally, on the subject of BOTH robots.txt and the REL attribute, there is no such thing as "follow" and "index" no matter how many people use them out of ignorance, the only valid properties are "nofollow" and "noindex". If you want things followed or indexed, don't include the properties... or the attribute... or the robots.txt file.
Copy linkTweet thisAlerts:
@Christina_WilliJun 18.2014 — it exists to block off access to links and directories so they aren't indexed,
Copy linkTweet thisAlerts:
@anirban09PJun 18.2014 — Robots.txt is common name of a text file that is uploaded to a Web site's root directory and linked in the html code of the Web site. The robots.txt file is used to provide instructions about the Web site to Web robots and spiders. Web authors can use robots.txt to keep cooperating Web robots from accessing all or parts of a Web site that you want to keep private.
Copy linkTweet thisAlerts:
@elenasmithson3Jun 18.2014 — Robots.txt is the text file that allows you to provide instruction to search engine that which pages or folder will be crawled and with is not ?
Copy linkTweet thisAlerts:
@parmeshwarJun 18.2014 — If you don't want to crawl your website page or url then create robot.txt and add. A sitemap is a list of pages for crawlers and visitors.
Copy linkTweet thisAlerts:
@peterdrucker27Jun 20.2014 — The robots.txt file is used to provide instructions about the Web site to Web robots and spiders. from accessing all or parts of a Web site that you want to keep private.
Copy linkTweet thisAlerts:
@kpkarthikJun 20.2014 — I want to discuss about the role of robot.txt file in SEO. What is the main purpose of this file when we submit sitemap to Google? What is the difference between sitemap and robot.txt file?[/QUOTE]

Sitemap file : your website sitemap file contains all the inner pages URL of your website.So search engine will crawl all the URLs through the help of sitemap file.Whatever URL were placed in sitemap file that particular URL only will index the Search Engine.So all the inner pages of your website could be index on search engine with the help of sitemap file only.

Robots.txt file : This totally opposite to the process of sitemap.That means If you placed the particular URL in robots file,search engine does not crawl that particular page.
Copy linkTweet thisAlerts:
@nada_book14Jun 21.2014 — Robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention. the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.

and

Site map have contain the internal link of website as above discussed by the @kpkarthik.
Copy linkTweet thisAlerts:
@MukeshKrJun 22.2014 — The robots.txt file prevents the pages to crawl by any robot or search engine. The XML sitemap is used for totally different purpose, it is used to tell the search engine about all your webpages.
Copy linkTweet thisAlerts:
@arunthomas203Jun 24.2014 — Robots.txt file consists those URLs which you don't want to be indexed. Sitemap consists URL which you want to index for search engine. So both robot.txt file and sitemap have different uses.[/QUOTE]

I am not sure whether you can include urls in robots.txt. Usually folder paths are specified in robots.txt so that search engines does not crawl in the specified directories.

Thanks

Arun

Web Design Cochin
Copy linkTweet thisAlerts:
@kiwistechauthorJun 26.2014 — How do we write urls in robot.txt file that we donot want Google to search for?
Copy linkTweet thisAlerts:
@omthavertchJul 01.2014 — User-agent: *

Disallow: /~joe/junk.html

Disallow: /~joe/foo.html

Disallow: /~joe/bar.html
Copy linkTweet thisAlerts:
@ChristinaCaJul 10.2014 — Robots.txt is a text file you put on your site to tell search robots which pages or folder will be crawled and with is not.
Copy linkTweet thisAlerts:
@FidelaSolomJul 10.2014 — If your website has a robot.txt file, then it helps you a lot to get ranked on search engine. A robot.txt is a simple text file which tells search engine crawlers about your website and which page have to crawled and which have to be ignored. Sitemap and robot.txt both plays an important role, both serve differing but in complementary purposes.
Copy linkTweet thisAlerts:
@rootJul 10.2014 — robots.txt file is a reference for web crawlers who read the file and follow its instructions when crawling a site and it is completely optional.

If you have directories (folders) on the server you don't want crawled then you will have to use server side .htaccess files which the server reads when a request is made on that folder. Namely you are looking for the settings to not allow contents listing / directory listing if you use .htaccess
Copy linkTweet thisAlerts:
@rootJul 10.2014 — When I say completely optional, I mean that it is up to the web crawler to stick to your request, most will not and they will crawl the entire site.
Copy linkTweet thisAlerts:
@CharlesGirardJul 10.2014 — The robots.txt is a very simple text file that is placed on your root directory. An example would be www.yourdomain.com/robots.txt. This file tells search engine and other robots which areas of your site they are allowed to visit and index.
Copy linkTweet thisAlerts:
@GeorgeZapatJul 15.2014 — Robot.txt gives idea to search engines about which content of the website should be crawled and which have to be ignored. It is a text file and it is placed on every websites root directory. Sitemap and robot.txt both are important for any website.
×

Success!

Help @kiwistech spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.19,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...