/    Sign up×
Community /Pin to ProfileBookmark

Robots.txt troubles

I’m using a robots.txt, partly generated by Google, to stop direct links to my image folder and a folder with files that can be downloaded from my site (CVs, desktop design, PDFs). The contents of these folders (“img” and “dwn”) are still shown by Google Images and other search engines. Could anyone tell me what’s wrong?

[code]User-agent: *
Disallow: /img/

User-agent: msnbot-media
Disallow: /

User-agent: Googlebot-Image
Disallow: /

User-agent: Googlebot-Image
Disallow: /dwn/

User-agent: *
Disallow: /Templates/

User-Agent: BDFetch
Disallow: /

User-Agent: BPImageWalker/2.0
Disallow: /[/code]

(BPImageWalker and BDFetch are malicious crawlers)

to post a comment
SEO

7 Comments(s)

Copy linkTweet thisAlerts:
@PrisauthorMay 24.2010 — Bump
Copy linkTweet thisAlerts:
@PrisauthorMay 27.2010 — No one? Really?
Copy linkTweet thisAlerts:
@FangMay 30.2010 — I imagine the images are used in documents on your site. The bots are collecting the data from there.
Copy linkTweet thisAlerts:
@PrisauthorMay 30.2010 — I imagine the images are used in documents on your site. The bots are collecting the data from there.[/QUOTE] Hi Fang, thanks for your reply.

I don't really understand what you mean with "the images are used in documents on your site". The images are embedded in the HTML like so:
<img src="img/myimage.jpg" name="placeholder" vspace="0" border="0"class="floatRightbrdr" id="placeholder" onmousedown="if (event.preventDefault) event.preventDefault()" oncontextmenu="return false;"/ I've used this method because the pics (on several pages) are part of a simple image gallery, here's a link. The image gallery is accessible through the links top right.

Though I don't understand why that should matter in this case; when a robots.txt blocks a certain folder, i.e. Google Images should not be allowed to display its contents, right?

BTW, as a temporary solution, a JS now breaks the frames when visiting my site through Google Images search results.
Copy linkTweet thisAlerts:
@FangMay 30.2010 — The bots read your documents, find an image element, store image data. It is irrelevant that robots.txt is blocked from the image folder. In any case not all bots read the robots.txt file.
Copy linkTweet thisAlerts:
@PrisauthorMay 30.2010 — I'm aware several bots don't respect robots.txt, but the Google Image bot should. Is there something wrong/missing from the code posted above? I've tried the suggestions from Google Webmaster Central, but without result.

Also adding <meta name="robots" content="noimageindex"> to the header proved useless.
Copy linkTweet thisAlerts:
@FangMay 30.2010 — Your images have already been indexed. Too late
×

Success!

Help @Pris spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 6.18,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @nearjob,
tipped: article
amount: 1000 SATS,

tipper: @meenaratha,
tipped: article
amount: 1000 SATS,

tipper: @meenaratha,
tipped: article
amount: 1000 SATS,
)...