/    Sign up×
Community /Pin to ProfileBookmark

How can a search engine crawl pages in a mysql database?

I am making a website in PHP where articles can be added.
There is a form on the website to add a new article, after which the data (title, text,…) is stored in a MySQL database.

Now I would like the articles to be found in the search results of eg Google, but a search engine does not crawl generated pages from a database.
Do I now have to create a page for each article, so that I end up with a lot of pages?

I have already tried to find out on the internet how to solve this, but I don’t really know in which direction to look.
I know there must be a way because many news or forum websites work like this, but I can’t figure out how they do it.

Does anyone know how to fix this or in which direction to look?

to post a comment
PHP

6 Comments(s)

Copy linkTweet thisAlerts:
@ginerjmJan 29.2022 — I don't think you want to create static pages that show all of your db content. Rather you might create mini summaries of each article's content that you could build a page of that doesn't get shown, but being in your root could be detected by any bot which would then be able to read. But - I don't know how you would organize it so that the bot could pull the pertinent portion of that 'index' page to provide a meaningful hit to any user.
Copy linkTweet thisAlerts:
@tracknutJan 29.2022 — Are these articles displayed somehow, visible on the internet? For example, do you have a page like "www.example.com/show=best+type+of+beer" that would display the article? If so, then yes Google can crawl and index it. You could even add that to your sitemap to help google. But yes, if there's just a form to enter an article but not to display it, then it won't be indexed.
Copy linkTweet thisAlerts:
@ginerjmJan 29.2022 — Do you have any copyright issues with storing these articles?
Copy linkTweet thisAlerts:
@jeff007authorJan 30.2022 — @tracknut#1642026 Currently the articles are only stored in a database and therefore cannot be found on the internet. But in your example "www.example.com/show=best+type+of+beer" how do I get my articles to come after the / , because I don't really know how they do this?
Copy linkTweet thisAlerts:
@tracknutJan 30.2022 — @jeff007#1642034 Firstly let me adjust my statement like this instead as it will be simpler that way:

"www.example.com/articles.php?show=best+type+of+beer" (note the file and question mark I added)

It can be done w/o the question mark, but requires some more advanced fiddling with your web server config that likely isn't important right now. The part after the question mark is called a query string, and you can get that in your PHP program through a global array called $_GET[]. So if you had (per this example) a file called articles.php and someone either got to it by typing that url, or that url was a link on some page (<a href="http://www.example.com/articles.php?show=best+type+of+beer">Check out my article on beer</a>), or maybe it's in a pull-down in an html form, your index file could look in the GET array and use that value to pull the article out of your database and display it.

Lots more than I can just type here before you have the code, I understand, but do some googling on PHP query strings, or PHP GET array, and you'll likely find some tutorial. One caution I'll give is that since this string can simply be typed by a user, you'll get all sorts of crap, including attempted hacks, coming in to your program via this parameter. You need to sanitize (ie, make sure the parameter is legit) before you just pass it to the database. The simplest way to do that might be to check to see if it's in an array like $article_names[] that you maintain in PHP, and only call the database if you truly have a legit article name.

I hope that gets you pointed in the right direction
Copy linkTweet thisAlerts:
@NogDogJan 30.2022 — Ignoring search engines for the moment, how do ordinary users find/view articles on your site? Basically, if people can find an article, a search engine should be able to, too. Search engines do not look at the source files on your web server; they just make HTTP requests and view what would get returned to a normal user's web browser. Therefore, once a search engine knows your site exists, it will:

  • 1. Look at the HTML of the first page it finds

  • 2. "Click" on each new link it finds there, and look at the HTML it gets from that

  • 3. Keep repeating the above until it doesn't find any more new links on any of the HTML it views


  • So, if people can find and read your articles, then search engines should be able to as well. If people can _not_ find and read articles, then I think you need to address that first (and the search engines will follow).
    ×

    Success!

    Help @jeff007 spread the word by sharing this article on Twitter...

    Tweet This
    Sign in
    Forgot password?
    Sign in with TwitchSign in with GithubCreate Account
    about: ({
    version: 0.1.9 BETA 6.18,
    whats_new: community page,
    up_next: more Davinci•003 tasks,
    coming_soon: events calendar,
    social: @webDeveloperHQ
    });

    legal: ({
    terms: of use,
    privacy: policy
    });
    changelog: (
    version: 0.1.9,
    notes: added community page

    version: 0.1.8,
    notes: added Davinci•003

    version: 0.1.7,
    notes: upvote answers to bounties

    version: 0.1.6,
    notes: article editor refresh
    )...
    recent_tips: (
    tipper: @nearjob,
    tipped: article
    amount: 1000 SATS,

    tipper: @meenaratha,
    tipped: article
    amount: 1000 SATS,

    tipper: @meenaratha,
    tipped: article
    amount: 1000 SATS,
    )...