Importing pages

This is a discussion on Importing pages within the PHP Language forums, part of the PHP Programming Forums category; Hi all I've written a content management system that I'm now selling to my customers. It's very ...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 09-22-2004
AJ
 
Posts: n/a
Default Importing pages

Hi all

I've written a content management system that I'm now selling to my
customers. It's very nice when we have a blank canvas of a site, but a pain
in the arse when there is already a site in place.

What I'm in the process of *trying* to put together is a script that would
do the following:

A simple form where you put the address of the site with the static pages

The script then spiders through the site, takes everything between <body>
and </body> and chucks the rest away

It would then take out all class definitions and all embedded styles like
font tags etc but leaves tables, <p> <H?> etc

This would leave a very plain page of HTML that would be inserted into a
database. CSS would control the fonts etc. I'm aware that there would need
to be some tidying up if there was any javascript or anything and also some
basic formatting.

What I want to know is

1. Has it been done and, if so, where might I find something like this
2. Might it have any commercial value to other developers?

Regarding 2, I'm thinking how much time something like this might save me if
I have to convert anything more than a few pages of static HTML into
something that I can put in a database.

Your thoughts would be appreciated.

Andy


Reply With Quote
  #2 (permalink)  
Old 09-22-2004
Nikolai Chuvakhin
 
Posts: n/a
Default Re: Importing pages

"AJ" <nospam@redcatmedia.net> wrote in message
news:<cirrgo$egu$1@hercules.btinternet.com>...
>
> The script then spiders through the site, takes everything between
> <body> and </body> and chucks the rest away


Bad idea. As of HTML 4.0, <head> and <body> tags are optional...
Also, why spider the site, if you can (theoretically, at least)
crawl the local file system?

> 1. Has it been done and, if so, where might I find something like this


The spidering part along with storing in databases is what search
engines do. What you need to add is the processing in-between.

> 2. Might it have any commercial value to other developers?


Developers, I doubt it. Content managers, possibly...

Cheers,
NC
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 07:44 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0