This is a discussion on Database structure decision within the MySQL Database forums, part of the Database Forums category; AlterEgo wrote: > Jerry, > > Regarding storing images in the database: > > 1. If one is looking for ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
AlterEgo wrote:
> Jerry, > > Regarding storing images in the database: > > 1. If one is looking for quick and easy (as in a hobby application), then I > totally agree - store them in the database. If one needs to keep a scalable > product life-cycle in mind, then I would not keep them in the database. Not sure that I agree with you on this point. You CAN overload a filesystem with too many files in any one directory. You must then employ a software storage scheme (layer) just to keep things managable. > 2. If this is a commercial or community driven venture, then it will have to > scale if it is successful. If it is not successful, then it really won't > matter. How does keeping them in the database not scale? > 3. Transferring binary data from the native file system is way faster than > any SQL database. I think that also depends. How many files in that 'full' directory? > 4. File systems are more easily scaled than databases. Managability (backups, keeping the database and filesystem in sync) can become quite a bear. Keeping it all in the database means one backup operation results in a complete entity that can be re-created without problems. And, they can be zipped up to save space. You can achieve zipped archives approx. 12% of the original size (that's including the image data). > 5. Automated image management utilities (for creating thumbnails, converting > image formats, reading image meta-data, etc.) love working with file > systems, but hate working with databases. I don't know of any that are designed to work with databases. That being said, I can load an image from a database, resize/manipulate it with GD (maybe even with imagick - magicwand api) and send it out with no problem, all without touching the filesystem. > 6. Its far easier to distribute images to the "edge of the web" with > companies like Akamai or Digital Island hosting the content close to the > users. > > I guess what it really boils down to is: thousands of pictures or millions > of pictures? hi res, low res, thumbnails, etc.? > Thousands or millions for me personally, I wouldn't attempt to manage what databases were designed to do for me. > -- Bill > > > "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message > news:w6OdnbneO9eyGyTYnZ2dnUVZ_uKknZ2d@comcast.com. .. >> Mikee Freedom wrote: >>> Good Morning all, >>> >>> New member to the list, hoping you might be able to give me some much >>> needed advice. >>> >>> Basically, I have a client who would like to offer the ability for his >>> users to have their own independent website at his domain. It is not as >>> clear cut as that but as a generic description it will do. >>> >>> I know such services exist and I'm by no means emulating there's in any >>> way. the specific purpose of the individual user sites is fairly >>> specific, hence why he needs to get us to create it for him. >>> >>> In a nutshell, people will be able to sign up, make some configuration >>> decisions, add some content, and have a website of their own that they >>> will be able to upload photo's to. Lot's of photo's. >>> >>> The decision I was looking at making, was whether or not to create >>> individual databases for each of the new users. If this was going to be >>> a good idea or bad, or if it was dependent a little on further factors. >>> >>> I've only begun to plan the site but this idea popped in to my head and >>> I was hoping someone could either say - "you ass, what are you >>> thinking?"; or indicate it may be beneficial. >>> >>> My alternate option is to relate all content, photo's, albums, etc to >>> individual users. This is cool I guess, but liked the idea of complete >>> seperation. >>> >>> One specific question I had was, if I needed to search for a particular >>> value in multiple databases is this going to be a pain in the ass, a >>> terrible load on the server... or anything else that I may be >>> overlooking. >>> >>> Conclusion : >>> >>> I like the idea of it, is it a good one? >>> Are there considerations? >>> >>> Thanks everyone, >>> Mikee >>> >>> p.s. if any of what I've written doesn't make sense please feel free to >>> berate or ask for further explanation :) >>> >> One thing to consider here - the users. They'll be uploading their own >> content. Does this include server-side scripts like PHP, Perl, etc.? Will >> they need to create their own tables for anything? Will different users >> have vastly different requirements? >> >> If so, I think you should go with separate databases for each user for >> security purposes. Give each user their own userid and password and only >> allow them access to their own database. >> >> As for storing pictures in the database - I do it regularly. MySQL >> handles it quite well. I use mainly the InnoDB engine, so I also have >> foreign key restraints, which I set up to not allow a picture to be >> deleted as long as it's still being referenced. It also makes it easier >> to reference the pictures - you don't need a filename. I just keep the >> pictures in their own table for performance reasons and don't worry about >> it any more. >> >> Not to mention making backups easier. >> I agree with Jerry completely. Although I use MyISAM tables normally. I'm not a wizard at databases things at all. And Jerry hit on one point that I think alot of people miss... keeping the images/binary data in separate tables from the other data... very, very important. Using a blob type field in any table definition automatically (and silently mostly) converts all other fixed length fields to variable length fields (at least in MySQL 4.?). ie: CHARS become VARCHARS. Norm |
|
|||
|
AlterEgo wrote:
> > "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message > news:w6OdnbneO9eyGyTYnZ2dnUVZ_uKknZ2d@comcast.com. .. >> Mikee Freedom wrote: >>> Good Morning all, >>> >>> New member to the list, hoping you might be able to give me some much >>> needed advice. >>> >>> Basically, I have a client who would like to offer the ability for his >>> users to have their own independent website at his domain. It is not as >>> clear cut as that but as a generic description it will do. >>> >>> I know such services exist and I'm by no means emulating there's in any >>> way. the specific purpose of the individual user sites is fairly >>> specific, hence why he needs to get us to create it for him. >>> >>> In a nutshell, people will be able to sign up, make some configuration >>> decisions, add some content, and have a website of their own that they >>> will be able to upload photo's to. Lot's of photo's. >>> >>> The decision I was looking at making, was whether or not to create >>> individual databases for each of the new users. If this was going to be >>> a good idea or bad, or if it was dependent a little on further factors. >>> >>> I've only begun to plan the site but this idea popped in to my head and >>> I was hoping someone could either say - "you ass, what are you >>> thinking?"; or indicate it may be beneficial. >>> >>> My alternate option is to relate all content, photo's, albums, etc to >>> individual users. This is cool I guess, but liked the idea of complete >>> seperation. >>> >>> One specific question I had was, if I needed to search for a particular >>> value in multiple databases is this going to be a pain in the ass, a >>> terrible load on the server... or anything else that I may be >>> overlooking. >>> >>> Conclusion : >>> >>> I like the idea of it, is it a good one? >>> Are there considerations? >>> >>> Thanks everyone, >>> Mikee >>> >>> p.s. if any of what I've written doesn't make sense please feel free to >>> berate or ask for further explanation :) >>> >> One thing to consider here - the users. They'll be uploading their own >> content. Does this include server-side scripts like PHP, Perl, etc.? Will >> they need to create their own tables for anything? Will different users >> have vastly different requirements? >> >> If so, I think you should go with separate databases for each user for >> security purposes. Give each user their own userid and password and only >> allow them access to their own database. >> >> As for storing pictures in the database - I do it regularly. MySQL >> handles it quite well. I use mainly the InnoDB engine, so I also have >> foreign key restraints, which I set up to not allow a picture to be >> deleted as long as it's still being referenced. It also makes it easier >> to reference the pictures - you don't need a filename. I just keep the >> pictures in their own table for performance reasons and don't worry about >> it any more. >> >> Not to mention making backups easier. >> >> -- >> ================== >> Remove the "x" from my email address >> Jerry Stuckle >> JDS Computer Training Corp. >> jstucklex@attglobal.net >> ================== > > (Top posting fixed) > Jerry, > > Regarding storing images in the database: > > 1. If one is looking for quick and easy (as in a hobby application), then I > totally agree - store them in the database. If one needs to keep a scalable > product life-cycle in mind, then I would not keep them in the database. I disagree. It scales quite well to larger databases. I've had databases over in the tens of gigabytes containing pictures, PDF's and other binary data. It works great. > 2. If this is a commercial or community driven venture, then it will have to > scale if it is successful. If it is not successful, then it really won't > matter. The busiest has upwards of 100K hits per day average Peaks have been over 250K. During testing we pushed it at > 1M hits/day. That's well beyond a "hobby site". In fact I wish some of my other sites got this much traffic :-) > 3. Transferring binary data from the native file system is way faster than > any SQL database. I suggest you check your figures. It may be a little faster - but in no way is it "way faster". > 4. File systems are more easily scaled than databases. Again I disagree. I've been doing RDB work since the early 80's when I started with DB2 on IBM mainframes. If properly designed, databases can scale much better than file systems. > 5. Automated image management utilities (for creating thumbnails, converting > image formats, reading image meta-data, etc.) love working with file > systems, but hate working with databases. So don't use them. Not a problem. I do use them. When a new image is uploaded, for instance, I may store a thumbnail as well as the image itself. But I don't need it after that. And why should I waste CPU and other system resources creating thumbnails every time they are requested? > 6. Its far easier to distribute images to the "edge of the web" with > companies like Akamai or Digital Island hosting the content close to the > users. > That's one way to do it. But it also creates nightmare backups and the like. Mu customers use mostly dedicated servers and VPS's. > I guess what it really boils down to is: thousands of pictures or millions > of pictures? hi res, low res, thumbnails, etc.? > > -- Bill > Tens of thousands of pictures. Hi res and thumbnails, mostly. As I said, database size in the tens of GB. Don't know what it is lately - I haven't looked at the size. I suggest you try it before you start telling me how bad it is. As I said - I've done it for a number of sites. It works great. And I've been doing it with RDB's for a lot longer than most people in this group. Proper design, tuning and implementation and it works quite well. How many have you actually done this on? Or are you just talking through your hat? P.S. Please don't top post. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
Gary L. Burnore wrote:
>> >>> Jerry, >>> >>> Regarding storing images in the database: >>> >>> 1. If one is looking for quick and easy (as in a hobby application), >> then I >>> totally agree - store them in the database. If one needs to keep a >> scalable >>> product life-cycle in mind, then I would not keep them in the database. >> I disagree. It scales quite well to larger databases. I've had >> databases over in the tens of gigabytes containing pictures, PDF's and >> other binary data. It works great. > > Wow, tens of gigabytes? Heh. > Yep. How many databases of that size do you deal with? From your other statements I suspect you haven't gotten over 50Kb. >>> 2. If this is a commercial or community driven venture, then it will >> have to >>> scale if it is successful. If it is not successful, then it really won't >>> matter. >> The busiest has upwards of 100K hits per day average Peaks have been >> over 250K. During testing we pushed it at > 1M hits/day. That's well >> beyond a "hobby site". In fact I wish some of my other sites got this >> much traffic :-) >> >> >>> 3. Transferring binary data from the native file system is way faster >> than >>> any SQL database. >> I suggest you check your figures. It may be a little faster - but in no >> way is it "way faster". > > Its dependent on the filesystem, the databsae and many other things. > You're both blowing hot air. > Wrong. I don't know of any filesystem which can handle 100K files in one directory very well. But 100M rows is easily handled by a good database. You really should get some facts before you start accusing others of blowing hot air. >>> 4. File systems are more easily scaled than databases. >> Again I disagree. I've been doing RDB work since the early 80's when I >> started with DB2 on IBM mainframes. If properly designed, databases can >> scale much better than file systems. > > Two more moronic statements. (His and yours). Either can scale well > if designed correctly. > Let's see you scale a filesystem to handle 100K files in a single directory. And no, I'm not talking about putting them in separate directories - where the program has to decide which directory(ies) to search for the file. I'm talking about like you do in a database - with everything in a single table. >>> 5. Automated image management utilities (for creating thumbnails, >> converting >>> image formats, reading image meta-data, etc.) love working with file >>> systems, but hate working with databases. >> So don't use them. Not a problem. >> >> I do use them. When a new image is uploaded, for instance, I may store >> a thumbnail as well as the image itself. But I don't need it after that. >> >> And why should I waste CPU and other system resources creating >> thumbnails every time they are requested? >> >>> 6. Its far easier to distribute images to the "edge of the web" with >>> companies like Akamai or Digital Island hosting the content close to the >>> users. >>> >> That's one way to do it. But it also creates nightmare backups and the >> like. Mu customers use mostly dedicated servers and VPS's. > > So backing up a bunch of dedicated servers is better how, exactly? >>> I guess what it really boils down to is: thousands of pictures or >> millions >>> of pictures? hi res, low res, thumbnails, etc.? >>> >>> -- Bill >>> >> Tens of thousands of pictures. Hi res and thumbnails, mostly. As I >> said, database size in the tens of GB. Don't know what it is lately - I >> haven't looked at the size. > > When you get to tens of terabytes, then you can talk about how well > you scale. Tens of gigs or even a couple hundred is nothing anymore. > No, but it's bigger than most of the websites out there. And how many filesystems handle 10's of terabytes in a single directory? > He's right about one thing. It makes far more sense to NOT store > images in a database table. >> I suggest you try it before you start telling me how bad it is. As I >> said - I've done it for a number of sites. It works great. And I've >> been doing it with RDB's for a lot longer than most people in this >> group. > > Bullshit. > Yep, and you're the one who's full of it. You know nothing about my background or my experience. Stoopid asshole. >> Proper design, tuning and implementation and it works quite well. > > Now that is true. > >> How many have you actually done this on? Or are you just talking >> through your hat? > > Better than out your ass. >> P.S. Please don't top post. > > We agree on this. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|||
|
Jerry,
Chill dude. I don't want to play techie egos here, but its late afternoon, I need a little R&R, and since I was asked: Currently, Director of Emerging Technologies www.connect3.com Our systems manage product content, pricing and advertising (print and web) production for large retailers. - over a terabyte of millions of high and low res images combined. Home Depot, Best Buy, Circuit City, Petco ... Before that: V.P. Technology www.local.com Processing 170 million search requests/day, 1.2 Billion click-steram transactions/day across four data centers - all transactions in distributed relational databases. Speaking engagements: Professional Association for SQL Server: Enterprise Class Service Levels on a Dotcom Budget. L.A. .NET Users Group: Breaking the Rules for Blistering OLTP Performance. .... others. Amazon, eBay, Akamai and Google do not store theire images in a database. As a matter of fact, Google uses a file system to index *all* of its data: http://216.239.37.132/papers/gfs-sosp2003.pdf. Are they wrong also? Also in my posting I said to store the images in a hive folder structure (unbalanced tree), not in one directory - jeez! If you choose to use UUID's (GUIDs) as filenames, you get a remakably balanced tree - at least up through the first 12 characters. If scaling isn't an issue, by all means store images in a database. If it is an issue, then I'll side with the big boys and store them in a file system. They know a little bit about implementing technology. Top poster and always will be, sorry. -- Bill "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message news:b-ydnfwzS5NGISfYnZ2dnUVZ_qunnZ2d@comcast.com... > AlterEgo wrote: >> >> "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message >> news:w6OdnbneO9eyGyTYnZ2dnUVZ_uKknZ2d@comcast.com. .. >>> Mikee Freedom wrote: >>>> Good Morning all, >>>> >>>> New member to the list, hoping you might be able to give me some much >>>> needed advice. >>>> >>>> Basically, I have a client who would like to offer the ability for his >>>> users to have their own independent website at his domain. It is not as >>>> clear cut as that but as a generic description it will do. >>>> >>>> I know such services exist and I'm by no means emulating there's in any >>>> way. the specific purpose of the individual user sites is fairly >>>> specific, hence why he needs to get us to create it for him. >>>> >>>> In a nutshell, people will be able to sign up, make some configuration >>>> decisions, add some content, and have a website of their own that they >>>> will be able to upload photo's to. Lot's of photo's. >>>> >>>> The decision I was looking at making, was whether or not to create >>>> individual databases for each of the new users. If this was going to be >>>> a good idea or bad, or if it was dependent a little on further factors. >>>> >>>> I've only begun to plan the site but this idea popped in to my head and >>>> I was hoping someone could either say - "you ass, what are you >>>> thinking?"; or indicate it may be beneficial. >>>> >>>> My alternate option is to relate all content, photo's, albums, etc to >>>> individual users. This is cool I guess, but liked the idea of complete >>>> seperation. >>>> >>>> One specific question I had was, if I needed to search for a particular >>>> value in multiple databases is this going to be a pain in the ass, a >>>> terrible load on the server... or anything else that I may be >>>> overlooking. >>>> >>>> Conclusion : >>>> >>>> I like the idea of it, is it a good one? >>>> Are there considerations? >>>> >>>> Thanks everyone, >>>> Mikee >>>> >>>> p.s. if any of what I've written doesn't make sense please feel free to >>>> berate or ask for further explanation :) >>>> >>> One thing to consider here - the users. They'll be uploading their own >>> content. Does this include server-side scripts like PHP, Perl, etc.? >>> Will they need to create their own tables for anything? Will different >>> users have vastly different requirements? >>> >>> If so, I think you should go with separate databases for each user for >>> security purposes. Give each user their own userid and password and >>> only allow them access to their own database. >>> >>> As for storing pictures in the database - I do it regularly. MySQL >>> handles it quite well. I use mainly the InnoDB engine, so I also have >>> foreign key restraints, which I set up to not allow a picture to be >>> deleted as long as it's still being referenced. It also makes it easier >>> to reference the pictures - you don't need a filename. I just keep the >>> pictures in their own table for performance reasons and don't worry >>> about it any more. >>> >>> Not to mention making backups easier. >>> >>> -- >>> ================== >>> Remove the "x" from my email address >>> Jerry Stuckle >>> JDS Computer Training Corp. >>> jstucklex@attglobal.net >>> ================== >> > > (Top posting fixed) > > > Jerry, > > > > Regarding storing images in the database: > > > > 1. If one is looking for quick and easy (as in a hobby application), > then I > > totally agree - store them in the database. If one needs to keep a > scalable > > product life-cycle in mind, then I would not keep them in the database. > > I disagree. It scales quite well to larger databases. I've had databases > over in the tens of gigabytes containing pictures, PDF's and other binary > data. It works great. > > > 2. If this is a commercial or community driven venture, then it will > have to > > scale if it is successful. If it is not successful, then it really won't > > matter. > > The busiest has upwards of 100K hits per day average Peaks have been over > 250K. During testing we pushed it at > 1M hits/day. That's well beyond a > "hobby site". In fact I wish some of my other sites got this much traffic > :-) > > > > 3. Transferring binary data from the native file system is way faster > than > > any SQL database. > > I suggest you check your figures. It may be a little faster - but in no > way is it "way faster". > > > 4. File systems are more easily scaled than databases. > > Again I disagree. I've been doing RDB work since the early 80's when I > started with DB2 on IBM mainframes. If properly designed, databases can > scale much better than file systems. > > > 5. Automated image management utilities (for creating thumbnails, > converting > > image formats, reading image meta-data, etc.) love working with file > > systems, but hate working with databases. > > So don't use them. Not a problem. > > I do use them. When a new image is uploaded, for instance, I may store a > thumbnail as well as the image itself. But I don't need it after that. > > And why should I waste CPU and other system resources creating thumbnails > every time they are requested? > > > 6. Its far easier to distribute images to the "edge of the web" with > > companies like Akamai or Digital Island hosting the content close to the > > users. > > > That's one way to do it. But it also creates nightmare backups and the > like. Mu customers use mostly dedicated servers and VPS's. > > > I guess what it really boils down to is: thousands of pictures or > millions > > of pictures? hi res, low res, thumbnails, etc.? > > > > -- Bill > > > > Tens of thousands of pictures. Hi res and thumbnails, mostly. As I said, > database size in the tens of GB. Don't know what it is lately - I haven't > looked at the size. > > I suggest you try it before you start telling me how bad it is. As I > said - I've done it for a number of sites. It works great. And I've been > doing it with RDB's for a lot longer than most people in this group. > Proper design, tuning and implementation and it works quite well. > > How many have you actually done this on? Or are you just talking through > your hat? > > P.S. Please don't top post. > > > > > > > -- > ================== > Remove the "x" from my email address > Jerry Stuckle > JDS Computer Training Corp. > jstucklex@attglobal.net > ================== |
|
|||
|
AlterEgo wrote:
> > Also in my posting I said to store the images in a hive folder structure > (unbalanced tree), not in one directory - jeez! If you choose to use UUID's > (GUIDs) as filenames, you get a remakably balanced tree - at least up > through the first 12 characters. > Yep, and as you add new directories you need to keep changing the code. And it creates a management nightmare. What happens when you want to delete an image? Is it used by anything, for instance? > If scaling isn't an issue, by all means store images in a database. If it is > an issue, then I'll side with the big boys and store them in a file system. > They know a little bit about implementing technology. > The "big boys" do store images in databases. We used to do it all the way back in the 80's on mainframes - for instance, scanned documents. And we did it for big companies (I was working for IBM at the time). It scales quite well. Don't tell me it doesn't scale when you haven't tried it. I have. And it does - quite well. > Top poster and always will be, sorry. > > -- Bill That says it all. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |