how to secure documents in server

This is a discussion on how to secure documents in server within the PHP Language forums, part of the PHP Programming Forums category; AlmostBob wrote: > "Bart Van der Donck" <bart@nijlen.com> wrote in message > news:591ca336-...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #41 (permalink)  
Old 07-21-2008
The Natural Philosopher
 
Posts: n/a
Default Re: how to secure documents in server

AlmostBob wrote:
> "Bart Van der Donck" <bart@nijlen.com> wrote in message
> news:591ca336-5b7d-4c06-afc7-6408bdc92c41@c65g2000hsa.googlegroups.com...
> The Natural Philosopher wrote:
>
>> Bart Van der Donck wrote:
>>> (1) Read actions without BLOB:
>>> - Application does not load any BLOB data from database.
>>> - Application uses a var holding the system-path (usr/my/path/to/
>>> pics/), adds the ID to it, adds .jpg to it, tests if file exists (-e).
>>> - If yes, use URL-path in stead of system-path and output inside an
>>> <IMG> to screen.
>>> - No binary data has to be handled; the major memory use here (if any)
>>> is the -e check for file existance. But even this could be skipped
>>> with a workaround.
>>> (2) Read actions with BLOB:
>>> - Load BLOB from column (already a memory-intensive task of its own).
>>> - Store in some folder (id.).

>
>>> It is my experience that (1) has huge memory benefits compared to
>>> (2).

>> The way I do it, it streams off the database via the unix socket into
>> PHP memory space, and is outputted from there via the web server to the
>> network.
>>
>> VERY little extra PHP or CPU activity is required, but I grant you its
>> probably held in PHP and SQL type memory areas as well as disk cache
>> memory. Its probably NOT held i e.g.apache memory though..apache or
>> whatever will read the stdout of the CGI script that spits it, and juts
>> pass the bytes...and memory is cheap. Cheaper than CPU anyway.

>
> All I do is this:
>
> SELECT id FROM table;
> print "<img src=url/to/$id.jpg>";
>
> Compared to your way:
> - Simpler
> - No need to start new php scripts to output raw binary stream for
> every image
> - No sockets
> - No need to read heavy binary BLOB from DB
> - No chance for possible cache attacks in MySQL, PHP, filesystem or
> Apache
>
> I don't want to sound religious, but I think my way is much better.
>


There is no better: it depends on the requirements.

Your way there is no chance to protect the image directory from random
downloads for example.

In my case the user may be a user with far greater access than the
general public, and have access to internal data - like plans drawings
and specifications.

I don't want script kiddies stealing vital info: Putting them in a
database is one giant leap in that sense.

execution speed and efficiency is only one of many many issues.

In my case the above, plus a general requirement to try and get all
important corporate data in the data base, under one backup regime, were
more significant. I especially did NOT want user accessible image files
that might get deleted by accident. I could protect the database area by
making it only accessible by root or the mysql daemon: direct access to
download areas had to be at lest readable, and if uploaded, wrteable, by
the permissions the web server and php ran at.


In practice at moderate loads the download speeds are far more dominant
that CPU or RAM limitations. And indeed the ability to make a special
download script that re-sizes the images on the fly, turned out to be a
better way to go than storing thumbnails of varying sizes. One trades
disk space for processing overhead.

As a practicing engineer all my working life, it still amazes me that
people will always come up with what amounts to a religious statement
about any particular implementation, that it is universally 'better'.

If that were the case, it would be universally adopted instantly.

Jerry has (for once) made an extremely valid point about directory sizes
as well. Databases are far better at finding things quickly in large
amounts of data: far better than a crude directory search. Once the
overhead in scanning the directory exceeds the extra download
efficiency, you are overall on a loser with flat files.

AND if you run into CPU or RAM limitations, its a lot easier to - say -
move your database to a honking new machine, or upgrade the one you have
than completely re-write all your applications to use the database, that
used to use a file.

I am NOT claiming that a database is te 'right' answer in all cases,
just pointing out that it may be a decision you want to make carefully,
as it is somewhat hard to change later on, and in most cases the extra
overhead on using it is more than compensated by the benefits,
particularly in access control.

Which was the primary concern of the OP.




> --
> Bart
>
>
> But BArt
> View source
> shows the true path to your image, not good
>
>

Reply With Quote
  #42 (permalink)  
Old 07-21-2008
Jerry Stuckle
 
Posts: n/a
Default Re: how to secure documents in server

Bart Van der Donck wrote:
> Jerry Stuckle wrote:
>
>> [...]
>> But don't count MS Access in there. Use a real database. MySQL
>> qualifies. And it has to be configured properly.

>
> Not the real communism ?[*] I partly agree for MS Access [**], but I
> have reasons to believe that my MySQL databases are set up properly.
> This is not a thing I do myself, but sysadmins in one of the giant
> datacenters who stick to one config for the entire park.
>


Not necessarily. Sysadmins cannot correctly set up a system in the
dark. They need communications from the developers on what data is
being stored, how it is being handled, etc.

Unfortunately, most sysadmins know very little about how to tune a
database (not just MySQL) and the results is poor response.

>> BTW - benchmarks tell exactly one thing - how a database runs UNDER
>> THOSE CONDITIONS. Change the conditions and benchmarks aren't valid any
>> more.
>>
>> With that said, under live conditions, I've seen virtually no slowdown
>> when accessing blob data in a database. And in some cases it actually
>> runs faster.

>
> I think the question is how BLOBs are handled. My situation is a
> browser-based application that consists of many read actions (public
> +intranet) and few update/delete actions (admin). Now suppose:
>
> (1) Read actions without BLOB:
> - Application does not load any BLOB data from database.
> - Application uses a var holding the system-path (usr/my/path/to/
> pics/), adds the ID to it, adds .jpg to it, tests if file exists (-e).
> - If yes, use URL-path in stead of system-path and output inside an
> <IMG> to screen.
> - No binary data has to be handled; the major memory use here (if any)
> is the -e check for file existance. But even this could be skipped
> with a workaround.
>


Wrong - binary data is still handled.

> (2) Read actions with BLOB:
> - Load BLOB from column (already a memory-intensive task of its own).
> - Store in some folder (id.).
> - Output with <img>.
>


Not very intensive at all. And you don't store it in some folder.

> (3) Update & delete actions without BLOB:
> - Update/delete instructions stay out of DB, affects file system only.
>


Yep.

> (4) Update & delete actions with BLOB:
> - Update/delete instructions stay out of file system, affects DB only
>


Yep.

> It is my experience that (1) has huge memory benefits compared to
> (2).
>


Memory is nothing nowadays. Sure, you need more memory for the database
to effectively handle large blobs. But a few more megabytes is nothing.


> The difference between (3) and (4) is not so clear; especially because
> MySQL probably optimizes this processus. I think in practice you would
> see that (3) is faster for environment A, and (4) for environment B;
> but never with real considerable differences.
>
> And (1) and (2) are much more important since they count for 99.x% of
> the queries in my case.
>


And the difference is much less than you claim.

>[*] -"Communism is great." -"But look how things went in the USSR."
> -"That was not the real communism."
> [**] Many tendencies in MS Access are a good thermometer for general
> database issues; MS Access is just the first that fails :-)
>
> --
> Bart
>


Databases are optimized for retrieving data - especially from large
groups of data. File systems are just low level databases which handle
small amounts of data (a few files) very well.

One of the big differences is that as your data grows, the database
efficiency remains fairly static. However, file system performance
degrades. Eventually, the file system will actually perform worse than
the database does. Try putting 100K files in one directory. Good luck.
But a database handles 100M rows with ease.

And no, MS Access is not a real database, and is not a good thermometer
for anything other than how bad it really is. Real databases work in an
entirely different way and perform much differently.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================

Reply With Quote
  #43 (permalink)  
Old 07-21-2008
Jerry Stuckle
 
Posts: n/a
Default Re: how to secure documents in server

Bart Van der Donck wrote:
> The Natural Philosopher wrote:
>
>> Bart Van der Donck wrote:
>>> (1) Read actions without BLOB:
>>> - Application does not load any BLOB data from database.
>>> - Application uses a var holding the system-path (usr/my/path/to/
>>> pics/), adds the ID to it, adds .jpg to it, tests if file exists (-e).
>>> - If yes, use URL-path in stead of system-path and output inside an
>>> <IMG> to screen.
>>> - No binary data has to be handled; the major memory use here (if any)
>>> is the -e check for file existance. But even this could be skipped
>>> with a workaround.
>>> (2) Read actions with BLOB:
>>> - Load BLOB from column (already a memory-intensive task of its own).
>>> - Store in some folder (id.).

>
>>> It is my experience that (1) has huge memory benefits compared to
>>> (2).

>> The way I do it, it streams off the database via the unix socket into
>> PHP memory space, and is outputted from there via the web server to the
>> network.
>>
>> VERY little extra PHP or CPU activity is required, but I grant you its
>> probably held in PHP and SQL type memory areas as well as disk cache
>> memory. Its probably NOT held i e.g.apache memory though..apache or
>> whatever will read the stdout of the CGI script that spits it, and juts
>> pass the bytes...and memory is cheap. Cheaper than CPU anyway.

>
> All I do is this:
>
> SELECT id FROM table;
> print "<img src=url/to/$id.jpg>";
>
> Compared to your way:
> - Simpler
> - No need to start new php scripts to output raw binary stream for
> every image
> - No sockets
> - No need to read heavy binary BLOB from DB
> - No chance for possible cache attacks in MySQL, PHP, filesystem or
> Apache
>
> I don't want to sound religious, but I think my way is much better.
>
> --
> Bart
>


It's easier for YOU. And you THINK your way is better. But you've
never really tried with lots of images, have you? In fact, I suspect
you've never really checked it at all with a real database which has
been designed and configured to do this type of operation.

So all you really have to go on is your opinion.

OTOH, some of us have been doing it for years (over 20, in my case,
starting with DB2 on mainframes), and have both designed databases and
configured RDBMS's to handle these operations efficiently. We've seen
the difference in performance, and it isn't what you claim.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================

Reply With Quote
  #44 (permalink)  
Old 07-21-2008
Bart Van der Donck
 
Posts: n/a
Default Re: how to secure documents in server

Jerry Stuckle wrote:

> Bart Van der Donck wrote:
>
>> * *SELECT id FROM table;
>> * *print "<img src=url/to/$id.jpg>";

>
> It's easier for YOU. *And you THINK your way is better. *But you've
> never really tried with lots of images, have you? *


Yes I have, and the tests with BLOBs were disastrous for my case
(although I must admit this study was done already 9 years ago).

Perhaps you're right that my requirements were a bit particular; I'm
facing a read load of a few MB/sec and a modest update/delete load
only peaking at nightly cronjobs. Images are spread on the machine
over 57 directories, the largest directory is holding 22,241 images at
this moment. Maybe it's BSD or the running shell that is optimal (?);
one thing I know -and tested well enough- is that my MySQL cannot
handle this kind of BLOB "abuse" under such conditions.

I can understand it might be desirable that the URL to the image must
be unknown, like Natural Philosopher said, or other requirements which
make this or that approach more preferable. In my case the binaries
are about hotel photos having their telephone number as the name of
the JPG's. This level of protection is acceptable here; performance
critera are more crucial.

> In fact, I suspect you've never really checked it at all with
> a real database which has been designed and configured to do
> this type of operation.
> So all you really have to go on is your opinion.


It's unwise to draw a conclusion from something you only suspect.

But you're right, it's my opinion, but based on experience and
proceeded by quite some study and benchmarks. I think that, for my
case, it was the best possible design under the given requirements.

--
Bart
Reply With Quote
  #45 (permalink)  
Old 07-25-2008
Jerry Stuckle
 
Posts: n/a
Default Re: how to secure documents in server

Jones wrote:
> On Mon, 21 Jul 2008 06:46:33 -0400, Jerry Stuckle <jstucklex@attglobal.net>
> wrote:
>
>> Not necessarily. Sysadmins cannot correctly set up a system in the
>> dark. They need communications from the developers on what data is
>> being stored, how it is being handled, etc.

>
> Once upon a time the term, "system analyst" actually meant something.
> And then Alan Sugar started selling desktop PC's to everyone and now
> everyone thinks they're a "software engineer" just because they can hack
> a few lines of PHP or type ./configure.
>
> The "developers" should have worked it all out before the project even started.
> Thats the REAL problem - here presumably and elsewhere for certain.
>


No, there are still sysadmins, who are responsible for system tuning.
It isn't just the needs of the database developers which needs to be
taken into consideration - there are others, also.

Of course, you're right - nowadays there are too many "system
administrators" who only hold that title because they failed Programming
101.


--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Reply With Quote
  #46 (permalink)  
Old 07-25-2008
Jerry Stuckle
 
Posts: n/a
Default Re: how to secure documents in server

Bart Van der Donck wrote:
> Jerry Stuckle wrote:
>
>> Bart Van der Donck wrote:
>>
>>> SELECT id FROM table;
>>> print "<img src=url/to/$id.jpg>";

>> It's easier for YOU. And you THINK your way is better. But you've
>> never really tried with lots of images, have you?

>
> Yes I have, and the tests with BLOBs were disastrous for my case
> (although I must admit this study was done already 9 years ago).
>


How many is a lot? I've done it with over 50M images (several terabytes
- but that was a mainframe) in a database with no performance
degradation. But the database and RDBMS were designed to do it, also.

And this was under live conditions, averaging > 10K queries/second.

> Perhaps you're right that my requirements were a bit particular; I'm
> facing a read load of a few MB/sec and a modest update/delete load
> only peaking at nightly cronjobs. Images are spread on the machine
> over 57 directories, the largest directory is holding 22,241 images at
> this moment. Maybe it's BSD or the running shell that is optimal (?);
> one thing I know -and tested well enough- is that my MySQL cannot
> handle this kind of BLOB "abuse" under such conditions.
>


Do it all in one directory. That's what the database effectively does.
And it means you don't need to sort images into different directories,
create new directories when the images get too large...

> I can understand it might be desirable that the URL to the image must
> be unknown, like Natural Philosopher said, or other requirements which
> make this or that approach more preferable. In my case the binaries
> are about hotel photos having their telephone number as the name of
> the JPG's. This level of protection is acceptable here; performance
> critera are more crucial.
>
>> In fact, I suspect you've never really checked it at all with
>> a real database which has been designed and configured to do
>> this type of operation.
>> So all you really have to go on is your opinion.

>
> It's unwise to draw a conclusion from something you only suspect.
>
> But you're right, it's my opinion, but based on experience and
> proceeded by quite some study and benchmarks. I think that, for my
> case, it was the best possible design under the given requirements.
>
> --
> Bart


Yep, but your "study" and "benchmarks" were not necessarily accurate.
So neither are your conclusions.

Tune the RDBMS and design the database correctly, and there is virtually
no overhead. After all, all a file system is is a dumb dbms.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Reply With Quote
  #47 (permalink)  
Old 07-26-2008
Geoff Berrow
 
Posts: n/a
Default Re: how to secure documents in server

Message-ID: <g6bf1b$rm5$1@registered.motzarella.org> from Jerry Stuckle
contained the following:

> After all, all a file system is is a dumb dbms.


Don't you mean, a file system is a database?

--
Geoff Berrow 0110001001101100010000000110
001101101011011001000110111101100111001011
100110001101101111001011100111010101101011
http://slipperyhill.co.uk
Reply With Quote
  #48 (permalink)  
Old 07-26-2008
Jerry Stuckle
 
Posts: n/a
Default Re: how to secure documents in server

Geoff Berrow wrote:
> Message-ID: <g6bf1b$rm5$1@registered.motzarella.org> from Jerry Stuckle
> contained the following:
>
>> After all, all a file system is is a dumb dbms.

>
> Don't you mean, a file system is a database?
>


No, the files are a database. A file system is a dump database
management system.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================

Reply With Quote
  #49 (permalink)  
Old 07-26-2008
Jerry Stuckle
 
Posts: n/a
Default Re: how to secure documents in server

Jerry Stuckle wrote:
> Geoff Berrow wrote:
>> Message-ID: <g6bf1b$rm5$1@registered.motzarella.org> from Jerry Stuckle
>> contained the following:
>>
>>> After all, all a file system is is a dumb dbms.

>>
>> Don't you mean, a file system is a database?
>>

>
> No, the files are a database. A file system is a dump database
> management system.
>


Whoops - mistype. That should be "A file system is a dumB database
management system". But come to think of it, it is kind of a dump, also :-)

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================

Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 04:54 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0