separate table vs. array-like list field

This is a discussion on separate table vs. array-like list field within the MySQL Database forums, part of the Database Forums category; I'm very, very new to mysql and I've been toying with it the last couple weeks. I've ...


Go Back   Usenet Forums > Database Forums > MySQL Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-20-2006
Rick
 
Posts: n/a
Default separate table vs. array-like list field

I'm very, very new to mysql and I've been toying with it the last
couple weeks. I've hit upon a theoretical question that I would like to
run by more experienced users.

One of the things I've been doing to teach myself is creating databases
and tables that might be similar to databases powering big, popular
sites like Flickr or Digg. For example, I created a user table (id
[primary key], name, nickname, email, location, birthday, profile,
etc.) and then played around with querying it in various ways.

One of the things I did to learn how to use "join" was to create a
separate table for blocked users. Sites like Flickr and Digg often give
users the ability to block other users. The way I implemented this was
to create a table with the blocker's id and the "blockee's" id. Using
that, I can query the blocked users table and cross-reference it to the
user list and come up with the list of blocked users for any specified
user. Pretty cool.

So I wonder. On a big site with a million+ users, would having to query
a separate table for a list of blocked users like that bog down the
site or would it be preferable (in terms of speed and efficiency) to
add a field to the user table that lists the id numbers of that user's
blocked users, sort of a pseudo-array that I collapse and expand when
needed (e.g., a blocked user array-like list might look like
"150|11516|61878|9091|41064" etc.

Any thoughts? Or am I still so new at this that this scenario doesn't
make sense?

--Rick

Reply With Quote
  #2 (permalink)  
Old 03-20-2006
Bill Karwin
 
Posts: n/a
Default Re: separate table vs. array-like list field

"Rick" <inkswamp@hotmail.com> wrote in message
news:1142819964.329383.55800@t31g2000cwb.googlegro ups.com...
> So I wonder. On a big site with a million+ users, would having to query
> a separate table for a list of blocked users like that bog down the
> site or would it be preferable (in terms of speed and efficiency) to
> add a field to the user table that lists the id numbers of that user's
> blocked users, sort of a pseudo-array that I collapse and expand when
> needed (e.g., a blocked user array-like list might look like
> "150|11516|61878|9091|41064" etc.


It's better to use another table. There are several reasons for this.
- You can create constraints so that you know each blocker/blockee id in the
table corresponds to an existing entry in the users table.
- You don't have to worry about the string growing too long to fit in the
field; just add records as needed.
- You create and use your database based on the data model, not the physical
model (this is kind of an abstract reason and hard to appreciate if you're
new to database programming, but it more makes sense after you've had to
manage badly-designed databases for a while).
- Doing the join is something an RDBMS is designed to do efficiently.
Expanding/collapsing a long array of numbers into a string is not. There
are internal features (indexes) that may actually make the join be more
speedy and efficient than implementing the same model using a string like
you describe.

Also, adding and deleting elements by inserting/deleting records is easier
and more efficient. Take for example deleting. You'd have to fetch a
really long string, search for the relevant id number, make a new string
consisting of the prefix and suffix of the string, and then update the table
to save the new string. Compare with deleting a row simply by using DELETE
with appropriate WHERE criteria.

No doubt there are other reasons why it's better to use the separate table
instead of the list-in-a-string.

Regards,
Bill K.


Reply With Quote
  #3 (permalink)  
Old 03-20-2006
Jerry Stuckle
 
Posts: n/a
Default Re: separate table vs. array-like list field

Rick wrote:
> I'm very, very new to mysql and I've been toying with it the last
> couple weeks. I've hit upon a theoretical question that I would like to
> run by more experienced users.
>
> One of the things I've been doing to teach myself is creating databases
> and tables that might be similar to databases powering big, popular
> sites like Flickr or Digg. For example, I created a user table (id
> [primary key], name, nickname, email, location, birthday, profile,
> etc.) and then played around with querying it in various ways.
>
> One of the things I did to learn how to use "join" was to create a
> separate table for blocked users. Sites like Flickr and Digg often give
> users the ability to block other users. The way I implemented this was
> to create a table with the blocker's id and the "blockee's" id. Using
> that, I can query the blocked users table and cross-reference it to the
> user list and come up with the list of blocked users for any specified
> user. Pretty cool.
>
> So I wonder. On a big site with a million+ users, would having to query
> a separate table for a list of blocked users like that bog down the
> site or would it be preferable (in terms of speed and efficiency) to
> add a field to the user table that lists the id numbers of that user's
> blocked users, sort of a pseudo-array that I collapse and expand when
> needed (e.g., a blocked user array-like list might look like
> "150|11516|61878|9091|41064" etc.
>
> Any thoughts? Or am I still so new at this that this scenario doesn't
> make sense?
>
> --Rick
>


First of all, you should check into database normalization. It helps
understand how to design databases.

Additionally, it is much more efficient to let the database do the
searches rather than try to explode a field yourself, especially when
you get into a lot of id's.


--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Reply With Quote
  #4 (permalink)  
Old 03-20-2006
Gordon Burditt
 
Posts: n/a
Default Re: separate table vs. array-like list field

>I'm very, very new to mysql and I've been toying with it the last
>couple weeks. I've hit upon a theoretical question that I would like to
>run by more experienced users.
>
>One of the things I've been doing to teach myself is creating databases
>and tables that might be similar to databases powering big, popular
>sites like Flickr or Digg. For example, I created a user table (id
>[primary key], name, nickname, email, location, birthday, profile,
>etc.) and then played around with querying it in various ways.
>
>One of the things I did to learn how to use "join" was to create a
>separate table for blocked users. Sites like Flickr and Digg often give
>users the ability to block other users. The way I implemented this was
>to create a table with the blocker's id and the "blockee's" id. Using
>that, I can query the blocked users table and cross-reference it to the
>user list and come up with the list of blocked users for any specified
>user. Pretty cool.


You can also do other things joining this table against other
tables, like "get me a list of articles whose authors are NOT
on the blocked list of this user".

>So I wonder. On a big site with a million+ users, would having to query
>a separate table for a list of blocked users like that bog down the
>site or would it be preferable (in terms of speed and efficiency) to
>add a field to the user table that lists the id numbers of that user's
>blocked users, sort of a pseudo-array that I collapse and expand when
>needed (e.g., a blocked user array-like list might look like
>"150|11516|61878|9091|41064" etc.


Consider what happens when you have a million+ users, EACH OF WHOM
BLOCK HALF OF THE OTHER USERS. All that exploding and unexploding
can get expensive. And you have to be prepared for the huge string
lengths even if one user does that. Now, how do you retrieve a
list of articles by unblocked authors? Assuming there's a reason
to do it, how do you retrieve a list of users and how many other
users blocked them?

>Any thoughts? Or am I still so new at this that this scenario doesn't
>make sense?


Indexes can speed up lookups enormously. Your blocked user list
with an index in (blocking user, blocked user) can quickly check
the presence of an entry even if there's half a trillion entries
in the table.

Gordon L. Burditt
Reply With Quote
  #5 (permalink)  
Old 03-20-2006
Rick
 
Posts: n/a
Default Re: separate table vs. array-like list field

Thank you very much, all three of you.

Strangely, I would have bet (prior to this) that the pseudo-array
approach would have been faster, but I appreciate the three of you
taking the time to answer so thoroughly. What you've written makes a
lot of sense and shows me where I'm making assumptions I ought not to
make.

Reply With Quote
  #6 (permalink)  
Old 03-20-2006
Chuck Anderson
 
Posts: n/a
Default Re: separate table vs. array-like list field

Rick wrote:

>Thank you very much, all three of you.
>
>Strangely, I would have bet (prior to this) that the pseudo-array
>approach would have been faster, but I appreciate the three of you
>taking the time to answer so thoroughly. What you've written makes a
>lot of sense and shows me where I'm making assumptions I ought not to
>make.
>
>
>

And thanks for asking this question. As a novice user of databases I
have run into situations where storing comma separated data in a column
seemed to make sense, and I have wondered about this myself.

--
*****************************
Chuck Anderson • Boulder, CO
http://www.CycleTourist.com
Integrity is obvious.
The lack of it is common.
*****************************
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 06:52 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0