CSV speed

This is a discussion on CSV speed within the PHP General forums, part of the PHP Programming Forums category; I have about 10 csv files I need to open to access data. It takes a lot of time to ...


Go Back   Usenet Forums > PHP Programming Forums > PHP General

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-11-2008
Danny Brow
 
Posts: n/a
Default CSV speed

I have about 10 csv files I need to open to access data. It takes a lot
of time to search each file for the values I need. Would it be best to
just dump all the cvs files to an SQL db and then just grab what I need
from there? I'm starting to think it would make a lot of sense. What do
you guys think?

Thanks,
Dan


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply With Quote
  #2 (permalink)  
Old 03-11-2008
Wolf
 
Posts: n/a
Default Re: [PHP] CSV speed

Danny Brow wrote:
> I have about 10 csv files I need to open to access data. It takes a lot
> of time to search each file for the values I need. Would it be best to
> just dump all the cvs files to an SQL db and then just grab what I need
> from there? I'm starting to think it would make a lot of sense. What do
> you guys think?
>
> Thanks,
> Dan



Dan,

I can tell you that depending on the size of your files is going to
dictate the route you want to go. I have a CSV with 568,000+ lines with
19 different pieces to each line. The files are around 180M apiece and
it takes my server about 2 seconds to run a system grep against the
files. I can run a recursive call 7 times against a MySQL database with
the same information and it takes it about 4 seconds.

IF you have system call ability, a grep wouldn't be bad, otherwise I'd
suggest loading the csv files into MySQL tables and checking them for
the information, then dropping the tables when you get the next files.
You can backup the databases such as a cron job overnight even.

HTH,
Wolf

Reply With Quote
  #3 (permalink)  
Old 03-11-2008
Danny Brow
 
Posts: n/a
Default Re: [PHP] CSV speed

On Mon, 2008-03-10 at 22:36 -0400, Wolf wrote:
> Danny Brow wrote:
> > I have about 10 csv files I need to open to access data. It takes a lot
> > of time to search each file for the values I need. Would it be best to
> > just dump all the cvs files to an SQL db and then just grab what I need
> > from there? I'm starting to think it would make a lot of sense. What do
> > you guys think?
> >
> > Thanks,
> > Dan

>
>
> Dan,
>
> I can tell you that depending on the size of your files is going to
> dictate the route you want to go. I have a CSV with 568,000+ lines with
> 19 different pieces to each line. The files are around 180M apiece and
> it takes my server about 2 seconds to run a system grep against the
> files. I can run a recursive call 7 times against a MySQL database with
> the same information and it takes it about 4 seconds.
>
> IF you have system call ability, a grep wouldn't be bad, otherwise I'd
> suggest loading the csv files into MySQL tables and checking them for
> the information, then dropping the tables when you get the next files.
> You can backup the databases such as a cron job overnight even.
>
> HTH,
> Wolf
>
>



Thanks that sounds like a good idea. I'm still plugging away with how I
started. I want to know how much faster it will be going with a db. I
was actually thinking of using diff for each updated file to and upload
that to the DB.

Dan




--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply With Quote
  #4 (permalink)  
Old 03-11-2008
Greg Donald
 
Posts: n/a
Default Re: [PHP] CSV speed

On Mon, Mar 10, 2008 at 8:09 PM, Danny Brow <maillist@fullmotions.com> wrote:
> I have about 10 csv files I need to open to access data. It takes a lot
> of time to search each file for the values I need. Would it be best to
> just dump all the cvs files to an SQL db and then just grab what I need
> from there? I'm starting to think it would make a lot of sense. What do
> you guys think?


grep foo whatever.csv | php ./script.php


--
Greg Donald
http://destiney.com/
Reply With Quote
  #5 (permalink)  
Old 03-11-2008
Wolf
 
Posts: n/a
Default Re: [PHP] CSV speed

Danny Brow wrote:
> On Mon, 2008-03-10 at 22:36 -0400, Wolf wrote:
>> Danny Brow wrote:
>>> I have about 10 csv files I need to open to access data. It takes a lot
>>> of time to search each file for the values I need. Would it be best to
>>> just dump all the cvs files to an SQL db and then just grab what I need
>>> from there? I'm starting to think it would make a lot of sense. What do
>>> you guys think?
>>>
>>> Thanks,
>>> Dan

>>
>> Dan,
>>
>> I can tell you that depending on the size of your files is going to
>> dictate the route you want to go. I have a CSV with 568,000+ lines with
>> 19 different pieces to each line. The files are around 180M apiece and
>> it takes my server about 2 seconds to run a system grep against the
>> files. I can run a recursive call 7 times against a MySQL database with
>> the same information and it takes it about 4 seconds.
>>
>> IF you have system call ability, a grep wouldn't be bad, otherwise I'd
>> suggest loading the csv files into MySQL tables and checking them for
>> the information, then dropping the tables when you get the next files.
>> You can backup the databases such as a cron job overnight even.
>>
>> HTH,
>> Wolf
>>
>>

>
>
> Thanks that sounds like a good idea. I'm still plugging away with how I
> started. I want to know how much faster it will be going with a db. I
> was actually thinking of using diff for each updated file to and upload
> that to the DB.
>
> Dan



Running a diff and loading the changes wouldn't be a bad way to go, one
thing to take into account as well would be any tracking of the changes
that you would need to do as well. IE, if you update an entry, write it
to a changes table with a date attached as to when it happened.

Wolf

Reply With Quote
  #6 (permalink)  
Old 03-11-2008
Wolf
 
Posts: n/a
Default Re: [PHP] CSV speed

Greg Donald wrote:
> On Mon, Mar 10, 2008 at 8:09 PM, Danny Brow <maillist@fullmotions.com> wrote:
>> I have about 10 csv files I need to open to access data. It takes a lot
>> of time to search each file for the values I need. Would it be best to
>> just dump all the cvs files to an SQL db and then just grab what I need
>> from there? I'm starting to think it would make a lot of sense. What do
>> you guys think?

>
> grep foo whatever.csv | php ./script.php


While that could work if the server is set up to allow you to run system
calls, but you don't find that available outside of test environments.
Also depending on the file sizes, you'll see some differences.

Wolf

Reply With Quote
  #7 (permalink)  
Old 03-11-2008
Arvids Godjuks
 
Posts: n/a
Default Re: [PHP] CSV speed

I should say that running 7 quueries against 200MB table with ~600 000 rows
for 4 seconds looks to me as if there was no indexes and very poor database
design (i mean that CVS file is dumped into database incorrectly). Or you'r
server is Pentium 100 :)

P.S. Having myself a 1.5GB database with some tables 300MB+, having 1070
queries/sec average for last month.

2008/3/11, Wolf <LoneWolf@nc.rr.com>:
>
> Danny Brow wrote:
> > I have about 10 csv files I need to open to access data. It takes a lot
> > of time to search each file for the values I need. Would it be best to
> > just dump all the cvs files to an SQL db and then just grab what I need
> > from there? I'm starting to think it would make a lot of sense. What do
> > you guys think?
> >
> > Thanks,
> > Dan

>
>
>
> Dan,
>
> I can tell you that depending on the size of your files is going to
> dictate the route you want to go. I have a CSV with 568,000+ lines with
> 19 different pieces to each line. The files are around 180M apiece and
> it takes my server about 2 seconds to run a system grep against the
> files. I can run a recursive call 7 times against a MySQL database with
> the same information and it takes it about 4 seconds.
>
> IF you have system call ability, a grep wouldn't be bad, otherwise I'd
> suggest loading the csv files into MySQL tables and checking them for
> the information, then dropping the tables when you get the next files.
> You can backup the databases such as a cron job overnight even.
>
> HTH,
>
> Wolf
>
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


Reply With Quote
  #8 (permalink)  
Old 03-11-2008
Ray Hauge
 
Posts: n/a
Default Re: [PHP] CSV speed

Wolf wrote:
> Danny Brow wrote:
>> I have about 10 csv files I need to open to access data. It takes a lot
>> of time to search each file for the values I need. Would it be best to
>> just dump all the cvs files to an SQL db and then just grab what I need
>> from there? I'm starting to think it would make a lot of sense. What do
>> you guys think?
>>
>> Thanks,
>> Dan

>
>
> Dan,
>
> I can tell you that depending on the size of your files is going to
> dictate the route you want to go. I have a CSV with 568,000+ lines with
> 19 different pieces to each line. The files are around 180M apiece and
> it takes my server about 2 seconds to run a system grep against the
> files. I can run a recursive call 7 times against a MySQL database with
> the same information and it takes it about 4 seconds.
>
> IF you have system call ability, a grep wouldn't be bad, otherwise I'd
> suggest loading the csv files into MySQL tables and checking them for
> the information, then dropping the tables when you get the next files.
> You can backup the databases such as a cron job overnight even.
>
> HTH,
> Wolf
>
>


If you do go the MySQL route, MySQL can import CSV files natively, and
it'll be a lot faster than doing it through PHP. Just look up the
syntax for the LOAD DATA INFILE command... or look here.

http://dev.mysql.com/doc/refman/5.1/en/load-data.html

Here's an example:

LOAD DATA LOCAL INFILE '/importfile.csv'
INTO TABLE test_table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(field1, filed2, field3);

I've had to do imports of a million or more records from CSV files, and
PHP is a lot slower than MySQL at importing them :)

--
Ray Hauge
www.primateapplications.com
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 09:10 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0