This is a discussion on threadsort within the PHP Language forums, part of the PHP Programming Forums category; I recently (loosely) translated a file-based perl cgi bulletin board script to php. Both scripts now produce the same ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I recently (loosely) translated a file-based perl cgi bulletin board
script to php. Both scripts now produce the same output. But the perl version is substantially faster than my php version. Perhaps someone else can figure out how to optimize the sorting of file-based 'threads' of posts, which is where the bottleneck seems to be. Can someone else do a better job? Both these scripts use a readdir to get the contents of the current directory. Every post is a file with a numerical name like 0004 0004.0001 would be a reply to post 0004 0004.0002.0001 would be the first reply to the second reply to 0004 So you want to assemble individual subject threads as sorted, numerically ascending arrays of string. But the collection of all such arrays of strings (threads) want to be sorted in reverse order, so newer threads appear (on the top of the page) before older threads. So you want to sort those based on the first string in each array of string (the root post of that thread). the perl: #generic sort function by Joseph Hall, joseph@5sigma.com sub fieldsort { my ($sep, $cols); if (ref $_[0]) { $sep = '\\s+' } else { $sep = shift; } unless (ref($cols = shift) eq 'ARRAY') { die "fieldsort columns must be in anon array"; } my (@sortcode, @col); my $col = 1; for (@$cols) { my ($a, $b) = /^-/ ? qw(b a) : qw(a b); my $op = /n$/ ? '<=>' : 'cmp'; push @col, (/(\d+)/)[0] - 1; push @sortcode, "\$${a}->[$col] $op \$${b}->[$col]"; $col++; } my $sortfunc = eval "sub { " . join (" or ", @sortcode) . " } "; my $splitfunc = eval 'sub { (split /$sep/o, $_)[@col] } '; return map $_->[0], sort { $sortfunc->() } map [$_, $splitfunc->($_)], @_; } ==slower, uglier php that does produce the same output ============ function how_deep($msg) { $cnt=0; $notdone=$msg; while($notdone) { $notdone = strstr($msg,"."); $msg=substr($notdone,1); $cnt++; } return($cnt); } function threadBase($msg) { $substr = ereg_replace(strstr($msg,"."),"",$msg); return($substr); } function threadsort($files) { $fcnt = count($files); sort($files, SORT_NUMERIC); $list=array(); for($i=0; $i<$fcnt; $i++) { $key = $files[$i]; $depth = $this->how_deep($key); if($depth == 1) { $thread_head = array(); $thread_head[] = $key; $list[$key] = $thread_head; $this->threadkeys[] = $key; } else { $thread_key = $this->threadBase($key); $list[$thread_key][] = $key; } } rsort($this->threadkeys, SORT_NUMERIC); return($list); } |
|
|||
|
"Salmo Bytes" a écrit le 07/01/2004 :
[...] > ==slower, uglier php that does produce the same output ============ > function how_deep($msg) > { > $cnt=0; > $notdone=$msg; > while($notdone) > { > $notdone = strstr($msg,"."); > $msg=substr($notdone,1); > $cnt++; > } > return($cnt); > } I would do this like : function how_deep($msg) { return sizeof(explode($msg,".")); } -- Have you read the manual? http://www.php.net/manual/en/ |
|
|||
|
"Salmo Bytes" <devnull@montana-riverboats.com> wrote in message
news:a8c6175f.0401061504.234bfdd2@posting.google.c om... > I recently (loosely) translated a file-based perl cgi bulletin board > script to php. > > Both scripts now produce the same output. But the > perl version is substantially faster than my php > version. Perhaps someone else can figure out how to > optimize the sorting of file-based 'threads' of posts, > which is where the bottleneck seems to be. > Can someone else do a better job? > > > Both these scripts use a readdir to get the contents of > the current directory. Every post is a file with > a numerical name like 0004 > 0004.0001 would be a reply to post 0004 > 0004.0002.0001 would be the first reply to the second reply to 0004 > I created a bunch of random files, acording to your scheme (xx - xx.xx - xx.xx.xx ) and read through the entire directory, and added each filename to an array $aryFiles[] = $fileName; then did $sortFiles = arsort($aryFiles); then they all came out in this order 0003.0001 0003.0001.0001 0003.0001.0002 0003.0002 0003.0002.0001 is this what you are looking for? -- Mike Bradley http://www.gzentools.com -- free online php tools |
|
|||
|
"Salmo Bytes" a écrit le 07/01/2004 :
[...] > function threadBase($msg) > { > $substr = ereg_replace(strstr($msg,"."),"",$msg); > return($substr); > } I would have done : function threadBase($msg) { $substr = explode(".",$msg); return($substr[0]); } But maybe you should look at CountScubula post before... -- Have you read the manual? http://www.php.net/manual/en/ |
|
|||
|
"Jedi121" <jedi121news@free.fr.Removethis> wrote in message
news:mesnews.38177d41.cebf82cb.430.2689@free.fr.Re movethis... > "Salmo Bytes" a écrit le 07/01/2004 : > [...] > > function threadBase($msg) > > { > > $substr = ereg_replace(strstr($msg,"."),"",$msg); > > return($substr); > > } > > I would have done : > function threadBase($msg) > { > $substr = explode(".",$msg); > return($substr[0]); > } > > But maybe you should look at CountScubula post before... > > -- > Have you read the manual? > http://www.php.net/manual/en/ Did I miss something? I did realy read his entire post (sorry, I dont want to port code) Did I answer a question that didn't exist? sorry, I will try and read things a little more. -- Mike Bradley http://www.gzentools.com -- free online php tools |
|
|||
|
On 6 Jan 2004 15:04:03 -0800, devnull@montana-riverboats.com (Salmo Bytes)
wrote: >I recently (loosely) translated a file-based perl cgi bulletin board >script to php. > >Both scripts now produce the same output. But the >perl version is substantially faster than my php >version. Perhaps someone else can figure out how to >optimize the sorting of file-based 'threads' of posts, >which is where the bottleneck seems to be. >Can someone else do a better job? > > >Both these scripts use a readdir to get the contents of >the current directory. Every post is a file with >a numerical name like 0004 >0004.0001 would be a reply to post 0004 >0004.0002.0001 would be the first reply to the second reply to 0004 > >So you want to assemble individual subject threads as >sorted, numerically ascending arrays of string. > >But the collection of all such arrays of strings (threads) want to be >sorted in reverse order, so newer threads appear (on the top of the page) >before older threads. So you want to sort those based on the >first string in each array of string (the root post of that thread). How about this; the array in the following code is sorted how I think you want it sorted (descending by first component, ascending by the reply components?), shuffled, then resorted to return to the original sort order: <pre> <?php $a = array( '0003', '0003.0001.0001', '0003.0001.0002', '0003.0001.0002.0001', '0003.0001.0002.0002', '0003.0001.0002.0003', '0003.0002.0001', '0003.0002.0002', '0003.0003.0001', '0002', '0002.0001.0001', '0002.0001.0002', '0002.0002.0001', '0002.0002.0002', '0001', '0001.0001.0001', '0001.0001.0002', '0001.0002.0001', '0001.0002.0002' ); shuffle($a); var_dump($a); function threadsort($a, $b) { $thread_a = substr($a, 0, 4); $thread_b = substr($b, 0, 4); $reply_a = substr($a, 5); $reply_b = substr($b, 5); if ($thread_cmp = strcmp($thread_a, $thread_b)) return -$thread_cmp; else return strcmp($reply_a, $reply_b); } usort($a, 'threadsort'); var_dump($a); ?> </pre> Outputs: array(19) { [0]=> string(14) "0001.0001.0002" [1]=> string(14) "0002.0001.0002" [2]=> string(4) "0001" [3]=> string(14) "0002.0001.0001" [4]=> string(19) "0003.0001.0002.0002" [5]=> string(14) "0003.0002.0002" [6]=> string(14) "0001.0001.0001" [7]=> string(14) "0001.0002.0002" [8]=> string(19) "0003.0001.0002.0001" [9]=> string(14) "0002.0002.0001" [10]=> string(14) "0003.0002.0001" [11]=> string(14) "0003.0001.0001" [12]=> string(14) "0003.0001.0002" [13]=> string(4) "0002" [14]=> string(19) "0003.0001.0002.0003" [15]=> string(14) "0003.0003.0001" [16]=> string(4) "0003" [17]=> string(14) "0002.0002.0002" [18]=> string(14) "0001.0002.0001" } array(19) { [0]=> string(4) "0003" [1]=> string(14) "0003.0001.0001" [2]=> string(14) "0003.0001.0002" [3]=> string(19) "0003.0001.0002.0001" [4]=> string(19) "0003.0001.0002.0002" [5]=> string(19) "0003.0001.0002.0003" [6]=> string(14) "0003.0002.0001" [7]=> string(14) "0003.0002.0002" [8]=> string(14) "0003.0003.0001" [9]=> string(4) "0002" [10]=> string(14) "0002.0001.0001" [11]=> string(14) "0002.0001.0002" [12]=> string(14) "0002.0002.0001" [13]=> string(14) "0002.0002.0002" [14]=> string(4) "0001" [15]=> string(14) "0001.0001.0001" [16]=> string(14) "0001.0001.0002" [17]=> string(14) "0001.0002.0001" [18]=> string(14) "0001.0002.0002" } Anywhere close? Assumes the fields are all zero-padded to the same length. If they aren't, use the natural sort comparison functions (strncmp), which should produce the expected result. -- Andy Hassall (andy@andyh.co.uk) icq(5747695) (http://www.andyh.co.uk) Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space) |
|
|||
|
"CountScubula" a écrit le 07/01/2004 :
> Did I miss something? I did realy read his entire post (sorry, I dont want > to port code) > Did I answer a question that didn't exist? sorry, I will try and read > things a little more. No, don't misunderstand me, I realised I was replying small bit at a time instead of looking at the overall problem. So my meaning was for Salmo Bytes to read your Post to answer his question and still look at my posts to get ideas how someone else would write his functions. Have a nice night, Jedi -- Have you read the manual? http://www.php.net/manual/en/ |
|
|||
|
"Jedi121" <jedi121news@free.fr.Removethis> wrote in message
news:mesnews.38677d41.d29a298f.432.2689@free.fr.Re movethis... > "CountScubula" a écrit le 07/01/2004 : > > Did I miss something? I did realy read his entire post (sorry, I dont want > > to port code) > > Did I answer a question that didn't exist? sorry, I will try and read > > things a little more. > > No, don't misunderstand me, I realised I was replying small bit at a > time instead of looking at the overall problem. So my meaning was for > Salmo Bytes to read your Post to answer his question and still look at > my posts to get ideas how someone else would write his functions. > > Have a nice night, > Jedi > > -- > Have you read the manual? > http://www.php.net/manual/en/ > IC, Sometimes I end up answering a question that didn't exists, I sometimes lose sight of a question sometimes. And yes, there are 1,001 ways to skin a cat (2,101 in the thailand) -- Mike Bradley http://www.gzentools.com -- free online php tools |