threadsort

This is a discussion on threadsort within the PHP Language forums, part of the PHP Programming Forums category; I recently (loosely) translated a file-based perl cgi bulletin board script to php. Both scripts now produce the same ...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-07-2004
Salmo Bytes
 
Posts: n/a
Default threadsort

I recently (loosely) translated a file-based perl cgi bulletin board
script to php.

Both scripts now produce the same output. But the
perl version is substantially faster than my php
version. Perhaps someone else can figure out how to
optimize the sorting of file-based 'threads' of posts,
which is where the bottleneck seems to be.
Can someone else do a better job?


Both these scripts use a readdir to get the contents of
the current directory. Every post is a file with
a numerical name like 0004
0004.0001 would be a reply to post 0004
0004.0002.0001 would be the first reply to the second reply to 0004

So you want to assemble individual subject threads as
sorted, numerically ascending arrays of string.

But the collection of all such arrays of strings (threads) want to be
sorted in reverse order, so newer threads appear (on the top of the page)
before older threads. So you want to sort those based on the
first string in each array of string (the root post of that thread).

the perl:
#generic sort function by Joseph Hall, joseph@5sigma.com
sub fieldsort {
my ($sep, $cols);
if (ref $_[0]) {
$sep = '\\s+'
}
else {
$sep = shift;
}
unless (ref($cols = shift) eq 'ARRAY') {
die "fieldsort columns must be in anon array";
}
my (@sortcode, @col);
my $col = 1;
for (@$cols) {
my ($a, $b) = /^-/ ? qw(b a) : qw(a b);
my $op = /n$/ ? '<=>' : 'cmp';
push @col, (/(\d+)/)[0] - 1;
push @sortcode, "\$${a}->[$col] $op \$${b}->[$col]";
$col++;
}
my $sortfunc = eval "sub { " . join (" or ", @sortcode) . " } ";
my $splitfunc = eval 'sub { (split /$sep/o, $_)[@col] } ';
return
map $_->[0],
sort { $sortfunc->() }
map [$_, $splitfunc->($_)],
@_;

}

==slower, uglier php that does produce the same output ============
function how_deep($msg)
{
$cnt=0;
$notdone=$msg;
while($notdone)
{
$notdone = strstr($msg,".");
$msg=substr($notdone,1);
$cnt++;
}
return($cnt);
}

function threadBase($msg)
{
$substr = ereg_replace(strstr($msg,"."),"",$msg);
return($substr);
}

function threadsort($files)
{
$fcnt = count($files);
sort($files, SORT_NUMERIC);
$list=array();
for($i=0; $i<$fcnt; $i++)
{
$key = $files[$i];
$depth = $this->how_deep($key);
if($depth == 1)
{
$thread_head = array();
$thread_head[] = $key;
$list[$key] = $thread_head;
$this->threadkeys[] = $key;
}
else
{
$thread_key = $this->threadBase($key);
$list[$thread_key][] = $key;
}
}
rsort($this->threadkeys, SORT_NUMERIC);
return($list);
}
Reply With Quote
  #2 (permalink)  
Old 01-07-2004
Jedi121
 
Posts: n/a
Default Re: threadsort

"Salmo Bytes" a écrit le 07/01/2004 :
[...]
> ==slower, uglier php that does produce the same output ============
> function how_deep($msg)
> {
> $cnt=0;
> $notdone=$msg;
> while($notdone)
> {
> $notdone = strstr($msg,".");
> $msg=substr($notdone,1);
> $cnt++;
> }
> return($cnt);
> }


I would do this like :
function how_deep($msg)
{
return sizeof(explode($msg,"."));
}

--
Have you read the manual?
http://www.php.net/manual/en/

Reply With Quote
  #3 (permalink)  
Old 01-07-2004
CountScubula
 
Posts: n/a
Default Re: threadsort

"Salmo Bytes" <devnull@montana-riverboats.com> wrote in message
news:a8c6175f.0401061504.234bfdd2@posting.google.c om...
> I recently (loosely) translated a file-based perl cgi bulletin board
> script to php.
>
> Both scripts now produce the same output. But the
> perl version is substantially faster than my php
> version. Perhaps someone else can figure out how to
> optimize the sorting of file-based 'threads' of posts,
> which is where the bottleneck seems to be.
> Can someone else do a better job?
>
>
> Both these scripts use a readdir to get the contents of
> the current directory. Every post is a file with
> a numerical name like 0004
> 0004.0001 would be a reply to post 0004
> 0004.0002.0001 would be the first reply to the second reply to 0004
>



I created a bunch of random files, acording to your scheme (xx - xx.xx -
xx.xx.xx )
and read through the entire directory, and added each filename to an array

$aryFiles[] = $fileName;

then did
$sortFiles = arsort($aryFiles);

then they all came out in this order

0003.0001
0003.0001.0001
0003.0001.0002
0003.0002
0003.0002.0001

is this what you are looking for?

--
Mike Bradley
http://www.gzentools.com -- free online php tools


Reply With Quote
  #4 (permalink)  
Old 01-07-2004
Jedi121
 
Posts: n/a
Default Re: threadsort

"Salmo Bytes" a écrit le 07/01/2004 :
[...]
> function threadBase($msg)
> {
> $substr = ereg_replace(strstr($msg,"."),"",$msg);
> return($substr);
> }


I would have done :
function threadBase($msg)
{
$substr = explode(".",$msg);
return($substr[0]);
}

But maybe you should look at CountScubula post before...

--
Have you read the manual?
http://www.php.net/manual/en/

Reply With Quote
  #5 (permalink)  
Old 01-07-2004
CountScubula
 
Posts: n/a
Default Re: threadsort

"Jedi121" <jedi121news@free.fr.Removethis> wrote in message
news:mesnews.38177d41.cebf82cb.430.2689@free.fr.Re movethis...
> "Salmo Bytes" a écrit le 07/01/2004 :
> [...]
> > function threadBase($msg)
> > {
> > $substr = ereg_replace(strstr($msg,"."),"",$msg);
> > return($substr);
> > }

>
> I would have done :
> function threadBase($msg)
> {
> $substr = explode(".",$msg);
> return($substr[0]);
> }
>
> But maybe you should look at CountScubula post before...
>
> --
> Have you read the manual?
> http://www.php.net/manual/en/



Did I miss something? I did realy read his entire post (sorry, I dont want
to port code)
Did I answer a question that didn't exist? sorry, I will try and read
things a little more.

--
Mike Bradley
http://www.gzentools.com -- free online php tools



Reply With Quote
  #6 (permalink)  
Old 01-07-2004
Andy Hassall
 
Posts: n/a
Default Re: threadsort

On 6 Jan 2004 15:04:03 -0800, devnull@montana-riverboats.com (Salmo Bytes)
wrote:

>I recently (loosely) translated a file-based perl cgi bulletin board
>script to php.
>
>Both scripts now produce the same output. But the
>perl version is substantially faster than my php
>version. Perhaps someone else can figure out how to
>optimize the sorting of file-based 'threads' of posts,
>which is where the bottleneck seems to be.
>Can someone else do a better job?
>
>
>Both these scripts use a readdir to get the contents of
>the current directory. Every post is a file with
>a numerical name like 0004
>0004.0001 would be a reply to post 0004
>0004.0002.0001 would be the first reply to the second reply to 0004
>
>So you want to assemble individual subject threads as
>sorted, numerically ascending arrays of string.
>
>But the collection of all such arrays of strings (threads) want to be
>sorted in reverse order, so newer threads appear (on the top of the page)
>before older threads. So you want to sort those based on the
>first string in each array of string (the root post of that thread).


How about this; the array in the following code is sorted how I think you want
it sorted (descending by first component, ascending by the reply components?),
shuffled, then resorted to return to the original sort order:

<pre>
<?php
$a = array(
'0003',
'0003.0001.0001',
'0003.0001.0002',
'0003.0001.0002.0001',
'0003.0001.0002.0002',
'0003.0001.0002.0003',
'0003.0002.0001',
'0003.0002.0002',
'0003.0003.0001',
'0002',
'0002.0001.0001',
'0002.0001.0002',
'0002.0002.0001',
'0002.0002.0002',
'0001',
'0001.0001.0001',
'0001.0001.0002',
'0001.0002.0001',
'0001.0002.0002'
);

shuffle($a);

var_dump($a);

function threadsort($a, $b) {
$thread_a = substr($a, 0, 4);
$thread_b = substr($b, 0, 4);

$reply_a = substr($a, 5);
$reply_b = substr($b, 5);

if ($thread_cmp = strcmp($thread_a, $thread_b))
return -$thread_cmp;
else
return strcmp($reply_a, $reply_b);
}

usort($a, 'threadsort');
var_dump($a);

?>
</pre>

Outputs:

array(19) {
[0]=>
string(14) "0001.0001.0002"
[1]=>
string(14) "0002.0001.0002"
[2]=>
string(4) "0001"
[3]=>
string(14) "0002.0001.0001"
[4]=>
string(19) "0003.0001.0002.0002"
[5]=>
string(14) "0003.0002.0002"
[6]=>
string(14) "0001.0001.0001"
[7]=>
string(14) "0001.0002.0002"
[8]=>
string(19) "0003.0001.0002.0001"
[9]=>
string(14) "0002.0002.0001"
[10]=>
string(14) "0003.0002.0001"
[11]=>
string(14) "0003.0001.0001"
[12]=>
string(14) "0003.0001.0002"
[13]=>
string(4) "0002"
[14]=>
string(19) "0003.0001.0002.0003"
[15]=>
string(14) "0003.0003.0001"
[16]=>
string(4) "0003"
[17]=>
string(14) "0002.0002.0002"
[18]=>
string(14) "0001.0002.0001"
}
array(19) {
[0]=>
string(4) "0003"
[1]=>
string(14) "0003.0001.0001"
[2]=>
string(14) "0003.0001.0002"
[3]=>
string(19) "0003.0001.0002.0001"
[4]=>
string(19) "0003.0001.0002.0002"
[5]=>
string(19) "0003.0001.0002.0003"
[6]=>
string(14) "0003.0002.0001"
[7]=>
string(14) "0003.0002.0002"
[8]=>
string(14) "0003.0003.0001"
[9]=>
string(4) "0002"
[10]=>
string(14) "0002.0001.0001"
[11]=>
string(14) "0002.0001.0002"
[12]=>
string(14) "0002.0002.0001"
[13]=>
string(14) "0002.0002.0002"
[14]=>
string(4) "0001"
[15]=>
string(14) "0001.0001.0001"
[16]=>
string(14) "0001.0001.0002"
[17]=>
string(14) "0001.0002.0001"
[18]=>
string(14) "0001.0002.0002"
}

Anywhere close?

Assumes the fields are all zero-padded to the same length. If they aren't, use
the natural sort comparison functions (strncmp), which should produce the
expected result.

--
Andy Hassall (andy@andyh.co.uk) icq(5747695) (http://www.andyh.co.uk)
Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space)
Reply With Quote
  #7 (permalink)  
Old 01-07-2004
sandy pittendrigh
 
Posts: n/a
Default Re: threadsort

Yes, that's clean, compact and the right
output. I'll plug that into my code and see
how it runs.

Reply With Quote
  #8 (permalink)  
Old 01-07-2004
Jedi121
 
Posts: n/a
Default Re: threadsort

"CountScubula" a écrit le 07/01/2004 :
> Did I miss something? I did realy read his entire post (sorry, I dont want
> to port code)
> Did I answer a question that didn't exist? sorry, I will try and read
> things a little more.


No, don't misunderstand me, I realised I was replying small bit at a
time instead of looking at the overall problem. So my meaning was for
Salmo Bytes to read your Post to answer his question and still look at
my posts to get ideas how someone else would write his functions.

Have a nice night,
Jedi

--
Have you read the manual?
http://www.php.net/manual/en/

Reply With Quote
  #9 (permalink)  
Old 01-07-2004
CountScubula
 
Posts: n/a
Default Re: threadsort

"Jedi121" <jedi121news@free.fr.Removethis> wrote in message
news:mesnews.38677d41.d29a298f.432.2689@free.fr.Re movethis...
> "CountScubula" a écrit le 07/01/2004 :
> > Did I miss something? I did realy read his entire post (sorry, I dont

want
> > to port code)
> > Did I answer a question that didn't exist? sorry, I will try and read
> > things a little more.

>
> No, don't misunderstand me, I realised I was replying small bit at a
> time instead of looking at the overall problem. So my meaning was for
> Salmo Bytes to read your Post to answer his question and still look at
> my posts to get ideas how someone else would write his functions.
>
> Have a nice night,
> Jedi
>
> --
> Have you read the manual?
> http://www.php.net/manual/en/
>


IC,

Sometimes I end up answering a question that didn't exists, I sometimes lose
sight of a question sometimes.

And yes, there are 1,001 ways to skin a cat (2,101 in the thailand)

--
Mike Bradley
http://www.gzentools.com -- free online php tools


Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 07:26 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0