Home | About | Courses | Research | Search 
Quick'n'Dirty Pub Archive Search Engine
     
  About the School 
  Course Information 
  Course Material 
  Research Profile 
  Student Support 
  Staff Support 
  Online Documents 
  Miscellaneous 
  UWS Home 

  To Secure Site

This is a quickly written search engine put together on Good Friday (10/4/98) between chocolate Easter eggs! It will probably be revised over the next few weeks.

Current Features...

  • The search strings are compared against a listing of files in the /pub/ hierarchy. Matches are on substrings and are case-insensitive.
  • You can use multiple search keys. All search strings are AND'ed together. eg. to search for a Macintosh version of Netscape, you could search for "mac netscape".
  • You can enter as many space-separated search strings as you like.
  • Results are returned in directory order, which makes more sense then pure alphabetic.
  • FYI the search dataset is the /pub/ls-lR file and the program is written in Perl5 using mod_perl with Apache.

Current Bugs...

  • Sometimes the script fails with "Internal Server Error". The real error is "Can't undef active subroutine". I'm not sure why yet! I'm sure I'm missing something in mod_perl.

Things Todo...

  • OR'ing of search keys. Initially just simplistic OR'ing of two or more search strings, eg. "unix ftp mail" to return all filenames which have any of those strings.
  • Sorting of results based on date, size and name ala Apache 1.3.
  • Search stats - search time, number of matches, total file size, etc.
  • Allow choice about using case-insensitive and sub-string matching.
  • Directory statistics: total size of directory, last modified date.
  • Limit the number of entries returned. Probably allow a user-defined "maximum number of hits". Defaulting to something like 100.
  • Top 50 new files. Files which are new in the past N days or weeks.
  • Look at regular expression matching (this sort of works now, but I've limited the search strings so that the characters used for regex are invalid, until I can determine the how safe it is!).



 
 
Modified: 10th September, 2001 
School of Computing & Mathematics  
© University of Western Sydney, 2008