GLIMPSE

                 GLobal IMPlicit SEarch

 
Table of Contents:

  1. Introduction
  2. Features of GLIMPSE
    1. Technical specifications
    2. Indexing features
    3. Search features
    4. Results display
    5. Costs, license, registration
    6. Unique features
  3. GLIMPSE Applications demonstration
  4. GLIMPSE Indexing and search performance
  5. Installation
  6. Our observations
     


 
I. Introduction:   

 
II. Features of GLIMPSE:
 

1. Technical Specifications:
 
 

GLIMPSE: Technical Specifications 
 
Server platforms supported 
  •      Operating System
 
 
 
     
     
     
 
 
 
 
 
  • Web server
  • version 4.1 supports following platforms
      • OSF/1 DEC Alpha
      • Sparc Solaris 2.5.1
      • Linux 2.0.30
      • SGI  IRIX64 6.2
      • SGI  IRIX32 6.2
      • SGI  IRIX  5.3
      • freeBSD
      • HPUX
  • version 3.6 supports following platforms
      • Sparc Sun OS 4.1.1 
      •  freeBSD 
  • version 3.5 supports following platforms
      • IBM AIX 4.1
      • IBM AIX 3.2.5 
      • DEC Ultrix 
     
  • No webserver specifications.
Scalability: 
  • Index local doc collections only
  • Index doc collections on multiple web servers (remote indexing)
  •  Index Local document collections as well as     remote indexing upto two levels..
 
Technical support: 
  • E-Mail
  • Mailing list
  • Documentation on Web site
Source code availability   Yes (with the package)
Main program modules  Glimpseindex: Is the Indexer 
 Glimpse: Is the Search Engine. 
 Glimpseserver: Provide remote access to glimpse database. 
 Webglimpse: Search Interface .
 
 


II. Features of GLIMPSE:

2. Indexing Interface:
 
 

 
GLIMPSE: Indexing Interface
 
File/document formats supported (HTML, ASCII, PDF, SQL, Spread sheets, WYSIWYG) HTML, DOC, PDF and Text files
Indexing level support: 
  • File/directory  level
  • Multi-record files (e.g. bibliographic records in a file)
    • Yes
    •  No
    Specification of document collections 
    • Package specific/ independent
    • Single/ multiple source directories (and sub directories)
    • Package independent
    • Multiple source directories & its sub directories
     
    Standard formats recognised (MARC, BIB-REF, MEDLINE, etc.)  NO
    Customisation of document formats  NO
    Stemming  Yes. (Through Truncation)
    Stop words  Yes (using the -Sk option). Instead of having a fixed stop-list, glimpseindex figures out the words that are too common for every index seperately.
    Field level indexing  No 
    Database updation (merging) Yes (using the -a option with glimpseindex). Adds the given files and/or directories to an existing index.
    Compression support No.
    WEBGLIMPSE
    • WebGlimpse uses the Glimpse search engine, a fast and flexible searching tool. 
    • A WebGlimpse archive is built with a script called confarc.If you are configuring a new archive, run confarc(demo). The  script first sets up the right paths to the archive, the  URL of the cgi-bin programs, the title of the archive, and other administrative details. It computes neighborhoods, adds search boxes to selected pages, collects remote pages when relvant, and caches those pages locally.
    • Once indexing is done, the search form will be created in  wgindex.html. Point to this page from your website or copy the form into your own   pages.  Users can search from the  wgindex.html.
    • Run wgreindex. If you want regular updates to the archive,put wgreindex in your crontab.
     
    Additional Features
    • Creates indexes at command line using different options of glimpseindex

    • Eg:   glimpseindex -H  "index files path" -a "index directory" 
    • You can create different archives on the hard disk using confarc and provide web access using webglimpse.
     


     
    II. Features of GLIMPSE:

    3. Search Interface
     
     

     
    GLIMPSE: Search Interface 
     
    Boolean
    • Yes. Glimpse supports boolean operations like an `AND' operation denoted by the symbol `;' an `OR' operation denoted by the symbol `,', and

    •  a 'NOT' operation  denoted by the symbol `~', or  any combination. 

      Eg: 1. glimpse`pizza;cheeseburger'  
            2.glimpse `{political,computer};science'  
            3.glimpse -W 'fame;~glory' 

     
    Query term weighting No
    Relevancy ranking No 
    Proximity/phrase searching No
    Approximate matching Yes. Combination of exact and approximate matching . 
    Eg: competer
    Truncation  The symbol `#' is used to denote a sequence of any number of characters . 
     Eg: comp#
    Pattern Search Yes. Glimpse supports a large variety  of  patterns,  including  simple  strings,  strings with classes of characters, sets of strings, wild cards, and regular expressions. 
     Eg: glimpse \^abc\ ,  glimpse [a-ho-z]
    Search set manipulation  No
    Duplicate detection  No
    Field level searching  No
    Thesaurus/concept searching  No
    QBE (Query-By-Example)/ Relevance feedback searching  No
    Customisation (thru CGI programming)  Yes (using Perl scripts)
    Soundex search  Yes 
    Additional features Searches can be done at command line using different options of glimpse
    Eg:  
     glimpse -H  "index files path" -n information
     


      II. Features of GLIMPSE:

    4. Results Display:
     
     

      
    GLIMPSE: Results Display 
     
    Formats supported (Native format, ASCII, HTML) The output of a query is a set of records, one for each matching file. WebGlimpse formats the results in four ways. 
    • "Context for each match": WebGlimpse     outputs the title of each matching URL with a  link to it, and, since glimpse provides all the matching lines or records are output too. . 
    • "Providing the line numbers": Glimpse can compute the right line number for each match, and WebGlimpse has an option to bring the documents automatically to that line number.
    • "Highlighting Keywords": All the matched keywords are highlighted, both in the output records and, in case line numbers are used.
    • "Showing dates of modification": The date it was last modified the file .
    Relevancy ranking  No
    Option for viewing document summary  Yes
    Keyword-in-context  No
    Customisation of results display  No
     
     

    II. Features of GLIMPSE:

    5. Costs, license, registration:
     
     

      
    GLIMPSE: Costs, license, registration, etc. 
     
    Completely free  Yes
    Maintenance fee, license fee and any other contractual requirements  Any commercial use of this software will require a license. 
      e-mail: ott@u.arizona.edu 
    http://vpr2.admin.arizona.edu/ott/Webnot96.htm 
     
    Registration for download and use  No
     


     
    II. Features of GLIMPSE:

    6 Unique Features:



     III. GLIMPSE Applications demonstration:

    Application 1: