Main Page   Class Hierarchy   Compound List   File List   Compound Members   File Members  

Trie Class Reference

Stores URL strings by superposition of common prefixes. More...

#include <trie.h>

Collaboration diagram for Trie:

Collaboration graph
[legend]
List of all members.

Public Methods

 Trie (unsigned long slen, long jlen)
ptrdiff_t FindURL (const char *url)
 Searches for a URL within the trie. More...

ptrdiff_t InsertURL (const char *url) throw (overflow_error)
 Inserts a URL string into the trie. More...

void Statistics (ostream &o)
uint32 StatsCumulativeStringSize ()
uint32 StatsBigstringInsertions ()
uint32 StatsJumptableInsertions ()

Public Attributes

char * bigs

Protected Attributes

unsigned long slen_
unsigned long end_of_bigs
SimpleCharPtrHashTablejumptable
uint32 stats_cumulative_string_size
uint32 stats_bigstring_insertions
uint32 stats_jumptable_insertions

Detailed Description

Stores URL strings by superposition of common prefixes.

This class implements a classic trie structure (Knuth, Vol. 3) which consists of a very long string space together with a hashtable (jumptable) which allows navigation.

The trie allows large space savings by storing common string prefixes only once.

Definition at line 45 of file trie.h.


Constructor & Destructor Documentation

Trie::Trie unsigned long    slen,
long    jlen
 

Definition at line 29 of file trie.cc.

References bigs, end_of_bigs, jumptable, SimpleCharPtrHashTable, slen_, stats_bigstring_insertions, stats_cumulative_string_size, and stats_jumptable_insertions.


Member Function Documentation

ptrdiff_t Trie::FindURL const char *    url
 

Searches for a URL within the trie.

If found, returns a character pointer to the end of the URL string within the trie, else returns -1.

Definition at line 62 of file trie.cc.

References bigs, SimpleHashTable< char * >::Find(), and jumptable.

Referenced by GraphBuilder::FindLeafNodeKey(), and GraphBuilder::FindWebNode().

ptrdiff_t Trie::InsertURL const char *    url throw (overflow_error)
 

Inserts a URL string into the trie.

Always returns a char pointer to the end of the inserted string. The second argument is of urltype, and signifies that the URL that is being inserted is either the current document's url, or a url found in one of the current document's anchor tags.

Note InsertURL assumes url is a pure ASCII string (7 bits).

Definition at line 131 of file trie.cc.

References NULL.

Referenced by GraphBuilder::NodeSetURL(), and GraphBuilder::TrieInsertLinkURL().

void Trie::Statistics ostream &    o
 

Definition at line 44 of file trie.cc.

References end_of_bigs, jumptable, SimpleHashTable< char * >::Size(), slen_, stats_bigstring_insertions, stats_cumulative_string_size, and stats_jumptable_insertions.

Referenced by GraphBuilder::StatisticsMem().

uint32 Trie::StatsBigstringInsertions   [inline]
 

Definition at line 55 of file trie.h.

References stats_bigstring_insertions, and uint32.

uint32 Trie::StatsCumulativeStringSize   [inline]
 

Definition at line 53 of file trie.h.

References stats_cumulative_string_size, and uint32.

uint32 Trie::StatsJumptableInsertions   [inline]
 

Definition at line 57 of file trie.h.

References stats_jumptable_insertions, and uint32.


Member Data Documentation

char* Trie::bigs
 

Definition at line 60 of file trie.h.

Referenced by FindURL(), Trie(), and GraphBuilder::TrieInsertLinkURL().

unsigned long Trie::end_of_bigs [protected]
 

Definition at line 65 of file trie.h.

Referenced by Statistics(), and Trie().

SimpleCharPtrHashTable* Trie::jumptable [protected]
 

Definition at line 67 of file trie.h.

Referenced by FindURL(), Statistics(), and Trie().

unsigned long Trie::slen_ [protected]
 

Definition at line 64 of file trie.h.

Referenced by Statistics(), and Trie().

uint32 Trie::stats_bigstring_insertions [protected]
 

Definition at line 70 of file trie.h.

Referenced by Statistics(), StatsBigstringInsertions(), and Trie().

uint32 Trie::stats_cumulative_string_size [protected]
 

Definition at line 69 of file trie.h.

Referenced by Statistics(), StatsCumulativeStringSize(), and Trie().

uint32 Trie::stats_jumptable_insertions [protected]
 

Definition at line 71 of file trie.h.

Referenced by Statistics(), StatsJumptableInsertions(), and Trie().


Generated on Wed May 29 11:37:28 2002 for MarkovPR by doxygen1.2.15