Main Page   Class Hierarchy   Compound List   File List   Compound Members   File Members  

WebNode Class Reference

Encapsulates a web document. More...

#include <webnode.h>

Inheritance diagram for WebNode:

Inheritance graph
[legend]
Collaboration diagram for WebNode:

Collaboration graph
[legend]
List of all members.

Public Methods

 WebNode (uint32 idno)
void InsertRawLinks (RawLinkSet *s)
 Inserts a set of tolinks into the WebNode. More...

void NormalizeRawLinks (SimpleHashTable< WebNodePtr > *h)
 Sorts all valid links into the first part of the tolinks array. More...

size_t RealSize ()
 Returns the full size of the WebNode including link arrays. More...

int NumberOfValidToLinks ()
int NumberOfDanglingToLinks ()
int NumberOfValidFromLinks ()
int NumberOfLeafLinks ()
void IncrementNumberOfFromLinks ()
void AppendFromLink (WebNodePtr anothernode) throw (overflow_error)
 This appends a webnode to the fromlinks list. More...

void UpdateLeafLinks (SimpleLeafNodePtrHashTable *leaftable)
 sifts LeafNode pointers upward in the tolinks array. More...

void SetDate (uint16 adate)
 Sets the earliest known date of the WebNode. More...

WebNodePtr ValidToLink (int k)
WebNodePtr ValidFromLink (int k)
LeafNodePtr ValidLeafLink (int k)
LeafNodePtr ValidLeafLinkDirectly (int k)
uint32 ID ()
uint16 Date ()
void ClearTag ()
void SetTag (int k)
bool Tagged (int k)
void ClearOccupationCount ()
uint32 OccupationCount ()
void IncrementOccupationCount ()
void IncrementOccupationCount (int c)
ScratchStruct Scratch ()
void SetScratch (ScratchStruct ascratch)

Static Private Attributes

MemPool< LinkStructglobal_link_pool

Detailed Description

Encapsulates a web document.

Every web document read by the ripper is represented by a WebNode. The construction of a WebNode is complicated, and is done by GraphBuilder, which also links the nodes into a WebLinkGraph. All the data members are defined as a WebNodeStruct, WebNode is really just a wrapper for WebNodeStruct to handle custom memory management. The class inherits memory management from MemoryPooled<T>.

Definition at line 92 of file webnode.h.


Constructor & Destructor Documentation

WebNode::WebNode uint32    idno
 

Definition at line 39 of file webnode.cc.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::date, WebNodeStruct::fromlinks, WebNodeStruct::id, WebNodeStruct::num_fromlinks, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, WebNodeStruct::tolinks, and uint32.


Member Function Documentation

void WebNode::AppendFromLink WebNodePtr    anothernode throw (overflow_error)
 

This appends a webnode to the fromlinks list.

Definition at line 99 of file webnode.cc.

References MemPool< LinkStruct >::Allocate(), MemPoolObject< S >::data, and global_link_pool.

void WebNode::ClearOccupationCount   [inline]
 

Definition at line 160 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::occupation_count, and WebNodeStruct::scratch.

void WebNode::ClearTag   [inline]
 

Definition at line 148 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::tag.

uint16 WebNode::Date   [inline]
 

Definition at line 145 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::date, and uint16.

Referenced by GraphBuilder::NodeGetDate(), GraphBuilder::NodeLaunch(), and DateBiasedPageRankSampler::QEvolveFrom().

uint32 WebNode::ID   [inline]
 

Definition at line 143 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::id, and uint32.

Referenced by Talker::LoadLeaves(), and GraphBuilder::NodeGetID().

void WebNode::IncrementNumberOfFromLinks   [inline]
 

Definition at line 109 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_fromlinks.

void WebNode::IncrementOccupationCount int    c [inline]
 

Definition at line 169 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::occupation_count.

void WebNode::IncrementOccupationCount   [inline]
 

Definition at line 167 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::occupation_count.

Referenced by WebSampler::SimulateAllocForward(), and WebSampler::TaggedSimulateForward().

void WebNode::InsertRawLinks RawLinkSet   s
 

Inserts a set of tolinks into the WebNode.

Inserts (or merges) the raw (character ptr) links into the webnode's tolinks array.

Definition at line 70 of file webnode.cc.

References MemPool< LinkStruct >::Allocate(), MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, MemPool< LinkStruct >::Deallocate(), global_link_pool, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, LinkStruct::pointer_diff, RawLinkSet, and WebNodeStruct::tolinks.

Referenced by GraphBuilder::NodeInsertLinks().

void WebNode::NormalizeRawLinks SimpleHashTable< WebNodePtr > *    h
 

Sorts all valid links into the first part of the tolinks array.

While sorting, it also converts the pointer_differences into webnode_ptrs. Dangling links are left in the upper half of the array.

Note second argument should really be a data member of first (more elegant) but then we'd have to create a derived simplehashtable...

Definition at line 137 of file webnode.cc.

References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, SimpleHashTable< R >::Find(), WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, NumberOfValidToLinks(), OccupationCount(), LinkStruct::pointer_diff, WebNodeStruct::tolinks, ValidToLink(), and LinkStruct::webnode_ptr.

int WebNode::NumberOfDanglingToLinks   [inline]
 

Definition at line 102 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, and WebNodeStruct::num_valid_tolinks.

int WebNode::NumberOfLeafLinks   [inline]
 

Definition at line 106 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_leaflinks.

Referenced by DateBiasedPageRankSampler::QEvolveFrom(), PageRankSampler::QEvolveFrom(), and UpdateLeafLinks().

int WebNode::NumberOfValidFromLinks   [inline]
 

Definition at line 104 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_fromlinks.

Referenced by WebLinkGraph::BuildFromSets(), and TruncatedKleinbergSampler::QEvolveFrom().

int WebNode::NumberOfValidToLinks   [inline]
 

Definition at line 100 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::num_valid_tolinks.

Referenced by NormalizeRawLinks(), TruncatedKleinbergSampler::QEvolveFrom(), DateBiasedPageRankSampler::QEvolveFrom(), PageRankSampler::QEvolveFrom(), WebSampler::SimulateAllocForward(), WebSampler::TaggedSimulateForward(), and UpdateLeafLinks().

uint32 WebNode::OccupationCount   [inline]
 

Definition at line 165 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::occupation_count, and uint32.

Referenced by NormalizeRawLinks(), and UpdateLeafLinks().

size_t WebNode::RealSize  
 

Returns the full size of the WebNode including link arrays.

Definition at line 175 of file webnode.cc.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_fromlinks, and WebNodeStruct::num_tolinks.

ScratchStruct WebNode::Scratch   [inline]
 

Definition at line 172 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::scratch, and ScratchStruct.

Referenced by DateBiasedPageRankSampler::QEvolveFrom().

void WebNode::SetDate uint16    adate
 

Sets the earliest known date of the WebNode.

This function is designed to be called several times. The earliest nonzero date is retained.

Definition at line 55 of file webnode.cc.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::date, and uint16.

Referenced by GraphBuilder::NodeSetDate().

void WebNode::SetScratch ScratchStruct    ascratch [inline]
 

Definition at line 174 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::scratch, and ScratchStruct.

Referenced by DateBiasedPageRankSampler::QEvolveFrom().

void WebNode::SetTag int    k [inline]
 

Definition at line 150 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::tag, and TAG_NUMBER_OF_BITS.

Referenced by WebLinkGraph::BuildFromSets(), and Talker::BuildTags().

bool WebNode::Tagged int    k [inline]
 

Definition at line 155 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, and WebNodeStruct::tag.

Referenced by WebLinkGraph::BuildFromSets(), and WebSampler::TaggedSimulateForward().

void WebNode::UpdateLeafLinks SimpleLeafNodePtrHashTable   leaftable
 

sifts LeafNode pointers upward in the tolinks array.

The code for this function is very similar to that of NormalizeRawLinks()

Definition at line 185 of file webnode.cc.

References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, SimpleHashTable< LeafNodePtr >::Find(), LinkStruct::leafnode_ptr, WebNodeStruct::num_leaflinks, WebNodeStruct::num_tolinks, WebNodeStruct::num_valid_tolinks, NumberOfLeafLinks(), NumberOfValidToLinks(), OccupationCount(), LinkStruct::pointer_diff, WebNodeStruct::tolinks, and ValidLeafLink().

WebNodePtr WebNode::ValidFromLink int    k [inline]
 

Definition at line 124 of file webnode.h.

References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, WebNodeStruct::fromlinks, and LinkStruct::webnode_ptr.

Referenced by WebLinkGraph::BuildFromSets(), and TruncatedKleinbergSampler::QEvolveFrom().

LeafNodePtr WebNode::ValidLeafLink int    k [inline]
 

Definition at line 129 of file webnode.h.

References MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_leaflinks, WebNodeStruct::num_valid_tolinks, and WebNodeStruct::tolinks.

Referenced by DateBiasedPageRankSampler::QEvolveFrom(), and UpdateLeafLinks().

LeafNodePtr WebNode::ValidLeafLinkDirectly int    k [inline]
 

Parameters:
k  Same as ValidLeafLink but saves a +/- in -O3

Definition at line 135 of file webnode.h.

References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, LinkStruct::leafnode_ptr, WebNodeStruct::num_leaflinks, WebNodeStruct::num_valid_tolinks, and WebNodeStruct::tolinks.

Referenced by DateBiasedPageRankSampler::QEvolveFrom(), and PageRankSampler::QEvolveFrom().

WebNodePtr WebNode::ValidToLink int    k [inline]
 

Definition at line 118 of file webnode.h.

References MemPoolObject< LinkStruct >::data, MemoryPooled< WebNodeStruct >::data, WebNodeStruct::num_valid_tolinks, WebNodeStruct::tolinks, and LinkStruct::webnode_ptr.

Referenced by NormalizeRawLinks(), TruncatedKleinbergSampler::QEvolveFrom(), DateBiasedPageRankSampler::QEvolveFrom(), and PageRankSampler::QEvolveFrom().


Member Data Documentation

MemPool< LinkStruct > WebNode::global_link_pool [static, private]
 

Definition at line 31 of file webnode.cc.

Referenced by AppendFromLink(), and InsertRawLinks().


Generated on Wed May 29 11:37:29 2002 for MarkovPR by doxygen1.2.15