Package nltk_lite :: Module featurestructure :: Class FeatureStructure
[hide private]
[frames] | no frames]

Class FeatureStructure

source code

object --+
         |
        FeatureStructure
Known Subclasses:
parse.category.Category

A structured set of features. These features are represented as a mapping from feature names to feature values, where each feature value is either a basic value (such as a string or an integer), or a nested feature structure.

A feature structure's feature values can be accessed via indexing:

>>> fstruct1 = FeatureStructure(number='singular', person='3rd')
>>> print fstruct1['number']
'singular'
>>> fstruct2 = FeatureStructure(subject=fstruct1)
>>> print fstruct2['subject']['person']
'3rd'

A nested feature value can be also accessed via a feature paths, or a tuple of feature names that specifies the paths to the nested feature:

>>> fpath = ('subject','number')
>>> print fstruct2[fpath]
'singular'

Feature structures may contain reentrant feature values. A reentrant feature value is a single feature value that can be accessed via multiple feature paths.


Note: Should I present them as DAGs instead? That would make it easier to explain reentrancy.

Nested Classes [hide private]
  _UnificationFailureError
An exception that is used by _destructively_unify to abort unification when a failure is encountered.
Instance Methods [hide private]
 
__init__(self, **features)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
__getitem__(self, index) source code
list of string
feature_names(self)
Returns: A list of the names of the features whose values are defined by this feature structure.
source code
 
equal_values(self, other, check_reentrance=True)
Returns: True if self and other assign the same value to to every feature.
source code
 
__eq__(self, other)
Returns: True if self is the same object as other.
source code
 
__hash__(self)
hash(x)
source code
 
deepcopy(self, memo=None)
Returns: a new copy of this feature structure.
source code
list of FeatureStructure
reentrances(self)
Returns: A list of all feature structures that can be reached from self by multiple feature paths.
source code
FeatureStructure
apply_bindings(self, bindings)
Returns: The feature structure that is obtained by replacing each variable bound by bindings with its values.
source code
FeatureStructure
rename_variables(self, newvars=None)
Returns: The feature structure that is obtained by replacing each variable in this feature structure with a new variable that has a unique identifier.
source code
 
_apply_bindings(self, bindings, visited) source code
 
_rename_variables(self, newvars, visited) source code
 
unify(self, other, bindings=None, trace=True)
Unify self with other, and return the resulting feature structure.
source code
 
_destructively_unify(self, other, bindings, trace=True, ci_str_cmp=True, depth=0)
Attempt to unify self and other by modifying them in-place.
source code
 
_apply_forwards_to_bindings(self, bindings)
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
source code
 
_apply_forwards(self, visited)
Replace any feature structure that has a forward pointer with the target of its forward pointer (to preserve reentrancy).
source code
 
_rebind_aliased_variables(self, bindings, visited) source code
 
subsumes(self, other)
Check if this feature structure subsumes another feature structure.
source code
 
__repr__(self)
Display a single-line representation of this feature structure, suitable for embedding in other representations.
source code
 
__str__(self)
Display a multi-line representation of this feature structure as an FVM (feature value matrix).
source code
 
_repr(self, reentrances, reentrance_ids)
Returns: A string representation of this feature structure.
source code
 
_str(self, reentrances, reentrance_ids)
Returns: A list of lines composing a string representation of this feature structure.
source code
 
_find_reentrances(self, reentrances)
Find all of the feature values contained by self that are reentrant (i.e., that can be reached by multiple paths through feature structure's features).
source code

Inherited from object: __delattr__, __getattribute__, __new__, __reduce__, __reduce_ex__, __setattr__

Class Methods [hide private]
 
_parseval(cls, s, position, reentrances)
Helper function that parses a feature value.
source code
 
_parse(cls, s, position=0, reentrances=None)
Helper function that parses a feature structure.
source code
 
parse(cls, s)
Convert a string representation of a feature structure (as displayed by repr) into a FeatureStructure.
source code
Class Variables [hide private]
  _PARSE_RE = {'assign': re.compile(r'\s*=\s*'), 'bracket': re.c...
Instance Variables [hide private]
  _features
A dictionary mapping from feature names to values.
  _forward
A pointer to another feature structure that replaced this feature structure.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, **features)
(Constructor)

source code 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Overrides: object.__init__
(inherited documentation)

feature_names(self)

source code 
Returns: list of string
A list of the names of the features whose values are defined by this feature structure.

equal_values(self, other, check_reentrance=True)

source code 
Parameters:
  • check_reentrance - If true, then any difference in the reentrance relations between self and other will cause equal_values to return false.
Returns:
True if self and other assign the same value to to every feature. In particular, return true if self[p]==other[p] for every feature path p such that self[p] or other[p] is a base value (i.e., not a nested feature structure).

Note that this is a weaker equality test than ==, which tests for equal identity.

__eq__(self, other)
(Equality operator)

source code 
Returns:
True if self is the same object as other. This very strict equality test is necessary because object identity is used to distinguish reentrant objects from non-reentrant ones.

__hash__(self)
(Hashing function)

source code 

hash(x)

Overrides: object.__hash__
(inherited documentation)

deepcopy(self, memo=None)

source code 
Parameters:
  • memo - The memoization dicationary, which should typically be left unspecified.
Returns:
a new copy of this feature structure.

reentrances(self)

source code 
Returns: list of FeatureStructure
A list of all feature structures that can be reached from self by multiple feature paths.

apply_bindings(self, bindings)

source code 
Returns: FeatureStructure
The feature structure that is obtained by replacing each variable bound by bindings with its values. If self contains an aliased variable that is partially bound by bindings, then that variable's unbound aliases will be bound to its value. E.g., if the bindings <?x=1> are applied to the feature structure [A = ?<x=y>], then the bindings will be updated to <?x=1,?y=1>.

rename_variables(self, newvars=None)

source code 
Parameters:
  • newvars (dictionary from FeatureStructureVariable to FeatureStructureVariable) - A dictionary that is used to hold the mapping from old variables to new variables. For each variable v in this feature structure:
    • If newvars maps v to v', then v will be replaced by v'.
    • If newvars does not contain v, then a new entry will be added to newvars, mapping v to the new variable that is used to replace it.

    To consistantly rename the variables in a set of feature structures, simply apply rename_variables to each one, using the same dictionary:

    >>> newvars = {}  # Maps old vars to alpha-renamed vars
    >>> new_fstruct1 = ftruct1.rename_variables(newvars)
    >>> new_fstruct2 = ftruct2.rename_variables(newvars)
    >>> new_fstruct3 = ftruct3.rename_variables(newvars)

    If newvars is not specified, then an empty dictionary is used.

Returns: FeatureStructure
The feature structure that is obtained by replacing each variable in this feature structure with a new variable that has a unique identifier.

unify(self, other, bindings=None, trace=True)

source code 

Unify self with other, and return the resulting feature structure. This unified feature structure is the minimal feature structure that:

  • contains all feature value assignments from both self and other.
  • preserves all reentrance properties of self and other.

If no such feature structure exists (because self and other specify incompatible values for some feature), then unification fails, and unify returns None.

Parameters:
  • bindings - A set of variable bindings to be used and updated during unification. Bound variables are treated as if they were replaced by their values. Unbound variables are bound if they are unified with values; or aliased if they are unified with other unbound variables. If bindings is unspecified, then all variables are assumed to be unbound.

_destructively_unify(self, other, bindings, trace=True, ci_str_cmp=True, depth=0)

source code 

Attempt to unify self and other by modifying them in-place. If the unification succeeds, then self will contain the unified value, and the value of other is undefined. If the unification fails, then a _UnificationFailureError is raised, and the values of self and other are undefined.

__repr__(self)
(Representation operator)

source code 

Display a single-line representation of this feature structure, suitable for embedding in other representations.

Overrides: object.__repr__

__str__(self)
(Informal representation operator)

source code 

Display a multi-line representation of this feature structure as an FVM (feature value matrix).

Overrides: object.__str__

_repr(self, reentrances, reentrance_ids)

source code 
Parameters:
  • reentrances - A dictionary that maps from the id of each feature value in self, indicating whether that value is reentrant or not.
  • reentrance_ids - A dictionary mapping from the ids of feature values to unique identifiers. This is modified by repr: the first time a reentrant feature value is displayed, an identifier is added to reentrance_ids for it.
Returns:
A string representation of this feature structure.

_str(self, reentrances, reentrance_ids)

source code 
Parameters:
  • reentrances - A dictionary that maps from the id of each feature value in self, indicating whether that value is reentrant or not.
  • reentrance_ids - A dictionary mapping from the ids of feature values to unique identifiers. This is modified by repr: the first time a reentrant feature value is displayed, an identifier is added to reentrance_ids for it.
Returns:
A list of lines composing a string representation of this feature structure.

_find_reentrances(self, reentrances)

source code 

Find all of the feature values contained by self that are reentrant (i.e., that can be reached by multiple paths through feature structure's features). Return a dictionary reentrances that maps from the id of each feature value to a boolean value, indicating whether it is reentrant or not.

_parseval(cls, s, position, reentrances)
Class Method

source code 

Helper function that parses a feature value. Currently supports: None, integers, variables, strings, nested feature structures.

Parameters:
  • s - The string to parse.
  • position - The position in the string to start parsing.
  • reentrances - A dictionary from reentrance ids to values.
Returns:
A tuple (val, pos) of the value created by parsing and the position where the parsed value ends.

_parse(cls, s, position=0, reentrances=None)
Class Method

source code 

Helper function that parses a feature structure.

Parameters:
  • s - The string to parse.
  • position - The position in the string to start parsing.
  • reentrances - A dictionary from reentrance ids to values.
Returns:
A tuple (val, pos) of the feature structure created by parsing and the position where the parsed feature structure ends.

parse(cls, s)
Class Method

source code 

Convert a string representation of a feature structure (as displayed by repr) into a FeatureStructure. This parse imposes the following restrictions on the string representation:

  • Feature names cannot contain any of the following: whitespace, parenthases, quote marks, equals signs, dashes, and square brackets.
  • Only the following basic feature value are supported: strings, integers, variables, None, and unquoted alphanumeric strings.
  • For reentrant values, the first mention must specify a reentrance identifier and a value; and any subsequent mentions must use arrows ('->') to reference the reentrance identifier.

Class Variable Details [hide private]

_PARSE_RE

Value:
{'assign': re.compile(r'\s*=\s*'),
 'bracket': re.compile(r'\s*\]\s*'),
 'comma': re.compile(r'\s*,\s*'),
 'ident': re.compile(r'\s*\((\d+)\)\s*'),
 'int': re.compile(r'-?\d+(?=\s|\]|,)'),
 'name': re.compile(r'\s*([^\s\(\)"\'-=\[\]]+)\s*'),
 'none': re.compile(r'None(?=\s|\]|,)'),
 'reentrance': re.compile(r'\s*->\s*'),
...

Instance Variable Details [hide private]

_forward

A pointer to another feature structure that replaced this feature structure. This is used during the unification process to preserve reentrance. In particular, if we're unifying feature structures A and B, where:
  • x and y are feature paths.
  • A contains a feature structure A[x]
  • B contains a reentrant feature structure B[x]=B[y]

Then we need to ensure that in the unified structure C, C[x]=C[y]. (Here the equals sign is used to denote the object identity relation, i.e., is.)