ged4py.model

Module containing Python in-memory model for GEDCOM data.

Functions

make_record(level, xref_id, tag, value, …)

Create Record instance based on parameters.

Classes

Date()

Sub-class of Record representing the DATE record.

Dialect(value)

Even though the structure of GEDCOM file is more or less fixed, interpretation of some data may vary depending on which application produced GEDCOM file.

Individual()

Sub-class of Record representing the INDI record.

Name(names, dialect)

Class representing “summary” of person names.

NameOrder(value)

Names/Individuals can be ordered differently, e.g.

NameRec()

Sub-class of Record representing the NAME record.

Pointer(parser)

Sub-class of Record representing a pointer to a record in a GEDCOM file.

Record()

Class representing a parsed GEDCOM record in a generic format.

ged4py.model.make_record(level, xref_id, tag, value, sub_records, offset, dialect, parser=None)[source]

Create Record instance based on parameters.

Parameters
levelint

Record level number.

xref_idstr

Record reference ID, possibly empty.

tagstr

Tag name.

valuestr

Record value, possibly empty. Value can be None, bytes, or string object, if it is bytes then it should be decoded into strings before calling freeze(), this is normally done by the parser which knows about encodings.

sub_recordslist [ Record ]

Initial list of subordinate records, possibly empty. List can be updated later.

offsetint

Record location in a file.

dialectDialect

One of Dialect enums.

parserGedcomReader

Parser instance, only needed for pointer records.

Returns
recordRecord

Instance of Record (or one of its subclasses).

Notes

This is the factory method for record instances, it can create different types of record based on tag of value:

  • if value has a pointer form (@ref_id@) then Pointer instance is created

  • if tag is “INDI” then Individual instance is created

  • if tag is “NAME” then NameRec instance is created

  • if tag is “DATE” then Date instance is created

  • otherwise Record instance is created

Returned record is not complete, it could be updated by parser. When parser finishes updates it calls Record.freeze() method to finalize record construction.

class ged4py.model.Record[source]

Bases: object

Class representing a parsed GEDCOM record in a generic format.

This is the main element of the data model, it represents records in GEDCOM files. Each GEDCOM records consists of small number of items:

  • level number, integer;

  • optional reference ID, string in format @identifier@;

  • tag name, short string;

  • optional record value, arbitrary string, for pointer records the record value is the reference ID of some other record.

For many record types GEDCOM specifies subordinate (nested) records with incremental level number.

Record class defines an interface that makes it easier to navigate this complex hierarchy of subordinate and referenced records:

  • sub_records attribute contains the list of all immediate subordinate records of this record.

  • sub_tag method find subordinate record given its tag, it can do it recursively if tag name contains multiple levels separated by slashes, and it can navigate through the pointer records transparently if follow argument is True.

  • sub_tag_value is a convenience method that finds a subordinate record (via sub_tag call) but returns value of the record instead of record itself. This simplifies handling of missing tags.

  • sub_tags returns the list of immediate subordinate records (no recursion). It is useful when multiple sub-records with the same tag can exist.

There are few sub-classes of the Record class providing additional methods or facilities for specific tag types.

In general it is impossible to define what constitutes value or identity of GEDCOM record, so comparison of the records does not make sense. Similarly hashing operation cannot be used on Record instances, and the class is explicitly marked as non-hashable.

Client code usually does not need to create instances of this class directly, make_record() should be used instead. If you create an instance of this class (or its subclass) then you are responsible for filling its attributes.

Attributes
levelint

Record level number

xref_idstr

Record reference ID, possibly empty.

tagstr

Tag name

valueobject

Record value, possibly None, for many record types value is a string or None, some subclasses can define different type of record value.

sub_recordslist [ Record ]

List of subordinate records, possibly empty.

offsetint

Record location in a file.

dialect: `Dialect`

GEDCOM source dialect, one of the Dialect enums.

Methods

freeze()

Method called by parser when updates to this record finish.

sub_tag(path[, follow])

Finds and returns sub-record with given tag name.

sub_tag_value(path[, follow])

Returns value of a direct sub-record.

sub_tags(*tags[, follow])

Returns list of immediate sub-records matching any tag name.

freeze()[source]

Method called by parser when updates to this record finish.

Some sub-classes will override this method to implement conversion of record data to different representation.

Returns
selfRecord

Finalized record instance.

sub_tag(path, follow=True)[source]

Finds and returns sub-record with given tag name.

Path can be a simple tag name, in which case the first direct sub-record of this record with the matching tag is returned. Path can also consist of several tags separated by slashes, in that case sub-records are searched recursively.

If follow is True then pointer records are resolved and pointed record is used instead of pointer record, this also works for all intermediate records in a path.

Parameters
pathstr

One or more tag names separated by slashes.

followbool

If True then resolve pointers.

Returns
recordRecord

Subordinate record or None if sub-record with a given tag does not exist.

sub_tag_value(path, follow=True)[source]

Returns value of a direct sub-record.

Works as sub_tag() but returns value of a sub-record instead of sub-record itself.

Parameters
pathstr

One or more tag names separated by slashes.

followbool

If True then resolve pointers.

Returns
valueobject

Subordinate record value or None if sub-record with a given tag does not exist.

sub_tags(*tags, follow=True)[source]

Returns list of immediate sub-records matching any tag name.

Unlike sub_tag method this method does not support hierarchical paths. It resolves pointer records if follow keyword argument is True (default).

Parameters
*tagsstr

Names of the sub-record tag

followbool, optional

If True then resolve pointers.

Returns
recordslist [ Record ]

List of records, possibly empty.

class ged4py.model.Pointer(parser)[source]

Bases: ged4py.model.Record

Sub-class of Record representing a pointer to a record in a GEDCOM file.

This class wraps a GEDCOM pointer value and adds a ref property which retrieves pointed object. Instance of this class will be used in place of the GEDCOM pointers in the objects created by parser.

Parameters
parserged4py.parser.GedcomReader

Instance of parser class.

Attributes
valuestr

Value of the GEDCOM pointer (e.g. “@I1234@”)

refRecord

Referenced GEDCOM record.

Methods

freeze()

Method called by parser when updates to this record finish.

sub_tag(path[, follow])

Finds and returns sub-record with given tag name.

sub_tag_value(path[, follow])

Returns value of a direct sub-record.

sub_tags(*tags[, follow])

Returns list of immediate sub-records matching any tag name.

property ref
class ged4py.model.NameRec[source]

Bases: ged4py.model.Record

Sub-class of Record representing the NAME record.

This class adds an additional method for determining type of the name. It also redefines the type of the value attribute, it’s type is tuple. Value tuple can contain 3 or 4 elements, if there are 4 elements then last element is a maiden name. Second element of a tuple is surname, first and third elements are pieces of the given name (this is determined entirely by how name is represented in GEDCOM file). Any of the elements can be empty string. If NAME record value is empty in GEDCOM file then all three fields of the tuple will be empty strings. Few examples:

("John", "Smith", "")
("Mary Joan", "Smith", "", "Ivanova")    # maiden name
("", "Ivanov", "Ivan Ivanovich")
("John", "Smith", "Jr.")
("", "", "")                             # empty NAME record

Client code usually does not need to create instances of this class directly, make_record() should be used instead.

Attributes
type

Name type as defined in TYPE record.

Methods

freeze()

Method called by parser when updates to this record finish.

sub_tag(path[, follow])

Finds and returns sub-record with given tag name.

sub_tag_value(path[, follow])

Returns value of a direct sub-record.

sub_tags(*tags[, follow])

Returns list of immediate sub-records matching any tag name.

freeze()[source]

Method called by parser when updates to this record finish.

Returns
selfNameRec

Finalized record instance.

property type

Name type as defined in TYPE record. None if TYPE record is missing, otherwise string, e.g. “aka”, “birth”, “immigrant”, “maiden”, “married” (or anything else).

class ged4py.model.Name(names, dialect)[source]

Bases: object

Class representing “summary” of person names.

Parameters
nameslist [ NameRec ]

List of NAME records (NameRec instances).

dialectDialect

One of Dialect enums.

Notes

Person in GEDCOM can have multiple NAME records, e.g. “aka” name, “maiden” name, etc. This class provides simple interface for selecting “best” name from all existing names. The algorithm for choosing best options is:

  • If there are no NAME records then it makes an empty name (with all empty components)

  • If there is only one NAME record then it is used for person name.

  • If there are multiple NAME records then the first record without TYPE sub-record is used, or if all records have TYPE sub-records then first NAME record is used.

Attributes
first

First name is the first part of a given name (drops middle name)

given

Given name could include both first and middle name (str)

maiden

Maiden last name, can be None (str)

surname

Person surname (str)

Methods

format()

Format name for output.

order(order)

Return name order key.

property surname

Person surname (str)

property given

Given name could include both first and middle name (str)

property first

First name is the first part of a given name (drops middle name)

property maiden

Maiden last name, can be None (str)

order(order)[source]

Return name order key.

Returns tuple with two strings that can be compared to other such tuple obtained from different name. Note that if you want locale-dependent ordering then you need to compare strings using locale-aware method (e.g. locale.strxfrm).

Parameters
orderNameOrder

One of the NameOrder enums.

Returns
ordertuple [ str ]

Tuple of two strings.

format()[source]

Format name for output.

There is no single correct way to represent name, values returned from this method are only useful in limited context, e.g. for logging.

Returns
namestr

Formatted name representation.

class ged4py.model.Date[source]

Bases: ged4py.model.Record

Sub-class of Record representing the DATE record.

After freeze() method is called by parser the value attribute contains instance of ged4py.date.DateValue class.

Methods

freeze()

Method called by parser when updates to this record finish.

sub_tag(path[, follow])

Finds and returns sub-record with given tag name.

sub_tag_value(path[, follow])

Returns value of a direct sub-record.

sub_tags(*tags[, follow])

Returns list of immediate sub-records matching any tag name.

freeze()[source]

Method called by parser when updates to this record finish.

Returns
selfDate

Finalized record instance.

class ged4py.model.Individual[source]

Bases: ged4py.model.Record

Sub-class of Record representing the INDI record.

INDI record represents a single person in GEDCOM. This class defines few methods that are useful shortcuts for accessing person information, such as navigation to parent records, name, etc.

Client code usually does not need to create instances of this class directly, make_record() should be used instead.

Attributes
father

Parent of this individual (Individual or None)

mother

Parent of this individual (Individual or None)

name

Person name (Name).

sex

Person sex, one of “M”, “F”, or “U” for unknown (str).

Methods

freeze()

Method called by parser when updates to this record finish.

sub_tag(path[, follow])

Finds and returns sub-record with given tag name.

sub_tag_value(path[, follow])

Returns value of a direct sub-record.

sub_tags(*tags[, follow])

Returns list of immediate sub-records matching any tag name.

property name

Person name (Name).

property sex

Person sex, one of “M”, “F”, or “U” for unknown (str).

property mother

Parent of this individual (Individual or None)

property father

Parent of this individual (Individual or None)