GEDCOM
After several partially successful attempts to write a GEDCOM C++ library I've slowly run into enough design problems to suggest my approach is flawed. In what I thought was an obvious direction (top-down) I created top level objects of the major record types found in a GEDCOM file. Starting at the highest level:
- 0 «Header»
- 0 «Submission_Record»
- 0 «Record»
- 0 Trlr
- n «FAM_RECORD»
- n «INDIVIDUAL_RECORD»
- n «MULTIMEDIA_RECORD»
- n «NOTE_RECORD»
- n «REPOSITORY_RECORD»
- n «SOURCE_RECORD»
- n «SUBMITTER_RECORD»
- n @<XREF:FAM>@ FAM
- +1 RESN <RESTRICTION_NOTICE>
- +1 «FAMILY_EVENT_STRUCTURE»
- +1 HUSB @<XREF:INDI>@
- +1 WIFE @<XREF:INDI>@
- +1 CHIL @<XREF:INDI>@
- +1 NCHI <COUNT_OF_CHILDREN>
- +1 SUBM @<XREF:SUBM>@
- +1 «LDS_SPOUSE_SEALING»
- +1 REFN <USER_REFERENCE_NUMBER>
- +2 TYPE <USER_REFERENCE_TYPE>
- +1 RIN <AUTOMATED_RECORD_ID>
- +1 «CHANGE_DATE»
- +1 «NOTE_STRUCTURE»
- +1 «SOURCE_CITATION»
- +1 «MULTIMEDIA_LINK»
This ramble is a kind of thinking on paper exercise to allow stating the problem and hopefully deriving a solution.
That said, my current thinking (hopefully box escaping) lies in reconsidering the nature of the Linage-Linked format. I my initial rush to create a hierarchy of objects I believe that I missed the obvious. The clue lies in the word 'linked'. The entire form can (and hopefully should) be thought of as a linked list. In a kind of lispish format, the highest level can be thought of this way:
- (header)(link)
- (submission_record)(link)
- (record)(link)
- (trlr)
Essentially the idea is to flatten the entire form into a single data type: a list. Each list has a type and data. Data may be terminal or a list of lists. If terminal it is a simple string. If a list of lists, it is a simple collection of the basic data type. (type)(link) all the way down.
I'm going to wander off and play with this idea—I'll return to this white board when I learn more about what I am thinking here…
No comments:
Post a Comment