20130502

GEDCOM musings and rambles (no rants)

GEDCOM


After several partially successful attempts to write a GEDCOM C++ library I've slowly run into enough design problems to suggest my approach is flawed. In what I thought was an obvious direction (top-down) I created top level objects of the major record types found in a GEDCOM file. Starting at the highest level:
  • 0 «Header»
  • 0 «Submission_Record»
  • 0 «Record»
  • 0 Trlr
Where Record expands to:
  • n «FAM_RECORD»
  • n «INDIVIDUAL_RECORD»
  • n «MULTIMEDIA_RECORD»
  • n «NOTE_RECORD»
  • n «REPOSITORY_RECORD»
  • n «SOURCE_RECORD»
  • n «SUBMITTER_RECORD»
And as an example the FAM_RECORD expands to:
  • n @<XREF:FAM>@ FAM
    • +1 RESN <RESTRICTION_NOTICE>
    • +1 «FAMILY_EVENT_STRUCTURE»
    • +1 HUSB @<XREF:INDI>@
    • +1 WIFE @<XREF:INDI>@
    • +1 CHIL @<XREF:INDI>@
    • +1 NCHI <COUNT_OF_CHILDREN>
    • +1 SUBM @<XREF:SUBM>@
    • +1 «LDS_SPOUSE_SEALING»
    • +1 REFN <USER_REFERENCE_NUMBER>
      • +2 TYPE <USER_REFERENCE_TYPE>
    • +1 RIN <AUTOMATED_RECORD_ID>
    • +1 «CHANGE_DATE»
    • +1 «NOTE_STRUCTURE»
    • +1 «SOURCE_CITATION»
    • +1 «MULTIMEDIA_LINK»
Were this all it would have been a successful strategy. However, note the presence of «SOMETHING» references. These are why GEDCOM is referred to as a Linage-Linked document form. Any item shown that way links to another sub-record which may well be a mix of similar nature: primitives and higher level forms. As an illustration of the problem consider the NOTE_STRUCTURE and CHANGE_DATE links. Interestingly enough, each NOTE_STRUCTURE link contains a CHANGE_DATE link. Even more fun, each CHANGE_DATE link contains a NOTE_STRUCTURE link.

This ramble is a kind of thinking on paper exercise to allow stating the problem and hopefully deriving a solution.

That said, my current thinking (hopefully box escaping) lies in reconsidering the nature of the Linage-Linked format. I my initial rush to create a hierarchy of objects I believe that I missed the obvious. The clue lies in the word 'linked'. The entire form can (and hopefully should) be thought of as a linked list. In a kind of lispish format, the highest level can be thought of this way:


  • (header)(link)
  • (submission_record)(link)
  • (record)(link)
  • (trlr)
Essentially the idea is to flatten the entire form into a single data type: a list. Each list has a type and data. Data may be terminal or a list of lists. If terminal it is a simple string. If a list of lists, it is a simple collection of the basic data type. (type)(link) all the way down.

I'm going to wander off and play with this idea—I'll return to this white board when I learn more about what I am thinking here…

20130423

A Leaf named Postscript

As in the language that is. I'll probably switch to writing about what I know best and that would most likely be programming. I certainly won't stop ranting, but that doesn't have anything to do with "know best".

This was a draft that I had forgotten about. At a guess it was written while I was doing a fair amount of Postscript programming. For reasons that are unclear to me at least, I've always liked Postscript and before that Forth. Reverse Polish Stack Based Languages (RPSBLs ?) are just cool in my book. More on this at some point—back to the real world ah-well!

At it again...

After a substantial absence it occurs to me that I should get back to it so to speak. So I will. Soon. Really. Trust me™

20080512

New year, new leaf...

While I'm not about to stop ranting (and raving for that matter) I am going to turn over a new leaf and start posting things that I'm doing that may be of interest to an audience greater than one.

This first is Lisp. Unlike many lemmings rushing to the new to them drumbeat of near the oldest language in computer-dom, I got hooked many years ago, even before 16bit computers and IBM. Before Microsoft even!! Long ago and in an S-exp far away, I worked for a small company who had recently moved from selling smoke detectors to selling these new-fangled micro computers. When I wasn't writing the inventory portion of a general ledger package (you know, the one that wasn't boiler plate) I was investigating a product put out by a company in Hawaii, called Soft Warehouse. The product was Mu-Lisp which came with another product called Reduce. While I bought it to acquire Reduce, I was caught up by Mu-Lisp. Here was a small, easy to know, reasonably fast computer language at a time when there weren't all that many for 8-bit machines. In addition to it's virtues listed thus far, the best of all was that it was fun to program in. Who knew? I've always programmed because I like to--- well to be truthful, because I have to. I'm addicted to creativity and programming has one of the shortest needles around. When I wasn't investigating symbolic algebra with Reduce, I was writing a rational number package for Mu-Lisp(so if you just finished your package for CL, ha ha--- I beat you by 30 years!) Ah those were the days™.

At the moment in order to catch up after an absence of 5 years or so, I'm doing a comparitive exploration of 3 Lisps. Scheme, CLISP and newLISP. Not to save time, but only because I know the problem space well; I'm porting my Perl code that deals with the care and feeding of PGN files. PGN stands for Portable Game Notation. This is a standard widely used in the chess world to write chess scores for games played. Without going into detail(later perhaps) the interesting thing about this task is that in order to parse chess games, you have to write the code to actually play the game. This makes it an ideal source for exploration. Each time I do this in either another language or dialect I learn something new about the problem. So far I've done it in C, C++, Basic, Ruby, Python, Perl, newLisp, CLISP and Scheme. If you think you know the best way to do something, try doing it from a different angle. You will find new ideas each time, both in the comparison and in the new ways made available by a different language.

My plan for now is to post comparitive code over the next 'n' rants, not to prove a point, but just to demonstrate what it takes to do the same thing in these 3 dialects. Should be fun...

ℑ♥λ

20070106

First Rant part the second

Obviously when I bitch about something like deposit lag there is a heavy suggestion that it is a 'problem' that I'm suffering or have recently suffered. And so it is. We are now entering our 8th day of 'hold' (surely that should be 'withhold' yes? ) with no action. Either end. They (the writers of the check) are not over drawn or in danger of being over drawn. Poor little electrons are just sitting on the side of the road. Probably waiting for their support crew. I'd guess the support folk are driving a broken down volvo with 'Breaking Away' stickers on it...

I go to seek justice later today. I will don my 'puzzeled old man seeking enlightenment' costume rather than my 'rath of god' cape and see what is what.

20070104

First Rant

Have you ever wondered just what in the hell the banks are doing with your money when you deposit a largish sum (> $4,000) and then the teller calmly announces that 'of course, this will be on hold you know. Just till it clears...' Next they hand you various pieces of paper that tell you the first batch of yay much money will be available in so many days. Heck of a deal don't you think? Must take quite a bit of time and effort to shove that currency into the pipe and all. Or maybe this explains what happened to the Pony Express; the railroads forced them to go to work for the banks instead. And for that matter, money is heavy so you can only load just so much on a horse at a time (ASPCA and all that).

Of course this is all complete and utter god-damn nonsense. How fast does the money move? Well how fast do electrons run? How about the speed of light (C)? And even given that restricting them to a copper wire slows them way down to 1% of C, that is still as close to instant as we need to care about. Let's put it this way if they can take it out of your wallet with a hit of the return key and a brief pause, it seems only fair that they should be able to put it in a the same or better speed.