Jth1 uses yaml syntax, which is both easy to type by a human and understandable by a program.
ExampleHere is an example of a person expressed in jth1 :
Typed by human
- nom: Vincent Perre sexe: M profession: Cultivateur domicile: Berrias, 07 naissance: date: '1728-06-05' lieu: Berrias, 07 baptême: date: '1728-06-07' lieu: Berrias, 07 père: Pierre Perre mère: Gabrielle Thibon relations: - avec: Jeanne Bayle mariage: date: '1754-11-28' lieu: Berrias, 07 - avec: Thérèse Monbel mariage: date: '1758-04-04' lieu: Berrias, 07 - avec: Marie Coste mariage: date: '1762-10-26' lieu: Berrias, 07 sources: - acte de décès de son fils Vincent Perre
Generated by program
- local-id: jean-rocard sex: M official-name: 'Jean ROCARD' birth: date: ~1606 death: place: 'Vaux-sous-Aubigny,52190,Haute-Marne,Champagne-Ardenne,FRANCE ' date: '1681-09-07' profession: Vigneron relations: - with: jeanne-dadant children: - francois-rocard - jeanne-chinardet - jean-rocard-1 - jeanne-rocard - nicolas-rocard - jacques-rocard sources: - 'http://gw.geneanet.org/jeanpierre16?lang=en;p=jean;n=rocard'
Operating rulesOne yaml file is considered like a GEDCOM file, in the sense that it contains definitions of persons and relationships that must be coherent.
Like for GEDCOM, coherence is only necessary within a given file : if you build a tree aggegating several gedcom files, merging the trees and ensuring a global coherence is the responsibility of the genealogy software, it is not part of gedcom syntax.
Unique id of a personThis is the most important point : every person must be identified by a unique id.
In gedcom files, this is done with syntax like
@I1@ INDIfor persons, or
FAMC @F2@for relationships.
In jth1, the links are implicit ; the user must be aware of this and is responsible to ensure the coherence of the ids.
Here is the mechanism :
By default, the id of a person is composed by a "slug" built using the person's name, followed by a hyphen, followed by the birth year (if it is known).
A "slug" of a string is an other string where all accents are removed, all letters are lower-cased, and non alpha-numeric characters replaced by a hyphen.
In the previous example, the id of Vincent Perre is
In case of doublons (this happens in particular for homonyms with unknown birth year), the ids are built adding
-2etc. to the id.
- This automatic building can be overriden by the user, who can impose an id, using a fild
Defining relationshipsThe relationships can be expressed either using person names or person ids.
In general, it is more convenient to directly use the person names, but using the person id may be necessary in case of doublon.
HeadersEach yaml file can have a header, which permits to specify comments on the author, the vocabulary file, and indicate the transformations to be done on place names for geonames matching.
Vocabulary fileAs you can see in the example, some files are written using an english vocabulary, and other files use french. For example, the notion of birth can be expressed as
naissance:The mechanism to do that uses vocabulary files : there is one unique vocabulary (almost english) used by the parser, but the user can associate a vocabulary file to his yaml files.
Here are extracts of the vocabulary file I used :
# general terms type: type id-local: local-id par: by rôle: role ### time date: date début: begin fin: end naissance: birth mort: death décès: death ### geo lieu: place pays: country lieu-précis: precise-place ### names nom: official-name prénom: given-name nom-famille: family-name nom-courant: name surnom: nickname noms-alternatifs: alternative-names ### relations and events relations: relations mariage: mariage # marriage with 2 r in correct english # but jth1 syntax is not exact english contrat-mariage: mariage-contract divorce: divorce union-libre: free-union notaire: notary
Splitting a yaml file into multiple filesOne inconvenient of jth1 is that information may become messy when the tree starts to contain hundreds of persons. So I introduced the notion of "split yaml file", which permits to explode a yaml file into multiple files.
The rule is simple : all the files must be located in the same directory and one file (the "root file") must contain the header informations ; this file is the first file in alphabetical order.
For example, here are the files I wrote :
thierry-graff ├── 1.asc-thierry-graff.yml ├── MMM-asc-stella-fetonti-1882.yml (...) └── PPPM-asc-clemence-morin-1862.ymlThe letters
PPPMis a convention I use to organize my files, they are not part of jth1 syntax.
The important point here is that file
1.asc-thierry-graffis the root file.
This permits to easilly split the files when the tree grows.
Comments and analysisThe development of this syntax was motivated by two main reasons :
- The ability to store my data in a textual format was for me a non-negotiable requirement. GEDCOM is stored in text files, but the syntax is not convenient if one tries to build a genealogy by directly editing GEDCOM files. The normal process is to use a genealogy software, which generally store the information in a relational database.
- The limitations of GEDCOM : things like "free union", "PACS" (a french status between free union and mariage) and several information that can be found in the acts are not part of GEDCOM format ; in particular, there is no way to express the notion of approximative date.
Advantages and inconvenientsThe main advantage of jth1 syntax is the ability to have the information directly stored in text files, which permits to manage a genealogical tree like a software development, using tools like git to frequently register the modifications and make back-ups on remote machines.
An other advantage is the ability to expand the vocabulary when a new type of information is found. This is quite easy at this stage of development.
The main inconvenient is that yaml files can become too big to be manageable when the tree grows. This is adressed by the notion of yaml split file, but one needs to be careful and rigourous.