Imprimer

The online help for Puck users

 

 

Opening window

 

Image

Image - Image

The opening window is the first window that appears after launching Puck. It remains open in the background throughout a session, and serves as a platform to switch between simultanously opened corpus windows.

A dataset may be constructed in three possible ways:

Import the dataset from a file

Double-click into the empty bar of the window to choose the file from your computer or directly enter the file's adress and name. Then click on "import" to import the dataset. The "title page" of the dataset shows up in the corpus window.

Note: If nothing has been entered or chosen, Puck opens by default the last dataset used. A dataset thus can be treated repeatedly in succeeding sessions without always having to enter or choose its file name.

Puck can read files in text, excel, pajek, gedcom, tip, xml and prolog format (for more information on these file formats click here).

If the file contains kinship references (by identity numbers) to individuals which do not show up by themselves, Puck creates the missing relatives as virtual individuals (with the sign "#" instead of a name). If the dataset is imported from a gedcom file, virtual individuals are likewise created to represent missing parents of full siblings (members of one "family" in gecom format).

Enter a new dataset

Click on the button "new" to acceed to the corpus window where you can specify the name of the dataset, its file adress, and basic informations (author, region, period etc.) by entering them directly into the "title page" of your new dataset. Then click on the "start" button of the corpus window to open the data window and start entering your data.

Simulate a random dataset

Click on the button "random" to open the simulation parameter window where you can specify the paramters of your random simulation. Then click on the "ok" button of the simulation parameter window to create the random dataset in order to construct a random corpus. The "title page" of the random dataset then shows up in the corpus window.

Convert a dataset to another format

Click on the button "convert" to open the conversion options window where you can specify the file formats to which you want to convert your dataset.

Then click on the "ok" button of the conversion options window - Puck simultaneously creates the files in the selected formats and places them into the same directory as the original file.

Puck can convert files from and to text, excel, pajek, gedcom, tip, xml and prolog format (for more information on these file formats click here).

 

 

Window components

Buttons:

Checkboxes:

  • Details - has to be activated if all informations of the file are to be imported (if not activated, only identity number, name, gender and basic kinship relations - f, m, sp, ch - are imported)

Clickable labels (activate by double click):

  • Info - opens the info window (informations on Puck)

  • Options - opens the general options window

  • Language - opens a dropdown menu to choose the language used by Puck

  • Help - opens the Online Help (presupposes internet connection)

Other labels:

  • Counter of currently open corpuses

 

Corpus window

 

Image

Image - Image

The corpus window appears when a dataset (empty, random, or imported from a file) has been opened by Puck (from the opening window). It serves as a sort of "front page" for the "kinship book" the various pages of which are represented in the data window.

Note: If you want to acceed to the corpus window from "within" the "kinship book" (i.e. from the data window), go to the first page (by navigation button "<<" and then click the navigation button "<".

Naming and describing a dataset

The front page contains the name of the dataset, as well as other informations that may be stored in the file by lines preceded by an asteriks (*).

If a new corpus is opened (by clicking the button "New" in the opening window), it gets the default name "New Corpus" and shows three lines for informations on region and period, source and author. The name and any other of these lines can be changed by double-clicking on the text and entering the new text in the bar that appears. Any new line can be added by double clicking on the space below the the last line.

To change the name of the dataset, there are thus two possibilities: either double-click on the name and change it directly (the file name changes accordingly), or enter a new filename (the name of the dataset changes accordingly).

Saving a dataset

The bar at the bottom of the corpus contains the directory and name of the file in which the corpus shall be saved. This default file is always in tip-format. To save the dataset in another format, use the export function. The default file name always corresponds to the name of the dataset. If the file name is changed, the name of the dataset changes accordingly, and vice versa.

To save a dataset, click on the start button. Saving a corpus is almost the same as exporting it, but

Note: The Save function should be used if a dataset is under construction and currently augmented by new entries. When working with datasets from external files, prefer the Export function so as not to alter the original file.

Acceed to the dataset

Click at the start button in order to acceed to the first "inner" page of the dataset "book".

 

Window components

Buttons:

Fields:

  • File name field - to enter the name of the file as which the corpus shall be saved (in .tip format)

Clickable labels (modify by double click):

  • Corpus information - displays general information on the corpus (such as the corpus name)

 

 

Data window

 

Image

Image - Image

The data window, which is opened by clicking the start button in the corpus window, contains all data concerning a given individual of the corpus. It serves to enter data, to navigate through the corpus along kinship paths, to partition the corpus, and to explore the kinship environment of a given individual.

Click here for informations and hints on how to establish a kinship dataset and on the different file formats for them.

Puck organizes genealogical data like a kinship "book", where each individual is represented by a separate "page". Every page contains seven "sections".The first one contains basic information on the individual's identity, the three following paragraphs (parents, spouses, children) contain information on its immediate kinship ties, and the three last one (property, cv event, note) contain supplementary information on the individual's attributes. This supplementary information, which can be used to partition the corpus, is only loaded from the file if the "details" checkbox has been activated in the opening window.

Every section except the first one can be "folded" and "unfolded" by clicking on the headline bar. The appearance at the beginning of a session can be set be activating the corresponding options in the General options window. The first (identity) section remains always open. Clicking on the headline bar of the first section is a rapid way of getting from one page to another (see below, navigating through a corpus).

The seven sections of the data window:

The identity section contains

The three kinship sections (one for parents, one for spouses, one for children) contain information on the individual's immediate (first degree) relatives. The headline bar above a kinship section indicates also the number of the respective relatives. Only the relative's identity numbers can be entered directly. All other informations are generated by Puck from the relative's gender and name as indicated on the relative's individual page. In particular, the kinship section displays:

The three attribute sections contain supplementary information on the individual, which can also be used to partition the corpus. All these attributes are exogenous properties.

Puck distinguishes three different categories of attributes and regroups them in separate sections. All attributes are characterized by a label (for which standard individual property codes should be used) and a value. For instance, a property with the label "OCCU" and the value "barber" means that the individual's occupation is that of a barber.

For a list of standard property codes and guidelines for personnalized property codes click here .

Warning: even an ego's property with specification of alter and automatic attachment of a reciprocal property to alter with specification of ego still remains a combination of individual properties and is not a relation property! This is important when these properties are used to partition the corpus or restrict a relational or matrimonial census to a subcorpus. For instance, search results for matrimonial rings among individuals whose marriages lie within a certain period may well contain marriages outside that period, if both partners have been married before or after.

The last number of a date is automatically interpreted as year (important for partitioning). This must be taken into account when specifying events as having happend before, after, or arround a certain year.

Attention: one and the same label cannot be used at the same time for a simple property and for an event property. For example, the code MARR cannot be used at the same time to indicate if a person is married (simple property) and when, where and to whom she is married (event property).

All attribute fields (with the exception of notes) contain drop-down menus which allow for choosing among existing label, value, place, and date data and existing individuals (for alter). The drop-down menu for alter contains identity numbers and names of all individuals in the corpus, just as for kinship relation entry.

Virtual individuals

Virtual individuals are individuals for whom no information is available except for their kinship relations (and perhaps their gender, if they are parents or spouses of existing individuals). They are identified by having the sign "#" as a name.

Virtual individuals can be directly entered, or may be automatically created during importation (click here for details).

Virtual individuals are ignored in exploring and exporting a corpus if the option "Ignore virtual individuals" is activated in the General options window.

Note: Virtual individuals are systematically ignored when exporting to Tip format by the "Save" function.

Virtual individuals can be eliminated altogether by activating the option "Eliminate virtual individual" in the Transform options window and reducing the dataset to its "real" part.

Doubles

Doubles are individuals who are identical with another individual. The individual with the lower identity number is considered as the original, the one with the higher id number as the double. Doubles can be marked by substituting the original's identity number for their name.

Doubles can be eliminated from the dataset by the eliminate doubles function.

Data deletion

To remove an individual or a kinship relation from the dataset, delete the individual's or relative's number in the respective ID field (or alternatively enter 0, e.g. via the drop-down menu), and press enter.

To remove an individual's attribute, delete the attribute's label in the property or cv event field, and press enter

Other fields and functions of the data window:

In addition to data entry, inspection of individuals and navigation through the corpus, the data window can also be used to search individuals, explore the kinship environment of individuals and partition the corpus according to exogeneous or endogenous criteria.

 

Any individual property (see individual property codes for details) can be used to split the corpus into subcorpuses. A subcorpus (whose title indicates the parent corpus as well as the property label and value) has the appearance of an autonomous corpus (with its own corpus window and all dependant windows), and every operation on a corpus can also be effectuated on a subcorpus. The important difference is that a subcorpus remains linked to the parent corpus, and all relation and ring search processes, while limiting results to individuals in the subcorpus (see the count window for details), always run through the total corpus.

Warning: a subcorpus which is saved and re-imported as a normal corpus looses this important subcorpus property. All links to individuals outside the subcorpus are cut, and external individuals can no longer act as intermediaries for chains between members of the subcorpus!

To partition a corpus, the property code has to be entered in the rightmost field in the last but one line (next to the button with the asterisk). This property code field contains a drop-down menu listing simple endogenous and exogenous property labels (normally, the first code displayed is ALL, actually a code used to remove partitions and to restore corpus unity). Once "enter" has been pressed, corpus is reduced to the subcorpus (the former corpus disappears), and the title of the subcorpus (the corpus cluster) is displayed in the field below the property code field. This cluster field contains a drop-down menu listing all clusters of the partition. Choosing another cluster permits to jump to another corpus. In order to restore the original corpus, enter "ALL".

Multiple partitions: A corpus can be partioned more than once. Successive application of different properties as partition criteria permits to refine partitioning. It amounts to combining properties by logical "AND".

Navigation along kinship ties (by clicking on the names of relatives) is possible across subcorpus borders. Jumping to an individual in another subcorpus automatically switches to this subcorpus. However, this switching is only possible between clusters of the same immediate super-corpus. This is important in the case of multiple partitioning. For instance, if a corpus has been partitioned first according to birth place and then according to gender, it is possible to switch from female Parisians to male Parisians, but not from female Parisians to female Londonders.

Navigation methods

In order to run through a corpus by passing from one individual to another, several methods are possible:

Note: the method by headline clicks is particularly convenient for data entry. It allows to enter all relatives of an individual (which may have much higher numbers) and then pass to the next number (using the > button would directly jump to the closest relative). By contrast, the method by navigation buttons is more convenient for reading a corpus, as it permits to jump over "empty" pages.

Window components

Editable fields with drop-down menus:

Editable fields without drop-down menus:

Editable fields with drop-down menus and particular functions:

Non-editable fields (serve only for information, not to enter any data)

  1. Name (in the parent, spouse and children sections) - the full name of the relative

  2. Kinship type (in the parent, spouse and children sections) - the kinship type of the relative (according to gender and section)

Navigation buttons:

Note: if the current page is already the first one, this button leads back to the "front page" (corpus window) This is actuall the only method to acceed to the corpus window after opening the corpus.

Note: if the current page is already the last one and the corpus is not a subcorpus, this button opens a page for a new individual

Note: entering new individuals is only possible for the dataset as a whole - this function is blocked if the corpus is a subcorpus produced by partitioning of some larger corpus!

Remaining buttons from the corpus window

 

Export window

 

Image

Image - Image

The Export window (which can be accessed from the corpus window or the data window) serves to export a corpus in various file formats.

Window components

Buttons:

  • Export - exports the corpus

Fields:

Drop-down menus:

  • original- exports names as they appear in the corpus (e.g. "Pierre Dupont")

  • anonymized - replaces names by identity numbers preceded by a letter indicating gender (e.g. "H 5")

Note: In order to be deposed in the open online archive KinSources, a genealogical dataset has to be anonymized!

  • numbered - adds the identity number between parentheses to the original name (e.g. "Pierre Dupont (5)")

Note: Numbered names are convenient if the corpus is exported in pajek format (pajek files require renumbering of individuals in order to assure continuous vertex numbers. Numbered names thus serve to keep the original numbers). If a pajek file with numbered names is imported to Puck, numbers between parentheses are automatically re-converted into identity numbers

  • the blank (" ") (e.g. "Jean Marie Durand")

  • the slash ("/") (e.g. "Jean Marie / Durand")

  • the asterisk ("*") (e.g. "Jean Marie*Durand")

Warning: in order to be recognizable as separate name parts by Puck, name parts have to be separated by a slash!

Warning! Before exporting to a pajek file you have to eliminate all fictive individuals by activating the checkbox

 

 

Update window

 

Image

The Update window (which can be accessed from the corpus window or the data window) serves to update a corpus using data from a file.

Corpus update requires a file which fulfills the same format requirements as the files used to load a corpus.

Note: if you use a file in text format containing only supplementary information on individuals' properties (so that the first block is empty), make sure that the file begins with two headlines (and not just one). The first headline (which may consist in a single letter) marks the presence of the first block, the second one the swich to the second block. Otherwise Puck will read your supplementary information as basic genealogical information, and the update will fail. These problems do not arise when you use files in tip format.

Updates can be made in the append or the overwrite mode. In the first case, data are simply appended without overwriting existing data (in this case, certain data may not be added, for instance a father that does not correrspond to the actual father). In the second case, data are used to overwrite exting data by new ones (in this case, an individual which appears in the update file as having no father will loose its father in the corpus).

 

 

Window components

Buttons:

  • Update - updates the corpus by adding data from a chosen file

Fields:

Checkboxes:

 

Fuse window

 

Image

Image - Image

The Fuse Window (which can be accessed from the corpus window) serves to produce a corpus by fusing the current corpus with another corpus.

Dataset fusion requires two files :

Corpus fusion implies an automatic renumbering of individuals: individuals of the second corpus who have no double in the first corpus obtain a new identity number by adding the number of the last individual of the first corpus to their old number.

Attention: this renumbering should remain a transitory exception while establishing a definite dataset! As a rule, individuals should have one unique identity number and belong to one unique corpus.

Window components

Buttons:

  • Fuse - produces a new corpus by fusion of the actual corpus and a second corpus

Fields:

 

Statistics window

 

Image

Image - Image

The Statistics window (which can be accessed from the corpus window or the data window) serves to explore the corpus structure by individual and relation statistics, visualizing them by diagrams and exporting them as pajek partitions.

Available statistics and diagrams:

Note: A direct grouping of relations according to exogenous criteria is not yet possible.

All partitions of individuals can be exported in pajek format (as pajek clu-files) by activating the option"export partitions" in the Statistics Options Window and clicking on the "show" button.

Further statistics and diagrams can be chosen by clicking the "Options" button and opening the Statistics Options Window (see there for details).

Most statistics are available in three different formats:

Window components

Buttons:

  • Show - produces tables, diagrams and partitions according to the chosen criteria. Partitions are directly exported into the
    directory chosen in the directory name field

  • Save - saves tables and diagrams in the directory chosen in the directory name field

  • Clear - close all tables and diagrams

  • Options - opens the Statistics Options Windowin order to choose output format and additional statistics.

Fields:

Drop-down menus:

Codes for individual properties and kinship relation properties can be chosen from four drop-down menus, two for the main criteria and two for the split criterion. There can be several main criteria but only one split criterion for all individual and one for all relation statistics.

Checkboxes for choosing basic statistics:

Further statistics are accessible via the Statistics Options Window.

 

Count window

 

Image

Image - Image

The Count window (which can be accessed from the corpus window or the data window) serves to count kinship relations and matrimonial rings in a corpus (to effectuate a relational / matrimonial census).

The types and parameters of a census can be specified in the search options window.

The results of a relation or ring census may be represented in different formats, that can be chosen via the report options window.

 

Matrimonial vs non-matrimonial census, circuit census vs open-chain census

A matrimonial circuit census counts the matrimonial circuits in a kinship network, that is, kinship and marriage chains between spouses.

For a matrimonial circuit census, activate the options circuits and deactivate the option ego-alter relation in the search options window. The ego-alter relation is thus by default defined as a marriage relation.

A non-matrimonial circuit census counts relational circuits in a kinship network, that is, kinship and marriage chains between people who are linked by previoulsy defined relations (for instance, coresidence, friendship, etc.)

For a relational circuit census, activate the option circuits and the option ego-alter-relation in the search options window. By double-click on the label "relational circuit", you acceed at a window where you can choose or enter the relational property that defines the couples for which linking kinship and marriage chains are searched.

According as the relation between ego and alter is symmetrical or asymmetrical, the search option symmetry has to be activated or deactivated in the search options window.

Note: this method differs from the method to redefine spouses altogether. In the latter case, all marriage relations are replaced by the chosen relation; whereas in the case of a relational circuit census, only the basic relation is defined as the chosen relation, while the couple may be linked by kinship as well as marriage relations. In a matrimonial circuit census (with or without redefinition of spouses), there are only two kinds of relations (kinship and marriage, or kinship and another relation). In a relational circuit census, there are three kinds of relations: kinship, marriage, and the basic relation.

An open kinship chain census counts the kinship chains in a kinship network, that is, all kinship and marriage chains, no matter how ego and alter are otherwise related.

For a kinship relation census, deactivate the option circuits in the search options window.

Note: The open chain census is in every respect analogous to a matrimonial census, so as to assure comparability (options concerning ring inclusion, sibling differentiation or permitted consanguinous relations apply to both types of censuses).

Attention: whether or not the open chain census counts only relations between married individuals (so as to be comparable with the materimonial census), between individuals of different sex, or between all individuals whatsoever, depends on whether or not the options "married individuals only" and/or "no same-sex relations" have been chosen in the search options window.

The relations or circuits to be counted can be determined in two ways:

This is done by a series of numbers such that the ith number indicates the maximal height (canonic degree of consanguineous ties) of rings of widht i (including i marriages in the case of a matrimonial census, i-1 marriages in the case of a relational census). For instance, the code 3 2 1 sets the horizon of matrimonial ring search to blood marriages between 2nd cousins (degree 3), marriage redoublings between pairs of 1st cousins (degree 2) and marriage retriplings between pairs of siblings (degree 1). All rings/relations of lower dimensions (for instance, blood marriages between first cousins or marriage redoublings between pairs of siblings) are included in the search.

This is done by writing a ring type in positional notation, where the letter X serves as a gender variable. For instance, the formula X(H)HX limits matrimonial ring search to marriages with agnatic nieces/nephews or uncles/aunts.

Note that, as a rule, a structure formula is read by Puck in a socio-centered manner. That is, the formulae X(H)HX and XH(H)X designate one and the same type of relation. This is important if the census is run on a subcorpus, where ego may be in the corpus but alter may not. In this case, we may, however, fix an ego-centered perspective of the relation by activating the search option "ego in cluster" (the formula XX(X), for instance, will then give us all grandparents relations of the individuals in the cluster, but not all the grandchildren relations).

Without further indication, a structura formula limits search to relations/rings that exactly fit this formula (contrary to the number code method which only sets an upper limit). We can, however, include all relations/rings which lie within the limits of the formula by letting it precede the sign "<". For instance, the formula "<XXX(X)XX" limits matrimonial ring search to all blood marriage rings within the limits of the 5rd civil degree (including marriages between first cousins, between uncles and nieces etc.)

Juxtaposition of several structure formula is interpreted as a combination by logical "or". For instance, the formula X.X(X)X XX(X)XX limits ring search to marriages between siblings in-law or 1st cousins.

For the moment, the possibility of combining formulae by logical "and" (multiple ring/relations) is not yet possible. You can, however, reach the same result by running, first, a census of the first ring class with the option "Mark Individuals", then Redefining Spouses (so as to limit spouses to relatives of the first type) and then run a census of the second ring class.

This is done by putting a subtraction sign ("-") before the formula. The rings which fit the formula are then eliminated from the rings within the limits of the number code. For instance, a number code 2 combined with the formula -X(X)XX limits matrimonial ring search to all blood marriages between consanguines of degree 2 except oblique relatives.

The type of sibling differentiation can be set in three possible ways:

  • Code 1: one single sibling type, "all siblings assimilated". Full and half siblings are not distinguished.

This method is recommended where half sibling relations rarely occur or do not matter for marriage rules.

  • Code 2: two sibling types, "no siblings assimilated". Only paternal and maternal siblings are distingished, full siblings are counted twice (once as paternal and once as maternal siblings).

This method is recommended where half-sibling relations are frequent (e.g. because of high rates of polygamy) the agnatic or uterine relationship is more important than the sibling relationship as such

This method provides the maximum of information.

Attention: the sibling mode is automatically set to "2" if the report option "matrimonial network" or "single rings" have been chosen in the Report Options Window! (In order to draw the circuits in a network or as subnetwors, the apical ancestors must be preserved)

The type of permitted consanguinity relations can be set in three possible ways:

The choices Agn or Uter result in an unilinear census.

The symmetry type of the relation between ego and alter (option symmetry in the search options window) decides on the permutability or non-permutability of ego and alter.

Example: if kinship chains between co-residents are seached, the option "symmetry" has to be activated, for co-residence is a symmetric relation. Accordingly, the chains "father-son" and "son-father" will be counted as one single category. By contrast, if kinship chains between persons and their heirs are searched, the option "symmetry" has to be deactivated, for inheritance is an asymmetric relation. Accordingly, the chains "father-son" and "son-father" will be counted as different categories.

Note: the symmetry type choice is only relevant for a non-matrimonial circuit census (option "ego-alter"). If matrimonial circuits or open kinship chains are searched, ego and alter are always considered as permutable (in the first case, male or female ego will be chosen according the chosen option, in the second case, there is no criterion for the selection of ego or alter)

The possibility or impossibility of inclusion of smaller circuits or intersection with smaller (or smaller and narrower) circuits can be set by activating or deactivating the options "Rings only", "Minor rings only" and "Minimal rings only" option in the search options window.

 

Special types of censuses

Restricted matrimonial / relational census

A census can be restricted, either to a subset of individuals or to a subset of marriages:

Note: restriction only affects the pivots (ego and alter and eventually married intermediate individuals) and not the linking individuals. The chains between ego and alter run through the entire network, not only through the subnetwork. The results are therefore different from a matrimonial census run on a subcorpus that has been exported and reimported as a total corpus (exportation and re-importation cuts all kinship chains through external individuals!)

Example: the code +SKEW O restricts the census to oblique marriages, -LINE A restricts the census to marraiges that are not with agnatic kin.

Note: this restriction applies to marriages and not only to ring types! That is, marriages are only counted if they are (are not) in a ring having the given property, no matter under which ring type they may appear. For instance, if the code +SKEW O is chosen, then horizontal marriages will only be counted if they are also oblique; if the code -LINE A is chosen, uterine marriages will only counted if they are not also agnatic, and so on.

Complex matrimonial / relational census

A more complex relational or matrimonial census can be effectuated by combining two censuses, using the results of the first (stored as relational properties by the "Mark Individuals" function) in order to redefine spouses, and running the second census on the thus transformed corpus.

In this manner, one can search for MBD marriages that are at the same time ZD marriages, bilateral cross cousins, and so on.

Such a complex census is a useful analytical complement to the inspection of the ring intersection network.

Instead of generating relational data from a preliminary relational or matrimonial census, they can also be directly read from a file, for instance a list of ego-alter-pairs (in the form of a two-column text file). For the precise method see the entry Relational properties from text files.

Aggregate matrimonial census

By activating the "ring clustering" option in the Search Options Window, you can regroup ring types according to a certain property (chosen from a drop-down menu by double clicking on the checkbox label) and effectuate a census of matrimonial rings classified by these aggegate ring types rather than by elementary ring types.

Mixed matrimonial census and connubial census

By activating the "individual clustering" option in the Search Options Window, you can regroup individuals according to a certain property (chosen from a drop-down menu by double clicking on the checkbox label) and effectuate a census of

  • mixted matrimonial circuits containing the relation "belonging to the same cluster" (symbolized by the tilde ~) : Puck distinguishes for the moment 9 types of mixed matrimonial rings, accordings as the H, HF, or HM belong to the same cluster as W, WF or WM.

  • connubial circuits consisting of marriage alliance relations (each of which may represent several marriages) between groups of individuals (that is, rings in an alliance network, where arrows point from the wife's group to the husband's group): Puck distinguishes endogamous rings (1 group), redoubling circuits or exchange circuits (2 groups, arrows pointing in the same or in inverse directions) and cycle circuits (3 groups, arrows consistently directed).

For each ring, the census lists (1) the number of connubial circuits, (2) the number of distinct cycles that may be formed from the marriages that constitute the connubial circuit, (3) the weight of the circuit (the geometric mean of marriages joining two groups in the ring) and (4) the probability of the circuit, given the relative numbers of potential spouses in the constitutive groups

If the "individual clustering" option in the Search Options Window is activated), several options in the Report Options Window change their meaning accordingly:

  • General survey: lists the numbers of mixed matrimonial circuits and connubial circuits for each circui type

  • List by type: lists the individual couples for each mixed matrimonial circuit type and the cluster couples (for example, families) for each connubial circuit type

  • List by couple: lists the alliance partners - separated into wife givers and wife takers - for each cluster (for example, each family) and the number of marriages concluded. This is tantamount to a list version of the alliance matrix.

  • Matrimonial Network: produces the alliance network and the corresponding alliance matrix of the chosen cluster set.

Window components

Buttons:

Fields:

Drop-down menus:

 

Info window

 

Image

Image - Image

The info window (accessible by double mouse click from the opening window) contains a series of basic informations on the Puck program in a window with a file name field and a text button that can be saved as a text file.

These informations contain:

  • The actual version

  • License information

  • The author

  • The website

  • The contact email

  • The date of the last update

 

Simulation parameter window

 

Image

Image - Image

Clicking on the button "Random" in the Opening Window opens the Simulation Parameter Window which prepares a random simulation of a genealogical dataset.

This virtual corpus opens just like a corpus loaded from a file and can be treated in all respects like any other corpus. The virtual society has been called the "Pucks", and its members have fantasy names to facilitate identification and analysis.

Note that we are not talking about simulation of a real genealogical network, but of a genealogical corpus which is a lacunary and biased picture of the real network. This makes it comparable with the datasets we are analyzing.

Puck's method to do this is to simulate a virtual fieldwork situation. If the option "protocols" is activated in the General Options Window, the protocol of this fieldwork is produced in an editable window that can be saved as a textfile. This virtual fieldwork protocol contains the list of individuals interviewed, and, for each interview, the relatives recalled by the individual, with their kinshihp relation, birth year and death year. In this manner the generation of the random corpus can be controlled at every step.

 

 

Virtual fieldwork

 

Image

Image - Image

Warning! This module is actually little more than a demonstrator in order to explore a new method of kinship network simulation. It has not yet been tested, and its results should not be used in scientific publications!

Window components

Fields:

 

General options

 

Image

Image - Image
The general options window (accessible by double mouse click from the opening window) contains a list of checkboxes for various options concerning Puck's general behaviour.

Underlined checkboxes allow to open (by double-clicking on the checkbox label) another options window to refine the choice.

 

Window components

Options concerning the default layout of the data window :

  • Show parents: Show/mask the parents panel in the data window

  • Show spouses: Show/mask the spouses panel in the data window

  • Show children: Show/mask the children panel in the data window

  • Show properties: Show/mask the properties panel in the data window

  • Show cv: Show/mask the cv panel in the data window

  • Show notes: Show/mask the notes panel in the data window

Options concerning the information furnished when opening a corpus:

Double-clicking opens the Info options window for choosing the kind of information to be furnished at the opening of the corpus

Double-clicking opens the Control Options Window for choosing the kind of irregularities to be reported at the opening of the corpus

Options concerning census reports and lists:

  • Sort couples by wives: Sort couples by the wife's (and not the husband's) identity number

  • List rings for every ego: List matrimonial rings from all possible ego-perspectives in census reports

  • Last name first: Edit and sort names as "LastName, Firstname" instead of "Firstname LastName"

  • Reports in excel format: Produce census reports, matrices and partition statistics in excel table format (if this option is inactivated, text format is chosen)

Options concerning the treatment of virtual relatives:

  • Ignore virtual individuals: Do not count fictive parents in corpus statistics and only use them in relation counts (for totally removing fictive parents, use the "Eliminate virtual individuals" option in the transform options window .

Reset options:

  • Reset at close: Resets all parameters to their default value. At the next launch, Puck will have forgotten all modifications by the user and behave as if he was launched for the very first time.

 

Info options window

 

Image

Image - Image

If the option "show info at start" is activated in the General options window, loading a corpus automatically produces window containing some basic statistical informations on the corpus. The same information can also be obtained at any moment and for any open (sub)corpus byactivating the option "survey" in the Statistics options window.

The kind of information can be chosen in the Info Options Window.

Some statistical analyses (such that component analysis) may be relatively time-expensive for large networks, and the user may want to skip them without renouncing to basic information such as gender proportions or population size.

The informations are displayed in an editable window and may be saved as a text file:

 

General info

 

Image

Image - Image

 

Window Components

Checkboxes

 

Control options window

 

Image

Image - Image

If the option "protocols" is activated in the General options window , loading a corpus automatically produces a window indicating some structural features which, according to the control options chosen, represent irregularities and may be due to error during data entry.

Some of these items (such as persons without name or gender, or homosexual marraiges) may actually be correct and wanted, but they often are due to simple errors during data entry. Puck indicates them in order to facilitate error control for the researcher, but never automatically "corrects" possible errors.

Even if they true representations of the kinship network, some of these irregularities may hinder certain functions of Puck from working correctly or necessitate a reconsideration of analytical methods.

Some errors may cause failure of some functionalities (for instance, cyclic descent causes infinite loops in agnatic/uterine component computation - the basic corpus statistics window will not open) or will lead to erroneous results in others (for instance, the presence of male mothers or female fathers causes calculation errors in the matrimonial census)

The Control Options Window allows the user to choose the feature which he wants to be checked before starting to work on the corpus.

The results of the control are displayed in an editable window and can be saved as a text file:

 

 

Image

Image - Image

Note that these options are only valid for a corpus that is loaded from a file. A randomly generated corpus by definition cannot contain any structural irregularities. In the case of a random corpus, the option "protocols" does not result in an irregularity protocol, but in a virtual fieldwork protocol.

Types of possible codage errors

Cases of cyclic descent render all generation and component calculus unfunctional and may block functions which presuppose them. Cyclic descent always represents a major error in genealogical networks! Networks where cyclic descent is allowed lose practically all of the characteristic features of kinship networks and should not be treated by a kinship network program, but by general network analytic programs (such as Pajek).

 

Window components

Checkboxes:

 

Transform options window

 

Image

Image - Image

The transform options window (which can be accessed from the corpus window) contains a list of checkboxes for various options for transforming a corpus into a new corpus. The transformed corpus appears in a new corpus window, the original corpus is not closed.

Pressing the "ok" button starts the transform operation.

Window components

Transform options:

  • Maximal bicomponent - Reduce the kinship network to its maximal bicomponent

  • Kernel - Reduce the kinship network to its kernel

  • Cut Tails - Eliminate recursively all individuals who have only one kinship link (which amounts to eliminating all tails of the kinship network)

  • Eliminate unmarried children - Eliminate all individuals who have neither spouses nor children (the structural children of the kinship network)

  • Expand (bottom-up) - This method works only for a subnetwork. It augments the subnetwork by all ascendants of its members.

  • Core - Reduce the kinship network to the sum of its matrimonial bicomponents - the core

  • Eliminate virtual individuals - Eliminates all virtual individuals (individuals that only serve to represent the common parents of full siblings).

This reduction is recommended for the exploratory analysis of a corpus containing fictive individuals.

  • Eliminate doubles - Eliminates all individuals whose name consists in a number and transfers all relatives and attributes to the individual who have this number as their identity number

This is the way to note double entries in the corpus: replace the name of the double by the identity number of the original. You can then eliminate (as required) all doubles by the "Eliminate doubles" function.

  • Redefine spouses - replaces the marriage relation by a chosen relation (which may be itself a more narrowly defined marriage relation, for example marriages concluded in a certain period)

The relation is set by double-clicking on the label "redefine spouses" and entering a relational property in the window that opens (manually or by chosing from the drop-down menu. A relational property is a property which may have an "alter".

For example, a code like "FRIEND" replaces spouses by the individuals whose identity numbers appear in the "alter" field of the property "FRIEND" (in the "property" panel of the data window). A matrimonial census run on this new corpus thus counts kinship relations linking friends (or business partners, neighbors, successors in office, or whatever). In this manner, the matrimonial census is transformed into a general multiple relation census.

The property can also be binarized. This is particularly useful for restricting a matrimonial census to marriages that fulfill certain place or date criteria.

For example, a code like "MARR Year 1701-1800" replaces spouses by spouses who have been married in the 18th century. A matrimonial census can thus be restricted not only to a subset of individuals, but also to a subset of marriages.

  • Redefine parents - replaces the parent relation by a chosen relation, where the type of parent (father/mother) corresponds to the gender of alter.

The procedure is the same as in redefining spouses. Redefined parents (for example adoptive parents, godparents, etc.) then replace genealogical parents in a matrimonial census or any other function run on the new corpus.

Note that one and the same relation (for instance adoption) can be meaningfully used to "redefine parents" as well as to "redefine spouses", for example to examine by a matrimonial census genalogical relations between godparents and godchildren, or to construct the alliance network created by adoption (rather than marriage) relations between groups.

  • Equivalence relations - relationalizes a chosen property by creating a relation between all individuals that have the same property value. The relation is stored in the form of a relational property for each ego. This relational property has the same label as the relazionalized property, preceded by the prefix "Co-"

For example, relationalizing of the property "residence" leads to the relationalized property "co-residence" which lists, for given ego, every alter having the same residence as ego.

Relationalized properties can in turn be used in order to redefine spouses or to run a relational circuit census. For instance, a relational circuit census based on the relationalized property "co-residence" will count kinship and marriage relations between co-residents.

  • Positive cluster values - reduced the network to those vertices who have a positive cluster value in a chosen partition.

For example, if "PROG 1" (number of children) is chosen as partition command, then the network is reduced to people who have children. If "RES" is chosen, the network will only contain people whose residence is known, etc.

 

 

Conversion options window

 

Image

Image - Image

The conversion options window is opened by double-click on the "convert"' button in the opening window. It displays the option for converting a file into different formats.

Puck can convert files from and to text, excel, pajek, gedcom, tip, xml and prolog format (for more information on these file formats click here).

 

Search options window

 

Image

Image - Image
The Search Options window (which can be reached from the Count window) contains a series of checkboxes for options concerning relational and matrimonial censuses.

Underlined checkboxes allow to open (by double-clicking on the checkbox label) a drop-down menu from which cluster categories can be chosen (for individual and relation clustering), where multiple choices are possible

 

 

 

 

 

 

 

General options

  • Circuits: count matrimonial circuits (and not open consanguineous relations) - desactivation of this checkbox amounts to a switch from a matrimonial or relational circuit censusto akinship relational census

  • Ego-alter relation: take a relation other than marriage as the base relation of the circuit (the relation can be set by choosing a relational property from the drop-down menu accessible via double-click). If this chechbox is deactivated, marriage relation is chosen by default.

Drop-down menu (opened by double-clicking): individual property codes (for choosing a relational property)

  • Married individuals only: count only relations and circuits joining married individuals

Note: If this option has been chosen, the resulting kinship relation census counts the relations that could form part of the matrimonial circuits. There numbers are then identical with those indicated in the matrimonial census report when the option "show relation frequencies" has been chosen in the report options window.

  • No same-sex relations: count only relations and circuits joining individuals of different sex

Note: in the case of a matrimonial census (where the option "circuits" is activated and the option "ego-alter relation" is deactivated) the census is automatically restricted on married individuals of different sex. In this case, the two preceding options are only relevant for a relational circuit census (where the option "ego-alter relation" is activated in a non-marriage relation is chosen as a base relation) and for a kinship relation census (where the option "circuits" is deactivated and ego-alter are not related by a second relation)

  • Rings only: do not consider also circuits/relations that completely include shorter ones (that is, only count matrimonial rings and not all matrimonial circuits)

  • Minor rings only: do not consider also circuits/relations that intersect with shorter and narrower ones (that is, only count minor matrimonial rings and not all matrimonial circuits)

  • Minimal rings only: do not consider also circuits/relations that intersect with shorter ones (that is, only count minimal matrimonial rings and not all matrimonial circuits)

  • Symmetry: consider the ego-alter relation as symmetric and therefore allow permutation of ego and alter

Options concerning clustering of individuals or relations/rings

Drop-down menu (opened by double-clicking): individual property codes

  • Ring clustering: group matrimonial rings according to a chosen kinship relation property and count the corresponding ring frequencies

Drop-down menu (opened by double-clicking): kinship relation property codes

Options concerning censuses restricted on a cluster

  • All in cluster: all married individuals (matrimonial ring pivots) must belong to the chosen cluster

  • Ego in cluster: ego (according to a chosen kinship schema) must belong to the chosen cluster (presupposes ring search by formula).

  • Last married in cluster: the last married individual must belong to the chosen cluster (presupposes marriage dates, for the moment still treated as individual properties) Warning! This method is only an approximation! If you really want to restrict a matrimonial census to a subset of marriages, use the redefine spouses function!

 

Report options window

 

Image

Image - Image
The Report Options window (which can be reached from the Count window) contains a series of checkboxes for options concerning the format of relational and matrimonial census reports.

Underlined checkboxes allow to open (by double-clicking on the checkbox label) a drop-down menu from which cluster categories can be chosen (for individual and relation clustering), where multiple choices are possible

Click here for informations on ring and relation notation.

Click here for informations on text and pajek formats.

 

 

 

 

Options concerning types of census reports in text format

Every report in text format is headed by informations on the corpus, the type of census (matrimonial or relational), the date of the census, the version of Puck used, the maximal dimension (height and width) of the relations/rings concerned, the type of sibling differentiation, the treatment of ring inclusion, and the total number of marriages and married men and women in the corpus.

  • General survey: 1. Indicate the number of marriages and individuals concerned by the census (both in absolute numbers and as percentages, differentiated by men and women). Ring types of different width (number of marriages implied) are listed in separate blocks. Before each block, indicate the total number of rings and ring types (with the mean number of rings per type), the width of the rings and the maximal height of the ring (e.g. the canonic degree of the consanguinity relations implied).

2. List, for each ring/relation type (unless the census has been made per relation formula, only for types with nonzero frequency), continuously numbered (across blocks), their current index, their standard and positional notation, and the number of individual and relations / individuals, marriages and rings concerend (both in absolute numbers and as percentages)

  • List by types: List, for each type of relation/ring (indicated in standard notation), all relations/rings found in the corpus, with nominal indication of their pivots (and their relations: "=" for marriage, "-" for consanguinity) and the complete chain in positional notation (individuals being indicated by their identity numbers)

  • List by couples: List, for each couple or pair of relatives concerned by the census, all the rings/relations that link them, both in standard notation of the type and in positional notation of the complete chain (individuals being indicated by their identity numbers)

  • Sortable list: List, in a continuous (thus easily sortable) list, all relations/rings found in the census, with the index, standard and positional notation of the relation/ring type, nominal indication of their pivots (and their relations: "=" for marriage, "-" for consanguinity) and the complete chain in positional notation (individuals being indicated by their identity numbers)

  • Decomposition: The report of ring intersection network decomposition. This is a method for studying the interdependence of matrimonial ring types by eliminating, one after the other, ring types in order of size, and diminish all other ring types by the marriages they have in common with the eliminated ring type (if no marriages are left, a ring type is also eliminated).

For each elimination step, the decomposition report lists the eliminated ring type, the number of rings of this type (absolute and as a percentage of their original ring number), the number of remaining rings and ring types (absolute and as a percentage of their original numbers), and the secondary ring types that are diminished in size or eliminated.

Options concerning types of network outputs

All network are exported in a pajek .net format. Partitions are exported in .clu format if they concern non-numerical data, in .vec format if they concern numerical data. All network outputs of a given session are united in a single pajek project (.paj) file.

  • Matrimonial network: produce the matrimonial network corresponding to the matrimonial census (that is, a network made up only of the links that form part of some matrimonial circuit), as well as a partition to distinguish vertices that occupy spouse positions in the matrimonial network

Attention: If this option is activated, the sibling mode is automatically set = 2 (no siblings assimilated)

  • Matrimonial network (frame): produce the frame of the matrimonial network corresponding to the matrimonial census, embedded in the total kinship network. To extract the frame, use the Tip4Pajek macro A 2.3 ("Reduce to frame").

  • Single circuits: export all matrimonial rings fount by the census as separate networks (with nominal indication of individuals as vertex labels), as well as a partition to distinguish vertices that occupy spouse positions in the circuit

Attention: If this option is activated, the sibling mode is automatically set = 2 (no siblings assimilated)

  • Circuit intersection network: produce the ring intersection network corresponding to the matrimonial census, as well as the circuit intersection matrix in text format.

Options concerning supplementary information in the general survey

  • Show relation frequencies: count, for each matrimonial ring type, the number of corresponding relations (between married individuals) in the network, and calculates the preference index (as the quotient of the shares of relations and rings of the given type relative to total relation and ring frequencies within the horizon of the census). Relation frequencies and preference indices are indicated in separate columns in the general survey.

  • Show circuit clusters: partition rings according to several chosen circuit properties and indicates, for each ring, the clusters to which it belongs. For each ring property there is a separate cluster column, headed by the property name. Ring properties can be chosen from a drop-down menu (opened by double click)

Options concerning partitions

  • Mark individuals: add the binary property of being in a circuit of a given type to each individual forms a pivot of a circuit. The label of the property is the circuit type in standard notation, preceded by a "#" (see the Individual Property Codes). The property includes indication of Alter and appears in the "cv data" panel of the Data window.

Partitioning the corpus according to such a property permits to extract the sub-corpus of all individuals that form part of a circuit of the given type.

Using this property for redefining spouses in order to effect a second relational or matrimonial census permits a complex matrimonial or relational census (multiple kinship relations or intersecting matrimonial circuits).

 

Statistics options window

 

Image

Image - Image
Image
Image - Image

The Statistic Options window (which can be reached from the Statistics window) contains a series of checkboxes for options concerning statistics and diagrams other than those concerning the distribution of individual and relation properites, chosen directly in the Statistics window.

The statistics available (other than property distribution) are actually:

  • Statistics concerning the distribution of relations of certain types according to generational depth or degree:

The following options can be chosen from a special statistics options window opened by double-click on the label "analyze structure":

  • Agnatic/uterine weight: the number of individuals for whom the agnatic/uterine/the agnatic and the uterine linear ascendant of a given degree is known, as a percentage of indivdiuals for whom the agnatic or the uterine ascendant of that degree is known. This is a measure of the agnatic or uterine bias of the corpus

  • Agnatic/uterine net weight: the number of individuals for whom only the agnatic/uterine ascendant of a given degree is known, as a percentage of indivdiuals for whom the agnatic/uterine ascendant of that degree is known. Another measure of the agnatic or uterine bias of the corpus

  • Agnatic/uterine component distribution: the distribution of agnatic/uterine components (connected subnetworks made up entirely by paternal/maternal ties) according to their size: the abscissa of the diagram indicates the relative size of components (as a percentage of total network size, where size = number of individuals), the ordinate indicates the relative frequency of components of given size (as a percentage of total number of components)

  • Genealogical completeness: the mean number of known ascendants of a given generational depth, as a percentage of the possible number of ascendants of that degree

  • Differential density: the density of consanguineous relations of a given degree, that is, the number of consanguineous relations as a percentage of the number of possible relations between non-identical individuals in the network

  • Fratry distribution: the distribution of agnatic and uterine fratries (sibling groups) according to their size

  • Statistics concerning the distribution of agnatic/uterine components according to number (as a percentage of the total number of components) and size (as a percentage of the total corpus size)

  • Some frequently required statistics are available prefigured standard statistics:

Note: if this option is activated, Puck automatically inactivates the option ego-alter-relation in the Search Options Window!

Note: According to the options chosen via the Search Options Window, the consanguine relations can be restricted to individuals of different sex and/or to married individuals.

This statistic permits, for example, to list the kinship composition of households, thus providing a detailed inside into residential organization.

The underlying partition is established for the chosen attribute labels listed in the left-hand area of the statistics window.

The following types of relation are counted (both in absolute numbers and in density terms, that is a percentage of the total number of possible relations between cluster members, that is n*(n-1)/2 relations for a cluster of size n):

Agnatic, uterine, consanguineous, bilateral and parent-child relatedness are differenciated according to sex: HH (male-male), FF (female-female) and HF (male-female)

Agnatic, uterine or consanguineous relatedness is determined without limitation of degree. The average of the minimal genealogical distance (roman degree) is indicated in a separate column.

In addition, total size, male and female population is indicated for each cluster in the first three columns.

The first line gives total, the second line average values for the kinship compositions of all clusters.

This statistic permits, for example, to list the kinship relations of the habitants of a house (cluster X in partition A "residence") to the closest individuals whose family (cluster Y in partition B "own family") is the family of the owners of the house (cluster Y in the partition C "family of the house-owner").

The three underlying partitions A, B, C correspond to the first three chosen attribute labels listed in the left-hand area of the statistics window.

The following kinship links are counted (separately for male and female cluster-members): Ego, F, M, FF, FM, MF, MM, FFF, MMM, H/W, FW, MH, FFW, MM, SpF, SpM

The first line gives total, the second line average frequencies of the cluster-affiliation links for all clusters.

Expected relinking frequencies are obtained by a permutation test: marriages are randomly permuted among married individuals of the same generation (so that the size of sibling groups and the number of allies of each sibing group remain invariant)

Note: The same kind of information can also be obtained at the opening of a corpus if the option "show info at start" has been activated in the General options window. However, while the latter option provides only information for the total corpusat the moment of its opening, the option in the statistics options window allows obtaining it for any corpus, subcorpus or transformed corpus at any moment.

Window components

Checkboxes for choosing output modes:

Checkboxes for choosing statistics:

 

Kinship calculator

 

Image

Image - Image

The Kinship calculator serves to analyze, transform and compute kinship relations

The kinship calculator can be accessed from the opening window or from the corpus window by double-click into the kinship relation schema field. In the latter case, it directly adopts the kinship relation entered in this field, and keeps the reference to the underlying corpus whose corresponding ring/relations can be searched. In the former case, relations/rings are totally abstract.

Kinship relations can be entered in any notation (standard, positional or numeric). The calculator contains three lines so as to allow for unary and binary operations. These operations can act on fully specified relations or on a relational schema (without specification of gender).

 

Relation analysis includes information on:

The results of the relation or ring type analysis are displayed in an editable window and can be saved as a text file.

Unary operations:

Binary operations:

Multiple output operations

Unary and multiple output operations can act on relations in any of the three fields. Binary oparations act on relations 1 and 2 to produce relation 3.

Ring construction as networks

The export function constructs an abstract representant of the ring type (or a set of representants of a class of ring types) in network format. These networks can be used as fragments in a Pajek Fragment Search (an alternative to Puck's matrimonial census as a method of matrimonial ring search).

Window components

Buttons:

Drop-down menus:

 

Kinship dataset

 

How to establish a kinship dataset

 

 

 

Data are not only a result but also a means of data collection.

 

They should be easily accessible in order to guide your research and to cross-check your informant’s answers. When dealing with archives, this is often fairly simple: you can take a computer with you. But in many fieldwork situations this is not possible. However, noting kinship ‘by hand’ can be extremely fast and efficient, if some basic principles are observed.

 

— Always use a compact medium, such as a notebook. Do not use filesheets or loose papers. You cannot use them during interviews, and there is a high risk of loosing some of them.

 

— Separate graphics and text. A good method is to use a notebook with the left page for drawing genealogies, the right page for listing the individuals and their properties, and numbers for identifying these individuals (if numbers get large, it is recommended to use, in addition, initial letters to prevent identification problems in case of numbering errors)
— Attribute an identity number to each individual and never attribute that number to another individual. If you have ‘doubles’, make a link to the original number but do not re-assign it. Holes in the series of numbers do not cause any damage, but ambiguities in identity numbers cause much damage, and are extremely difficult to detect.

 

Do not use identity numbers as codes. Identity numbers serve to identify individuals - and nothing else (except, perhaps, to recall the order in which you have entered them and to document the history of your corpus). If you want to convey information on individuals’ gender, clan affiliation, residence, etc., do not use identity numbers for that.

 

Never forget to make regularly copies and store them on different places. This holds for all data, but especially for kinship data, due to the network properties of kinship: one lost notebook may render twenty others useless.

 

 

 

Frequently asked questions

 

— Do I have to number individuals continuously?
— No. Discontinuous numbering is no problem for Puck nor for most other genealogical programs. Pajek requires continuous numbering, but Puck can convert datasets into pajek file format including renumbering without loss of information on original numbers (by using the option "numbered" for exportation). However, you should avoid too large empty spaces between identity numbers, because some search methods may get more time intensive.

 

— Some individuals in my dataset are doubles, do not exist, or have become obsolete. Can I delete them ?
— Yes - but do not reassign their identity numbers to other individuals! Just leave their positions empty. In the case of doubles, it can be useful to keep them in your dataset, so that you can easily find informations on the individuals in the different places in your notebooks. You can mark them as doubles by assigning them as a name the identity number of the original. If needed, you can always eliminate them by the eliminate doubles option.

 

— How do I code kinship relations between individuals when I ignore the exact genealogical chain ?
— If you know the exact genealogical relation, you may introduce into your dataset virtual individuals - having « # » as a name - as intermediary links (for instance, if you know that A is B’s paternal brother, you may introduce a virtual common father). Make sure, however, that the kinship term people give you really corresponds to the supposed genalogical relation (in many societies, kinship terms may designate large classes of relations, some of them may be without any genealogical foundation whatsoever!) If you are not 100% sure that your « brother » really is a brother in a genalogical sense, you should rather store the information in a note or as relational property of the concerned individuals.

 

— How do I code divorced spouses ?
— Like all other spouses, living or dead, married or divorced. You can store the information on divorce among the individuals’ properties, using the property code DIV.

 

 

 

 

File formats

 

Kinship data can be stored in files of different formats.

All example files are representations of the Ragusan corpus of Irmgard Mahnken (click here for more informations)

Text format (file extension .txt) [example]

This is a tab delimited text file, composed of two blocks separated by an empty line.

The first block contains the basic information for each individual in separate columns:

  • A unique identity number (ID)

  • Name(s), where different name parts are separated by a slash (/)

  • Gender: M or H (man), W or F(woman), X (gender unknown). Gender letters are not case sensitive

  • Father's ID number

  • Mother's ID number

  • Spouse(s) ID number, where the ID numbers of different spouses appear in different columns

A headline may be convenient for data entry, but is not necessary for Puck.

Attention! If you use a headline, do not call the column of identity numbers " ID " (rather use " Id " or " Nr " or something similar), otherwise it will not be opend by Microsoft Excel (this has nothing to do with Puck, but is a general Microsoft Excel Bug).

In the case of multiple spouses, there are two possibilities (which may be combined):

  • either the ID numbers of an individual's spouses appear in one single line but different columns (from the sixth column on). This is the output produced by Puck and the most convenient solution if an individual's spouses are immediately known

  • or the ID numbers of an individual's spouses appear in the same column (the sixth) but in different lines, which means that the individual's ID number has to be entered several times. This solution may be more comfortable if information on an individual's spouses is dispersed.

In the second block, each line contains an individual's ID number and, in successive columns, items of supplementary information concerning the individual. These items are the values of individual properties, columns being labelled according to individual property codes.

Excel table format (file extension .xls) [Exemple]

The table format is the most simple format for entering data " manually " (without using the Puck data entry form or another genealogy program).

It is organized just as a file in text format, with the only difference that the various blocks are on different spreadsheets.

Excel table format is defined as the default format for the exportation of matrimonial censusreports as well as for matrices and statistics. To change the default format for these outputs to txt, deactivate the option "reports in excel format" in the general options window.

Pajek network format (file extension .paj) [examples]

Kinship data in pajek format can be used to transform, manipulate and analyze them with the computer program pajek .

For a free download of pajek click here.

For an introduction to pajek see the pajek manual (in pdf format).

For a series of macros that can be used to analyze kinship data with Pajek see Tip4Pajek.

Puck exports data into a pajek project file (that is, a package of network, partition and vector files), which contains the basic information for each individual in a network, and the supplementary information in partitions and vectors.

The basic information is stored as a network which consists of two parts:

  • A vertex list, where each individual is represented by a line containing

    • A current index (which must be continuous).

Warning! This index is generally not identical with the individual's ID number. In particular, it is never identical with it if the network represents a subcorpus of the original corpus. In order to save original ID numbers when exporting to a paj file, use the "numbered" option in the Export window.

    • The individual's name between apostrophs, where different name parts are separated by a slash (/).

Warning! Make sure that there are no apostrophes within any individual's name!

    • A geometrical figure name indicating the individual's gender: triangle for male, ellipse or circle for female, square for unknown gender

  • An arc list, where each arc of the network is represented by three numbers: the indices of the two vertices connected by the arc, and the value of the arc, using the codes of kinship networks in tip format:

1 for an arc connecting wife and husband

2 for an arc connecting mother and daughter

3 for an arc connecting mother and son

4 for an arc connecting father and daughter

5 for an arc connecting father and son

Any supplementary information is stored in partitions (.clu) or vectors (.vec): These are simple lists of numbers, where each number corresponds to a different cluster of the partition or a different value of the vector. Each partition or vector should be named according to individual property codes.

Vectors are used for properties with numeric values, partitions for all others. As a consequence, a cluster value is just a label to identify clusters without any intrinsic meaning.

Warning: Original cluster labels (property values) are lost in pajek format!

Gedcom format (file extension .ged) [example]

This is the format used by most genealogy programs (commercial and noncommercial).

For an introduction to gedcom formats click here

The individual property codes used by Puck correspond, as far as possible, to standard gedcom codes.

Tip format (file extension .tip) [example]

This is the format most rapidly read by Puck. In this format a corpus is stored when pressing the "Save" button.

The tip format stores all information in three kind of lines, distinguished by the first number of the line. The different items of each entry are separated by tabs:

0 Identity line: contains

a. the individual's ID number

b. gender number (0 male, 1 female, 2 unknown gender)

c. name (different parts being separated by a slash (/)

1 Kinship line: contains

a. the individual's ID number

b. alter's ID number

c. the code of the kinsip relation (0 father, 1 mother, 2 spouse)

2 Property line: contains

a. the individual's ID number

b. the label of the property, using individual property codes

c. the property value

d. the place (in case of cv events)

e. the date (in case of cv events)

f. alter's ID number (in case of cv events)

There is no particular order prescribed. Although Puck exports data ordered by individuals, any information can be added manually at the end of the file.

Extensible Markup Language format (file extension .xml) [example]

Xml is awidely used language for the representation of arbitrary data structures, for example in web services (click here for a detailed description of its encoding rules). It is used as the standard output format of the Kinship Editor.

For kinship datasets, the following markup terminology has been defined:

Prolog format (file extension .pl) [example]

Prolog is a general purpose logic programming language The program logic is expressed in terms of relations. For more informations on the prolog language click here. For an introduction to the use of prolog for reprsenting kinship relations click here. To download a free prolog click here.

In prolog format, all relations and attributes are representing as pairs of the form r(a, b), where r is the name of the attribute or relation, a is the identity number of ego (preceded by the letter p), and b is either an attribute (between simple parentheses) or the identity number of alter (preceded by the letter p). For example

daughter(p1,p2) means that p1 is p2's daughter

gname(p1,'Mary') means that p1 has the name 'Mary'

The prolog format for kinship data readable by Puck or the Kinship Editor use the followig terminology:

relations: father, mother, daughter, son, husband, wife

attributes: gname, sex, info1, info2 etc.

Special topics

General corpus informations

General informations on the corpus (author, coder, region and period, etc.) are at the beginning of txt, tip and paj files at the top of the file, each line preceded by an asterisk. Puck shows these informations after importation at the "front page" of the dataset (the corpus window).

Note: informations preceded by asterisks are not read by pajek and have no effect (neither positive nor negative) when opening the paj file in pajek

Note: In ged files, there are special gedcom codes for different kinds of information, which are not yet read by the actual version of Puck.

Codage of multiple marriages

Multiple marriages can be coded in tip, ged and txt files.

Example: Individual Nr. 1 has been married to individual nr 8 in Paris in 1986, and to individual nr 10 in London in 1999.

2 1 MARR 1986 Paris 8

2 1 MARR 1999 London 10

0 @F1@ FAM

1 HUSB @I1@

1 WIFE @I8@

1 MARR

2 PLAC Paris

2 DATE 1986

0 @F2@ FAM

1 HUSB @I1@

1 WIFE @I10@

1 MARR

2 PLAC London

2 DATE 1999

 

Kinship relation notation

 

 

Codes for individual properties

 

 

 

Codes for relation and ring properties

 

 

 

 

Graphs

 

 

Paths and cycles

 

 

Subgraphs, chains and rings

 

 

Kinship network representations

 

 

Kinship networks (Ore graphs)

 

 

Kinship tracks and relations

 

 

Kinship chains and their properties

 

 

Matrimonial circuits

 

 

 

Kinship relations notations

A kinship relation can be represented in several different notations:

Standard notation

The conventional notation of kinship relations uses capital letters for indicating the type of 8 basic kinship relations. These letters are mostly abbreviations of the corresponding English kinship term. They contain information on the gender of Alter and of the direction of the basic kinship relation (ascendance, descendance, marriage, as well as siblingship).

 

  Male Alter Female Alter
Ascendance : F for father M for mother
Descendance : S for son D for daughter
Marriage : H for husband W for wife
Siblingship : B for brother Z for sister

 

The gender of Ego must be indicated by additional signs such as ♂ [male Ego] or ♀ [female Ego] placed before the initial letter.

These basic relations are composed into more complex relations by simple juxtaposition of letters according to their position in the kinship chain, starting from ego (as in English, but contrary, for example, to French, where kinship terms have to be composed starting with alter!). The resulting combination of letters can be read as a direct abbreviation of an English kinship term: MBD (mother's brotherís daughter, a matrilateral cross-cousin), ZH (sister's husband, a brother in-law), FWS (father's wife's son, a step-brother) are examples of this.

Half-sibling relations are distinguished from full sibling relations by using explicit combination of ascendance and descendance letters instead of sibling letters: for instance, FS (father's son, paternal half-brother).

In addition to genealogical relations, relative age can be indicated by minor letters e (elder) and y (younger) placed before the kinship letter concerned: for instance, FeB (father\'s elder brother), MyZ (mother's younger sister).

Standard kinship notation is highly intuitive and easy to read (at least for anglophones). However, it expresses the ethnocentric viewpoint of English kinship terminology and, by using simple abbreviations, tells us little or nothing about the structure of the kinship relation. It is therefore certainly not the best tool for analytical purposes.

Positional notation

In positional notation (developed by L. Barry 2004), a kinship relation is represented by a sequence of letters indicating gender (by abbreviations of the french terms H - homme - for male, and et F - femme - for female) and two diacritical signs:

Relations of ascendance and descendance are indicated by simple juxtaposition, where direction changes after every pair of parentheses and every marriage dot. By convention, the starting direction is ascendance.

By replacing gender letters with the variable X, more comprehensive classes of kinship relations can be represented in positional notation. For instance, X(H)X denotes paternal half-siblings, XX(X)F direct aunts, X(F)FH uterine nephews.

Note that the translation of kinship relations from standard notation (without using ♂ and ♀ signs for the gender of ego) into positional notation always implies the variable letter X in the first position!

Positional notation can be used not only to represent abstract kinship relations, but also concrete kinship chains. In this case, gender letters are replaced by identity numbers of the individuals in the respective positions.

The major advantages of this notation are:

P-graph notation

P-graph notation (first sketched White and Jorion 1992, later elaborated) corresponds to the representation of kinship networks as P-graphs. As in positional notation, kinship relation is represented by a sequence of letters indicating gender (by abbreviations of the french terms H - homme - for male, and et F - femme - for female). However, these letters now do not indicate vertices in Ore graphs but arcs in P-graphs.

Capital letters are used if arcs point from children's marriages to parent's marriages, lower case letters if arcs point in inverse direction.

If a parent marriage is linked to several children's marriages by one and the same individual (married several times to different spouses), the identity of the two arcs is marked by a full stop " . "

Numeric notation

Any linear kinship relation can be represented by a characteristic number

λ(k) = ∑(1+ σi)∙2i

where κ+1 is the degree of the relation (see above) and σi is the gender number (0 = male, 1 = female) of the ith individual (starting with the ancestor i=0 ). For example:

relation positional binary characteristic number
male ego H 0 (1∙20) 1
M/S HF 01 (1∙20 + 2∙21) 5
MF/DD FFH 110 (2∙20 + 2∙21 + 1∙22) 10

 

Numeric notation of linear relations is an elaborated variant of the ahnentafel genealogical numbering system, dating from the 16th century.

Any compound kinship relation can be represented by a characteristic vector - that is, the sequence of the characteristic numbers of all linear relations it contains (identity included). By convention, every consanguineus relation is represented as a combination of an ascending and a descending relation. A simple ascending relation will therefore be represented as a combination of an ascending relation and identity. Thus, any complex relation of width n can be represented in numeric notation by a sequence of 2(n+1) characteristic numbers.

Full sibling relations can be represented in numeric notation by choosing the characteristic numbers of the corresponding relations with male apical ancestors and giving them negative sign.

Clearly, this is not a notation for "everday use". Its major advantage is that it provides a simple vector representation of all kinds of kinship relations which

Basic notation

Basic notation consists of a chain of gender indices (0 for male, 1 for female, 2 for unknown), with a dot after each apical vertex and between pivotal vertices.

Basic notation is an extremely simple analytical notation that is very suitable for computer analysis but not very readable for humans.It is used as a basis for Puck\'s construction of circuits from signatures.

Some examples

Positional Standard Numeric P-graph Basic
H()F.(H)F ZHD -3 -4 1 4 HfH.ff 02.1.0.1.
H(F).H(H)HH MHFSS 4 1 3 7 HF.fHH.hhh 01..00.00.
FF()F.H MMZH -10 -4 1 1 FFfH.h 112.1.0.
HH(F)FH.(F)H FMDSWS 11 13 2 5 HHFfhF.fh 001.10.10.

 

Individual property codes are used to cluster individuals according to simple or more complex properties.

These codes can be used

Some of the cluster codes can be used as such (see the list below).

Attention: except for the cases listed below, property codes have to consist in a single word without blank. For example, use "AGE_CLASS" and not "AGE CLASS", "EVENT-1" and not "EVENT 1"

Others permit or require specification by, or combination with, additional code:

Property specification

Attention: there must be a blank between the label "PEDG" or "PROG" and the number that follows! For example, the code PEDG3 is not readable for puck

Attention: the specifications "place", "date" and "year" are fixed language-independent codes (they have to be used in any language). However, they may be used with lower-case or upper-case letters.

Attention: if text (or excel) format is used, the property and its specification ("place", "date" or "year") form a single heading for one column!

Cluster regrouping and periodization

Clusters can be regrouped into more comprehensive superclusters by combining the property with additional code. This is particularly meaningful in the case of date or year properties, where the comprehensive superclusters represent time periods. Regrouping can be done in different ways:

Cluster expansion / unilinear transfer of cluster affiliation

Cluster affiliation can be made "inheritable" in the paternal or maternal line. By letting the code "PATRIC" or "MATRIC" precede the property code,cluster affiliation is transmitted to all agnatic/uterine descendants.

Note: this method only makes sense if descendants' properties do not contradict those of their ascendants. In particular, it can be used to construct "clans", i.e. groups that recruit their members by unilinear filiation but not necessarily consist of genealogically related membres. In this case it is sufficient to indicate clan affiliation of apical ancestors and to use cluster expansion to get "clans". In a sense the "lineages" constructed by simple PATRIC and MATRIC codes are limiting cases of clan construction, each apical ancestor belonging to exactly one "clan".

Date properties : status at a point of time

The property code that consists in a simple date (for the moment, only years are possible) assigns to each individual its status (living, dead, unborn, married, widowed, or unknown) at that date

Cumulation

Numeric data (e.g. birth year or number of ancestors) can be partitioned in a cumulative manner by putting an addition sign ("+") or a subtraction sign ("-") before the code, according as cumulation is to be done in ascending (<=, "at most", "before") or descending (>=, "at least", "after") order.

For example,

+PEDG 2 gives the numbers of individuals that have at most 1, 2, 3 etc. known grandparents

+BIRT Year *100 1500 gives the numbers of individuals that are born after 1500, 1600, 1700 etc.

Warning: cumulative statistics is not compatible with cross partitioning. If a split criterion (for cross partitioning) has been chosen, the overall cumulation statistics will be correct but will not be reflected in the split diagrams. Deactrivate the split option in order to produce the diagrams for cumulative statistics!

Binarization

Any individual property can be binarized into a true/false dichotomy by entering label and value of the property, joined by the equality sign ("=").

For example, the code "BIRT Place = Paris" paritions the individuals of the corpus into born Parisians and Non-Parisians.

If only a part of the property value has to match the entry, then add an asterisk

For example, the code "LASTN = *Mor" partions the corpus into people named Moreschi, Morpurgo, Morrovalle etc. and all the others

If the property is numeric (for example, birth year), it can also be binarized by assigning an interval rather than a single value. This is done by entering, instead of a single value, two limiting values joined by a hyphen (the interval is inclusive, that is,limiting values belong to it).

For example, the code "BIRT Year = 1701-1800" partitions the individuals of the corpus into those born in the 18th century and those born in other centuries.

Partitioning is particularly useful for enlarging or restricting a relational or matrimonial census by redefining spouses.

Endogenous and exogenous properties

Some special endogenous properties

Generation

Generation and generational distance are not unique concepts. Except in kinship networks that consist of trees, there are usually several alternative ways to arrange an individual on a generational level inferior to its ascendants and superior to its descendants. For instance, if a man has married his sister's daughter, his children will be at the same time grand-children and great-grand children of his father. One has to decide on the path along which generational distance shall be calculated.

The algorithm used by Puck is identical to that of Pajek. It consists in navigating through the network along kinship paths and assigning to parents, spouses and children of each individual the generational level of that individual, augmented by 1, 0 or -1 according to the nature of the kinship tie.

Note: the identity of the algorithm does not necessarily imply the network of the results. The result depends on the navigation path which may be different for Pajek and Puck, since arcs are not necessarily stored in the same order.

 

 

 

 

Codes

In addition to the standardized codes listed above, you are free to enter any other property label you want.

Warning: only use single-word codes - Puck does not allow for empty spaces in property codes!

Note: property codes are fixed and language-independent. They do not change by switching from one language to another.

Endogenous properties:

Exogenous properties (Gedcom codes):

Using this property for redefining spouses in order to effect a second relational or matrimonial census permits a complex matrimonial or relational census (multiple kinship relations or intersecting matrimonial rings).

Binarizing this property according to place, date or period and using this binarized property for redefining spouses in order to effect a second relational or matrimonial census permits a restricted matrimonial census