- How to Establish a Kinship Dataset
- File Formats for Kinship Data
- Kinship Relations Notation
- Property Codes
- Bibliographic References
A genealogical corpus is a set of individuals linked by relations of kinship and marriage with basic and supplementary information for each individual that has been coded :
- A unique identity number (ID)
- Gender: H (man), F (woman), X (gender unknown)
- Father’s ID number
- Mother’s ID number
- Spouse(s) ID number
- Biographical informations (birth, marriage, death dates and places, other properties)
Tips for Collecting Kinship Data
Data are not only a result but also a means of data collection. They should be easily accessible in order to guide your research and to cross-check your informant’s answers. When dealing with archives, this is often fairly simple: you can take a computer with you. But in many fieldwork situations this is not possible. However, noting kinship "by hand" can be extremely fast and efficient, if some basic principles are observed :
- Always use a compact medium, such as a notebook. Do not use filesheets or loose papers. You cannot use them during interviews, and there is a high risk of loosing some of them.
- Separate graphics and text. A good method is to use a notebook with the left page for drawing genealogies, the right page for listing the individuals and their properties, and numbers for identifying these individuals (if numbers get large, it is recommended to use, in addition, initial letters to prevent identification problems in case of numbering errors) — Attribute an identity number to each individual and never attribute that number to another individual. If you have "doubles", make a link to the original number but do not re-assign it. Holes in the series of numbers do not cause any damage, but ambiguities in identity numbers cause much damage, and are extremely difficult to detect.
- Do not use identity numbers as codes. Identity numbers serve to identify individuals - and nothing else (except, perhaps, to recall the order in which you have entered them and to document the history of your corpus). If you want to convey information on individuals gender, clan affiliation, residence, etc., do not use identity numbers for that.
- Never forget to make regularly copies and store them on different places. This holds for all data, but especially for kinship data, due to the network properties of kinship: one lost notebook may render twenty others useless.
- Do I have to number individuals continuously ?
No. Discontinuous numbering is no problem for Puck nor for most other genealogical programs. Pajek requires continuous numbering, but Puck can convert datasets into pajek file format including renumbering without loss of information on original numbers (by using the option "numbered" for exportation). However, you should avoid too large empty spaces between identity numbers, because some search methods may get more time intensive.
- Some individuals in my dataset are doubles, do not exist, or have become obsolete. Can I delete them ?
Yes, but do not reassign their identity numbers to other individuals! Just leave their positions empty. In the case of doubles, it can be useful to keep them in your dataset, so that you can easily find informations on the individuals in the different places in your notebooks. You can mark them as doubles by assigning them as a name the identity number of the original. If needed, you can always eliminate them by the eliminate doubles option.
- How do I code kinship relations between individuals when I ignore the exact genealogical chain ?
If you know the exact genealogical relation, you may introduce into your dataset virtual
individuals - having « # » as a name - as intermediary links (for instance, if you know that A is B’s paternal brother, you may introduce a virtual common father). Make sure, however, that the kinship term people give you really corresponds to the supposed genalogical relation (in many societies, kinship terms may designate large classes of relations, some of them may be without any genealogical foundation whatsoever!) If you are not 100% sure that your « brother » really is a brother in a genalogical sense, you should rather store the information in a note or as relational property of the concerned individuals.
- How do I code divorced spouses ?
Like all other spouses, living or dead, married or divorced. You can store the information on divorce among the individuals properties (see also File formats for kinship data).
This page contains some references to the Kinsources project website. To know more about Kinsources, click here.
Kinship data can be stored in files of different formats :
Text and Excel format (file extensions .txt and .xls)
Pajek Network format (file extension .paj)
Gedcom format (file extension .ged)
Kinship editor xml format (file extension .xml)
Prolog format (file extension .pl)
A kinship relation can be represented in several different notations. Puck basically uses two of them : the standard and the positional notations.
The conventional notation of kinship relations uses capital letters for indicating the type of 8 basic kinship relations. These letters are mostly abbreviations of the corresponding English kinship term. They contain information on the gender of Alter and of the direction of the basic kinship relation (ascendance, descent, marriage, as well as siblingship). The following table shows its logic :
These basic kinship relations are composed into more complex ones by the simple juxtaposition of letters according to their position in the kinship chain, starting from ego (as in English, but contrary, for example, to French, where kinship terms have to be composed starting with alter!). The gender of Ego must be indicated by additional signs such as ♂ [male Ego] or ♀ [female Ego] placed before the initial letter. The resulting combination of letters can be read as a direct abbreviation of an English kinship term: MBD (mother’s brother’s daughter, a matrilateral cross-cousin), ZH (sister’s husband, a brother in-law), FWS(father’s wife’s son, a step-brother) are examples of this.
Half-sibling relations are distinguished from full sibling relations by using explicit combination of ascendance and descendance letters instead of sibling letters: for instance, FS (father’s son, paternal half-brother). In addition to genealogical relations, relative age can be indicated by minor letters e (elder) and y (younger) placed before the kinship letter concerned: for instance, FeB (father’s elder brother), MyZ (mother’s younger sister). Standard kinship notation is highly intuitive and easy to read (at least for anglophones). However, it expresses the ethnocentric viewpoint of English kinship terminology and, by using simple abbreviations, tells us little or nothing about the structure of the kinship relation. It is therefore certainly not the best tool for analytical purposes.
In the positional notation, developed by Laurent Barry (Barry, 2004), a kinship relation is represented by a sequence of letters indicating gender (by abbreviations of the french terms H - homme - for male, and et F - femme - for female) and two diacritical signs :
- The point or full stop “.” which indicates marriage ;
- The parentheses () surround an apical position, that is, the position of an individual which is not descendant of any of its neighbors. If both neighbors are spouses, the parentheses may be dropped.
Relations of ascendance and descent are indicated by simple juxtaposition, where direction changes after every pair of parentheses and every marriage dot. By convention, the starting direction is ascendance.
By replacing gender letters with the variable X, more comprehensive classes of kinship relations can be represented in positional notation. For instance, X(H)X denotes paternal half- siblings, XX(X)F direct aunts, X(F)FH uterine nephews.
Note that the translation of kinship relations from standard notation (without using ♀ and ♂ signs for the gender of ego) into positional notation always implies the variable letter X in the first position.
Positional notation can be used not only to represent abstract kinship relations, but also concrete kinship chains. In this case, gender letters are replaced by identity numbers of the individuals in the respective positions.
The major advantages of positional notation are :
- The clear representation of the kinship relations structural properties, which remain unchanged by symmetry transformations HF( )HF becomes FH( )FH, but MBD becomes FZS ;
- The integration of the sex of ego and not only of alter ;
- The applicability not only as a notation but as a classification tool (by use of gender variables) ;
- The homogeneity of notations of kinship chains (with individual numbers), kinship relations (with gender letters) and kinship relation classes (with gender variables).
The following table shows some examples of kinship relations translation from positional to standard notation :
Endogenous and exogenous properties are designated by standard codes. In addition to the standardized codes listed above, you are free to enter any other property label you want.
Warning : only use single-word codes - Puck does not allow for empty spaces in property codes.
Note : property codes are fixed and language-independent. They do not change by switching from one language to another.
Main Endogenous Properties
“Endogenous” criteria of classification are calculated by Puck from the genealogical data and are derived automatically from the kinship network itself : sibling group size, number of known ascendants, number of spouses, etc. They need not and should not be explicitly specified, and their codes should not be used to enter properties or to load them from a file.
- ALL - a pseudo-property that serves to remove a partition and to restore the unity of the underlying corpus
- ***BIRTH_ORDER - birth order
- ***GENDER - gender
- GEN - generation (see here)
- FIRSTN - first name
- LASTN - last name
- FRATP - father, agnatic fratry
- FRATM - mother, uterine fratry
- PATRIC - agnatic apical ancestor, “patrilineage”
- MATRIC - uterine apical ancestress, “matrilineage”
- PATRID - distance to the agnatic apical ancestor, “agnatic generation”
- MATRID - distance to the agnatic apical ancestress, “uterine generation”
- DEPTH - distance to the most remote ancestor, maximal generational depth
- MDEPTH - mean distance to ancestors, mean generational depth. The formula have been defined by Cazes (Cazes & Cazes, 1996)
- PEDG x - number of ascendants (where x is a number specifying generational distance)
- PROG x - number of descendants (where x is a number specifying generational distance)
Note : The properties PEDG (pedigree) and PROG (progeny) require specification by a number that indicates generational distance. For instance, PEDG 2 is the number of grandparents, PROG 1 the number of children.
- SPOU - number of spouses
Main Exogenous Properties
The “Exogenous” classification criteria do not derive from the kinship network itself : dates of birth, death or marriage, profession, residence, religion, etc. Exogenous properties have to be specified explicitly for each individual in the file from which the corpus is loaded or by entering them in the data window. Puck uses the standard gedcom codes for exogenous properties.
- ***BIRT_DATE - birth date
- ***BIRT_PLACE - birth place
- ***DEAT_DATE - death date
- ***DEAT_PLACE - death place
- ***MARR - marriage (place/date/year/alter)
Note : Binarizing this property according to place, date or period and using this binarized property for redefining spouses in order to effect a second relational or matrimonial census permits a restricted matrimonial census
- DIV - divorce (place/date/year/alter)
- BAP - baptism (place/date/year)
- BURI - burial (place/date/year)
- DECO - decoration (place/date/year)
- EDUC - education
- NATI - nationality
- OCCU - occupation
- RELI - religion
- RESI - residence
- TITL – title
a) According to the arc and edge pattern of lines:
- Length : the number of arcs and edges included (Roman degree in the case of consanguine relations)
- Height : the length of the longest linear chain included (German degree in the case of consanguine relations)
- Width : the number of marriage edges included (consanguine relations have width 1, relinking marriages width 2 or more.)
b) According to the gender pattern of vertices :
- Descent : agnatic, uterine or cognatic according to the gender of vertices in consanguine chains
- Crossness : cross or parallel according to the gender difference of intermediate pairs of vertices in consanguine chains
- Terminal crossness : cross or parallel according to the gender difference of terminal pairs of vertices in consanguine chains
c) According to symmetry features :
- Skewedness: horizontal, ascending or descending according to differences in the length of the linear chains composing a consanguine chain
- Automorphy: percentage of symmetry transformations that leave the kinship relation unchanged
- SIMPLE - the relation or ring type as such (the "finest" classification: each relation is in a separate class)
- LENGTH - length : the number of links between ego and alter (in consanguine relations this corresponds to civil or roman degree)
- HEIGTH - height : the maximal number of links to an apical ancestor (in consanguine relations this corresponds to canonic or germanic degree)
- WIDTH - width: the number of consanguine components implied in the relation
- SYM - symmetry: the number of automorphic transformations as a percentage of all possible transformations which leave gender and direction invariant
- HETERO - a binary property, true if all married couples as well as the pair ego/alter are heterosexual, false otherwise
- DEGREE - civil degree (number of links between consanguines)
- ENDS - gender combination of ego/alter
- SKEW - skewedness (generational distance between ego and alter)
- SKEW+ - skewedness (in three classes: horizontal, oblique, alterne)
- LINE - unilinearity type (agnatic, uterine, cognatic, bilateral or identity)
- AGNA - agnatic coefficient (percentage of agnatic links)
- UTER - uterine coefficient (percentage of uterine links)
- DRAV - dravidian crossness
- SWITCHES - number of gender switches
- ARCH - gender combination of the apical siblings (children of the apical ancestor of the relation), not defined for linear relations
- Status (allowed / not allowed / not defined) according to particular marriage systems
- DRAV-H - dravidian crossness (horizontal system, Chimane model)
- DRAV-O - dravidian crossness (oblique system, Parakana model)
2004, "Historique et Spécificités techniques du programme Genos", Ecole « Collecte et traitement des données de terrains », Available online at http://llacan.vjf.cnrs.fr/SousSites/EcoleDonnees/extras/Genos.pdf
BARRY Laurent, & GASPERONI Michaël,
2008, "L’oubli des origines. Amnésie et information généalogiques en histoire et en ethnologie", Annales de démographie historique, 116, 53-104.
CAZES Marie-Hélène, & CAZES Pierre,
1996, "Comment mesurer la profondeur généalogique d’une ascendance?", Population, 51/1, 117-140.
GRANGE Cyril, & HOUSEMAN Michael,
2010, "Objets d’analyse pour l’étude des réseaux de parenté: une application aux familles de la grande bourgeoisie juive parisienne XIXe-XXe siècles", Annales de démographie historique, 116(2), 105-144.
HAMBERGER Klaus & DAILLANT Isabelle,
2008, "L’analyse de réseaux de parenté: concepts et outils", Annales de démographie historique, 116, 13-52.
HAMBERGER Klaus, & GARGIULO Floriana,
2013, "Virtual Fieldwork. Modeling Observer Bias in Kinship and Alliance Networks", Journal for Artificial Societies and Social Simulation, 17(3), 2. Available online at http://jasss.soc.surrey.ac.uk/17/3/2.html.
HAMBERGER Klaus, HOUSEMAN Michael, DAILLANT Isabelle, WHITE Douglas R., & BARRY Laurent,
2004, "Matrimonial ring structures", Mathématiques et Sciences Humaines. Mathematics and Social Sciences, (168), p.83-120.
HAMBERGER Klaus, HOUSEMAN Michael, & GRANGE Cyril,
2009, "La parenté radiographiée", L’Homme, 191(3), 107-137.
HAMBERGER Klaus, HOUSEMAN Michael & GRANGE Cyril,
2014, "Scanning for patterns of relationship: analyzing kinship and marriage networks with Puck 2.0", The History of the Family, publication in progress, see http://www.tandfonline.com/loi/rhof20 (restricted access).
HAMBERGER Klaus, HOUSEMAN Michael, & WHITE Douglas R.,
2012, "Kinship Network Analysis", In P. Carrington & J. . Scotto (Éd.), The Sage Handbook of Social Network Analysis (p. 533-549). Sage Publications.
WHITE Douglas R., & HOUSEMAN Michael,
1996, "Structures réticulaires de la pratique matrimoniale", L’Homme, 36(139), 59-85.
WHITE Douglas R., & JORION Paul,
1992, "Representing and Analyzing Kinship: A New Approach", Current Anthropology, 33, 454-462.
Cette partie du site est actuellement en construction...
This section of the site is currently under construction...
Cette partie du site est actuellement en construction...
This section of the site is currently under construction...
Program for the Use and Computation of Kinship data
© Research Group TIP (Kinship and Computing)
Centre National de Recherche Scientifique, Paris
Distributed under CeCILL licence version 2 (http://www.cecill.info/)
Created 2007 by Klaus Hamberger
Developed by Klaus Hamberger, Christian Momon, Edoardo Savoia, Telmo Menezes and Éric Mermet
Visualization powered by KinOath (developed by Peter Withers) and Geneaquilts (developed by Anastasia Bezerianos, Pierre Dragicevic and Jean-Daniel Fekete)
JVM = 1.7.0_91 24.91-b01
XMX = 298M/353M/910M
OS = Linux 3.16.0-38-generic 64bits
Cette partie du site est actuellement en construction...
This section of the site is currently under construction...
What is PUCK ?
Puck is a computer program for analyzing genealogical and other kinship related data. It has been conceived in continuity with the gradual introduction of computers in kinship studies since the 1970s.
Puck is a product of the research group TIP and an outcome of the project « Informatical treatment of kinship phenomena » (« Traitement Informatique des phénomènes de parenté », 2006- 2009) financed by the French National Research Agency (ANR).
For a presentation of the software, and of its theoretical issues, we suggest english readers to see Hamberger, Houseman, & Grange, (2014) ; french readers can refer to Hamberger, Houseman, & Grange, (2009).
What PUCK does ?
« A paradox is at the heart of studies on kinship: the marriage choices which, as the foundation of kinship systems, should be the primary object were neglected, especially in their empirical dimension. The anthropology of kinship began as an analysis of terminology and developed into a science of rules and norms. Today, it appears as a multifaceted research field dealing with representations and institutions, political strategies and symbolic operations. Still, the actual practices that generate matrimonial networks continue to occupy a marginal position » (Hamberger, Houseman, & Grange, 2009, p. 107). Kinship cannot be treated as a collection of isolated elements, the use of a computer processing of matrimonial practices have proved fundamental.
Starting from this premise, the software allows not only to make very fine and precise analyses of the matrimonial structures and configurations of genealogical networks, but also to consider its quality and biases: the data processing of kinship relationships is thus an opportunity to consider the theoretical issues and methodological choices, as it respects the individual way of the collection and organization of field data or records, to assess the quality of the dataset studied.
Puck accompanies the researcher throughout his work, from data input to final analysis. It is compatible with most commonly used formats (Excel, Gedcom, Pajek, etc.), and allows to import or export files in all these formats. Puck has been written in Java 1.6 and is continuously updated.
- Matrimonial and Relational Census
Puck is the first software allowing to run a full matrimonial and relational census on a genealogical dataset. It identifies the matrimonial circuits that can be found in a kinship network, classifies them and assists the researcher through the analysis of the network topology. By producing the circuit intersection network of a given dataset, it highlights the actual combinations existing between different marriages types and thus allows to focus on the most structurally significant areas of the network. Those operations can help in distinguishing whether the frequency of given matrimonial structures is due to specific social norms (preferences, avoidances...) or is a mechanical effect of the network density.
- Diagnostics of Kinship Datasets
Genealogical corpora produced by researchers are not neutral objects. Genealogies constructed from informants or documentary sources are often incomplete and androcentric, which implies an asymmetry of relations in the network. It is very important to detect errors of data collection or data entry to correct them, to know biases and be able to relativize the raw results (Barry & Gasperoni, 2008; Hamberger & Gargiulo, 2013; Hamberger, Houseman, & White, 2012, p. 546‑547). Before analyzing a family dataset, the first step is therefore to establish a profile of the network, from basic tools (count of the population and demographic composition, gender distribution, genealogical depth, density) to more complex ones (family completeness, gender bias), not only on the overall network, but also on specific parts of it. By partitioning the dataset and focusing on specific parts of the network it is then possible to refine the analysis.
- Kinship Networks Partitioning
PUCK allows to partition a kinship network by choosing appropriate criteria. These can be individual (e.g., gender, age, birth period, occupation...) as well as family (e.g., union status, children number...) properties. The partitioning operations can show themselves fundamental, in order to refine and supplement qualitative analysis (dataset diagnosis) and/or a matrimonial census.
- Kinship Networks Simulation
Puck can not only treat actual kinship networks, it also provide refined tools for their simulation. The technique, recently diffused in kinship studies, consist in producing randomly generated networks and using them as means of comparison. Such an analytical process can contribute to discern if some of the regularities found in matrimonial practices are the mere product of given statistical criterias or if they actually derive from social institutions.
- Management and Editing of Kinship Networks
Hardware supports, such as fieldwork notebooks, hand-drawed genealogy graphs (etc.) are especially subject to time, weather and, more commonly, entropy. A 10 minutes rain can severely damage the fruit of a long period work. PUCK can be used as a tool to generate, manage and stock kinship networks on a software support. This can be crucial in order to secure the kinship data conservation and therefore their analysis.
- Navigation through Kinship Networks
Kinship networks are complex collections of individuals, ties, events and properties. Even if such data have been gathered personally by the researcher, individual after individual, they can be surprisingly hard to remember and confusing to browse. Thereafter, supports such as fieldwork notebooks can carry biases : for instance, a household based enquiry can easily give unwanted prominence to the residential factor on other ones, like matrimonial alliances.
Thus, Puck has been designed to provide researchers with a tool, not only for stocking data, but also for navigating through them as freely and neutrally as possible. Its components allow to move smoothly, as well as actually "jump", through kinship corpora.
Download and System Requirements
The software can be downloaded here for free, where you will find the latest version (PUCK 2.0) and earlier ones (PUCK 1.0).
System requirements : written in Java, Puck works with most of the common operating systems : on PC (Linux, Solaris, Windows) as well as Mac. Before running the application, make sure that your computer is equipped with an adequate version of Java.
PUCK is a free software, distributed under CeCILL license (a french variant of GPL). Puck has been deposited at the APP (Agence pour la Protection des Programmes, Program Protection Agency) and is protected by French law.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS « AS IS » AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
About this guide
This guide has been conceived for users of every level. Its main objective is to offer accessible and clear answers to your questions about Puck 2, its functionalities and use. As the guide focuses on the program itself, we suggest exploring the bibliographic section in order to deepen the theoretical issues that surround the program conception.
In order to facilitate the access to the guide you can find, as follows, a short presentation of its different sections :
- The section called Functionalities contains a detailed and exhaustive description of PUCK functional components. It is the core of the guide. Its organisation follows the program components themselves and a specific menu is dedicated to it. In its "Start" page you can find a map of the program, indicating PUCK main components that are described in each page of the Functionalities section.
- The Alphabetic and Thematic indexes are meant to drive the reader through the guide contents. In particular, the Thematic Index is conceived to introduce in a more progressive way, new Puck users to the program.
- The Appendices section contains several additional chapters which contents do not directly concern Puck components themselves, but are useful introductions for its use. These are about :
- The Glossary offers a terminological compound to the guide reading. Most of its entries are an english translation of a previous glossary, that the TIP research group published in a monographic number of the french review Annales de démographie historique (2008/2 n.116, p.233-235).
Cette partie du site est actuellement en construction...
This section of the site is currently under construction...
***Agnatic (Uterine) Weight – It is the number of individuals whose linear agnatic (uterine) ascendant of a given generation is known, as a percentage of all individuals for whom the at least one (agnatic or uterine) ascendant of that generation is known.
***Agnatic (Uterine) Net Weight - It is the number of individuals for whom only the agnatic (uterine) ascendant of a given generation is known, as a percentage of individuals for whom at least one (agnatic or uterine) ascendant of that generation is known.
***Alliance Matrix – It is the matrix of a matrimonial alliance network. It is a contingency (or cross) weighted table that indicates the number of spouses "exchanged" between a given set of classes/groups of individuals. As the classes are listed in the first top-line and first left-column, the descending diagonal of the matrix represent the endogamic (intra-group) marriages.
***Bicomponent - A bicomponent (or bi-connected component) is a graph where two distinct Paths can connect any two vertices to each other. Thus, a bicomponent contains no cut-point whose elimination would cut it into two disconnected components. Consequently, any two vertices in a bicomponent form part of a cycle (Grange & Houseman, 2010; Hamberger & Daillant, 2008).
Cycle - A cycle is a path where the first and the last vertex are identical. Using this notion presuppose an ego-centered view of kinship ties.
Circuit - A circuit is a (sub)graph whose vertices and arcs form a single cycle. Using this notion presuppose a socio-centered view of kinship ties.
Circuit Census - See matrimonial census.
***Circuit Composition Table – It is a symmetrical cross table that indicates which matrimonial circuit type is the product of the intersection of two other types of circuit (for an example of a circuit composition table, limited to the third canonical degree, see Hamberger & Daillant, 2009).
***Circuit Intersection Matrix – It is the matrix of a circuit intersection network. It indicates the occurrences of each circuit type, as well as the number of marriages involved in a circuit intersection.
***Circuit Intersection Network – It is the network of all the circuit intersections in a given kinship network. It shows both the occurrences of each circuit type, and the number of marriages involved in each circuit intersection.
***Classificatory Matrimonial Census – It is a specific kind of matrimonial census. It consists in searching not for individual circuit types, but for classes of circuit types, that share a given formal criterion.
***Concordance Table - A concordance table is a file containing information about the renumbering of individuals. It is a simple text file with two columns where the Id numbers of the members of the second and the first corpus are listed. Individuals that appear in only one corpus do not have to be listed in this table, and numbers need not to be ordered.
***Connubial Circuit – It is a circuit as part of a matrimonial alliance network.
***Core - The core of a kinship network is the sum of its matrimonial bicomponents.
***Frame - The frame of a matrimonial circuit is the graph obtained by replacing each one of its consanguine chains with a simple arc. Consequently, only two types of lines compose it: the marriages and the consanguine ties. The frame of a matrimonial network is the union of all the circuit frames that compose the network.
***Generation - Generation and generational distance are not unique concepts. Except in kinship networks that consist of trees, there are usually several alternative ways to arrange an individual on a generational level inferior to its ascendants and superior to its descendants. For instance, if a man has married his sister's daughter, his children will be at the same time grandchildren and great-grandchildren of his father. One has to decide on the path along which generational distance shall be calculated.
The algorithm used by Puck is identical to that of Pajek. It consists in navigating through the network along kinship paths and assigning to parents, spouses and children of each individual the generational level of that individual, augmented by 1, 0 or -1 according to the nature of the kinship tie.
Note: The identity of the algorithm does not necessarily imply the network of the results. The result depends on the navigation path, which may be different for Pajek and Puck, since arcs are not necessarily stored in the same order.
***Kernel - The kernel of a kinship network is its largest matrimonial bicomponent.
***Kinship Network - "Kinship networks are characterized by the interplay of three fundamental principles: filiation, marriage, and gender. We ordinarily represent filiation by a set of arcs (descent arcs) that are directed from parents to children, and marriage by a set of undirected edges (marriage edges) between spouses (for alternative representations of kinship networks without edges, see below). Kinship networks thus are mixed graphs, containing both arcs and edges. Gender is usually taken into account by a partitioning of the vertex set (the gender partition), usually into two or three disjoint classes (male, female, and possibly unknown sex)" (Hamberger, Houseman, & White, 2012).
***Matrimonial Alliance Network – It is a network of marriage ties between groups (classes) of individuals. In a matrimonial alliance network, arcs represent (one or more) marriage ties and point from the wife's to the husband's group ; nodes represent groups of individuals, who share a given property (place of residence, consanguinity...).
Matrimonial Bicomponent - A matrimonial bicomponent is a maximal subgraph in which every two vertices are part of a matrimonial circuit. Also, any two vertices in a matrimonial bicomponent can be linked to each other by two distinct kinship chains that do not pass through and do not meet in “structural children”. Matrimonial bicomponents are closely related (but not identical) to matrimonial components : both are line-biconnected (two distinct line-series link each vertex to every other), but matrimonial bicomponents are also vertex-biconnected (the two interconnecting line-series never run through the same vertex).
Matrimonial Circuit - A matrimonial circuit is a kinship chain that both is closed by a marriage and do not involve childless and unmarried individuals (Hamberger, Houseman, Daillant, White, & Barry, 2004).
Pragmatically, matrimonial circuit types correspond to types of consanguine marriage (between consanguine kin, such as between a man and his mother’s brother’s daughter) and types of affine “relinkings” incorporating one, two or more intermediary marriage ties.
- Consanguine marriages, that incorporate a single marriage tie (and a single consanguine kinship chain), form matrimonial circuits of “width” 1, e.g. a man marries his mother’s brother’s daughter.
- Relinkings incorporating two marriage ties (and two consanguine kinship chains), form matrimonial circuits of “width” 2, e.g. a man and his sister marry a sister and her brother, or a man marries his mother’s brother’s wife’s bother’s daughter.
- Relinkings incorporating three marriage ties (and three consanguine kinship chains) form matrimonial circuits of “width” 3, e.g. a man marries his mother’s brother’s wife’s bother’s daughter’s husband’s sister.
Matrimonial circuits are indicators of sociological constraints of matrimonial choice (rules, preferences and avoidances, opportunities) and of the dynamics of self-organization of the network. They have to be studied as a whole. For the concept of the matrimonial circuit, see Hamberger, Houseman, & White (2012, p. 539‑540).
***Matrimonial Constellation – It is the largest component of a matrimonial network frame.
***CORR? Matrimonial Network - A matrimonial network is a subgraph induced by matrimonial circuits. Matrimonial networks are line-induced and not vertex-induced subgraphs. This means that every line of the subgraph is part of a circuit (it is not enough that its endpoints are in a circuit). The matrimonial network derived from a set of matrimonial circuits found in a kinship network is thus simply the network composed of these circuits. It consists, in other word, of the matrimonially “interesting” regions of the original kinship network.
The connected parts of the matrimonial network (the matrimonial components) represent continuous regions of densely interconnected circuits, which may be studied from various perspectives.
On the one hand, we may suppose that the frequent occurrence of particular matrimonial patterns is correlated with other properties of the network region concerned (for instance social class, geographical region or historical period); we may then apply several partitions to the network in order to evaluate the degree to which partition clusters correspond to matrimonial components.
On the other hand, we may interpret the density of circuits as an effect of self-reinforcing social mechanisms (behavior transmission, imitation or the presence of rules) or as a simple network effect (rings combining to compose other circuits) which we did not consider when defining the criteria for our initial circuit search.
The concept of a matrimonial network is also meaningful in and of itself, independent of any particular circuit set. Even without being able to precisely identify all matrimonial circuits (without limits of size) which may exist in a kinship network, it is possible to determine which part of the network is composed of matrimonial circuits. The result is the absolute matrimonial network, the subgraph induced by all lines in the network which are in some circuit whatsoever. This absolute matrimonial network is equivalent to the sum of all matrimonial bicomponents. It corresponds to what has been called the “core” in a P-graph context (Grange & Houseman, 2010; White & Houseman, 1996) .
Every matrimonial network constitutes a network without tails (every vertex must have a degree greater than one) and without structural children (every vertex must have an outdegree greater than zero). However, the reverse is not the case. There may be networks where all vertices fulfil these two-degree criteria, but which nevertheless are not matrimonial, as they contain lines which do not form part of any matrimonial circuit. Filial triads (father, mother and child) or marriage ties connecting disjoint matrimonial components are instances of this.
***Mean Genealogical Depth – It is the mean genealogical distance from apical ascendants, it is calculated with the Cazes formula (Cazes, & Cazes, 1996).
Ore-graphs - Named after the scandinavian mathematician Oystein Ore (1970), developed by Vladimir Batagelj and Andrej Mrvar. In an Ore graph, vertices represent individuals, arcs filial ties and edges marriages. Vertex-labels represent gender; two different types of lines represent paternal and maternal ties.
P-graph - Developed by Douglas White and Paul Jorion (White, & Jorion, 1992), used by the homonymous computer program p-graph. In a P-graph couples or unmarried individuals are represented by vertices, married individuals by gender labeled lines running from the couple in which they are partner to the couple of which they are born.
P-graphs have the advantages of being directed acyclic and of incorporating fewer lines and vertices, allowing semi-cycles (that correspond to matrimonial circuits in Ore-graphs) to be more easily detected. Note, however, that an individual who marries several times will be represented by several lines. Lines therefore have to be name-labeled in order to distinguish identity from siblingship.
***Relation Census – It is a census of kinship relations that: whether comply with some given quantitative (size) or qualitative (formula) criteria; or correspond to the criteria established in a matrimonial census.
***Structural Children - In a kinship network, structural children are individuals who do not have neither spouses nor children.
Tip-graph - Named after the research group TIP (Traitement Informatique de la Parenté), used by the macros of the Tip4Pajek series (2007). In a Tip-graph, filial and marriage ties are represented by arcs. All information on the type of tie and on gender is contained in line values. There are five types of lines:
- a marriage arc pointing from female to male,
- a filial arc pointing from female (mother) to female (daughter),
- a filial arc pointing from female to male (son),
- a filial arc pointing from male (father) to female,
- a filial arc pointing from male to male.
Because a Tip-Graph does not involve vertex labeling, it is a highly economical representation of a kinship network. Its major disadvantage is that it is not directed acyclic. Many analyses therefore require its being re-transformed into a conventional Ore-graph. To export a dataset in tip-graph format (as a pajek project file) the option “tip” has to be chosen.
***Virtual Individuals - Virtual individuals are individuals for whom no information is available except for their kinship relations (and perhaps their gender, if they are parents or spouses of existing individuals). They only serve to represent the common parents of full siblings.
Cette partie du site est actuellement en construction...
This section of the site is currently under construction...
Welcome to the "Functionalities" section of the Puck 2.0 online help. Here, you can find a detailed description of all PUCK functional components. As showed on the image below, those can be firstly distinguished into three main components : the Menu Bar, the Main Window and the Partitions Bar.
This part of the guide is thus divided into sections that match with one of these Puck components. In order to navigate through the guide and find the functionality that you are looking for, you can identify the PUCK main component to which it is related to, and then navigate to the pertinent section of the guide by using the "Functionalities Menu" (located on the right side of this web page).
For instance, to know what is the "Additional data" frame (see below) and how to use it, click on the "Main Window" voice of the "Functionalities" menu, and then search the page running a Ctrl+F text query.
Be aware that, on the "Functionalities" menu, some entries appear even if they don't belong to the Puck main components showed below. It's because those entries are related to specific commands that the guide treates in the Menu Bar sections.
The File menu provides some of the most fundamental Puck functionalities such as creating, importing and exporting in a wide range of formats, fusing or updating kinship datasets. In addition, this menu has been recently enhanced with some kinship simulation functions.
In order to create a new kinship network, use the command File > New > Empty network.
It is possible to enter data directly into Puck, but you can also use the software of your choice : the program is compatible with most popular formats.
The submenu File > New > Random network allows creating several types of randomly produced kinship networks. This simulation technique can be useful in order to compare "real" kinship data - which represent actual social practices of a given population - with randomly generated ones.
In order to create a new random network, there are two possibilities :
- File > New > Random network > Classic
- File > New > Random network > Birth-Centered
As kinship simulation appears between the PUCK main functions, a specific section of this guide is dedicated to this two commands. If you want to keep reading about them, click here or follow the "Functionalities Menu".
Import a dataset by clicking File > Open... and choose your file. The data appears in the main window. It is possible to open a recently used file (File > Open recent) and to browse a recent folder (File > Open Recent Folder).
By default, Puck assumes that the dataset is a UTF-8 file. Anyway, it is possible to choose the encoding via the command : File > Open encoding.
***Open from Kinsources
The command File > Open from Kinsources allows downloading a corpus directly from the Kinsources project website. When executing the command, a dialog window called Kinsources Catalog Selector opens. You can then select a corpus and open it with PUCK.
***Reload / Revert
The command File > Reload re-establish a modified dataset to its original version (when opening the file). By doing so, every kind of modification (on individuals, families, ties, attributes, partitions...) is thus erased. Note that when a modification has been introduced into a corpus, the command name automatically switches to File > Revert. In order to avoid data loss, a dialog window automatically opens, asking to confirm the required action.
In order to fuse two datasets you can use the command : File > Merge. When executing the command, a dialog window opens requiring a “File to be joined” (the second dataset) and a concordance table. The latter, which enables the program to identify double entries, has to be provided if (and only if) some individuals appear both in the two datasets.
Warning : merging two datasets implies an automatic renumbering of individuals : individuals of the second corpus who have no doubles in the first corpus obtain a new Id number by adding the number of the last individual of the first corpus to their old number. Renumbering should remain a transitory exception while establishing a definite dataset. As a rule, individuals should have one unique identity number and belong to one unique corpus.
Update a Dataset
Genealogical corpuses can be updated and new supplementary information added to them. To update a corpus and add data from another dataset, Puck provides a specific command : File > Update. For safety, Puck will generate a new dataset in order to prevent the original dataset to be erased by mistake. The new dataset will then be stored in a new file, in the same format as the original one. A corpus update requires a file which fulfills the same format requirements as the files used to load a corpus.
Note : if you use a file in text format containing only supplementary information on individuals’ properties (so that the first block is empty), make sure that the file begins with two headlines (and not just one). The first headline (which may consist in a single letter) marks the presence of the first block, the second one indicates the switch to the second block. Otherwise Puck will read your supplementary information as basic genealogical information, and the update will fail. These problems do not happen when you use files in .tip format.
You have two choices: File > Update overwriting and File > Update appending.
In the first case, data are simply appended without overwriting existing data (in this case, certain data may not be added, for instance a father that does not correspond to the actual father). In the second case, data are used to overwrite existing data by new ones (in this case, an individual which appears in the update file as having no father will lose his father in the corpus).
***Save and Export Datasets
When executing the command File > Save (Ctrl+s) PUCK overwrites the recent changes on the original file. In order to secure this operation, a dialog window automatically opens asking to confirm the operation.
By the commands File > Save as and File > Save a Copy, it is possible to:
- Export the current dataset to a format of your choice : Gedcom, Pajek, Text, XLS, PUC, etc. (for a synoptic presentation of the different formats properties, click here). Generally, you save as much information as possible by exporting in .puc format
- Create a backup version of the dataset
- Save an updated version of a modified corpus (in which case you will have to change its name)
NOTE : on the current version of PUCK, the drop-down list of extensions does not work to actually change the file format. In order to do so, you must find the format in the "Files of type" list and then type the extension in the "File Name" field.
***Export to Pajek
The command File > Export to Pajek produces a file which can be opened with Pajek. This can be useful if you want the current kinship network to be drawn in a graph. When executing the command, the Export to Pajek Input window automatically opens. There you find the Graph Type frame, which contains a check-box list showing three possibilities (click on the file type to view its Glossary definition):
***The Partitions Label frame allows defining several properties (i.e., gender, birth place, patri-clan...) as partitions, that can be read and represented by Pajek.
Once those parameters set, you can then choose the destination file and create it by clicking on the Export button.
Close and Quit
To exit from PUCK there are two possibilities : the command File > Close (Ctrl+W) refers to the current dataset ; the command File > Quit refers both to the current dataset and the program. In both cases, Puck will automatically detect and notify unsaved changes on the current file.
Manage and Edit Data
Once the dataset is created, you can then begin entering new data by creating individuals, kinship relations, families, additional data (place of birth, occupation, religion, date of birth, etc.). In order to do so, you can use the following commands from the Edit Menu, the correspondent keyboard shortcuts, or the Main Window Bottom Toolbar (see here) :
||Ctrl + I|
||Ctrl + U|
||Ctrl + P|
||Ctrl + K|
||Ctrl + Maj + U|
||Ctrl + F
Note : The command Add Origin family create a family where Ego appears as a children. The command Add family creates a simple union tie, whose individuals must be defined.
The command Edit > Preferences allows choosing in which language Puck will run. The available languages are : English, French, German, Italian and Spanish.
*** In the Preferences dialog window, the Input settings frame allows defining how PUCK has to treat, by default, several special features. This can be useful, in particular, in order to prevent data input mistakes. There are three possibilities :
- None : the selected Special Feature is not reported
- Warning : the selected Special Feature is reported and a confirmation is asked
- Error : the selected Special Feature is denied
The Report menu contains several commands that can help organizing the dataset records, searching for potential errors and producing attribute statistics.
The sub-menu Reports > List (...) allows listing all the corpus individuals by choosing the most pertinent ordering criterion. After giving the chosen command, Puck put in order all the dataset individuals and produces an exportable report. You can then save the results in a .txt or .xls formats file, by clicking on the "Save" button (bottom right-hand corner of the results window).
The command Report > Homonyms allows detecting and list individuals who carry the same name, or a part of it (e.g., the first name). A dialog window opens when executing the command, asking to specify :
- in the Name parts field, the number of words (separated by a "/" symbol) that Puck has to consider as pertinent for regrouping homonyms.
- in the Minimal Number of names field, the minimal number of individuals who share a given name (or name part) that Puck will list.
E.g., the individuals who share their first name can be found by setting Name parts on "1" ; if you are looking for individuals whose name is shared by more than five persons, set the Minimal number of names to "5".
By the submenu Report > Controls, PUCK allows finding quickly Special Features (possible errors) that could be contained in the dataset (including input derived ones). The commands appearing in the submenu allow choosing the most appropriate errors and report it. The following list resumes the types of errors that PUCK recognises :
- Same-Sex spouses
- Female Fathers or Male Mothers
- Multiple Fathers or Mothers
- ***Cyclic descent cases - individuals for whom a descendant is, at the same time, an ascendant
- Unknown sex persons
- Unknown Sex Parents or Spouses
- Nameless Persons
- Parent-child marriages
- Inconsistent Dates
- Missing Dates
The command Reports > Controls > Special Features produces a full report of all these possible errors. Some them (such as persons lacking name or gender, or marriages between same-sex spouses or between parents and children) may actually be correct and wanted (depending from the fields and the sources), but they often are due to simple mistakes introduced during the data input. Puck indicates them in order to facilitate the researcher to check the dataset, but never automatically “corrects” possible mistakes.
When executing the command Reports > Controls > Special Features, a dialog window automatically opens. There, it is possible to define which kind of potential errors you want Puck to check for.
Otherwise, it is possible to run a step-by-step check-up, by selecting a single special feature in the submenu.
Note : even if they are true representations of a real kinship network, some of these irregularities may hinder certain functions of Puck from working correctly, which should lead to a reconsideration of analytical methods. Some errors can even cause PUCK crashes (for instance, cyclic descent cases cause infinite loops) ; some others will lead to erroneous results (for instance, the presence of male mothers or female fathers causes calculation errors in the matrimonial census).
The command Reports > Attribute Statistics allows obtaining basic statistics concerning the distribution of attributes. By default, these are classed as Not-set, Set blank, Filled and Set. In the Report Window, PUCK counts all the Corpus, Individuals and Families attributes, and sorts them by label. The results are exportable in .txt or .xls formats by clicking on the "Save" button placed in the bottom right-hand corner of the window.
The Transform menu features a number of commands allowing several systematic changes on the current dataset (duplication, anonymization, reduction, extraction, expansion, shrinking...). Some of these transformations concern the individuals names, their attributes, as well as the relations existing between them ; others can target the dataset as a whole or just some partitions of it.
The commandTransform > Duplicate creates an exact "live" copy of the original dataset. This can be useful if you want to test some transforming operations on a dataset, without affecting the original file with unwanted changes. Note that the duplicate wich PUCK thus produces is not automatically saved into a file.
The Anonymization commands are used for hiding individuals names. This can be useful, for example, if you work on a recent population and whish to publish some analytical results without revealing the individuals identity.
The following commands enable to choose the most appropriate form of anonimization :
- Transform > Anonymize by First Name ;
- Transform > Anonymize by Last Name ;
- Transform > Anonymize by Gender & ID.
Note : Numbered names are convenient if the corpus is exported in .paj format (Pajek files require renumbering of individuals in order to assure continuous vertex numbers. Numbered names thus serve to keep the original numbers). If a Pajek file with numbered names is imported to PUCK, numbers between parentheses are automatically re-converted into identity numbers.
The following commands allow to operate systematic changes on attributes. It can be useful to use them, as it avoids to operate such changes one by one through the entire dataset :
- Transform > Rename Attribute (acts on all Labels)
- Transform > Filter Exogenous Attribute (acts on some Values)
- Transform > Set Attribute Value (acts on all Values)
- Transform > Replace Attribute Value (acts on some Values)
- Transform > Valuate exo. Attribute
- Transform > Remove all Attributes (acts on all Label and Values)
- Transform > Transmit Attribute Value (act on all Values)
When executing each one of these commands, a dialog window automatically opens asking to specify which attribute has to be changed, and how. Thus, each command produces a specific dialog window, but some recurrences can be isolated (we will leave the rest of it implicit, as these functions are sufficiently intuitive). The Target field indicates the type of attribute concerned (i.e., All, Individual, Family...). The Label field indicates the "name" of the attribute concerned (i.e., BIRTH_DATE...). The Value field indicates the actual content of the attribute (i.e., "1955" for a birth date).
The command Transform > Marry Coparents associates to each fertile couple a matrimonial link. This can be useful, for instance, in order to make those unions visible in a matrimonial census. The command is effective on the entire dataset without taking into account partitioning and it doesn't change the family numbering.
The Transform > Re-number Ids sub-menu enables to change the whole dataset individuals Id numbering. The new numbering will start from 0 and will cover all individuals without gaps.
If you have a pre-existent corpus where the Id number is defined as an attribute, you can use the commands Transform > Renumber Ids from ID attr. and Transform > Renumber Ids from REFN attr. in order to make Puck recognize it as the actual individuals Id.
The sub-menu Transform > Reduce allows removing specific segment types from a kinship network. The reduction operations precisely concern segments that can be considered structurally irrelevant (unmarried people, structural children, etc.). Thus, these operations are meant to "clean" the corpus before proceeding to an analysis of its structure. This can be useful when preparing, for instance, a matrimonial circuit census, e.g. in order to refine an analysis of the datasets gender bias. The following list of commands gives the detail of each possible reduction :
- ***Transform > Reduce > Acyclic Segments : eliminates from the kinship network all the segments that do not contain cycles.
- Transform > Reduce > Marked doubles : eliminates all doubles. The individual with the lower identity number is considered as the original, the one with the higher ID number as the double. Doubles can be marked by substituting the original’s identity number for their name.
- Transform > Reduce > Structural children : eliminates all individuals who have neither spouses nor children (the structural children of the kinship network).
- Transform > Reduce > Unmarried : eliminates all individuals who are not married.
- Transform > Reduce > Virtual individuals : eliminates all virtual individuals. This reduction is recommended for the exploratory analysis of a corpus containing fictive individuals.
The sub-menu Transform > Extract enables to create a new dataset by selecting a specific sub-corpus of the original one. The new dataset will be thus composed only by the vertices of the selected sub-corpus. By the following commands, you can choose to extract different types of sub-corpuses :
- Transform > Extract > Current Segment
- Transform > Extract > Current Cluster (Ctrl+E)
- Transform > Extract > Kernel (maximal matrimonial bicomponent)
- Transform > Extract > Max. Bicomponent
- Transform > Extract > Core
- Transform > Extract > By cluster size / By cluster value : ***CORR*** reduces the kinship network to those vertices who have a positive cluster value in a chosen partition. For example, if “OCCU” (occupation) is chosen, then the network is reduced to the people whose occupation is known.
Note : Some of the commands in the list above presuppose a basic knowledge of the partitioning process. To read more about partitioning, click here or use the "Functionalities" menu (click on the voice : "Partitions Bar").
The sub-menu Transform > Expand current segment (...) allows creating a new dataset composed both of the selected partition members and of individuals somehow connected to them. This can be useful, for instance, when you want to operate a circuit census on a given partition of the dataset, without losing data about the ties that exist between its members (which could involve non-members of the partition).
Thus, the submenu allows expanding a partition to its connected non-members and, in addition, to operate a selection between them, based on the type of ties existing between the segment members and the "to be included" non-members. Such a selection can be operated by the following commands, which indicate different classes of connected individuals :
- Transform > Expand current segment > Special Features... : Includes individuals connected by special features
- Transform > Expand current segment > Universal : Includes individuals connected by all existing ties
- Transform > Expand current segment > All related : Includes all individuals connected by ties defined as a Relation Model
- Transform > Expand current segment > All Kin : Includes all individuals connected by marriage and filiation ties
- Transform > Expand current segment > Ascending : ***Includes all individuals connected by ascending ties
- Transform > Expand current segment > Ascending (Agnatic) : ***Includes all individuals connected by agnatic ascending ties
- Transform > Expand current segment > Ascending (Uterine) : ***Includes all individuals connected by uterine ascending ties
- Transform > Expand current segment > Descending : ***Includes all individuals connected by descending ties
- Transform > Expand current segment > Descending (Agnatic) : ***Includes all individuals connected by descending agnatic ties
- Transform > Expand current segment > Descending (Uterine) : ***Includes all individuals connected by descending uterine ties
- Transform > Expand current segment > Horizontal : ***Includes all individuals connected by marriage ties
The Shrink function allows regrouping the dataset individuals following a given criterion ; it also allows generating and analyzing the network of links existing between such groups. The results can be exported in .paj and .dat formats. Such networks can be thus represented as directed graphs, where both nodes and arcs have values. The nodes values will then quantify the partition size (number of individuals), the arcs values will quantify their weight (number of ties).
The Transform > Shrink > Alliance Network command produces a directed graph where nodes represent groups of individuals sharing a given endogenous/exogenous property and arcs represent the number of links between such groups. For instance, it can be used in order to analyze the matrimonial alliance network between different patri-lignages, or to study the transmission of professions through filiation.
In Puck, when executing the command, a dialog window automatically opens asking to set some criteria.
The Label field, allows defining which endogenous/exogenous property will actually regroup the dataset individuals.
The Alliance Type field allows defining as "alliance" relations three different types of ties : wife-husband, sister-brother and parent-child. Choosing one of these will produce, respectively, a matrimonial exchange network, a network of siblingship or a network of filiation ties.
The Weighted Arcs check-box allows choosing whether or not the arcs weight will appear in the results.
Finally, three fields allow to filter the resulting network depending on the Minimal number of : links (node degree), alliances per node (node strength), and alliances per link (link weight).
The results can be viewed (and managed) both as an autonomous Alliance Network Window (on the dialog window, click on the Launch button) and as a statistic report (on the dialog window, click on the Statistics button). For a description of the Alliance Network Window, see here.
The statistic report window is composed of six tabs :
- Alliance Network Report, which summaries the input criteria and allows exporting the results in the .paj, .paj (edge version) and .dat formats ;
- Analysis, which presents the results according to a number of indicators such as the number of nodes and arcs, the maximal weight (links per arc) and strength (links per node), the potential endogamic pairs, the distribution of circuits (etc.) (for more details see here) ;
- Matrix, which shows the network alliance matrix in a table and allows exporting it in .txt and .xls formats ;
- Couples, which lists in a table the linked cluster-to-cluster couples, as well as the composition of (directed) links connecting each couple. Every block then specifies the link weight and the individuals couples (their Id number, gender and Name) that are connected by marriage, siblingship or filiation tie (depending on the chosen criterion).
- Sortable List, which indicates : in the first column, the origin vertex (wife, sister or parent, depending on the chosen criterion) of each link : its Id number, Gender and Name ; in the second column, the destination vertex (husband, brother or child) its Id Number, Gender and Name ; in the third column, the cluster to which the origin vertex belongs ; in the fourth column, the cluster to which the destination vertex belongs ; in the fifth column, the link weight.
- ***GAP Sides, which lists [...]
The command Transform > Shrink > Flow Network allows producing, analyze and manage flow networks. Here, nodes represent segments that regroup the dataset individuals who share one (of two) given endogenous/exogenous properties. Concurrently, the network weighted arcs connect the segments that contain, each one, the same individual ; their weight correspond then to the number of individuals who share the two given properties, and they point from the first cluster to the other.
The command can be useful, for instance, for studying migration flows (from the birth place to the death place of the dataset individuals).
When executing the command, a dialog window automatically opens.
Here, the Source Label field allows defining the first property used for regrouping (i.e., BIRT_PLACE), and the Target Label field allows defining the second one (i.e., DEAT_PLACE).
***The Minimal number of links field allows excluding from the results network all the source nodes whose size doesn't reach a given number of individuals.
Unlike for alliance networks, the results are shown only as statistics. The Report Window is made up of five tabs:
- Flow Network Report, where one can find a review of the input criteria, as well as the possibility to export the network in .paj format ;
- Analysis, where appear specific statistics on the flow network ;
- Matrix, where the flow network matrix appears in the same form as an Alliance Matrix ;
- Flows, where are listed the couples of source > target nodes and, for each one of those, the individuals appearing in both segments.
- ***GAP Sortable List, where [...]
***GAP Simulation Tools
The command Transform > Reshuffling allows producing the network that results by randomizing the corpus marriages (and keeping the rest of it as it is). This simulation technique can be useful in order to understand to what extent specific matrimonial configurations depend from demographic and/or data collection biases.
When executing the command, a dialog window automatically opens, asking to specify :
- The Number of edge permutations per step - [...]
- The Maximum generational distance - [...]
- The Minimum shuffle percentage (stop condition) - [...]
- The Minimum stable iterations (stop condition) - [...]
The command Transform > Virtual Network allows simulating the biases introduced with data collection. It can be useful in order to know how the network morphology would change, if all informants came, for instance, from a small set of families.
When executing the command, a dialog window automatically opens, asking to specify :
- The Number of informants - [...]
- The Kin proximity - [...]
- The Kin degree - [...]
- The Near Kin weight - [...]
- The Memory - [...]
- The Acceptance of both a Male Informant and a Female Informant
- The Kin Recall rates of both Men's Kin (first degree) and Women's Kin (first degree).
The command Transform > Virtual Fieldwork Variations allows [...]
In addition to entering data and navigating through the corpus, Puck allows not only to explore the kinship environment of individuals, but also to run structural socio-centered analysis. The Analysis menu contains several tools whose functions concern such kind of analysis.
Pedigree and Progeniture
The commands Analysis > Pedigree and Analysis > Progeniture produce a complete list of Ego’s ascendants or descendants, up to a given degree. After executing each one of these commands, a dialog window opens asking to specify the maximal generational depth of the ascent/descent ties you're looking for. This will be done by entering a single number that indicates the generational limit of your search. For instance, entering “3” produces a tree structured report of Ego's known ascendants/descendants up to great-grand-parents/children.
The command Analysis > Relatives enables to obtain a complete list of Ego’s relatives of a given type. This is done by entering in the dialog window a structure formula in positional notation, just as in the case of a matrimonial or relational census (see infra). Note, however, that the present function is ego-centered : the first individual in the formula is the currently selected individual.
For instance, entering “XX(X)XX” will produce the list of all Ego's cousins (with their names, identity numbers, and exact kinship relation types).
The command Analysis > Kinship Chains enables to obtain an exhaustive list of the kinship chains connecting Ego to another individual. This is done by entering two numbers, where the first one is alter’s identity number, and the second the canonical degree (maximal genealogical depth). A third number can be entered which specifies the maximal order of the chain, that is the maximal number of marriages it may contain. Puck lists all tracks between ego and alter within specified bounds in the classification of your choice.
The command Analysis > Distances allows classifying ego's relatives depending on the genealogical distance existing between them. When executing the command, a dialog window opens asking to specify which kind of ego's relatives are to be taken into account. You can do so by selecting the wanted Filiation Type in the check-box list ; and than specify the the upper limit of your search into the Max Distance field.
When launching the count, PUCK automatically introduces a new attribute to all individuals. The attribute Label will be, i.e., "DIST 1" if ego Id number is "1", and its value will correspond to the genealogical distance between ego to alter (which is the minimal number of arcs connecting them).
The command Analysis > Basic information (CTRL+B) gives access to the basic information of a dataset, which is the starting point for the analysis of its structure.
It produces a report that contains many basic information such as :
- The number of individuals (differentiated by gender : men/women/unknown)
- The number of marriage relations and unions (differentiated by gender)
- The number of parent-child relations
- The number of fertile marriages (couples with children), in absolute terms and as a percentage of total marriages
- The number of co-spouse relations (relations between co-wives and between co-husbands).
- The number of components (maximal connected subnetworks), the size of the largest component, which is useful to evaluate the dataset cohesion (or disintegration).
- The mean share (size divided by total network size) of agnatic/uterine components and the share of the largest agnatic/uterine component, and the percentage of marriages involving a member of the largest agnatic/uterine component.
- Elementary Cycles : The cyclomatic number (number of independent cycles) of the network.
- The Density of the kinship network : The number of marriage and filial relations divided by the total number of possible relations between two different individuals.
- Maximal and Mean Depth : the mean genealogical depth is computed as an average of the mean generational depth of each individual’s pedigree, according to the formula of Cazes (Cazes & Cazes, 1996).
- Mean number of spouses (differentiated by gender).
- Mean fratry size (mean number of cognatic, agnatic and uterine groups of siblings).
After identifying the errors affecting the dataset, the researcher should take into account its limits and biases. This can be done by exploring the network morphology and it should be seen as a precondition for any analysis or matrimonial census. In order to do so, Puck offers a wide range of tools, accessible from the command : Analysis > Statistics (CTRL + G). When executing the command, the Statistics Input Window automatically opens. This is a fundamental tool for the dataset diagnostics, which constitute one of PUCK main functions. A specific section of this guide is thus dedicated to it. To move there, you can click here or use the "Functionalities" menu.
Note : the use of this command presume a basic knowledge of the partitioning process. To read more about partitioning, see here.
The command Analysis > Partition Statistics allows producing statistics concerning the distribution of given partitions on the dataset. When executing the command, a dialog window opens asking to set the Partitions Statistics Input criteria. This can be done, firstly, in the Partition Diagrams Criteria frame. Here you can set more than one partition criterion at the time. ***GAP Secondly, the Split Partition Criteria frame allows [...].
After launching the count, PUCK shows the results in a new tab, both as diagrams and as tables. In the diagrams, the abscissa indicates the size (number of members) of each resulting partition ; the ordinate indicates the number of existing partitions of each size.
One of the most important PUCK functions consists in running a circuit census of your kinship network. This can be done by executing the command Analysis > Circuit Census... (Ctrl+H), which automatically opens the dialog window called Census Reporter Inputs. A number of settings can be chosen from it, following your analysis needs.
A specific section of this guide is dedicated to the use of the Census Reporter Inputs. To move there, you can click here or use the "Functionalities" menu of this guide.
The command Analysis > Differential census allows conducting a segment-based census and provides several comparative means, which especially concern the relations between members of identical segments. The significant advantage over a global census is the possibility to consider several segments separately. By applying a differential census to all the dataset’s clusters configured by a certain segmentation, statistic results can be achieved concerning for instance uterine-agnatic relations within a household or kinship relations within a profession. The results of a Differential Census appear both as diagrams and as tables. A Differential census produces : relational statistics for each cluster ; global and mean percentages of relations ; distribution of relation percentages by cluster size.
In the diagrams, the abscissa orders the selected partitioning criterion (i.e., occupation) and the ordinate [...]
The Kinship calculator allows converting, transforming and analyzing kinship relations.
You can access to it by the command Tools > Calculator.
Kinship relations can be entered in any notation. The calculator contains three lines in order to allow unary and binary operations. These operations can act on fully specified relations or on a relational schema (without specification of gender).
The Standard button allows bringing the entered formula to its standard form. Then it will begin with the longest ascending and most "agnatic" chain (a chain is the more agnatic the more male members it contains and, in case of equality, the higher the position of these members).
You can change the ego/alter point of view by clicking on :
- Reflect : inverses ego and alter
- Rotate : replaces ego by the next married pivot (not married to ego). In consanguine relations, it is equivalent to identity.
You can perform some binary operations for composing and combine kinship relations :
- Compose : composes relation 3 by linking alter of relation 1 to ego of relation 2 (by marriage in the heterosexual case, identity in the homosexual case)
- Insert : calculates the relation 3 implied between ego and alter of relation 1 if ego's parents are in relation 2
- Switch : switches the selected relation from standard to positional notation, and vice-versa.
- ***GAP Develop : [...] all relations of a same type
- ***GAP Analyze : [...] a relation producing the analytic profile of the relation in the Report window
The Main Window provides both the navigation and the data management functions. From the Main Window it is possible to add, modify and delate individuals, families, additional data and relations. It can be used for data input, even for corpuses of large dimensions.
On the upper left side, there are the Navigation Tabs : Corpus, Individuals and Family. The Corpus tab contains general information on the dataset. The Individuals tab contains the list of all individuals and the general information related to them. The Family tab contains the list of all the families of the dataset.
- Ego's Identity Number. Each individual has a unique Id number which serves to identify it in all contexts (such as matrimonial circuits, or in the kinship section of another individual’s page). Ideally, this value should be assigned one and only one time. However, by clicking on it, it is possible to change it. If you re-assign an already used value, Puck notifies it in an error window.
- Ego's Gender. In the Individuals Tab, gender is represented by a circle for women, a triangle for men and squares for individuals whose gender is unknown. In every individuals frame, gender is represented by the commonly used symbols : ♂ for men, ♀ for women and ⚲ for unknown. Here, it is possible to change the gender of the selected individual by clicking on the symbol.
- The individual’s first name and last name. When importing or exporting kinship data, Puck identifies separate parts of names by a slash “/”. The name part before the first slash is identified as first name and all the other parts as the last name. However, any number of names can be distinguished by using the slash separator. The two name fields are auto-completing. In order to facilitate data entry, clicking on the name make a drop-down menu appear in which all first and last names of the corpus are displayed.
Ego's First Degree Relatives Frame
On the right side of the Identity Frame, a specific section is shows Ego's parents, who are designated by their gender, names and Id Numbers. Next to it, the yellow circles indicate their Union Status.
If Ego doesn't have a parent and you want to add one, click on the name field and either choose an existing individual or create a new one by typing a non-assigned number.
Under Ego's Identity frame, the first one is the Partners frame, which contains information about Ego's spouses ; the second one is the Children frame, which contains information about Ego's first degree descendants.
Additional Data Frame
The Additional data frame contains the individuals/family attributes (such as, i.e., events dates and places). All these attributes are exogenous properties (date and place of birth/marriage/death, occupation, etc.) and can be used to partition the corpus (see here).
Puck distinguishes three different categories of attributes and regroups them in separate sections. All attributes are characterized by a label (for which standard individual property codes should be used) and a value. For instance, a property with the label “OCCU” and the value “merchant” means that the individual’s occupation is that of a merchant. When importing gedcom files, relational properties are automatically transformed into properties of the concerned individuals with reciprocal alter specification. For a list of standard property codes and guidelines for personalized property codes click here.
In order to add an individual/family attribute : on the Additional Data frame headline, click on the "+" button and a line will then appear inside the frame. Click on the first cell from left and type the right property code (i.e., "BIRT_PLACE") ; then, in the second cell from the left, type the corresponding value (i.e., "New York") and press enter.
In order to add an existing individual/family attribute, in Ego's Additional Data frame headline, click on the "++" button. This will make all the existing properties appear, so that you won't have to type the property code again!
If you want to clear Ego's Additional Data frame, on the headline, click on the "c" button. This will erase all empty attributes.
In order to remove an an individual/family attribute : on the Additional Data frame, select the attribute to-be-removed, right-click and click on "delete".
In order to remove (or to make changes on) all attributes, see here.
Even an ego’s property with specification of alter and automatic attachment of a reciprocal property to alter with specification of ego still remains a combination of individual properties and is not a relation property. This is important when these properties are used to partition the corpus or to restrict a relational or matrimonial census to a subcorpus.
For instance, search results for matrimonial rings among individuals whose marriages lie within a certain period may well contain marriages outside that period, if both partners have been married before or after.
The last number of a date is automatically interpreted as year (important for partitioning). This must be taken into account when the specifying events occurred before, after, or around a certain year.
One and the same label cannot be used at the same time for a simple property and for an event property. For example, the code MARR cannot be used at the same time to indicate if a person is married (simple property) and when, where and whom he or she is married to (event property). “Notes” are simple properties with the only difference that they allow entering long texts and line breaks. The label of notes (“NOTE”) is not displayed and cannot be chosen or changed.
All attribute fields (with the exception of notes) contain drop-down menus which allow choosing among existing label, value, place, and date data and existing individuals (for alter). The drop-down menu for alter contains identity numbers and names of all individuals in the corpus, just as for kinship relation entries.
The Relation frame contains the list of Ego's alters of a specific relation (for its use, see the relation model paragraph). On the headline of each frame the number of the respective relatives (number of spouses or children) is indicated. By double-clicking on the name of a related individual in each frame, you can jump to its individual page. Jumping from one relative to another is a way to navigate through the corpus along kinship paths.
The Family tab looks just as the Individuals one, apart from the fact that records are families and not individuals. Families have Id numbers, they are listed in the left side box ; they have an Identity frame, as well as a Children and an Additional Data frames.
If you wish to change the Union Status of a single union (i.e., from "married" to "divorced") you can simply click on the Union Status symbol until reaching the right setting.
- adding an individual : by clicking on the "+" symbol ;
- removing an individual : by clicking on the "-" symbol ;
- sorting the individuals list by Id number, First Name, Last Name : by clicking on the "Sort" button ;
- navigating through the individuals list one by one : by clicking on "Previous" or "Next" buttons ;
- choosing the Id number assigning strategy : filling numbering gaps or always starting from the biggest one ;
- choosing the default Union Status (married, divorced, unmarried).
Remove a Kinship Relation
In order to remove a descent tie : move to the Family navigation tab, select the family where appear the tie-to-be-removed, in the Children frame select the right child/children, in the Bottom Toolbar click on "-".
In order to remove a union : on the Family navigation tab select the family to-be-removed, on the Bottom Toolbar click on "-". Then, move to the Individual navigation tab, select one individual from the couple to-be-removed and, in the Partner frame right-click on the partner and click on "Delete".
Search Dialog Box
A Search Dialog Box is located on the right-side of the Bottom Toolbar (bottom right-hand corner of the main window). It allows searching for individuals or families through the dataset. This can be done either by entering the name, a name part or the Id number of the searched individual(s). If several individuals fit the search criteria, successive “enter” clicks permit to pass from one selected individual to another, and thus to navigate through the corpus. This can be useful, for instance, when a family name has been entered, when the same individual appears in more than one family, or in cases of homonymy.
Note : For a brief introduction to the Statistics function, see here.
The Statistics Inputs window (Analysis > Statistics... Ctrl+G ) contains, first of all, the General frame. It allows obtaining information and diagrams about the current dataset. By selecting the query criteria from the checkbox list contained in the frame, you will obtain information about some dataset structural properties, such as :
- The corpus Gender bias (weight)
- The corpus Gender bias (net weight)
- The distribution of Components
- The corpus Genealogical Completeness
- The corpus Ancestor chains (“Ancestor types”, choosing degree)
- The Fratry Distribution
- Consanguine Chains
- Four Cousins Marriages
Partition Diagrams Criteria
The Partition Diagrams Criteria frame allows obtaining statistics and diagrams about partitions of the dataset. More than one partition can be analyzed at the time. The resulting diagrams will show, on the abscissa, the partitioning criterion and, on the ordinate, the clusters size.
***GAP The Split Partition Criteria frame allows [...]
***GAP The Mean Cluster Values checkbox allows [...]
After launching a query from the Statistics Inputs window, the results are displayed in a report window, both as graphs and tables. Graphs can be viewed individually (by clicking on them). Results can be saved in .txt or .xls formats (by clicking on the “Save” button and choosing the destination folder). The report window also provides information on the distribution of properties.
The first measure of the corpus gender bias is the Agnatic (Uterine) Weight. The following snapshot shows an example of Gender Bias (weight) report diagram, taken from M. Gasperoni's "Ebrei" corpus.
The other Gender Bias measure is the Agnatic (Uterine) Net Weight. The following snapshot shows an example of Gender Bias (net weight) report diagram, taken from M. Gasperoni's "Ebrei" corpus.
Both Gender Bias measures are useful indicators of the interdependence and interconnection of the genealogical knowledge : the more curves (uterine and agnatic) are close to one another, the higher is their interdependence ; the more they are apart, the more they become autonomous, that is to say that we know the agnatic or uterine unisexual lines. If curves are low, it means that there is interconnection.
Note : it is important to analyze the gender bias on partitions (e.g., depending on the generation or the age of birth of individuals). Partitioning is particularly useful for focusing on a part of networks.
Distribution of Components
The Components diagram shows the distribution of agnatic/uterine components (connected subnetworks made up entirely by paternal/maternal ties) according to their size : the abscissa of the diagram indicates the relative size of components (as a percentage of total network size, where size = number of individuals), the ordinate indicates the relative frequency of components of given size (as a percentage of the total number of components). The following snapshot shows an example of Components report diagram, taken from M. Gasperoni's "Ebrei" corpus.
The Genealogical Completeness of a kinship network corresponds to the percentage of known ascendants (agnatic, uterine and overall) by generation. The following snapshot shows an example of Genealogical Completeness report diagram, taken from M. Gasperoni's "Ebrei" corpus.
The Fratry Distribution is the distribution of agnatic and uterine fratries (sibling groups) according to their size. The following snapshot shows an example of Fratry Distribution report diagram, taken from M. Gasperoni's "Ebrei" corpus.
***First Cousin Marriage
The First Cousin Marriages diagram shows the occurrences of the four first cousins marriages: between cross/parallel patri-/matri-lateral cousins. The four cousins types are distributed on the abscissa axis and their respective number of occurrences is detailed on the ordinate. The following snapshot shows the First Cousin Marriages report diagram of M. Gasperoni's "Ebrei" corpus.
The Ancestor chains diagram shows the composition of ancestor chains, expressed in positional notation, depending on a given degree (your choice) and gender. It is very important to know the distribution of consanguine chains and this is an additional measure of bias (Barry & Gasperoni, 2008, p. 71‑77). The following snapshot shows an Ancestor Chains report diagram, taken from M. Gasperoni's "Ebrei" corpus.
***Distribution of properties
Using the Statistics Window, it is also possible to analyze the distribution of endogenous and exogenous properties, combining queries and property codes and exporting the results as partitions (to represent with other software like, for instance, Pajek). This allows knowing precisely the profile and composition of the dataset, not only for analyzing it, but also to improve and complete it thereafter. Puck generates information and statistics on all genealogical (number and distribution of known ascendants/descendants etc.) or exogenous data (occupation, place and date of birth, etc.). The results appear in the form of tables, diagrams or partitions.
The Census Reporter Inputs window (Analysis > Circuit Census (CTRL+H) ) enables to set and operate a circuit census on your dataset. It presents a number of parameters, which can be used depending on the analysis needs.
The Pattern field is a fundamental tool, which enables you to define the type of circuits that you wish to count. It allows you to define the type of census desired : consanguine marriages, two-groups and three-groups relinking. It can be used, for circuits counting, by resorting to two input methods.
- The first way to use the Pattern field is to specify the maximal dimensions of the matrimonial circuits to be searched for.
Matrimonial Circuits have two dimensions : their width (or order) and their maximal canonic degree of consanguine relations (or depth). The first number to type into the Pattern field indicates the maximal canonic degree of circuits with width 1 (i.e. incorporating 1 marriage arc, corresponding to consanguine unions) ; the second number indicates the maximal canonic degree of circuits with width 2 (i.e. incorporating 2 marriage arcs, corresponding to two-groups relinkings), the third number indicates the maximal canonic degree of circuits with width 3 (i.e. incorporating 3 marriage arcs, corresponding to three-group relinkings).
For instance, the code 3 2 1 sets the horizon of matrimonial circuit search to blood marriages between 2nd cousins (degree 3), marriage redoublings between pairs of 1st cousins (degree 2) and marriage retriplings between pairs of siblings (degree 1). All circuits/relations of lower dimensions (for instance, blood marriages between first cousins or marriage redoublings between pairs of siblings) are included in the search.
- The second way to use the Pattern field is to enter a structural schema expressed in positional notation (i.e. XX(X)XX will find all circuits corresponding to marriages between cousins). Note that, as a rule, a structure formula is read by Puck in a socio-centered manner. That is, the formula X(H)HX and XH(H)X designate one and the same type of relation. This is important if the census is run on a subcorpus, where ego may be in but alter may not. Without further indication, a structure formula limits search to circuits/relations that exactly fit this formula (unlike the first method which only sets an upper limit). We can, however, include all circuits/relations which lie within the limits of the formula by letting it precede the character “<”. For instance, the formula “<XXX(X)XX” limits the search to all blood marriage circuits within the limits of the 5th civil degree (including marriages between first cousins, between uncles and nieces etc.). Juxtaposition of several structure formula is interpreted as a combination by logical “or”. For instance, the formula X.X(X)X XX(X)XX limits circuit search to marriages between siblings in-law or 1st cousins.
After setting the Pattern field, a matrimonial census can then be refined according to specific criteria, i.e. : Filiation, Symmetry, Sibling, Circuit and Restriction types.
- Cognatic : all consanguinity relations are permitted ;
- Agnatic : only agnatic relations are permitted (unilinear census) ;
- Uterine : only uterine relations are permitted (unilinear census) ;
- Bilateral : only bilateral relations are permitted.
The Symmetry Type checkbox enables to decide on the relations permutability between ego and alter. For example, if kinship chains between co-residents are searched, the option “symmetry” has to be activated, for co-residence being a symmetric relation. Accordingly, the chains “father-son” and “son-father” will be counted as one single category. By contrast, if kinship chains between persons and their heirs are searched, the option “symmetry” has to be deactivated, for inheritance being an asymmetric relation. Accordingly, the chains “father-son” and “son-father” will be counted as different categories.
Note : the symmetry type choice is only relevant for a non-matrimonial circuit census. If matrimonial circuits or open kinship chains are searched, ego and alter are always considered as permutable (in the first case, male or female ego will be chosen according to the chosen option, in the second case, there is no criterion for the selection of ego or alter).
- 2 (None) : no siblings assimilated. Only paternal and maternal siblings are distinguished, full siblings are counted twice (once as paternal and once as maternal siblings). This method is recommended when half-sibling relations are frequent (e.g. because of high rates of polygamy) the agnatic or uterine relationship is more important than the sibling relationship as such.
- 3 (Full) : full siblings assimilated. True paternal and maternal half-siblings are distinguished from full siblings. This method provides the maximum of information.
- 1 (All ) : all siblings assimilated. Full and half siblings are not distinguished. This method is recommended when half sibling relations rarely occur or do not matter for marriage rules.
- Circuit : counts matrimonial circuits.
- Ring : does not consider circuits/relations that completely include shorter ones (matrimonial rings and not all matrimonial circuits).
- Minor : does not consider circuits/relations that intersect with shorter and narrower ones (minor matrimonial rings and not all matrimonial circuits).
- Minimal (Minimal rings only) : does not consider circuits/relations that intersect with shorter ones (minimal matrimonial rings and not all matrimonial circuits)
- All : all married individuals (matrimonial circuit pivots) must belong to the chosen cluster.
- Ego : ego (according to a chosen kinship schema) must belong to the chosen cluster (presupposes a search expressed by formula).
- Last married : the last married individual must belong to the chosen cluster (presupposes marriage dates, for the moment they were still treated as individual properties). With closing relation, it is possible to choose different types of censuses (matrimonial, relational, absence of closing relations) from endogenous or exogenous properties (occupation, residence etc.), extending the search criteria and producing diagrams.
- The Details and Diagrams frame allows [...] The Label column enables to choose [...] The Report and Diagram check-boxes allow to [...]
- The Couples only check-box [...]
- The Mark Individuals check-box allows adding the binary property of being in a circuit of a given type to each individual who forms a pivot of a circuit. The label of the property is the circuit type in standard notation, preceded by "CENSUS" (see the Individual Property Codes). The property includes indication of Alter and appears in the Additional Data frame.
Partitioning the corpus according to such a property permits to extract/expand the sub-corpus of all individuals that are part of a circuit of the given type.
Using this property for redefining spouses in order to effect a second relational or matrimonial census permits a complex matrimonial or relational census (multiple kinship relations or intersecting matrimonial circuits).
- The Circuits as Relations check-box [...]
- The Cross-sex chains only check-box [...]
- The List out-of-circuits pairs check-box [...]
- The List all perspectives check-box [...]
The Filter field allows excluding from a circuit census a relation type. You can define the relation to be filtered by typing it in positional notation.
***Open Chains Frequencies and Closure Rate
A Relational Census can be used both in order to count non-matrimonial relations and, which is more important for a kinship network structural analysis, to evaluate some of the dataset biases. For example, if in a kinship network the number of cross patrilateral cousins is much greater than the number of cross matrilateral ones, marriages between the former will automatically result more frequent. Thus, the prominence of a given marriage type does not necessarily indicate a social preference for that type of marriage. It can merely result from the higher frequency of that specific relation (not "closed" by a marriage tie) compared to others. Thus, to understand high frequencies of given matrimonial circuits as a direct sign of a social preference can reveal itself misleading.
The closure rate (Hamberger & Daillant, 2008, p. 27-28) is an indicator that has been conceived to prevent such mistakes. As it appears in the snapshot showed below, a calculation of the Closure Rate can be obtained by selecting, in the PUCK Census Reporter Inputs window, the Open Chains Frequencies check-box. Then, on the Results table, for each type of matrimonial circuit will both appear : the total Open Chains number and its Closure Rate.
It has to be set before launching a census ad it includes several possibilities :
- Circuits as network : exports all matrimonial circuits found by the census as separate networks (with nominal indication of individuals as vertex labels), as well as a partition to distinguish vertices that occupy spouse positions in the circuit.
- Circuit induced network : produces the matrimonial network corresponding to the matrimonial census (that is a network made up only of the links that form part of some matrimonial circuit), as well as a partition to distinguish vertices that occupy spouse positions in the matrimonial network.
- Circuit induced frame network : (consanguine chains reduced to lines) produces the matrimonial network frame corresponding to the matrimonial census, embedded in the total kinship network in which marriages are coded according to the circuit(s) they are part of. This may provide a synthetic representation in which certain connective features of the matrimonial network can be identified.
- Circuit intersection network : produces the circuit intersection network corresponding to the matrimonial census, as well as the circuit intersection matrix (Hamberger & Daillant, 2008, p. 22‑24).
***Relational to Complex Matrimonial Census
If the matrimonial circuit census counts the matrimonial circuits in a kinship network, a non-matrimonial circuit census counts relational circuits, which are kinship chains "closed" by a previously defined relation (for instance, co-residence, friendship, etc.). The Closing relation frame enables to run such a census, by selecting the wanted relation.
As shown in the example above, setting the Closing Relation to "Open" and selecting the Couples only checkbox leads to a census of open chains concerning married people.
A more complex relational or matrimonial census can be effectuated by combining two censuses, using the results of the first (stored as relational properties by the Mark Individuals function) in order to redefine spouses, and running the second census on the thus transformed corpus.
In this manner, one can search for MBD marriages that are at the same time ZD marriages, bilateral cross cousins, and so on.
Such a complex census is a useful analytical complement to the inspection of the circuit intersection network.
***CORR? Instead of generating relational data from a preliminary relational or matrimonial census, they can also be directly read from a file, for instance a list of ego-alter-pairs (in the form of a two-column text file). For the precise method see the entry Relational properties from text files.
Mixed Matrimonial and Connubial Circuits
Puck allows regrouping individuals according to a certain property (chosen from a drop-down menu by double clicking on the checkbox label) and effectuate a census of :
- Mixed Matrimonial Circuits containing the relation “belonging to the same cluster”. Puck distinguishes for the moment 9 types of mixed matrimonial circuits, according as the H, HF, or HM belongs to the same cluster as W, WF or WM.
- Connubial Circuits : Puck distinguishes endogamous circuits (1 group), redoubling or exchange circuits (2 groups, arrows pointing in the same or in inverse directions) and cycle circuits (3 groups, arrows consistently directed). For each circuit, the census lists the number of connubial circuits, the number of distinct cycles that may be formed from the marriages that constitute the connubial circuit, the weight of the circuit (the geometric mean of marriages joining two groups in the circuit) and the probability of the circuit, given the relative numbers of potential spouses in the constitutive groups.
After launching a circuit census, PUCK produces a report in which are indicated : the precise number and type of the searched circuits, as well as their classification (as individuals and couples). Each report can be saved in .txt or .xls formats, by clicking on the "Save" button placed in the bottom right-hand corner.
- Census : indicates the total number of circuits and the maximum height (the canonic degree), the number of different circuits, the number of individuals and couples involved, in absolute numbers and as percentages, both in total and on the circuits concerned.
- Circuits : lists, for each type of circuit/relation (indicated in standard notation), all relations found in the corpus, with nominal indication of their pivots (and their relations: “=” for marriage, “-” for consanguinity) and the complete chain in positional notation (individuals being indicated by their identity numbers).
- Couples : lists, for each couple or pair of relatives concerned by the census, all the circuits/relations that link them, both in standard notation of the type and in positional notation of the complete chain (individuals being indicated by their identity numbers).
- Sortable list : List all relations found in the census, with the index, standard and positional notation of the relation/ring type, nominal indication of their pivots and their relations (“=” for marriage, “-” for consanguinity) and the complete chain in positional notation (individuals being indicated by their identity numbers).
- ***Diagrams : currently, this function is not implemented.
Any individual/family property (for details on property codes see here) can be used to split he corpus into subcorpuses. A subcorpus (whose title indicates the parent corpus as well as the property label and value) has the appearance of an autonomous corpus (with its own corpus window and all dependant windows), and every operation on a corpus can also be effectuated on a subcorpus. The important difference is that a subcorpus remains linked to the parent corpus, and all relation and circuit search processes, while limiting results to individuals in the subcorpus, always run through the total corpus.
The partitioning commands are located on a specific bar, placed on the top of the PUCK Main Window.
Warning : a subcorpus that is saved and re-imported as a normal corpus loses this important subcorpus property. All links to individuals outside the subcorpus are cut, and external individuals can no longer act as intermediaries for chains between members of the subcorpus.
Here, the Model drop-down menu enables to choose between Individual or Family, which are the two models of partition criteria. The Label drop-down menu contains a list of endogenous and exogenous properties, that can be set as partitioning criteria. Note that PUCK automatically integrates into the list the attributes/additional data that you might have defined during the encoding phase.
For an explanatory list of the properties codes, see here.
The Parameter field allows using as partitioning criterion some properties, such as "PEDG" or "PROG", that require a specification to operate.
For example, to segment the dataset on the basis of the number of known ascendants of a given degree, the Model field must be set to "INDIVUDUAL" and the Label field to "PEDG". In the Parameter field, you will then define the maximal generational depth to which calculate the individuals pedigree. If you set the Parameter at "2" and the Type at "Raw", PUCK will create five clusters :
- "0" (composed by individuals whose grand-parents are all unknown) ;
- "1" (one grand-parent known) ;
- "2" (two grand-parents known) ;
- "3" (three grand-parents known) ;
- "4" (all grand-parents known).
The checkbox list called Type allows choosing, between several possibilities, the most pertinent way to regroup the clusters of a given partition. This choice depends on the partitioning criterion. Here is a brief description of how these operators work :
- Raw : produces a basic partition containing as much clusters as necessary. E.g. the Gender partitioning (Model : "INDIVIDUAL", Label : "GENDER") produces 3 clusters : a Null one, a Female and a Male ones. A last name partition (Model : "INDIVIDUAL", Label : "LASTN") as many clusters as the number of last names in the corpus (plus, eventually, a null cluster) ;
- Binarization : always produces 2 clusters, one corresponding to the Pattern and the other regrouping everything but the Pattern ;
- Free grouping : allows fixing irregular (or regular) date intervals ;
- Counted grouping : allows fixing the number of clusters in which you want to divide a given period (from start … to end) ;
- Sized grouping : allows fixing the duration of clusters in which you want to divide a given period (from start … to end).
Note : The dataset can be partitioned more than once, so you can superimpose different partitioning criteria. This can be done by running once again the Add Segment button (symbol "+"). So, the successive application of different properties as partition criteria permits to refine partitioning which can be necessary for your analysis process.
Navigate through partitions and clusters
After partitioning the dataset, you can navigate through the different partitions and clusters, by using the Partition / Cluster drop-down menus, or by clicking on the Up / Down one segment buttons, placed on the right side of the instruments bar ("Δ"/"∇" symbols on the bar). All the individuals and families appearing in the Individuals and Families navigation tabs, belong then to the selected cluster. However, Ego's relatives don't necessarily find themselves in his same cluster ; and the selected cluster changes if you double-click on an individual who doesn't belong to the previous cluster. So be careful : if you navigate through the corpus by clicking on individuals, you could jump out of the starting cluster without knowing it!
- Remove current segment (the "-" symbol on the bar), which erase the selected partition ;
- Clear all segments (the "broom" symbol on the bar), which erase all partitions.
Warning : partitions are hierarchically organized. If you erase the first partition (see the Partition drop-down menu) by using the command "-", the action will be effective on all other partitions too.
- Genealogical corpora can be partitioned on a first level according to Gender (Model: INDIVIDUAL, Label: SEX) and then, on a second level, one of the clusters can be partitioned to the Family Name (Model: INDIVIDUAL, Label: LASTN) and one of these clusters is to be partitioned according to the surname (Model : INDIVIDUAL, Label FIRSTN). Use the arrow buttons to move up and down to the different levels of partition. For each item from the « Partition » list, the correspondent clusters appear in the « Cluster » list (Fig. 26).
- Partitions the corpus according to the date of birth of individuals: Start = 1500 ; Size = 100 and End = 1900 divides individuals born in 1500 to 1900 onwards into sub-sets corresponding to 100 year intervals, adding another sub-set containing individuals born before 1500 and a still further sub-set containing individuals whose birth date is unknown. (Sized grouping) (Fig. 27)
Note : for a very brief introduction to kinship simulation techniques, click here.
When opening the submenu File > New > Random network, you find two possible types of random networks : Classic and Birth-Centered. In each one of these cases, after executing the command, a dialog window opens asking to input some statistic criteria. These will determinate the network dimensions and form.