Note : For a brief introduction to the Statistics function, see here.
The Statistics Inputs window (Analysis > Statistics... Ctrl+G ) contains, first of all, the General frame. It allows obtaining information and diagrams about the current dataset. By selecting the query criteria from the checkbox list contained in the frame, you will obtain information about some dataset structural properties, such as :
- The corpus Gender bias (weight)
- The corpus Gender bias (net weight)
- The distribution of Components
- The corpus Genealogical Completeness
- The corpus Ancestor chains (“Ancestor types”, choosing degree)
- The Fratry Distribution
- Consanguine Chains
- Four Cousins Marriages
Partition Diagrams Criteria
The Partition Diagrams Criteria frame allows obtaining statistics and diagrams about partitions of the dataset. More than one partition can be analyzed at the time. The resulting diagrams will show, on the abscissa, the partitioning criterion and, on the ordinate, the clusters size.
***GAP The Split Partition Criteria frame allows [...]
***GAP The Mean Cluster Values checkbox allows [...]
After launching a query from the Statistics Inputs window, the results are displayed in a report window, both as graphs and tables. Graphs can be viewed individually (by clicking on them). Results can be saved in .txt or .xls formats (by clicking on the “Save” button and choosing the destination folder). The report window also provides information on the distribution of properties.
The first measure of the corpus gender bias is the Agnatic (Uterine) Weight. The following snapshot shows an example of Gender Bias (weight) report diagram, taken from M. Gasperoni's "Ebrei" corpus.
The other Gender Bias measure is the Agnatic (Uterine) Net Weight. The following snapshot shows an example of Gender Bias (net weight) report diagram, taken from M. Gasperoni's "Ebrei" corpus.
Both Gender Bias measures are useful indicators of the interdependence and interconnection of the genealogical knowledge : the more curves (uterine and agnatic) are close to one another, the higher is their interdependence ; the more they are apart, the more they become autonomous, that is to say that we know the agnatic or uterine unisexual lines. If curves are low, it means that there is interconnection.
Note : it is important to analyze the gender bias on partitions (e.g., depending on the generation or the age of birth of individuals). Partitioning is particularly useful for focusing on a part of networks.
Distribution of Components
The Components diagram shows the distribution of agnatic/uterine components (connected subnetworks made up entirely by paternal/maternal ties) according to their size : the abscissa of the diagram indicates the relative size of components (as a percentage of total network size, where size = number of individuals), the ordinate indicates the relative frequency of components of given size (as a percentage of the total number of components). The following snapshot shows an example of Components report diagram, taken from M. Gasperoni's "Ebrei" corpus.
The Genealogical Completeness of a kinship network corresponds to the percentage of known ascendants (agnatic, uterine and overall) by generation. The following snapshot shows an example of Genealogical Completeness report diagram, taken from M. Gasperoni's "Ebrei" corpus.
The Fratry Distribution is the distribution of agnatic and uterine fratries (sibling groups) according to their size. The following snapshot shows an example of Fratry Distribution report diagram, taken from M. Gasperoni's "Ebrei" corpus.
***First Cousin Marriage
The First Cousin Marriages diagram shows the occurrences of the four first cousins marriages: between cross/parallel patri-/matri-lateral cousins. The four cousins types are distributed on the abscissa axis and their respective number of occurrences is detailed on the ordinate. The following snapshot shows the First Cousin Marriages report diagram of M. Gasperoni's "Ebrei" corpus.
The Ancestor chains diagram shows the composition of ancestor chains, expressed in positional notation, depending on a given degree (your choice) and gender. It is very important to know the distribution of consanguine chains and this is an additional measure of bias (Barry & Gasperoni, 2008, p. 71‑77). The following snapshot shows an Ancestor Chains report diagram, taken from M. Gasperoni's "Ebrei" corpus.
***Distribution of properties
Using the Statistics Window, it is also possible to analyze the distribution of endogenous and exogenous properties, combining queries and property codes and exporting the results as partitions (to represent with other software like, for instance, Pajek). This allows knowing precisely the profile and composition of the dataset, not only for analyzing it, but also to improve and complete it thereafter. Puck generates information and statistics on all genealogical (number and distribution of known ascendants/descendants etc.) or exogenous data (occupation, place and date of birth, etc.). The results appear in the form of tables, diagrams or partitions.