Menu : Transform

Détails: Mis à jour : 2 janvier 2016

Page 5 sur 12

Generalities
The Transform menu features a number of commands allowing several systematic changes on the current dataset (duplication, anonymization, reduction, extraction, expansion, shrinking...). Some of these transformations concern the individuals names, their attributes, as well as the relations existing between them ; others can target the dataset as a whole or just some partitions of it.

Duplicate
The commandTransform > Duplicate creates an exact "live" copy of the original dataset. This can be useful if you want to test some transforming operations on a dataset, without affecting the original file with unwanted changes. Note that the duplicate wich PUCK thus produces is not automatically saved into a file.

Anonymization
The Anonymization commands are used for hiding individuals names. This can be useful, for example, if you work on a recent population and whish to publish some analytical results without revealing the individuals identity.
The following commands enable to choose the most appropriate form of anonimization :

Transform > Anonymize by First Name ;
Transform > Anonymize by Last Name ;
Transform > Anonymize by Gender & ID.

Number Names
The command Transform > Number names adds the identity number of each individual to its original name, between parentheses (e.g. “Pierre Dupont (5)”).

Note : Numbered names are convenient if the corpus is exported in .paj format (Pajek files require renumbering of individuals in order to assure continuous vertex numbers. Numbered names thus serve to keep the original numbers). If a Pajek file with numbered names is imported to PUCK, numbers between parentheses are automatically re-converted into identity numbers.

Transform Attributes
The following commands allow to operate systematic changes on attributes. It can be useful to use them, as it avoids to operate such changes one by one through the entire dataset :

Transform > Rename Attribute (acts on all Labels)
Transform > Filter Exogenous Attribute (acts on some Values)
Transform > Set Attribute Value (acts on all Values)
Transform > Replace Attribute Value (acts on some Values)
Transform > Valuate exo. Attribute
Transform > Remove all Attributes (acts on all Label and Values)
Transform > Transmit Attribute Value (act on all Values)

When executing each one of these commands, a dialog window automatically opens asking to specify which attribute has to be changed, and how. Thus, each command produces a specific dialog window, but some recurrences can be isolated (we will leave the rest of it implicit, as these functions are sufficiently intuitive). The Target field indicates the type of attribute concerned (i.e., All, Individual, Family...). The Label field indicates the "name" of the attribute concerned (i.e., BIRTH_DATE...). The Value field indicates the actual content of the attribute (i.e., "1955" for a birth date).

Redefining Relations
***GAP
Transform > Redefining Relation

***Marry Coparents
The command Transform > Marry Coparents associates to each fertile couple a matrimonial link. This can be useful, for instance, in order to make those unions visible in a matrimonial census. The command is effective on the entire dataset without taking into account partitioning and it doesn't change the family numbering.

***Re-number Ids
The Transform > Re-number Ids sub-menu enables to change the whole dataset individuals Id numbering. The new numbering will start from 0 and will cover all individuals without gaps.
If you have a pre-existent corpus where the Id number is defined as an attribute, you can use the commands Transform > Renumber Ids from ID attr. and Transform > Renumber Ids from REFN attr. in order to make Puck recognize it as the actual individuals Id.

Editable families
***GAP
Transform > Editable families

Life events
***GAP
The command Transform > Add life events processes biographic events (births, marriages, migrations...) as relation models rather than as attributes.

Enlarge from Attributes
***GAP
Transform > Enlarge from attributes

Reduce (Submenu)

The sub-menu Transform > Reduce allows removing specific segment types from a kinship network. The reduction operations precisely concern segments that can be considered structurally irrelevant (unmarried people, structural children, etc.). Thus, these operations are meant to "clean" the corpus before proceeding to an analysis of its structure. This can be useful when preparing, for instance, a matrimonial circuit census, e.g. in order to refine an analysis of the datasets gender bias. The following list of commands gives the detail of each possible reduction :

***Transform > Reduce > Acyclic Segments : eliminates from the kinship network all the segments that do not contain cycles.
Transform > Reduce > Marked doubles : eliminates all doubles. The individual with the lower identity number is considered as the original, the one with the higher ID number as the double. Doubles can be marked by substituting the original’s identity number for their name.
Transform > Reduce > Structural children : eliminates all individuals who have neither spouses nor children (the structural children of the kinship network).
Transform > Reduce > Unmarried : eliminates all individuals who are not married.
Transform > Reduce > Virtual individuals : eliminates all virtual individuals. This reduction is recommended for the exploratory analysis of a corpus containing fictive individuals.

Extract (Submenu)

The sub-menu Transform > Extract enables to create a new dataset by selecting a specific sub-corpus of the original one. The new dataset will be thus composed only by the vertices of the selected sub-corpus. By the following commands, you can choose to extract different types of sub-corpuses :

Transform > Extract > Current Segment
Transform > Extract > Current Cluster (Ctrl+E)
Transform > Extract > Kernel (maximal matrimonial bicomponent)
Transform > Extract > Max. Bicomponent
Transform > Extract > Core
Transform > Extract > By cluster size / By cluster value : ***CORR*** reduces the kinship network to those vertices who have a positive cluster value in a chosen partition. For example, if “OCCU” (occupation) is chosen, then the network is reduced to the people whose occupation is known.

Note : Some of the commands in the list above presuppose a basic knowledge of the partitioning process. To read more about partitioning, click here or use the "Functionalities" menu (click on the voice : "Partitions Bar").

Expand Current Segment (Submenu)

The sub-menu Transform > Expand current segment (...) allows creating a new dataset composed both of the selected partition members and of individuals somehow connected to them. This can be useful, for instance, when you want to operate a circuit census on a given partition of the dataset, without losing data about the ties that exist between its members (which could involve non-members of the partition).
Thus, the submenu allows expanding a partition to its connected non-members and, in addition, to operate a selection between them, based on the type of ties existing between the segment members and the "to be included" non-members. Such a selection can be operated by the following commands, which indicate different classes of connected individuals :

Transform > Expand current segment > Special Features... : Includes individuals connected by special features
Transform > Expand current segment > Universal : Includes individuals connected by all existing ties
Transform > Expand current segment > All related : Includes all individuals connected by ties defined as a Relation Model
Transform > Expand current segment > All Kin : Includes all individuals connected by marriage and filiation ties
Transform > Expand current segment > Ascending : ***Includes all individuals connected by ascending ties
Transform > Expand current segment > Ascending (Agnatic) : ***Includes all individuals connected by agnatic ascending ties
Transform > Expand current segment > Ascending (Uterine) : ***Includes all individuals connected by uterine ascending ties
Transform > Expand current segment > Descending : ***Includes all individuals connected by descending ties
Transform > Expand current segment > Descending (Agnatic) : ***Includes all individuals connected by descending agnatic ties
Transform > Expand current segment > Descending (Uterine) : ***Includes all individuals connected by descending uterine ties
Transform > Expand current segment > Horizontal : ***Includes all individuals connected by marriage ties

***Shrink

The Shrink function allows regrouping the dataset individuals following a given criterion ; it also allows generating and analyzing the network of links existing between such groups. The results can be exported in .paj and .dat formats. Such networks can be thus represented as directed graphs, where both nodes and arcs have values. The nodes values will then quantify the partition size (number of individuals), the arcs values will quantify their weight (number of ties).

The Transform > Shrink > Alliance Network command produces a directed graph where nodes represent groups of individuals sharing a given endogenous/exogenous property and arcs represent the number of links between such groups. For instance, it can be used in order to analyze the matrimonial alliance network between different patri-lignages, or to study the transmission of professions through filiation.
In Puck, when executing the command, a dialog window automatically opens asking to set some criteria.
The Label field, allows defining which endogenous/exogenous property will actually regroup the dataset individuals.
The Alliance Type field allows defining as "alliance" relations three different types of ties : wife-husband, sister-brother and parent-child. Choosing one of these will produce, respectively, a matrimonial exchange network, a network of siblingship or a network of filiation ties.
The Weighted Arcs check-box allows choosing whether or not the arcs weight will appear in the results.
Finally, three fields allow to filter the resulting network depending on the Minimal number of : links (node degree), alliances per node (node strength), and alliances per link (link weight).

The results can be viewed (and managed) both as an autonomous Alliance Network Window (on the dialog window, click on the Launch button) and as a statistic report (on the dialog window, click on the Statistics button). For a description of the Alliance Network Window, see here.
The statistic report window is composed of six tabs :

Alliance Network Report, which summaries the input criteria and allows exporting the results in the .paj, .paj (edge version) and .dat formats ;
Analysis, which presents the results according to a number of indicators such as the number of nodes and arcs, the maximal weight (links per arc) and strength (links per node), the potential endogamic pairs, the distribution of circuits (etc.) (for more details see here) ;
Matrix, which shows the network alliance matrix in a table and allows exporting it in .txt and .xls formats ;
Couples, which lists in a table the linked cluster-to-cluster couples, as well as the composition of (directed) links connecting each couple. Every block then specifies the link weight and the individuals couples (their Id number, gender and Name) that are connected by marriage, siblingship or filiation tie (depending on the chosen criterion).
Sortable List, which indicates : in the first column, the origin vertex (wife, sister or parent, depending on the chosen criterion) of each link : its Id number, Gender and Name ; in the second column, the destination vertex (husband, brother or child) its Id Number, Gender and Name ; in the third column, the cluster to which the origin vertex belongs ; in the fourth column, the cluster to which the destination vertex belongs ; in the fifth column, the link weight.
***GAP Sides, which lists [...]

The command Transform > Shrink > Flow Network allows producing, analyze and manage flow networks. Here, nodes represent segments that regroup the dataset individuals who share one (of two) given endogenous/exogenous properties. Concurrently, the network weighted arcs connect the segments that contain, each one, the same individual ; their weight correspond then to the number of individuals who share the two given properties, and they point from the first cluster to the other.
The command can be useful, for instance, for studying migration flows (from the birth place to the death place of the dataset individuals).
When executing the command, a dialog window automatically opens.
Here, the Source Label field allows defining the first property used for regrouping (i.e., BIRT_PLACE), and the Target Label field allows defining the second one (i.e., DEAT_PLACE).
***The Minimal number of links field allows excluding from the results network all the source nodes whose size doesn't reach a given number of individuals.

Unlike for alliance networks, the results are shown only as statistics. The Report Window is made up of five tabs:

Flow Network Report, where one can find a review of the input criteria, as well as the possibility to export the network in .paj format ;
Analysis, where appear specific statistics on the flow network ;
Matrix, where the flow network matrix appears in the same form as an Alliance Matrix ;
Flows, where are listed the couples of source > target nodes and, for each one of those, the individuals appearing in both segments.
***GAP Sortable List, where [...]

***GAP Simulation Tools
The command Transform > Reshuffling allows producing the network that results by randomizing the corpus marriages (and keeping the rest of it as it is). This simulation technique can be useful in order to understand to what extent specific matrimonial configurations depend from demographic and/or data collection biases.
When executing the command, a dialog window automatically opens, asking to specify :

The Number of edge permutations per step - [...]
The Maximum generational distance - [...]
The Minimum shuffle percentage (stop condition) - [...]
The Minimum stable iterations (stop condition) - [...]

The command Transform > Virtual Network allows simulating the biases introduced with data collection. It can be useful in order to know how the network morphology would change, if all informants came, for instance, from a small set of families.
When executing the command, a dialog window automatically opens, asking to specify :

The Number of informants - [...]
The Kin proximity - [...]
The Kin degree - [...]
The Near Kin weight - [...]
The Memory - [...]
The Acceptance of both a Male Informant and a Female Informant
The Kin Recall rates of both Men's Kin (first degree) and Women's Kin (first degree).

The command Transform > Virtual Fieldwork Variations allows [...]

Functionalities - Menu : Transform

Main Menu

Functionalities