Subset Configuration¶
The DBSnapper Agent is configured using a YAML file, which is created when you run dbsnapper config init
In this file you can specify multiple target configurations, each target being a set of options for a database you want to subset.
Referring to our sample configuration file, the highlighted lines show the configuration options for a database subsetting:
~/.config/dbsnapper/dbnsapper.yml
example
Configuration options¶
The subset configuration for a target is rather involved and provides a number of options to control the subsetting process. The following is a list of the subset configuration options:
Subset Connection Strings¶
subset
- Subset configuration for the target
src_url
- Connection string of source database
dst_url
- Connection string for the database where the subset will be created (will be overwritten)
Subset Tables¶
subset_tables
-
List of tables to be subsetted. The subset tables are the initial tables of interest, and the
where
andpercent
clauses can be used to control the portion of each table to be included in the subset. Rows from other tables are included as needed to maintain referential integrity with the subset tables. table
- The name of the table to be subsetted. All tables are specified in the format
schema.table
. where
- Providing a
where
clause will subset the table based on the condition specified. percent
- Specifying the
percent
clause will take a random sample of the table based on the percentage specified.
subset_tables
clauses
One (and only one) of the where
or percent
clauses must be provided for each table in the subset_tables
list.
Tables to Copy or Exclude¶
copy_tables
- List of tables to be copied in whole to the subset database (
dst_url
). These tables are copied as-is from the source database to the subset database. excluded_tables
- List of tables to be excluded from the subset database (
dst_url
). These tables are not copied to the subset database.
Defining Relationships¶
added_relationships
-
List of table relationships to be considered. These relationships are added to the configuration when a set of tables have a foreign key relationship that is not defined in the database schema via foreign key constraints.
Each
added_relationships
entry is defined by thefk_table
,fk_columns
,ref_table
, andref_columns
attributes. excluded_relationships
-
List of table relationships to be excluded. A relationship should be excluded when a circular dependency is detected in the database schema. This can occur when a table has a foreign key relationship to another table that also has a foreign key relationship back to the original table.
Each
excluded_relationships
entry is defined by thefk_table
andref_table
attributes.