Skip to main content

Syntax overview

dbt's node selection syntax makes it possible to run only specific resources in a given invocation of dbt. This selection syntax is used for the following subcommands:

commandargument(s)
run--select, --exclude, --selector, --defer
test--select, --exclude, --selector, --defer
seed--select, --exclude, --selector
snapshot--select, --exclude --selector
ls (list)--select, --exclude, --selector, --resource-type
compile--select, --exclude, --selector, --inline
freshness--select, --exclude, --selector
build--select, --exclude, --selector, --resource-type, --defer
docs generate--select, --exclude, --selector
Nodes and resources

We use the terms "nodes" and "resources" interchangeably. These encompass all the models, tests, sources, seeds, snapshots, exposures, and analyses in your project. They are the objects that make up dbt's DAG (directed acyclic graph).

Specifying resources

By default, dbt run executes all of the models in the dependency graph; dbt seed creates all seeds, dbt snapshot performs every snapshot. The --select flag is used to specify a subset of nodes to execute.

To follow POSIX standards and make things easier to understand, we recommend CLI users use quotes when passing arguments to the --select or --exclude option (including single or multiple space-delimited, or comma-delimited arguments). Not using quotes might not work reliably on all operating systems, terminals, and user interfaces. For example, dbt run --select "my_dbt_project_name" runs all models in your project.

How does selection work?

  1. dbt gathers all the resources that are matched by one or more of the --select criteria, in the order of selection methods (e.g. tag:), then graph operators (e.g. +), then finally set operators (unions, intersections, exclusions).

  2. The selected resources may be models, sources, seeds, snapshots, tests. (Tests can also be selected "indirectly" via their parents; see test selection examples for details.)

  3. dbt now has a list of still-selected resources of varying types. As a final step, it tosses away any resource that does not match the resource type of the current task. (Only seeds are kept for dbt seed, only models for dbt run, only tests for dbt test, and so on.)

Shorthand

Select resources to build (run, test, seed, snapshot) or check freshness: --select, -s

Examples

By default, dbt run will execute all of the models in the dependency graph. During development (and deployment), it is useful to specify only a subset of models to run. Use the --select flag with dbt run to select a subset of models to run. Note that the following arguments (--select, --exclude, and --selector) also apply to other dbt tasks, such as test and build.

The --select flag accepts one or more arguments. Each argument can be one of:

  1. a package name
  2. a model name
  3. a fully-qualified path to a directory of models
  4. a selection method (path:, tag:, config:, test_type:, test_name:)

Examples:

dbt run --select "my_dbt_project_name"   # runs all models in your project
dbt run --select "my_dbt_model" # runs a specific model
dbt run --select "path/to/my/models" # runs all models in a specific directory
dbt run --select "my_package.some_model" # run a specific model in a specific package
dbt run --select "tag:nightly" # run models with the "nightly" tag
dbt run --select "path/to/models" # run models contained in path/to/models
dbt run --select "path/to/my_model.sql" # run a specific model by its path

As your selection logic gets more complex, and becomes unwieldly to type out as command-line arguments, consider using a yaml selector. You can use a predefined definition with the --selector flag. Note that when you're using --selector, most other flags (namely --select and --exclude) will be ignored.

Troubleshoot with the ls command

Constructing and debugging your selection syntax can be challenging. To get a "preview" of what will be selected, we recommend using the list command. This command, when combined with your selection syntax, will output a list of the nodes that meet that selection criteria. The dbt ls command supports all types of selection syntax arguments, for example:

dbt ls --select "path/to/my/models" # Lists all models in a specific directory.
dbt ls --select "source_status:fresher+" # Shows sources updated since the last dbt source freshness run.
dbt ls --select state:modified+ # Displays nodes modified in comparison to a previous state.
dbt ls --select "result:<status>+" state:modified+ --state ./<dbt-artifact-path> # Lists nodes that match certain result statuses and are modified.

Questions from the Community

State selection

One of the greatest underlying assumptions about dbt is that its operations should be stateless and idempotent. That is, it doesn't matter how many times a model has been run before, or if it has ever been run before. It doesn't matter if you run it once or a thousand times. Given the same raw data, you can expect the same transformed result. A given run of dbt doesn't need to "know" about any other run; it just needs to know about the code in the project and the objects in your database as they exist right now.

That said, dbt does store "state" — a detailed, point-in-time view of project resources (also referred to as nodes), database objects, and invocation results — in the form of its artifacts. If you choose, dbt can use these artifacts to inform certain operations. Crucially, the operations themselves are still stateless and idempotent: given the same manifest and the same raw data, dbt will produce the same transformed result.

dbt can leverage artifacts from a prior invocation as long as their file path is passed to the --state flag. This is a prerequisite for:

  • The state selector, whereby dbt can identify resources that are new or modified by comparing code in the current project against the state manifest.
  • Deferring to another environment, whereby dbt can identify upstream, unselected resources that don't exist in your current environment and instead "defer" their references to the environment provided by the state manifest.
  • The dbt clone command, whereby dbt can clone nodes based on their location in the manifest provided to the --state flag.

Together, the state selector and deferral enable "slim CI". We expect to add more features in future releases that can leverage artifacts passed to the --state flag.

Establishing state

State and defer can be set by environment variables as well as CLI flags:

  • --state or DBT_STATE: file path
  • --defer or DBT_DEFER: boolean
  • --defer-state or DBT_DEFER_STATE: file path to use for deferral only (optional)

If --defer-state is not specified, deferral will use the artifacts supplied by --state. This enables more granular control in cases where you want to compare against logical state from one environment or past point in time, and defer to applied state from a different environment or point in time.

If both the flag and env var are provided, the flag takes precedence.

Notes:

  • The --state artifacts must be of schema versions that are compatible with the currently running dbt version.
  • These are powerful, complex features. Read about known caveats and limitations to state comparison.
Syntax deprecated

In dbt v1.5, we deprecated the original syntax for state (DBT_ARTIFACT_STATE_PATH) and defer (DBT_DEFER_TO_STATE). Although dbt supports backward compatibility with the old syntax, we will remove it in a future release that we have not yet determined.

The "result" status

Another element of job state is the result of a prior dbt invocation. After executing a dbt run, for example, dbt creates the run_results.json artifact which contains execution times and success / error status for dbt models. You can read more about run_results.json on the 'run results' page.

The following dbt commands produce run_results.json artifacts whose results can be referenced in subsequent dbt invocations:

  • dbt run
  • dbt test
  • dbt build (new in dbt version v0.21.0)
  • dbt seed

After issuing one of the above commands, you can reference the results by adding a selector to a subsequent command as follows:

# You can also set the DBT_STATE environment variable instead of the --state flag.
dbt run --select "result:<status>" --defer --state path/to/prod/artifacts

The available options depend on the resource (node) type:

result:\<status>modelseedsnapshottest
result:error
result:success
result:skipped
result:fail
result:warn
result:pass

Combining state and result selectors

The state and result selectors can also be combined in a single invocation of dbt to capture errors from a previous run OR any new or modified models.

dbt run --select "result:<status>+" state:modified+ --defer --state ./<dbt-artifact-path>

The "source_status" status

Another element of job state is the source_status of a prior dbt invocation. After executing dbt source freshness, for example, dbt creates the sources.json artifact which contains execution times and max_loaded_at dates for dbt sources. You can read more about sources.json on the 'sources' page.

The dbt source freshness command produces a sources.json artifact whose results can be referenced in subsequent dbt invocations.

When a job is selected, dbt Cloud will surface the artifacts from that job's most recent successful run. dbt will then use those artifacts to determine the set of fresh sources. In your job commands, you can signal dbt to run and test only on the fresher sources and their children by including the source_status:fresher+ argument. This requires both the previous and current states to have the sources.json artifact available. Or plainly said, both job states need to run dbt source freshness.

After issuing the dbt source freshness command, you can reference the source freshness results by adding a selector to a subsequent command:

# You can also set the DBT_STATE environment variable instead of the --state flag.
dbt source freshness # must be run again to compare current to previous state
dbt build --select "source_status:fresher+" --state path/to/prod/artifacts

For more example commands, refer to Pro-tips for workflows.

0