SELECT [ hints ] [ ALL | DISTINCT ] { named_expression | star_clause } [, ...]
FROM from_item [, ...]
[ LATERAL VIEW clause ]
[ PIVOT clause ]
[ WHERE clause ]
[ GROUP BY clause ]
[ HAVING clause]
[ QUALIFY clause ]
from_item
{ table_name [ TABLESAMPLE clause ] [ table_alias ] |
JOIN clause |
[ LATERAL ] table_valued_function [ table_alias ] |
VALUES clause |
[ LATERAL ] ( query ) [ TABLESAMPLE clause ] [ table_alias ] }
named_expression
expression [ column_alias ]
star_clause
[ { table_name | view_name } . ] *
Select all matching rows from the relation. Enabled by default.
Select all matching rows from the relation after removing duplicates in results.
An expression with an optional assigned name.
A combination of one or more values, operators, and SQL functions that evaluates to a value.
An optional column identifier naming the expression result. If no column_alias is provided Databricks Runtime derives one.
A shorthand to name all the referencable columns in the FROM clause. The list of columns is ordered by the order of from_items and the order of columns within each from_item.
The _metadata column is not included this list. You must reference it explicitly.
If present limits the columns to be named to those in the specified referencable table.
If specified limits the columns to be expanded to those in the specified referencable view.
A source of input for the SELECT. One of the following:
- table_name
Identifies a table that may contain a temporal specification. See Query an older snapshot of a table (time travel) for details.
- view_name
Identifies a view.
- JOIN
Combines two or more relations using a join.
- [ LATERAL ] table_valued_function
Invokes a table function. To refer to columns exposed by a preceding from_item in the same FROM clause you must specify LATERAL.
- VALUES
Defines an inline table.
- [ LATERAL ] ( query )
Computes a relation using a query. A query prefixed by LATERAL may reference columns exposed by a preceding from_item in the same FROM clause. Such a construct is called a correlated or dependent query.
LATERAL is supported since Databricks Runtime 9.0.
- TABLESAMPLE
Optionally reduce the size of the result set by only sampling a fraction of the rows.
- table_alias
Optionally specifies a label for the from_item. If the table_alias includes column_identifiers their number must match the number of columns in the from_item.
Used for data perspective; you can get the aggregated values based on specific column value.
Used in conjunction with generator functions such as EXPLODE, which generates a virtual table containing one or more rows. LATERAL VIEW applies the rows to each original output row.
Filters the result of the FROM clause based on the supplied predicates.
The expressions that are used to group the rows. This is used in conjunction with aggregate functions (MIN, MAX, COUNT, SUM, AVG) to group rows based on the grouping expressions and aggregate values in each group. When a FILTER clause is attached to an aggregate function, only the matching rows are passed to that function.
The predicates by which the rows produced by GROUP BY are filtered. The HAVING clause is used to filter rows after the grouping is performed. If you specify HAVING without GROUP BY, it indicates a GROUP BY without grouping expressions (global aggregate).
The predicates that are used to filter the results of window functions. To use QUALIFY, at least one window function is required to be present in the SELECT list or the QUALIFY clause.