Dataframe grouping functions#
Dataframe grouping methods#
- class pystarburst.relational_grouped_dataframe.GroupingSets(*sets: Column | List[Column])#
Bases:
object
Creates a
GroupingSets
object from a list of column/expression sets that you pass toDataFrame.group_by_grouping_sets()
. SeeDataFrame.group_by_grouping_sets()
for examples of how to use this class with aDataFrame
. See GROUP BY GROUPING SETS for its counterpart in SQL (several examples are shown below).Python interface
SQL interface
GroupingSets([col("a")], [col("b")])
GROUPING SETS ((a), (b))
GroupingSets([col("a") , col("b")], [col("c"), col("d")])
GROUPING SETS ((a, b), (c, d))
GroupingSets([col("a"), col("b")])
GROUPING SETS ((a, b))
GroupingSets(col("a"), col("b"))
GROUPING SETS ((a, b))
- class pystarburst.relational_grouped_dataframe.RelationalGroupedDataFrame(df: DataFrame, grouping_exprs: List[Expression], group_type: _GroupType)#
Bases:
object
Represents an underlying DataFrame with rows that are grouped by common values. Can be used to define aggregations on these grouped DataFrames.
- agg(*exprs: Column | Tuple[ColumnOrName, str] | Dict[str, str]) DataFrame #
Returns a
DataFrame
with computed aggregates. See examples inDataFrame.group_by()
.- Parameters:
exprs –
A variable length arguments list where every element is
a Column object
a tuple where the first element is a column object or a column name and the second element is the name of the aggregate function
a list of the above
a
dict
maps column names to aggregate function names.
Note
The name of the aggregate function to compute must be a valid Trino aggregate function.
See also
DataFrame.agg()
DataFrame.group_by()
- builtin(agg_name: str) Callable #
Computes the builtin aggregate
agg_name
over the specified columns. Use this function to invoke any aggregates not explicitly listed in this class. See examples inDataFrame.group_by()
.
- function(agg_name: str) Callable #
Computes the builtin aggregate
agg_name
over the specified columns. Use this function to invoke any aggregates not explicitly listed in this class. See examples inDataFrame.group_by()
.