Window function#

Window function#

Window frames in pystarburst.

class pystarburst.window.Window#

Bases: object

Examples

>>> from pystarburst.functions import col, avg
>>> window1 = Window.partition_by("value").order_by("key").rows_between(Window.CURRENT_ROW, 2)
>>> window2 = Window.order_by(col("key").desc()).range_between(Window.UNBOUNDED_PRECEDING, Window.UNBOUNDED_FOLLOWING)
>>> df = session.create_dataframe([(1, "1"), (2, "2"), (1, "3"), (2, "4")], schema=["key", "value"])
>>> df.select(avg("value").over(window1).as_("window1"), avg("value").over(window2).as_("window2")).collect()
[Row(WINDOW1=3.0, WINDOW2=2.5), Row(WINDOW1=2.0, WINDOW2=2.5), Row(WINDOW1=1.0, WINDOW2=2.5), Row(WINDOW1=4.0, WINDOW2=2.5)]
CURRENT_ROW: int = 0#

Returns a value representing current row.

UNBOUNDED_FOLLOWING: int = 9223372036854775807#

Returns a value representing unbounded following.

UNBOUNDED_PRECEDING: int = -9223372036854775807#

Returns a value representing unbounded preceding.

currentRow: int = 0#
static orderBy(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a WindowSpec object with order by clause.

Parameters:

cols – A column, as str, Column or a list of those.

static order_by(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a WindowSpec object with order by clause.

Parameters:

cols – A column, as str, Column or a list of those.

static partitionBy(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a WindowSpec object with partition by clause.

Parameters:

cols – A column, as str, Column or a list of those.

static partition_by(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a WindowSpec object with partition by clause.

Parameters:

cols – A column, as str, Column or a list of those.

static rangeBetween(start: int, end: int) WindowSpec#

Returns a WindowSpec object with the range frame clause.

Parameters:
  • start – The relative position from the current row as a boundary start (inclusive). The frame is unbounded if this is Window.UNBOUNDED_PRECEDING, or any value less than or equal to -9223372036854775807 (-sys.maxsize).

  • end – The relative position from the current row as a boundary end (inclusive). The frame is unbounded if this is Window.UNBOUNDED_FOLLOWING, or any value greater than or equal to 9223372036854775807 (sys.maxsize).

Note

You can use Window.UNBOUNDED_PRECEDING, Window.UNBOUNDED_FOLLOWING, and Window.CURRENT_ROW to specify start and end, instead of using integral values directly.

static range_between(start: int, end: int) WindowSpec#

Returns a WindowSpec object with the range frame clause.

Parameters:
  • start – The relative position from the current row as a boundary start (inclusive). The frame is unbounded if this is Window.UNBOUNDED_PRECEDING, or any value less than or equal to -9223372036854775807 (-sys.maxsize).

  • end – The relative position from the current row as a boundary end (inclusive). The frame is unbounded if this is Window.UNBOUNDED_FOLLOWING, or any value greater than or equal to 9223372036854775807 (sys.maxsize).

Note

You can use Window.UNBOUNDED_PRECEDING, Window.UNBOUNDED_FOLLOWING, and Window.CURRENT_ROW to specify start and end, instead of using integral values directly.

static rowsBetween(start: int, end: int) WindowSpec#

Returns a WindowSpec object with the row frame clause.

Parameters:
  • start – The relative position from the current row as a boundary start (inclusive). The frame is unbounded if this is Window.UNBOUNDED_PRECEDING, or any value less than or equal to -9223372036854775807 (-sys.maxsize).

  • end – The relative position from the current row as a boundary end (inclusive). The frame is unbounded if this is Window.UNBOUNDED_FOLLOWING, or any value greater than or equal to 9223372036854775807 (sys.maxsize).

Note

You can use Window.UNBOUNDED_PRECEDING, Window.UNBOUNDED_FOLLOWING, and Window.CURRENT_ROW to specify start and end, instead of using integral values directly.

static rows_between(start: int, end: int) WindowSpec#

Returns a WindowSpec object with the row frame clause.

Parameters:
  • start – The relative position from the current row as a boundary start (inclusive). The frame is unbounded if this is Window.UNBOUNDED_PRECEDING, or any value less than or equal to -9223372036854775807 (-sys.maxsize).

  • end – The relative position from the current row as a boundary end (inclusive). The frame is unbounded if this is Window.UNBOUNDED_FOLLOWING, or any value greater than or equal to 9223372036854775807 (sys.maxsize).

Note

You can use Window.UNBOUNDED_PRECEDING, Window.UNBOUNDED_FOLLOWING, and Window.CURRENT_ROW to specify start and end, instead of using integral values directly.

unboundedFollowing: int = 9223372036854775807#
unboundedPreceding: int = -9223372036854775807#
class pystarburst.window.WindowSpec(partition_spec: List[Expression] | None, order_spec: List[SortOrder] | None, frame: WindowFrame)#

Bases: object

Represents a window frame clause.

orderBy(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a new WindowSpec object with the new order by clause.

orderBy() is an alias of order_by().

order_by(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a new WindowSpec object with the new order by clause.

orderBy() is an alias of order_by().

partitionBy(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a new WindowSpec object with the new partition by clause.

partitionBy() is an alias of partition_by().

partition_by(*cols: ColumnOrName | Iterable[ColumnOrName]) WindowSpec#

Returns a new WindowSpec object with the new partition by clause.

partitionBy() is an alias of partition_by().

rangeBetween(start: int, end: int) WindowSpec#

Returns a new WindowSpec object with the new range frame clause.

rangeBetween() is an alias of range_between().

range_between(start: int, end: int) WindowSpec#

Returns a new WindowSpec object with the new range frame clause.

rangeBetween() is an alias of range_between().

rowsBetween(start: int, end: int) WindowSpec#

Returns a new WindowSpec object with the new row frame clause.

rowsBetween() is an alias of rows_between().

rows_between(start: int, end: int) WindowSpec#

Returns a new WindowSpec object with the new row frame clause.

rowsBetween() is an alias of rows_between().