Advanced topics
Pagination
By default, Subgrounds handles GraphQL query pagination automatically. That is, if a query selects more than 1000 entities using the first
argument (1000 being The Graph's limit to the first
argument), then Subgrounds will automatically split the query into multiple queries that each query at most 1000 entities.
Pagination is performed by Subgrounds with the use of a pagination strategy: a class that implements the PaginationStrategy
protocol. Subgrounds provides two pagination strategies out of the box, however, users wishing to implement their own strategy should create a class that implements the aforementioned protocol (see below).
If at some point during the pagination process, an unhandled exception occurs, Subgrounds will raise a PaginationError
exception containing the initial exception message as well as the PaginationStrategy
object in the state it was in when the error occured, which, in the case of iterative querying (e.g.: when using query_df_iter
), could be useful to recover and start pagination from a later stage.
Available pagination strategies
Subgrounds provides two pagination strategies out of the box:
LegacyStrategy
: A pagination strategy that implements the pagination algorithm that was used by default prior to this update. This pagination strategy supports pagination on nested fields, but is quite slow. Below is an example of a query for which you should use this strategy:query { liquidityPools(first: 10) { swaps(first: 5000) { id } } }
ShallowStrategy
: A new pagination strategy that is faster than theLegacyStrategy
, but does not paginate on nested list fields. In other words, this strategy is best when nested list fields select fewer than 1000 entities. Below is an example of a query for which you should use this strategy:query { liquidityPools(first: 5000) { swaps(first: 10) { id } } }
To use either pagination strategy, set the pagination_strategy
argument of toplevel querying functions:
from subgrounds import Subgrounds
from subgrounds.pagination import ShallowStrategy
sg = Subgrounds()
subgraph = sg.load_subgraph("https://api.thegraph.com/subgraphs/name/messari/compound-ethereum")
mkt_daily_snapshots = subgraph.Query.marketDailySnapshots(
orderBy='timestamp',
orderDirection='desc',
first=1000
)
field_paths = [
mkt_daily_snapshots.timestamp,
mkt_daily_snapshots.market.inputToken.symbol,
mkt_daily_snapshots.rates.rate,
mkt_daily_snapshots.rates.side,
]
df = sg.query_df(field_paths, pagination_strategy=ShallowStrategy)
Note that pagination can be explicitely disabled by setting pagination_strategy
to None
, in which case the query will be executed as-is:
df = sg.query_df(field_paths, pagination_strategy=ShallowStrategy)
Custom pagination strategy
Subgrounds allows developers to create their own pagination strategy by creating a class that implements the PaginationStrategy
protocol:
class PaginationStrategy(Protocol):
def __init__(
self,
schema: SchemaMeta,
document: Document
) -> None: ...
def step(
self,
page_data: Optional[dict[str, Any]] = None
) -> Tuple[Document, dict[str, Any]]: ...
The class's constructor should accept a SchemaMeta
argument which represents the schema of the subgraph API that the query is directed to and a Document
argument which represents the query to be paginated on. If no pagination is required for the given document, then the constructor should raise a SkipPagination
exception.
The class's step
method is where the main logic of the pagination strategy is located. The method accepts a single argument, page_data
which is a dictionary containing the response data of the previous query (i.e.: the previous page of data). The step
method should return a tuple (doc, vars)
, where doc
is a Document
representing the query to be made to fetch the next page of data. When pagination is over (e.g.: when all pages of data have been fetched), the step
method should raise a StopPagination
exception.
Below is the algorithm used by Subgrounds to paginate over a query document given a pagination strategy:
def paginate(
schema: SchemaMeta,
doc: Document,
pagination_strategy: Type[PaginationStrategy]
) -> dict[str, Any]:
try:
# Initialize the strategy
strategy = pagination_strategy(schema, doc)
data: dict[str, Any] = {}
# Compute the query document and variables to get the first page of data
next_page_doc, variables = strategy.step(page_data=None)
while True:
try:
# Fetch a data page
page_data = client.query(
url=next_page_doc.url,
query_str=next_page_doc.graphql,
variables=next_page_doc.variables | variables
)
# Merge the page with the data blob
data = merge(data, page_data)
# Compute the query document and variables to get the next page of data
next_page_doc, variables = strategy.step(page_data=page_data)
except StopPagination:
break
except Exception as exn:
raise PaginationError(exn.args[0], strategy)
return data
except SkipPagination:
# Excecute the query document as is if `SkipPagination` is raised
return client.query(doc.url, doc.graphql, variables=doc.variables)