I’m creating an R API wrapper around my state’s public transport service. To make life easier for the users, the responses from the API calls are parsed and returned as tibbles/data frames. To make life easier for me, I need to keep track of the API call behind each tibble. I do this by using the tibble::new_tibble()
function to attach metadata to the tibble as attributes, and creating a custom print
method to make the metadata visible.
First, the raw response from the API call is structured as a list:
response <- list(
request = request_url_without_auth,
retrieved = format(
Sys.time(),
format = "%Y-%m-%d %H:%M:%OS %Z",
tz = "Australia/Melbourne"
),
status_code = status_code,
content = content
)
Some other function will take the content in this list, process it, and create a parsed
tibble. We hand this off to the tibble::new_tibble()
function. Along with the new class name — “ptvapi” — we also attach the response
metadata as attributes to the new tibble.
tibble::new_tibble(
parsed,
nrow = nrow(parsed),
class = "ptvapi",
request = response$request,
retrieved = response$retrieved,
status_code = response$status_code,
content = response$content
)
Let’s say we have a tibble flinders_departures
created through this process. flinders_departures
will have the following classes, in order: “ptvapi”, “tbl_df”, “tbl”, and “data.frame”.
For those who are unfamiliar to S3, some functions in R — like print()
, summary()
and predict()
— are generics. When we call a generic, R will look through the classes of the argument to find the right method to call. When we call print(flinders_departures)
(or, equivalently, enter flinders_departures
at the console) R will first look for the print.ptvapi()
method. If it can’t find that, it will move on to print.tbl_df()
, and so on, until it tries print.default()
.
If I were to let R go through this method I would print flinders_departures
without printing the metadata. So I created a simple way of printing out those attributes:
print.ptvapi <- function(x) {
if (!is.null(attr(x, "request"))) {
cat("Request:", attr(x, "request"), "\n")
}
if (!is.null(attr(x, "retrieved"))) {
cat("Retrieved:", attr(x, "retrieved"), "\n")
}
if (!is.null(attr(x, "status_code"))) {
cat("Status code:", attr(x, "status_code"), "\n")
}
NextMethod()
}
This method will print out the attributes if they exist. NextMethod()
will then make R go down the class chain, until it prints like a regular tibble. This is great for debugging. The response of the API call is (more or less) determined by the three attributes I specially print. So it makes life much easier for me to be able relate the parsed tibble to the API response.
> flinders_departures
Request: http://timetableapi.ptv.vic.gov.au/v3/departures/route_type/0/stop/1071?max_results=5&date_utc=2020-05-18T12:14:10&include_cancelled=false
Retrieved: 2020-05-18 22:14:11 AEST
Status code: 200
# A tibble: 75 x 11
direction_id stop_id route_id run_id platform_number at_platform departure_seque…
<int> <int> <int> <int> <chr> <lgl> <int>
1 5 1071 6 952531 9 FALSE 0
2 2 1071 3 953881 5 FALSE 0
3 13 1071 14 954655 4 FALSE 0
4 6 1071 7 950675 2 FALSE 0
5 9 1071 5 949763 1 FALSE 0
6 16 1071 17 954539 9 FALSE 0
7 11 1071 12 988175 13 FALSE 0
8 10 1071 11 952689 6 FALSE 0
9 8 1071 9 951849 3 FALSE 0
10 14 1071 15 953653 4 FALSE 0
# … with 65 more rows, and 4 more variables: scheduled_departure <dttm>,
# estimated_departure <dttm>, flags <chr>, disruption_ids <list>
Most importantly, because I haven’t defined any methods like mutate.ptvapi()
, every generic other than print()
will treat this tibble as a tibble. So all of my data manipulation functions will ignore the metadata I’ve attached to this tibble.
A quick heads up: it’s not guaranteed that every function will preserve attributes. So after manipulation, the “ptvapi” class may be lost, along with the metadata. That’s fine for my purpose, but maybe not for yours.
Share this post
Twitter
Google+
Facebook
Reddit
LinkedIn
StumbleUpon
Pinterest
Email