Date last run: 14May2020
Introduction
In the LinkedIn group Centraal Bureau voor de Statistiek; Open Data I saw the article New beta release for CBS OData4 dataportal . The article points to page CBS Dataportal on their website for more information and mentions the new root pointer.
In the past I included two functions in package HOQCutil: get_table_cbs_odata4
and get_table_cbs_odata4_GET
for version OData4.
In this blog entry I check if the two functions in the HOQCutil still work.
OData3
In the past I made the package odataR for OData3. I rebuilt this package for R 4.0.0. and did not find any errors. The remainder of this document only concerns OData4
.
OData4
For the previous beta some of the functionality was already tested. The results of that test can be found in the pdf file opendata_beta_versie4_dec2018_20181225.pdf . In this document I will describe the tests done for the new version.
OData4 CBS documentation
On the CBS Dataportal the following documentation can be found:
FAQ with among others
- a reference to infoservice for questions about (open) data
OData4 information under the header
Welke OData 4 commando's zijn beschikbaar?
and examples of use for the commands that are implemented. SpecRunner is supposed to deliver an overview of all implemented functions but Firefox and Chrome browsers giveError: error at Object.<anonymous> (https://beta-odata4.cbs.nl/spec/Validation%20OData4/DateAndTimeFunctions.js:219:22) at u (https://beta-odata4.cbs.nl/external/jquery-3.3.1.min.js:2:33479) at Object.fireWith [as rejectWith] (https://beta-odata4.cbs.nl/external/jquery-3.3.1.min.js:2:34432) at k (https://beta-odata4.cbs.nl/external/jquery-3.3.1.min.js:2:93855) at XMLHttpRequest.<anonymous> (https://beta-odata4.cbs.nl/external/jquery-3.3.1.min.js:2:96455)
a pdf handleiding (manual) in Dutch with the differences between OData3 and OData4 and information about how to convert from version 3 to version 4.
in the tab
Informatie voor (Information for)
subsectionontwikkelaars
(developpers) we find- Snelstartgids OData v4 (quick guide) gives information about retrieving CBS data for the construction of a map and for creating time series in R or Python. The layout suggests that there is also information about
filters
andMetadata
but this is not visible. - a reference to the GitHub repository CBS Open Data v4 with the same code examples. This repository is said to contain an R package for OData4. I could not find a package in this repository.
- Snelstartgids OData v4 (quick guide) gives information about retrieving CBS data for the construction of a map and for creating time series in R or Python. The layout suggests that there is also information about
in the tab
Informatie voor (Information for)
subsectiondata analists
we find
Changes made in the HOQCutil
package
While trying to check if the two package functions get_table_cbs_odata4
and get_table_cbs_odata4_GET
were still working, I realized that I should have made unit tests for the various functionality in OData and my functions. So I decided to do this now. The test functions can be found in the package subfolder testthat.
I also took the opportunity to add the possibility for JSON
output. I renamed the response
parameter (it is now called restype
). The three possible values for restype
with their meaning:
''
: the output will be adata.frame
wherever possible. This is the default. A call withsubtable='Properties'
will always return alist
.'resp'
: the output will be theresponse
object returned by the OData server'json'
: the (original)JSON
output of the OData server will not be converted todata.frame
orlist
.
Test results
The root for the tables
As announced in the CBS blog the root for the CBS tables has been changed. Therefore the default for parameter odata_root
in the function get_table_cbs_odata4
is now changed to https://beta-odata4.cbs.nl
.
The list ‘Welke OData 4 commando’s zijn beschikbaar?’
The list in the FAQ is not complete. The list lacks the following functions that worked in the previous beta and still work now:
$count
This works, but differently in OData3 than in OData4. In OData3 the result is an integer and in OData4 the result is character and preceded by a unicode character. NB. because of the different buildup of the data it is not surprising that the reported numbers are different.
# Odata3
count=odataR::odataR_get_table(
table_id='81589NED',
query="$count")
str(count)
#> int 80244
# Odata4
count=HOQCutil::get_table_cbs_odata4(
table_id='81589NED',
subtable='Observations',
query="$count",
verbose=T)
#> generated url : https://beta-odata4.cbs.nl/CBS/81589NED/Observations/$count
#> unencoded query: $count
str(count)
#> chr "<U+FEFF>1034682"
resp = httr::GET('https://beta-odata4.cbs.nl/CBS/81589NED/Observations/$count')
count=httr::content(resp, as = "text",encoding='UTF-8')
str(count)
#> chr "<U+FEFF>1034682"
In the last case I also used the ‘raw’ httr
function calls (with the same result) to show that the unicode string result is not caused by the get_table_cbs_odata4
function.
I think that the current behaviour of $count is an error.
Other functions working but not documented
The following OData3 functions are not documented in the list ‘Welke OData 4 commando’s zijn beschikbaar?’ but are working in version 4:
- $select
- length
- indexof
- tolower and toupper
- month, day and minute
- mod
Functions working but poorly documented
I think it is advisible to tell the reader that the second parameter of the function substring
works with zero origin: the first character of a string is character 0
. The same goes for function indexof
but that function is not mentioned at all.
OData3 functions that are not working in OData4
$orderby
The $orderby function is in OData4 considered as a function: it is recognized and when used requires an argument. However it has no effect on the order whatever the additional argument is: You can specify ‘asc’, ‘desc’, ‘??’ or no additional argument and it will not influence the order.
Because it is recognized as a function, I think that the behaviour of $orderby is an error.
Conclusion
- The functions $count and $orderby have an error
- The functionality SpecRunner gives an error.
- The documentation of the available functions is not yet complete
- The documentation in tab
Informatie voor
subsectionontwikkelaars
is not fully accurate.
Session Info
This document was produced on 14May2020 with the following R environment:
#> R version 4.0.0 (2020-04-24)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] HOQCutil_0.1.22
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.4.6 digest_0.6.25 R6_2.4.1 jsonlite_1.6.1
#> [5] magrittr_1.5 evaluate_0.14 httr_1.4.1 odataR_0.1.4
#> [9] rlang_0.4.6 stringi_1.4.6 curl_4.3 rmarkdown_2.1
#> [13] tools_4.0.0 stringr_1.4.0 glue_1.4.0 purrr_0.3.4
#> [17] xfun_0.13 compiler_4.0.0 htmltools_0.4.0 knitr_1.28