Gbif
GBIFDownload
dataclass
¶
Represents a GBIF download URL and some associated metadata.
Source code in dataimporter/ext/gbif.py
145 146 147 148 149 150 151 152 153 154 155 156 | |
GBIFDownloadError
¶
Bases: Exception
If we get a failed or killed status from GBIF when checking the download status we throw this exception to avoid waiting an hour for a download that is never going to succeed.
Source code in dataimporter/ext/gbif.py
131 132 133 134 135 136 137 138 139 140 141 142 | |
GBIFDownloadTimeout
¶
Bases: Exception
If we time out when trying to access the GBIF download file (i.e. we wait for GBIF to generate the download, but it takes too long) this exception is raised.
Source code in dataimporter/ext/gbif.py
116 117 118 119 120 121 122 123 124 125 126 127 128 | |
GBIFView
¶
Bases: View
View for GBIF records.
Source code in dataimporter/ext/gbif.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
transform(record)
¶
Converts the GBIF record's raw data to a dict which will then be embedded in specimen records and presented on the Data Portal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
record
|
SourceRecord
|
the record to project |
required |
Returns:
| Type | Description |
|---|---|
dict
|
a dict containing the data for this record that should be combined with a specimen record |
Source code in dataimporter/ext/gbif.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
get_changed_records(store, gbif_username, gbif_password)
¶
Get a stream of the latest records from GBIF. This function will take time to complete as it will request a new download of the NHM's specimen dataset on GBIF, download it, and then stream the records from the downloaded CSV that have changed compared to the ones in the data DB already.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
store
|
Store
|
the GBIF Store |
required |
gbif_username
|
str
|
a GBIF username for requesting the download |
required |
gbif_password
|
str
|
a GBIF password for requesting the download |
required |
Returns:
| Type | Description |
|---|---|
Iterable[SourceRecord]
|
yields the changed SourceRecord objects |
Source code in dataimporter/ext/gbif.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
get_download_url(download_id)
¶
Wait for the given download to be ready and then return the URL to download the file.
This function tries every minute for an hour, checking the download status via the GBIF API. Once the status changes to "SUCCEEDED" the download URL is returned. If the status doesn't change to "SUCCEEDED" within the hour then a GBIFDownloadTimeout exception is raised. If a failed statuses appear, the function raises a GBIFDownloadError exception.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
download_id
|
str
|
the GBIF download ID |
required |
Returns:
| Type | Description |
|---|---|
GBIFDownload
|
a GBIFDownload object |
Source code in dataimporter/ext/gbif.py
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | |
request_download(gbif_username, gbif_password)
¶
Request a download of the NHM's specimen dataset from GBIF. To request a download we need to be authenticated with GBIF, hence the username and password parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gbif_username
|
str
|
GBIF account username |
required |
gbif_password
|
str
|
GBIF account password |
required |
Returns:
| Type | Description |
|---|---|
str
|
the GBIF download ID |
Source code in dataimporter/ext/gbif.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |