Viewing entries tagged with curl. Return to full view.

FDWs, curl and LIMIT

I'm sure everybody is aware you can have PostgreSQL fetch data live across the internet (or locally of course) as part of your queries and use the data directly. In fact there are a large number of drivers available to access different kinds of data for download. But in the simplest case, we can also just use the file_fdw wrapper that's included in the standard PostgreSQL packages, together with everybody's http-swiss-army-knife, curl.

In attempting to adapt this post for the public, what more time-relevant dataset to work off in these pandemic-affected times than the open data provided by the ECDC, being both open data and very current. In particular for this example, they provide public datasets with COVID numbers from across the world (actual public data, requiring no registration to read, and actual data, not just a limited API).

So, let's see how we can access this data from PostgreSQL:

CREATE EXTENSION file_fdw;

CREATE SERVER curly FOREIGN DATA WRAPPER file_fdw;

CREATE FOREIGN TABLE _rawdata (
 daterep text not null,
 day int not null,
 month int not null,
 year int not null,
 cases int not null,
 deaths int not null,
 countries text not null,
 geoid text not null,
 countrycode text null,
 popdata int null,
 continent text not null,
 cumulative14days float null
)
SERVER curly
OPTIONS (
 PROGRAM 'curl -s https://opendata.ecdc.europa.eu/covid19/casedistribution/csv/',
 FORMAT 'csv',
 HEADER 'on'
);

And to use this we can simply query the foreign table, which will then trigger a http GET of the data:

covid=# SELECT count(*) FROM _rawdata;
 count 
-------
 41837
(1 row)

However, there is an unfortunate interaction with LIMIT. So if we for example try to get just the first 10 rows:

covid=# SELECT * FROM _rawdata LIMIT 10;
ERROR:  program "curl -s https://opendata.ecdc.europa.eu/covid19/casedistribution/csv/" failed
DETAIL:  child process exited with exit code 23

Continue reading

Conferences

I speak at and organize conferences around Open Source in general and PostgreSQL in particular.

Upcoming

PGConf.EU 2023
Dec 12-15, 2023
Prague, Czechia

Past

PGCon 2023
May 30-Jun 2, 2023
Ottawa, Canada
PGDay Chicago 2023
Apr 20, 2023
Chicago, USA
PGDay/MED 2023
Apr 13, 2023
St Julian's, Malta
pgday.Paris 2023
Mar 23, 2023
Paris, France
Nordic PGDay 2023
Mar 21, 2023
Stockholm, Sweden
More past conferences