Official documentation is here: docs.data.world/documentation/api/
The last week has been a real whirlwind. Early in the week, our cofounder Jon Loyens was interviewed on the Partially Derivative podcast. Then on Friday, we were featured on Product Hunt. What a week!
We’ve had a few questions this week around APIs—what does our data catalog support today, and what are our plans for the future.
I want to give a quick update on where we are today. At data.world, we believe that a big part of what makes data open is making it available when and where you need it. To this end, we will be supporting a large variety of integrations—the only question is one of prioritization.
Java / JVM / JDBC-enabled tools
A number of data analytics and visualization tools on the market today have some level of support for JDBC. It was therefore natural for us to implement a data.world JDBC driver sooner rather than later. Our driver has full support for SQL and SPARQL. Check out the docs for more information on how to integrate.
REST-ful Query Interface
If you find yourself on the command-line and want to pull in some data, the easiest way is to hit our query endpoint. Perfect your query in the app, then bring it to the command-line to integrate with your external process.
curl 'https://query.data.world/sql/<user>/<dataset>' -H 'Authorization: Bearer <api-token>' -H 'Accept: application/sparql-results+json' --data 'query=SELECT%20*%20FROM%20tablename'
There’s an equivalent endpoint at for querying SPARQL:
I say command line, but really the only limit is your imagination. This is a REST-ful endpoint which could easily be integrated into any language.
<api-token> is in your advanced settings.
Python/Pandas and R have become increasingly popular environments for doing data science. In an effort to empower these users, we’ve created the ability to quickly pull query results directly into a local data frame. This is exposed through the “Export” option on the query tool.
Simply paste into into your Python/Pandas or R environment and load up a data frame ready for additional processing!
import pandas df = pandas.read_csv('https://download.data.world/sql_query_result_download/producthunt/product-hunt-research?filename=product-hunt-research-QueryResult&mimetype=text%2Fcsv&query=--%20NOTE%3A%20from%20a%20sample%20of%205000%20records%20in%20the%20full%20dataset%0ASELECT%20name%2C%20time_of_day%2C%20PostsForExploration.date%0AFROM%20PostsForExploration%20&auth=<USER-FILE-SPECIFIC-TOKEN>')
In the coming weeks, you’ll also find this capability for copying files directly on the dataset (no querying necessary).
That’s all for now, but stay tuned for more in the coming months! This is really just the start. 😁