Catégorie Archives: Blog

csvtool manual page

Temps de lecture : 4 minutes

I just discovered csvtool, a command line utility that I found in default Xubuntu package repository, as I was looking for other utilities (namely csvsort, csvstat, csvcut and co.). I wanted to scream my joy but couldn’t find the manual online. Only when running csvtool --help. It’s not on Ubuntu website.

Read more

From datasets to linked datasets in open government data

Temps de lecture : 11 minutes

Helping the machines to understand what the data means, so that they can help us search and cross the data published by our governments.

Initially published on Medium on September 14th, 2014. I’ve replaced some links with more accessible sources (i.e. not hardcore standard specifications) and I’ve also rephrased the last sections to make them more accessible.


Summary (tl;dr;): Publishing open government data dissipates the mist around the business of the State and its tentacles, and enriches the dialogue between the administration and the citizens. However, today, this data is usually not described with machine-readable semantics, which makes it hard to perform large scale search and create value by crossing datasets together.

The Semantic Web technologies come to the rescue: they enable the creation of worldwide identifiers for concepts (the URIs), definitions with machine-readable semantics and data storage and querying as a graph. The result is a standard and semantics-driven open data platform.

Read more

cURL examples to query Wikidata

Temps de lecture : 1 minutes

The SPARQL endpoint is http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql and it has a Web form to fire queries. However http://www.wikidata.org/prop/direct/P31 (« instance of ») tells you what the entity is.

The repository doesn’t have named graphs, or at least the SPARQL endpoint rejects graph queries. The classes of entities (rdf:type) are not described in the repository.

Read more