Yann has written a blog post on why it is good for transportation companies to share their data (French). Go read it. He has forgotten brand and commercial reasons.
When I was looking at the Canadian landscape for Open data, I was wondering about Montréal Open Data. Since Montréal Ouvert has opened with a clear mission.
We are a citizen initiative that promotes open access to civic information for the region of Montreal.
We believe open access to civic information and data increases civic engagement, makes services more accessible, and creates opportunities for innovation.
Through this space we hope to initiate and sustain a productive dialogue on open access between stakeholders for the benefit of all Montréalers. Join the conversation!
Disclaimer: Most of the information here is based on reverse engineering the html source code of these sites, and then should not be considered stable. Add on top of that, that it is not documented.
STM
The STM (Montréal Bus Company) went a long way for making their data available in Google Maps. That is quite cool. It gives the users the possibility to find their route in Google Maps. It would be even better if they published their data under a structured format (xml, json, or rdf).
So far, there are HTML pages for bus stops list for a line (ex: 69)
http://www.stm.info/bus/geomet/geo69.htm
Each bus stop has a unique id such as 50150. which gives the schedule for the line at this bus stop. Unfortunately in a pre-formated format, which makes it pretty useless even for html scraping.
http://www2.stm.info/taz/horaire.php?l=69&d=E&t=50150
The use of parameters show that it is very likely that it is just a question of creating the right template, which would be easy to fix.
AMT
AMT (local commuting trains) gives the list of schedules in a large html table. The values of the form for options are not straightforward and using obscure ids. I’m also still wondering why the form is using an HTTP POST instead of a GET but this blog post is not about HTTP. ;) We get a URI of this type for the default schedule.
http://www.amt.qc.ca/train/blainville-st-jerome/horaires.aspx
These data will be more difficult to parse from the HTML, still possible.
Bixi
Bixi is publishing a map of their dock stations. Let’s open Firebug and see what is happening on the wire. We can discover the following URI.
https://profil.bixi.ca/data/bikeStations.xml
The stations are described following this pattern
<stations>
<station>
<id>1</id>
<name>Notre Dame / Place Jacques Cartier</name>
<terminalName>6001</terminalName>
<lat>45.508183</lat>
<long>-73.554094</long>
<installed>true</installed>
<locked>false</locked>
<installDate>1276012920000</installDate>
<removalDate/>
<temporary>false</temporary>
<nbBikes>1</nbBikes>
<nbEmptyDocks>30</nbEmptyDocks>
</station>
…
</stations>
Via Rail
Via Rail, the canadian railway company,
Station Code List
Looking at the javascript, it is quite easy to find out the list of train stations with a JSON format. There is no global list, but if you query for example with ‘M’:
http://reservia.viarail.ca/GetStations.aspx?q=M
Note: Be careful, the return mime-type is incorrect Content-Type text/html; charset=iso-8859-1
you get something like this
[{"sc":"MLCH","sn":"MALACHI","pv":"ON","dEn":"MALACHI"},
{"sc":"MLHT","sn":"MALAHAT","pv":"BC","dEn":"MALAHAT"},
{"sc":"MWWC","sn":"MANIWAWA CLUB","pv":"QC","dEn":"MANIWAWA CLUB"},
…
{"sc":"MTRL","sn":"MONTRÉAL","pv":"QC","dEn":"MONTRÉAL" },
…
]
We can guess that
- sc = station code
- sn = station name
- pv = province
- dEn = I suspect the English spelling of a city, but I’m not sure.
Information about the stations
When searching for a train schedule, we get a list of possible departures. The train station name is a link to a URI which looks like this for Montréal
http://www.viarail.ca/cgi-bin/genericXSLT?xml=MTRL.xml&xsl=en_station.xsl
We can notice the same code for MTRL.xml. and en_station.xsl, the data in XML and an XSLT stylesheet for modifying the xml data. Unfortunately it is not possible to access the XML data. It would be still possible to create an html scraper for getting out the data without too much hurdles. But it would be definitely nicer to get the data under a structured format.
Schedules
The schedules are accessible for each train line in an html table and would also benefit to be available through a more documented format.
http://www.viarail.ca/fr/trains/quebec-et-ontario/montreal-toronto/horaires
Conclusion
Except for Bixi data, not many Montreal transportation data are available under a form easy to use. We would have to rely on html scrapers for extracting all these data.
What would be the benefits for all these organizations to expose their data with a clear documentation and license?
- Create a richer online data ecosystem
- Make it possible for the brand to live outside of its own domain name property (Creative Commons gives the license BY, i.e. Attribution)
- Give the opportunity of richer visualization of these data.
- Give the opportunity for people to develop commercial (or not) applications helping/driving users to order tickets or use more the service.
Often budgets in these organizations are limited. Giving the possibility for geeks and other businesses to build on top of these data increases the chances of revenues for these organizations. It maximizes the online surface of these services.
No doubt that Montreal Ouvert will achieve this endeavor.
4 opinions
There are also some companies sharing directly their Google Transit Data Feeds. It’s painful because parsing has to be done, again.
In Bordeaux, France; we’ve pushed on-demand bike availability combinining RSS, GeoRSS and namespace extension: http://www.vcub.fr/stations/feed/rss
Everything is then parsable at a high level (even on Google Maps, in 2 seconds), or at a lower level (to build a custom service for example).
Maybe there is a lack of format to improve sharing of public transportation data.
What do you thing of that?
[...] This post was mentioned on Twitter by karl dubost, Nik Garkusha and Patrick M. Lozeau, PlateformeLive. PlateformeLive said: RT @karlpro: Montreal Transportation Open Data http://lab.pheromone.ca/2010/08/09/montreal-transportation-open-data/ [...]
@Tom Indeed. A common format makes it a lot easier for people to publish their data. At least, it lowers the barrier.
Google has published the General Transit Feed Specification. I do not like very much the comma-delimited text for structured data, but why not. A quick search didn’t return a lot of parsers for these data, which is another issue, not an healthy ecosystem.
There are many APIs related to travel and transportation. I’m pretty sure it would not be that hard to find out what is essential.
Connecting dots.
STM has released [Open data](http://www.stm.info/en-bref/developpeurs.htm)