将公共交通数据添加到Transitous
Adding public transport data to Transitous

原始链接: https://www.volkerkrause.eu/2025/06/14/transitous-adding-data.html

Transitous是一个社区运营的公共交通路线规划服务,为GNOME地图等应用提供支持,其准确运行依赖于各种不同的数据集。您可以通过比较Transitous数据与实际时间表并找出差异来做出贡献。关键数据来源包括:GTFS数据源(用于时刻表)、GTFS实时数据源(用于延误和中断更新)以及GBFS数据源(用于共享出行选项)。 添加GTFS数据源到Transitous通常只需几行指向该数据源的JSON代码。Transitous地图显示了覆盖范围,但可以通过查找并添加来自运营商网站、开放数据门户(欧盟的国家接入点)和注册机构(如Mobility Database)的相关数据源来改进缺失区域或过时信息。OpenStreetMap数据,特别是车站的楼层信息,也需要关注。 未来的改进包括:利用未使用的票价信息;扩展GTFS以涵盖汽车渡轮;将其他数据格式(如NeTEx)转换为GTFS;以及使用车辆位置数据生成行程更新。加入Transitous Matrix频道、Transitous Hack Weekend和开放交通社区会议来贡献您的力量!

Hacker News users are discussing Transitous, a project focused on public transport data. The original post highlights the addition of public transport data to the service. A commenter ("butz") points out the common frustration with GTFS data providers who silently change URLs or stop updating data, leading to user dissatisfaction. Another commenter ("danielhep"), who works on OpenTripPlanner, acknowledges Transitous as a "competing" open-source routing engine and notes its unique features, despite operating at a different scale. The discussion showcases the challenges and interest in open-source public transit routing solutions.
相关文章

原文

I had mentioned a number of new Transitous features in a previous post. As those largely depend on the corresponding data being available, here’s an overview of how you can help to find, add and improve that data.

Transitous

Transitous logo

Transitous is a community-run public transport routing service build on top of the MOTIS routing engine and thousands of datasets from all over the world. Transitous backs public transport related features in applications like GNOME Maps, KDE Itinerary or Träwelling.

Just like OpenStreetMap this needs people on the ground identifying issues or gaps in the data, figuring out where things go wrong and who to talk to at the local operators to get things fixed.

The first step to help is just comparing data you get from Transitous with the reality around you, ie. does the public transport schedule match what’s actually happening, and are all relevant services included?

If there’s things missing or outdated, a list of the types of datasets consumed by Transitous, and how to inspect and add those, follows below.

The central part in this are a bunch of JSON files in the Transitous Git repository, which define all the datasets to be used as well as a few parameters and metadata for those. Once a day those are then retrieved, validated, filtered and post-processed for importing into MOTIS by Transitous’ import pipeline.

Public transport schedules

The backbone of public transport routing is static GTFS schedule data, that’s the bare minimum for Transitous to work in a region. GTFS feeds are essentially zip files containing a set of CSV tables, making them relatively easy to inspect, although especially nationwide aggregated feeds can get rather large.

GTFS feeds ideally contain data for several months into the future, but can nevertheless receive regular updates. Transitous checks for updates daily, so for this to work practically we also need a stable URL for them (that might seem obvious to you, but apparently not to all feed providers…).

We currently have more than 1800 of those, from 55 countries. The Transitous map view gives you an impression how well an area is covered, each of the colored markers there is an (estimated) current position of a public transport vehicle.

Map view with train and ferry positions densely covering all of South Korea.
Recently added coverage in South Korea.

If your area is incomplete or not covered at all, the hardest part to change that is probably finding the corresponding GTFS feeds. There’s a few places worth looking at:

  • The public transport operators themselves, they might just publish data on their website.
  • Regional or national open data portals, especially in countries with regulation requiring public transport data to be published. In the EU, those are called “National Access Point” (NAP).
  • GTFS feed registries such as Mobility Database and Transitland.
  • Google Maps having public transport data in your region is a strong indicator whether GTFS feeds even exist, as they use those as well.

Adding a GTFS feed to Transitous is then usually just a matter of a few lines of JSON pointing to the feed. In rare cases it might require a bit more automation work, such as in France where there’s hundreds of small feeds to manage.

And every feed is welcome, no matter whether it’s a nation-wide railway operator or a single community-run bus to help people in a rural area, as long as it’s for a service open to the general public.

Realtime data

So far this is all static though. For properly dealing with delay, disruptions and all kinds of other unplanned and short-notice service changes we also need GTFS Realtime (RT) feeds. Those are polled once a minute for updates.

GTFS-RT feeds come in three different flavors:

  • Trip updates, that is realtime schedule changes like delays, cancellations, etc.
  • Service alerts, that is textual descriptions of disruptions beyond a specific connection, such as upcoming construction work.
  • Vehicle positions, that is geographic coordinates of the current position of trains or busses.

MOTIS can handle the first two so far. Support for vehicle positions is also on the wishlist, and not just for showing current positions on a map, vehicle positions could also be used to interpolate trip updates when those are not available.

Adding GTFS-RT feeds to Transitous is very similar to adding static GTFS feeds, however GTFS-RT feeds usually only work in combination with their respective static equivalent. Combining a smaller realtime feed of a single operator with a nationwide aggregated static feed will thus usually not work out of the box. There’s ways to exclude certain operators from a larger static feed though, so with a bit of puzzle work this can usually be made to work as well.

GTFS-RT feeds use Protocol Buffers, but there’s nevertheless simple way to look at their content:

curl https://the.feed.url | protoc gtfs-realtime.proto --decode=transit_realtime.FeedMessage | less

The Protocol Buffers schema file needed for this can be downloaded here.

To see the realtime coverage available in Transitous, you can toggle the color coding of vehicles on its map view in the upper right corner. A green/yellow/red gradient shows the amount of delay for the corresponding trip, while gray vehicles have no realtime information.

Map view with color-coded vehicle positions indicating delays in Amsterdam, Netherlands.
Color-coded realtime data in Amsterdam, Netherlands.

Shared mobility data

Transitous doesn’t just handle scheduled public transport though, but also vehicle sharing, which can be particularly interesting for the first and last mile of a trip.

The data for this is provided by GBFS feeds. This includes information about the type of vehicles (bikes, cargo bikes, kickscooters, mopeds, cars, etc) and their method of propulsion (human powered, electric, etc), where to pick them up and where to return them (same location as pickup, designated docks of the provider, free floating within a specific area, etc) and most importantly where vehicles are currently available.

Adding GBFS feeds to Transitous is also just a matter of a few lines of JSON. We currently don’t have a built-in UI to see the results, showing all available vehicles on the map is certainly on the wishlist though. GBFS is relatively easy to inspect manually, the entry point is a small JSON manifest that contains links to JSON files with the actual information, generally split up by how often certain aspects are expected to change.

Same as for GTFS feeds, any service accessible to the general public is welcome here, whether it’s a small community run OpenBike instance or a provider with hundreds of vehicles.

On-demand services

Somewhere between scheduled transport and shared mobility are on-demand services. That is, services that require some form or booking beforehand and might be anything from an on-demand bus that still follows a somewhat fixed route with pre-defined stops to something closer to a taxi with a more flexible route that picks up or drops off passengers anywhere in a given area.

These services are often used in times and/or areas with fewer demand, thus making them often the only mobility option then/there. That makes it all the more important to have those covered as well.

Modeling on-demand services is challenging, given the variety on how those services work and their inherently very dynamic nature. There’s the relatively new GTFS-Flex standard covering this, which MOTIS supports since v2.0.66.

GTFS-Flex feeds might be included in static GTFS data or provided separately, and adding them to Transitous works again by just a few lines of JSON.

There’s one caveat though, the validator we use in pre-processing, gtfsclean, doesn’t support GTFS-Flex yet, so those feeds are currently imported without any sanity checking or validation. Therefore we need to be extra careful with adding such feeds until that is fixed. If you know a bit of Go and want to help with that, get in touch!

For GTFS-Flex data there’s some diagnostic visualization in the map view in debug mode, when zooming in far enough.

Map view with colored areas indicating on-demand service areas around Lausanne, Switzerland.
Diagnostic view of on-demand service areas in Switzerland.

OSM

A crucial dataset for all road-based and in-building routing is OpenStreetMap. While that is generally very comprehensive and up-to-date, there’s one aspect that more often needs fixes, the floor level separation. That’s not visible in most OSM-based maps and thus is easy to miss while mapping. For Transitous this is particularly important for in-building routing in train stations.

When zoomed in enough the map view of Transitous will offer you a floor level selector at the lower right. That can give you a first indication if elements are misplaced (showing up on the wrong level) or not assigned to a floor level at all (showing up on all levels). For reviewing smaller elements indoor= can also be useful, and for fixing things JOSM has a built-in level selector on the top left.

Transitous map view with floor level selector on the lower right, showing the ground floor (passenger) level of a station with some railway tracks crossing that should be one level up.
Railway tracks running through the passenger level of Bremen central station (upper right).

In most cases adding or fixing the level tag is all that’s needed. Elements allowing to move between levels (stairs, ramps, elevators, escalators, etc) are especially important for routing.

And more

All of the above is just the current state, there’s much more to look at though, such as:

  • Unused information in the existing datasets, such as fare information or the remaining range of sharing vehicles.
  • Expanding the GTFS standard to cover things currently not modeled. Starting from relatively simple things like car ferries or car transport trains to highly detailed information about a bus or train interior with regards to accessibility.
  • Convert other data formats to GTFS, such as NeTEx, SIRI or SSIM.
  • Generate GTFS-RT trip updates from realtime vehicle position data.
  • Extend the import pipeline to augment and normalize GTFS feeds, e.g. by injecting line colors and logos from Wikidata or accessibility information and missing paths from OSM, or to normalize train names.
  • Considering elevation data for street routing. MOTIS has initial support for this meanwhile, but even the relatively coarse global 30m SRTM grid data would require an extra 50-100GB of (fast) storage, with quadratic growth with smaller grid sizes (1m or 2m grids are available in a number of regions).

In other words, plenty of rabbit holes to explore, no matter whether you are into code, data, math, trains, busses, IT operations or lobby work :)

You can help!

Check around you whether information from Transitous matches the reality on the ground, join the Transitous Matrix channel, join the Transitous Hack Weekend in a few weeks and join the Open Transport Community Conference in October!

联系我们 contact @ memedata.com