Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - PRISM (PRobabilistic PRedictIon for Smart Mobility under stress scenarios)

Teaser

With new, smart mobility modes, demand and supply prediction becomes truly important. For example, a one-way car sharing (e.g. DriveNow) or an autonomous mobility on demand service, are highly sensitive to rebalancing operations (moving vehicles to where demand is expected)...

Summary

With new, smart mobility modes, demand and supply prediction becomes truly important. For example, a one-way car sharing (e.g. DriveNow) or an autonomous mobility on demand service, are highly sensitive to rebalancing operations (moving vehicles to where demand is expected). In fact, bad demand predictions can lead to disastrous outcomes, by placing supply where it is not needed, and removing it from where it is required.

PRISM approach is to combine latest research from Transport Engineering and Machine Learning, by using Probabilistic Graphical Models (PGMs) and Deep Neural Networks, two complementary tools that involve Bayesian statistics, graph theory, and optimization. As a research area, PGMs and DNNs have already reached a considerable level of solid foundations, community size, and software tools. And PRISM extended this impact to the field of transportation. The research behind the project has led to 12 journal publications, 2 (completed) and 4 (ongoing) PhD students, and other international recognition achievements, including journal editorial boards, keynote speeches, one new book edition, and 5 international PhD co-supervisions.


The Experienced Researcher (ER) returned to Europe in 2015, after several years of research in Singapore and USA with the Massachusetts Institute of Technology (MIT), and this Marie SkŁodowska-Curie fellowship (2017-2019) was instrumental for his growth and affirmation in the Danish and European context. In fact, PRISM was the backbone for the establishment and growth of the Machine Learning for Smart Mobility group (http://MLSM.man.dtu.dk).

Work performed

a. Objectives and workplan
PRISM workplan included two major tracks (picture attached): Empirical research and exploration; Advanced learning and studying. In the former, I planned to collect different urban transport datasets, and do their initial exploration through non-PGM methods. The second half of the first year would conclude with first approaches for a PGM based version of that work. The second year would be dedicated to refinements, and a field test in ITSWorld 2018.

On the advanced learning and studying side, I planned to study Probabilistic Graphical Models (PGMs) in depth. On the second semester, I expected to work on preparation of a course based on PGMs, and start studying new methods for optimization, and transport demand modeling.

b. Work developed
During the first six months, together with the team, and building on a collaboration with Google research and Movia (Denmark’s public transport agency), we collected and analysed different datasets, which already arrived anonymized and deidentified to my team. We also used a large public dataset of New York City taxis.
Using the mentioned datasets, and looking into the distributed nature of traffic (i.e. multi-outputs, where each model output predicts for a specific area in the network or city), we developed a Multi-Output Gaussian Processes for Crowdsourced Traffic Data Imputation (Rodrigues et al, 2019). In this model, we take advantage of the correlation structure that emerges between the different network links in the city (e.g. neighbor links have correlated speed profiles). The effect was particularly notorious in improving the data (i.e. imputation) across the network, more than actual prediction itself.

During the second semester of PRISM (Oct-2017-Feb 2018), another essential part of the workplan was developed, namely the design and creating of a new Masters course in Probabilistic Graphical Models. The name of the course is “Model Based Machine Learning”. It ran already in Spring 2018 and 2019.

The second year was dedicated to the development of more elaborate models. Besides PGMs, myself and my team decided also to consider Deep Neural Networks, and especially their combination with Bayesian methods, and generally speaking comparison with other methods, such as Gaussian Processes. In this period, other works were created, triggered by earlier developments from PRISM, namely:
- Deep Learning from Crowds (Rodrigues and Pereira, 2018) – presented in one of the most prestigious AI conferences (AAAI). This work falls into the “multiple annotators” category, where we use a large dataset labeled by people with different levels of expertise (e.g. different medical doctors have different skills in evaluating patients). The novelty is in the way we combine the experts knowledge into a deep neural network
- Predicting taxi demand hotspots using automated Internet Search Queries (Markou et al, 2019) – back to the NYC dataset, the goal is to automatically explain demand anomalies (e.g. “why so any people?”) by automatically generating and interpreting search queries, through search engine APIs (Google and Bing)
- Multi-output bus travel time prediction with convolutional LSTM neural network (Petersen et al, 2019) – using the Movia dataset, of bus location information, generate arrival time estimates that are better than the current system in Denmark. This was the first task of the new PhD student, under the Movia collaboration.

I believe we can say that, in the end, the general objectives of PRISMs were already achieved, in terms of the methodologies, but also in terms of deliverables and “by-products”:
- 3 journal papers + 1 prestigious conference (with full acknowledgement to PRISM funding)
- 9 journal papers in related topics (without full acknowledgement to PRISM funding)
- 1 Special issue on “Social network analysis in future transportation systems: Contributions on observability, behaviour and structure”, Transport Research P

Final results

The PRISM project introduced several Machine Learning based methodologies into transport research, as mentioned before, namely on usage of Heteroskedastic, Multi-output, and Deep Gaussian Processes, several Deep Learning methods (e.g. mixing of textual and taxi demand data, using different data from multiple annotators, multi-output LSTM networks), and automatic information retrieval (for explaining anomalies from web pages). We also applied Probabilistic Graphical Models in different settings, including a new course in DTU.

For the period after PRISM, I expect to extend the current set of papers with 4 new publications, in 2019. Finally, as planned in PRISM, I also took the opportunity to learn Danish (I am currently at B2 level), and considerably expand the local network (current collaborators include Danish Road Directorate, Movia, Copenhagen Municipality, Connected Cars, Donkey Republic and Autonomous Mobility). The next strategic step is to build on PRISM and other recent opportunities, to build a competitive ERC proposal, as also planned in PRISM.

Website & more info

More info: http://mlsm.man.dtu.dk/research-projects/prism.