Neural networks study images of Amsterdam’s streets

29 March 2021

A partnership of 9 cities and 3 universities has resulted in an initiative that uses information about existing objects to optimise a data set with panoramic images of Amsterdam. This may sound somewhat abstract but such a data set could facilitate many new AI projects aimed at enhancing the quality of life.

SCORE is a broad initiative covering the North Sea region in the field of AI and Data Science for Smart Cities. Under this initiative, Amsterdam and other participating cities focus on the reuse of open data in order to solve metropolitan problems. Such projects could be about sustainability, logistics or some other area. The universities involved in the research are Bradford, Aarhus and UvA.

The research partners are developing innovative solutions on the basis of open data. An important goal of this, is sharing insights and working methods as a way of providing better local government service to the public. This manifests itself in a range of projects that address sustainable mobility, air quality improvement and crowd management among other things. In her PhD thesis, Inske Groenen (Amsterdam Business School) examines the application of neural networks to panoramic images aiming to give us a better idea of how liveable the urban environment is. Neural networks are systems that mimic the functioning of the human brain and learn by seeing examples. This differs from other systems such as those based on decision trees, which can be preprogrammed so that specific actions are always carried out in the same way.

Training neural networks on image data

The data set used by Groenen is limited to images of Amsterdam, at least for now. 'Amsterdam City Council has its own car drive around and take panoramic pictures. But, for other cities, we can draw on images from Google Street View as well.' An advantage of working with the Amsterdam data is that geographical information about objects in the streets, such as rubbish bins and lamp posts, is well documented. The goal is to train neural networks on the image data and use the geographical information to label the images. 'This will allow us to do is to combine the photos with a contour map to determine how tall a building is.'

As far as Amsterdam goes, a development pipeline has now been created for 24 types of objects together with their 3D coordinates and a camera module to convert these to 2D coordinates on a street plan. 'What you then get from the data are so-called 'boxes', indicating where any particular object is located. For Amsterdam, this gives a data set of more than 14 million boxes.' Such a huge number is almost impossible to label manually so Groenen is looking for an automated solution.

Crowdsourcing campaign to remove noise

'The challenge is that labels don’t always fit very well. And some of the geographical data could be wrong. The upshot is that, when you generate a representation in a box, it might show an object in a place where it’s not actually located. This gives an unclear signal and reduces the performance of the network trained on the data.' To identify any weaknesses and discover how a network can in fact be trained properly, a subset with clean labels was needed. Such a subset, consisting of 7,000 images together with 150,000 boxes all labelled correctly, was achieved via crowdsourcing. This meant that large numbers of people all over the world were given the opportunity to help carry out the work in exchange for payment.

With the aid of the Amsterdam council, a crowdsourcing campaign was initiated with an annotation application on Amazon Mechanical Turk. 'That’s quite standard in this line of work. It basically needs to be set up properly, people need to be given clear instructions and the execution of tasks should be monitored continuously.' With the subset of correctly labelled images, Groenen wants to reproduce the entire data set. 'This opens up new research possibilities. From a council perspective, the accurate recording of where objects are positioned is an important task and efforts are underway to find smart ways of optimising the process. But from a scientific perspective, it would also be worthwhile to train specific networks that reduce the distortion of panoramic images, just to mention one area of interest.'

Starting shot for innovation

So what’s next? 'Because the labelling of big data sets is such a labour-intensive and high-cost process, researchers are looking for ways of dealing with this limitation. One way is the use of 'weak object detection', or the detection and pinpointing of objects that you know should appear in certain images based on existing labels. Innovation often begins with the availability of big, labelled data sets. With this in mind, I hope we’ll also stimulate innovation ourselves.'

Groenen will now embark on what she considers to be the most interesting part of her thesis. 'We can now unleash experiments and various networks on our data set. SCORE and Amsterdam City Council would also like us to zoom in on the quality of city life. One approach would be an analysis of specific objects to identify characteristics that affect the liveability of a neighbourhood, like the presence of parks or particular architectural styles. We don’t know yet how exactly we’re going to do that. But if we manage to develop something that will give us greater insight into the quality of city life, then I’d be very happy.'

Neural networks study images of Amsterdam’s streets

Training neural networks on image data

Crowdsourcing campaign to remove noise

Starting shot for innovation

Cookie Consent