Road Work: Using Mobile Data to Alleviate Traffic Congestion

generous introduction. I want to thank the Wenk family for their
generous endowment and for making these lectures possible, to Jeff Ban, my host in the civil
engineering department for the amazing logistics of the whole visit. And it’s really nice to see old friends and
good friends in the audience and Don, Lily, and many others. Today, I’m going to talk about inference and
control in routing games. That might sound a bit of an abstract title. But it’s actually very practical. And we’ll try to juggle back and forth between
some theory and some practice. And hopefully, you’ll connect with many of
the themes, because many of you were stuck in traffic today, will be stuck in traffic
next week going to Thanksgiving dinner. So we should all prepare collectively for
this. So first, I’m going to go through a general
framework for traffic operations, which is the underlying backbone of the work we do. And then I’m going to talk about two specific
problems– inference problems, which is how you find out what traffic is like and what
you do about it, and then games, which is something that, as you will find out through
the lecture, we all play willingly or not knowingly or not, but that we are all part
of. So this is the diagram that undergrads taking
any control theory or any optimization class probably go through in their first lecture. And this is a very good representation of
if you think any system that is controlled with sensors and activators works. There’s a plant which has a system, whether
it’s an airplane or a car, the road, or anything. There are sensor activators, and then there’s
a controller to control it. And what I’d like to do is try to think about
the transportation system in these terms, which will set up the stage for the rest of
the talk. So at the core of the system is dynamics,
which is how the system physically works. So think about next week, when you’re going
to your Thanksgiving dinner. Unless you leave really early, that’s the
type of traffic you will experience. This is last year’s Thanksgiving traffic in
LA. And I’m sure you could generate similar videos
pretty close to wherever you live. So the point here is that to model something
like this, it’s not that easy. There is lots of non-linear phenomena that
happen. And there’s decades and decades of work, generations
and generations of traffic engineers that spent a lot of time trying to understand the
physics of how these systems evolve. And so if you understand that to a certain
level, in the sense, you have a good understanding of how the system works. And so now, just like any control system,
you need to understand how to sense it. You might have a good understanding of how
it works, but you need to understand what it’s doing. And if you think about sensing infrastructure
around you, you’re probably familiar with this type of technology. First, it’s not very elaborate. And second, there’s not much of it. Whether these are loop detectors, whether
these are cameras, whether these are radars, FasTrak tubes, anything– you name it– these
are mostly historically tools which have been developed and deployed by public agencies. And at most, maybe 10% of the road is instrumented
with it. So there’s very little data, if you will,
to measure traffic. On the control side, to control the system,
it’s not that different. If you think of yourself in your car in the
morning, where are you going to be, quote, controlled? There’s traffic lights. There’s tolling, which might have meterings. There is metering at the on-ramps and the
off-ramps. There are some restricted access lanes, the
405 here being one example, with the tolling, maybe some ways to give you more information. And that’s pretty much it. And Even if you take a corridor like the 210 in
California– so this is one of the test beds that we’re working on with the California
DOT. Think of this corridor, which is probably
20 kilometers. And there’s 450 assets on this corridor to
control traffic. Every day, there’s hundreds of thousands of
people going through this corridor. And you have 450 control points to control
this entire flow of people. That’s very little. And this is the reality of transportation
engineering for the last half century. For the last half century, this has been the
only way at the disposal of public agencies to control your mobility, which is to make
you flow better from wherever you live to wherever you work or wherever you want to
go throughout the day. And so that upper box– in a sense, this is
like a two minute summary of the way that upper box works. Now you can wonder, what’s under the hood? All these traffic lights, all this urban infrastructure
I see, how does it work? And here’s another two minute description
of how it works. At the simple level, you have state estimation
and demand forecast. State estimation is the process of trying
to collect whatever data you have and reconstruct the most likely state of the system given
the data you have. So you can think about it as a dashboard that
is essentially empowering a traffic engineer in a command center somewhere to make these
decisions. And so to understand traffic, engineers have
simple representation tools called time-space diagrams, where, in a 2D diagram, you plug
time and space. And the color level encodes the speed of the
traffic. And so that color level fills up throughout
the day. And if you look at what it looks like now,
that gives you a depiction of the way traffic flows on a single– on a lane of freeway. And then you have demand, which is– and demand
forecast, which is essentially trying to understand what will traffic most likely look like on
a Monday, on a Tuesday, on a Wednesday. So with historical data, you can pretty much
try to model that there is a morning peak. So demand will rise in the morning as people
go to work. Then it slows down a bit. And there’s an evening peak, where there are
people going back home from work. And then it slows down throughout the night. And so with these two blocks, essentially
these are the tools for traffic engineers today to get the most likely traffic on a
Monday, on a Tuesday, on a Wednesday, and then a depiction of what’s happening today. And that’s what’s used by the system to control
it. And then goes some optimization, because obviously
accidents happen, weather disturbances happen, special events like ballgames happen. And so the job of the traffic engineer is
to make that process better. So you could view this as the dream situation
of any traffic engineer is to have everything green. So that means somehow they figure out the
magic way to sync up the traffic lights and everybody is flowing fast and nobody is stuck
in traffic. That’s obviously ideal. That reality is very different. But that’s an aspiration that you could think
about as someone optimizing the system. And to do that there is an interface. Because even though there’s a lot of science
fiction movies where everything flows automatically, at the end of the day, in the US and in many
countries in the world, you still have humans in charge. If you’ve watched the movie The Italian Job,
there is a traffic center that gets hacked. This is this traffic center. That’s the LA traffic center. And there’s two of them, actually, because
there’s the state and there is the MPO in charge. And in that traffic center, it’s just like
a air traffic control center. There’s lots of displays with traffic and
people making decisions and semi-automated tools. And part of the job we do at Berkeley is to
help the California DOT build these tools, decision support tools, that will provide
better regulation, if you will, of traffic. And what that means is that in this interface–
the operator cannot optimize in his or her head traffic online. So there is a database of playbooks, situations
which are known, things which have been optimized offline, things which we know work. And what the human does is it selects semi-automatically
in this database the playbook for today. Oh, we know it’s a Friday, Friday traffic. We know it’s raining. And we know there is a rain closure. And we know that playbook number 37 has worked
really well. Of course, that gets semi-automated. So things are pushed to the operator, or at
least that’s the dream. Things get pushed to the operator so the operator
can select the best way to make things flow. So this is, in a sense, the contribution of
the last century. This last century has brought the automobile,
has brought the traffic lights as well as a lot of technology. And that’s the way things work today. So what I’m going to do now is try to walk
through some of the underlying tools that are under the hood. That will bring us to maybe 2/3 of the talk. And in the last hour of the talk, I’m going
to explain how we have a Frankenstein on the loose and how everything I’ve spoken about
before doesn’t really work anymore. And that will be an explanation of why we
have all these problems of traffic that become more and more hard to cope with, if you will. So first, one of the things which is interesting
that has happened over the last five years is we have witnessed a complete revolution
of demand forecast. 10 years ago, if you wanted to know where
people are coming from and where they’re working and how you can use that information to regulate
traffic, essentially the only source of data you would have is people running census knocking
on your door every 10 years, asking you to fill a questionnaire essentially answering
questions where your children go to school, where you work, where you shop. And based on that, they would build these
models. And then with the explosion of smartphones,
what has happened is that all of you probably in this room or most of you in this room have
a phone. And even if you’re not willing to disclose
your precise location, the cell phone operator– AT&T, T-Mobile, Sprint, whoever you’re using–
has the knowledge of the cell tower you’re talking to, because your phone is connected
to some cell tower. Which means if the access provider has enough
of you– for example AT&T has about 40% of the US population, or at least that’s true
in California. Then that means that operator has the ability
all the time, when your phone is on, which cell tower you’re talking to, which gives
a notion of your location within roughly 200 or 300 meters. And that’s enough. It’s not precise. I can’t know exactly which block you’re on
or which street you’re on. But that’s enough to figure out large scale
mobility of traffic. And so what that means is that if you don’t
do math, you can skip this. If you do math, that’s probably trivial. The point is that with convex optimization
tools that are fairly common, you can find a least square estimate of the most likely
place people live to the most likely place they go with that data. [? Kathy ?] [? Woo, ?] who is sitting here,
was one of the people who did that work. And so what it means in practical terms is
that’s what the data looks like. So this is for LA. And what you see in pink is an example of
cell tower records of a person who was driving through LA. And it’s not the precise location. You can see this little kink here. That’s because the cell tower it’s talking
to happens to be off the main road. But from that, you can find the rough mobility
pattern of that person. So thank you for all of you to contributing
this data, because we all contribute to this database– so willingly or not knowingly or
not. But what’s interesting here is that over the
last five years, that data has become predominantly available to a lot of different agencies through
various agreements they have with these data providers, which essentially enables transportation
practitioners to model much more accurately the patterns of people. And you can see interesting things. This is an example of a FedEx truck doing
delivery. So that explains why that truck is throughout
the day delivering to a different location. This is a person going to work, probably going
to the bar or to the shopping mall or something, and then going home after the end of a hard
day, and so on and so forth. And so the way these sets of data are used
nowadays is from this approximate set of location, which you have a very high level and large
scale. You can identify what’s called origin destination
matrices, which are a way to encode where people live and where people work. And from that, with models– and that’s part
of the work we do in transportation engineering and in control and optimization– figure out
the network loading. Which is essentially if you know where people
are, if you know where they work, and if you know an approximate routing with good models,
you should be able to somehow forecast to a certain degree of accuracy their routes. And so the type of results that you can extract
from this is given the density of cell towers. So in some places, like downtown Seattle,
they have a very high density, lots of cell towers. Some places which are rural have a less big
density with less cell towers. And based on the density of these towers,
you can essentially determine the accuracy up to which you can go and with this type
of demand. So that state and demand forecast, once you
have it, if you have a good statistical models, it’s a very efficient tool for practitioners,
because now you have a good way to forecast what’s going to happen on a typical Friday
when it’s raining or when there’s a ballgame. Now, another revolution that happened over
the last 10 years which all of us here benefited from is the revolution of traffic estimation. And so that’s the story I’ll tell for the
next few minutes. So state estimation is the process of taking
data, whatever you have, cell phone data, loop data, detector data, a model– and finally,
the most likely state of traffic, given that the model is not perfect. The data is not perfect. And the truth is somewhere in the middle. And so the typical tool that transportation
engineers use for this is this time-space diagram I was talking about before, which
plots the time on the horizontal axis, the location on the freeway on the vertical axis. A certain color level indicates the congestion
level. So a 70 miles an hour is going to be blue. And low speed is going to be red. And if you are able to reconstruct this all
the time, then that means you have perfect vision of the speed of people everywhere on
the network. And you do this for every link. And of course, it’s not possible, but you
can get us as close as you can. And so part of the work we’ve been doing over
the last 10 years is exactly the development of algorithms that enable the reconstruction
of such time-space diagrams. And to show you how it works is imagine you
have none of this data. But that’s what you’re trying to reconstruct. So just keep this in the back of your mind. If you were able to sample data at given locations
on the freeway, like a loop detector, that’s essentially saying that you have access to
the data along the black lines. And if they’re interrupted, that means we
have a sensor failure. So you don’t have access for these particular
times. If you were able to make an estimate that
at midnight the freeway is nearly empty, that means you have knowledge of the data along
the green line. If you were able to track, some of you– because
say you’re willing to share your data– that means you’re able to sample the data along
a trajectory. If you do environmental science, that’s called
a Lagrangian measurement. And that means you’re able to measure that
function along that line. And if you’re just able to re-identify cars–
like for example you have an EZPass or a FasTrak or a transponder, that means you’re able to
say, oh, the vehicle in this year is the same as the vehicle on that dot up there. So that’s roughly the data that is available
today. And so part of the work that we’ve been doing
over the last 10 years, and that is I think are very mature, now is the process of developing
algorithms so that if you have this type of data, you’re able to reconstruct the picture. And to do that, what you use is you use mathematical
models of traffic, which are represented here in the form of a partial differential equation. That’s a Hamilton-Jacobi equation. And that’s essentially this model of traffic. It’s like this, 50 years of work of transportation
engineer leading to a model. And then so that’s what goes in this box. And then, having that sensing data, I essentially
say that you can sample the data at a given location. You have initial conditions. You have value of the function along some
trajectories and at specific points. And so part of the work of optimization engineers
is essentially trying to find a mathematical formulation that is able to say that, given
the data I see, that is not perfect. Given the model I have, that is not perfect
either, represented by these two polygons. My guess, resulting from the optimization
of a complex program, is that the value of the system, the state of this system is this. And that’s what these theorems say here. If you do partial differential equations,
it says that you can optimize a least square problem under the Hamilton-Jacobi constraint
as a convex program. Otherwise, just think about it as given imperfect
data, given an imperfect model, what is the truth? And the truth is somewhere in the middle. And so this is one of the things we’ve been
working on for the last 10 years. And so let me show you how it works in practice. This is data which represents GPS tracks of
2% of traffic– so again, time-space diagram, time, location. And each dot is a GPS track of a car moving
on the freeway, sending you their position every three seconds. And that represents about 2% of traffic here. And I’ll explain later how that data was collected. So essentially what the algorithms do is they
take that data. And they are able to fill the data out where
there is no holes. So here, there is nearly no hole. So you can read traffic. But say I get rid of 99% of the data. That’s what it looks like now. Then if I asked my five-year-old son or my
9-year-old daughter to paint in between, they’d probably figure out the colors. But essentially what the algorithms do is
that. Except they do it mathematically. So they’re able to reconstruct this from this. And this is the ground truth. And so this is a graphical way of explaining
how the system works here. And so 10 years ago when there was no phones,
essentially there was no such data. Initially, people thought, well, with a few
percent of data, can we reconstruct traffic? And it was not obvious then. And obviously today, reality is we all have
traffic on our phones. And so this has become part of everybody’s
technology. And so it’s interesting to see how this domain
has matured over the years, because today it has become a commodity that we can get
everywhere for free. And this has a lot of consequences that we’ll
talk about in the second part of the talk. So when we started the project and Jeff was
one of the early participants, back in 2007, the iPhone 1 did not have a GPS. The smartphone of the moment was the Nokia,
a defunct brand. And the Android was not even born. And so what we did– and Jeff was one of the
main architects of the first system– is we built the first traffic app that was ever
deployed in North America to collect traffic from these phones which look like a vintage
object you want to collect today, if you are into these kinds of things, but which was
the smartphone of the moment. This was one of the first smartphones ever. And we deployed this app, mostly in California,
and were able to reconstruct what was one of the first traffic map ever built from mobile
data. So within a few years, we managed to change
this, which is a vintage website. Actually, you go to California, they still
have it up– which was the only way you could get access to traffic in 2008 into this, which
all of you have on your phone today. And so to do this, you have to replace yourself
10 years ago, where nobody really believed and, in particular, a lot of the DOTs didn’t
believe it could be done. So in order to prove to the DOT that it could
be done, we had to first try it at small scale. So we ran an experiment called the Mobile
Century experiment, in which we hired 100 students– actually 200 students– to drive
a hundred cars back and forth 10 hours on 10 miles of freeway– a very boring job. But the goal of that experiment was to prove
that this traffic, which was about 2% of traffic, was enough to reconstruct traffic. And think again, nowadays you don’t even dispute. You have it on your phone. But 10 years ago it was not even clear it
could be done. So we ran that experiment for 10 hours and
collected enough data then to demonstrate that with this percentage of data, you would
have good enough accuracy. This 2% number was not chosen by mistake. Companies then, including Google and Apple,
thought that within 18 to 24 months of the experiment, that would be the penetration
rate on US freeways. Which obviously now is above 100%, but which
then were really good forecasts. So when we ran the experiment, it was like
a nice military operation back in my old days in the military, where we had a command center. Each car had this smartphone in it. And essentially, we were running things out
of a parking lot for the whole day, having a lot of TV crews there to film the experiment. We made practically every TV channel. An interesting side story is we were trying
a lot of different video cameras on bridges to film the entire thing. And the student who is here, who is in a faculty
at UT Austin, who was running all the preliminary tests, went every day for a few weeks on the
same bridge to try a different camera with different weather to make sure it would work. He was almost arrested by the police on that
day, because people were reporting terrorist activities on the bridge. So the police officer came to the bridge really
upset and told him, what the hell are you doing here? And he said, oh, I’m working for UC Berkeley
and Caltrans on this experiment. And the police person told him, OK, put a
helmet on next time so he doesn’t get people nervous. So the next day, he puts a helmet on and nobody
called. And so I guess the conclusion is if you want
to do anything shifty on a bridge, as long as you wear the right equipment, you’ll be
fine. [LAUGHTER] So we ran the experiment and had a press conference
with the CTO of Nokia, who flew in from Helsinki on that day. This is what you would have seen from the
helicopter if you had been above the experiment. Each of the dots here is a vehicle running
that very first experiment. And while we were doing it, we essentially
ran a live view of traffic. And this is probably the first time ever in
history that people could watch traffic collected exclusively from smartphones. Again, something that you have on your phones
everyday today, but which we managed to put online for 10 hours on 10 miles of freeway
just to demonstrate it could be done. What you see here is, in front of the press,
people pointing on the display, something they couldn’t believe at the time, is that
around 10 o’clock in the morning, when traffic should have been smooth, we saw this gigantic
piece of congestion. And we couldn’t believe that there was a traffic
jam there. So at that point, I realized that I was probably
going to lose my job. Because the algorithm was not working well,
and this would all be on live TV. That person used to work at, another
defunct company. And what she’s doing on that picture is actually
calling, because nobody would believe that was happening. And it turns out there was a five car pileup
accident that we caught before the police. We know this because they had pagers and other
defunct devices. And the pagers all start to beep as soon as
there was an accident. And so instantaneously, it switched the mood
with the TV crews there, because they figured out, wow, that stuff is really cool. And the rumor since is that we did the accident
on purpose to demonstrate the technology. [LAUGHTER] And that rumor came from Stanford and other
schools in the Bay Area that we won’t talk about. So now back to where we are today, we continued
building the system. In fact, Jeff was– actually, I’ll tell another
story. I never tell that story. But since Jeff is here, we’re going to end
up five minutes late today. But this is a story I never tell. So when we ran that experiment, we had to
build that system in about four weeks. And nobody knew how to program in Java except
Jeff. So we asked Jeff to teach everybody how to
program Java. And in a week, everybody was programming Java
and they did absolutely horribly. And so we put the program together. The last compilation of the code happened
at 4:00 AM. And the launch was at 10:00 AM that morning. A bunch of students were sleeping on the floor
at Nokia headquarters. Finally, at 4 o’clock, we compiled the code. It seemed to be working. So we started the experiment at 10:00. And then Jeff was in the command center and
making sure that nothing breaks. And then there was something called leaking
memory. If you have programmed, you know about leaking
memory. And essentially that when memory leaks and
leaks and leaks, at some point you saturate your memory. And your computer crashes. And so I don’t exactly know how this worked. But essentially there was so much leaking
memory that every 45 minutes we had to reboot the system because the leaking memory problem. And so when we were giving that presentation
in front of TV and then later at a seminar, I had no idea about this leaking memory problem. So Jeff was calling someone. I don’t even know who it was. And someone was walking to me like, hey, the
system is going to crash in three minutes. You need to move in front of the screen while
we reboot the system. And this was essentially every 45 minutes
Jeff was calling someone who was telling me, move in front of the screen. We’re rebooting. And then continue the demo. And this is so– I never told that story before. But since Jeff– and I really should acknowledge
the fact that this was really Jeff’s. So this was the first version. In engineering, you have to rebuild a system
many times until it works well. And after a while, it worked well. So this is an example of what the data looks
like. This is 500 taxis in the city of San Francisco. You can recognize it, because it drops a GPS
point every 30 seconds. So after a while, you can recognize the city. And that gives you a sense. If you can collect data from 500 vehicles
in a city like San Francisco, that’s the scale of data you get. Now imagine the city of San Francisco today
has 40,000 registered Uber drivers. And each of these Uber drivers plus you, if
you ride an Uber, transmit one GPS point every two or three seconds, because that’s how you
see the car moving on your screen. So you can imagine, this data– this was revolutionary
10 years ago. Maybe nine or eight years ago, it enabled
traffic information. Today, there is so much data that this is
almost a mature field. And so what companies like Apple, like Google,
Waze– which are now all the same, anyways– have built is essentially a traffic monitoring
system. And it more or less all looks the same. There’s a bunch of video feeds that are filtered
that feed models like the one I’ve shown before, that produce estimates– that’s what you get
on your phone– that get transmitted to customers whether you are using a phone, a web browser,
or any other system. And that’s the genesis of these systems. So a few years into the project, people were
really trying to figure out, well, how well is that system working? And so in 2009, one of the things we were
doing at the time is we were benchmarking ourself against Google, because this was the
beginning of the supremacy of Google Maps. And so the type of things we were looking
at is at the time Google was not using smartphone data. And so we wanted to know how much faster than
the state of the art can this system be? So what you see on the left is a movie of–
you’ll see a movie of Google traffic. It’s accelerated. One frame is 30 seconds, because watching
traffic on the web is a very boring thing. And then, on the right, is the interface of
the system we have built. And so what we are trying to demonstrate at
the time is that essentially you can monitor traffic much faster. And so on the left, you see an accident happening. You’ll see it in about two or three seconds. It’s happening right now. And that will create a big shock wave in the
back. And it took, at the time, Google about 15
minutes to show it on their display. So this was what we were showing to kind of
proselytize the publications to say you should really start to invest in technology to ingest
the data, because that is really the way traffic will be managed in the future. And so the project matured. And a lot of companies have hired our students
since. The person running Apple traffic, Ryan Herring,
was one of the co-creators of that system with Jeff. People at Google and many other companies
essentially took a lot of the things we did in the lab and migrated it to that, to your
phone. And so I think it’s fair to say later, about
10 years later, that there’s still a lot of research happening, because traffic predictions
are not perfect. I’ve missed only one flight ever in my life
because of bad traffic predictions. Maybe you have. We certainly have missed meetings. I have. But I think, overall, this is a fairly mature
field. And whether modeling contributions, estimation,
experimental, or data quality contributions still need to be made, things have progressed
quite well. So now the question is we collectively have
built this as a community. It’s 2017. Where does that leave us? And so this goes into the more interesting
part of the talk. Every morning or every day you are using Google
Maps or any other of your favorite apps. For example, if you want to go from Berkeley
to that other university I mentioned before on the other side of the Bay, you punch your
destination– that’s my alma mater, by the way, so I’m a bit conflicted. So you punch your destination. It will give you the shortest path. And it probably will give you another two
alternative path, which usually are the same shortest time. That’s the way that these things work. Now think about it for a minute. What does it do to the system? 10 years ago, you knew how to go to work. And since you had no information, you would
roughly go to work the same route, because you knew how to do. But now you have information. That means you have a feedback loop around
you. And if that feedback loop tells you, ah-ah,
don’t take this road. Take that one instead. You’ll do it. So that means that technology that all of
you have for free all the time everywhere on your phone enables you to have a feedback
loop inside the system. And that’s very bad. Imagine you are building an autopilot for
a truck. But now you’re transplanting the software
onto a car. It’s not meant for a car. It’s been for a truck. Or autopilot for an aircraft, a jet fighter,
and then you put it on a 747. It doesn’t work. Well, that’s what’s happening to traffic now. Now, you could think, well, that’s kind of
an abstract way of thinking about it. But actually, it’s not. If you’re trying to optimize a system but
you have a wrong model of the system because you have forgotten this feedback loop, things
could go really wrong. And here’s an example. So initially, people thought, oh, everybody
has traffic information. And if we can get anybody the best route to
their destination, that will make traffic flow better. And elected officials even made these statements
in public. I don’t want to finger point at people or
organizations. But if you look on the web, you’ll find tons
of these, people assuming that if all of us have the most efficient route to our destination,
we improve traffic. But then, sooner or later, people started
to realize that something’s wrong. And in fact, I’m just going to flip through
these. But if you have noticed more traffic in your
neighborhood recently, you are not alone. In fact, there’s a lot of places where people
have started to figure out– that’s strange. You know, there’s a lot more traffic. And then they figured out, oh, yeah, that’s
because there’s popular apps. And it turns out that these apps are running
more and more people through my neighborhood. So if you have seen that happening, well,
that’s a very known phenomenon. Then people start to get really upset. So they started to resist. So first, they did funny things, like posting
signs. You should be ashamed to come to my neighborhood. Then they started to do funnier things, like
posting fake detour signs to confuse people going through their neighborhoods. There’s even manuals on how to spoof and confuse
people using these apps. And then the city started to get involved
and organized. So for example, what they did is they started
to build more street bumps. Because if you build street bumps or stop
signs, traffic slows down. So by screwing your own traffic, you’re making
everybody’s life more miserable. But then people avoid your neighborhood. This is a reality of urban planning. So there’s other ways to do it. You could turn restrictions. You could start switching the traffic lights
in your neighborhood to make sure that it’s getting more and more– it’s taking more and
more time to drive through your neighborhood. And so it’s an interesting situation in which
urban planners today have found no better way to resist than making it harder to drive
through their own neighborhoods. So this is why I mentioned Frankenstein at
the beginning, because we collectively built that routing engine that is free that all
can use. And now, because those residential streets
are not meant to support all that traffic, we’re in a situation where people are trying
to make traffic worse so that people stay out of their neighborhood. And that war went on and on. That war is still happening, to the point
that elected officials are now trying to figure out how to solve the problem. And the bad news is there is no easy solution
to that problem. At some point, people even thought that people
would start, like, citizens would start to sue Waze. And ironically, Waze, which is one of the
leading companies in this field, which was born in Israel– even so, the first lawsuit
in Israel. So it’s ironic that the first lawsuit happened
where the company was born. And it’s probably only a matter of time before
we see class actions in the US. So now we should ask ourselves, what have
we done? Right? We have given everybody that free information,
the ability to be more efficient. But somehow it has made things worse. And this is a widespread phenomenon. For example, what you can see here is a map
of the places where this has hit the news. If it’s bad enough that it hits the news,
that means there’s a lot of other places where it’s happening. If you have issues in your neighborhood, please
email us, because we are trying to inventory. This is like cancer. It’s growing. It’s everywhere, and we’d like to understand
what’s happening. So now that this has become a problem that
is a very severe problem, people have started to figure out, well, can we explain it? Can we understand it? Because if you understand it, maybe you can
fix it. So let me explain, let me show you why this
is happening. And then you’ll understand the complexity
of sharing information of that nature to drivers, because there’s no easy fix. The movie you are about to see has twice the
same simulation. The bottom part is I guess 10 years ago, when
nobody had that information. And the upper part is, let’s say, 20% of the
people– you– are using the same app, whether using Waze or Google or Apple or INRIX or
anything you want. And then here’s what happens when you give
people information. So this is a simulation. So let’s say there is an accident that blocks
three lanes. That creates a big traffic accident. So a traffic jam happens. I’m sure you’ve all been in a traffic jam
before. But if you give people information, then people
want to leave the freeway immediately, because that’s what the app tells you to do. If you don’t give people the information,
they stay on the freeway. Ultimately, the freeway gets cleared. And things start flushing. If you have given people the information,
once the freeway is being flushed out, now you have a lot of people, hundreds of people,
on the off-ramp. So that has a lot of interesting consequences. First, it creates more traffic, because everybody
wants to leave the freeway at the same time. It’s not meant to be that way. So now you have a ton of traffic on the freeway,
because people have that information. And then– even in that process, in trying
to make traffic on the freeway better, you’ve made traffic on the freeway worse. But then, in addition, now you have these
hundreds of people on the side streets, which are absolutely not meant to support that traffic
either. First, because they are small, and second,
because they have a ton of traffic lights, which are not meant to process as much traffic. So now, you’ve made that traffic much worse
also. And this is a reality of traffic information
today. This is why you see a lot more traffic in
your neighborhood. And this is why your neighborhood is suffering
from it, because it’s not meant to support that extra traffic. So these simple things, which only in isolation
can be understood and have been studied by generations and generations of traffic engineers,
when they all become coupled because of that information system, then they become problematic. So people understand bottlenecks. People understand rerouting. People understand bottleneck removal. People understand residual condition routing. But all of these sequenced together is what’s
causing the problem. And so this is where analysis really should
enable us to at least understand the problem. And the good news is there is a good theory
that can explain this. And that theory is a game theory. So what’s happening here is that at the inner
loop of this feedback system, people are inherently behaving selfishly. In the morning, when you punch your destination
on your phone, what you want is you want to go to work as fast as possible. So you want selfishly what’s best for you. And there’s no other– I mean, I don’t blame
you. I do the same. And the point is this is a feature of traffic. You wouldn’t want to take a 10 or 20 minute
detour for the greater good of society. You want to go to your work. You want to go to whatever you have to do. So if you model this, there is many models
that have been developed over the years to encompass that system. And there’s a very famous concept in economics
school, the Nash equilibrium, which can characterize this. So I’m going to just skip this but explain
it on a very simple example. Say you go from here to here. And 10 years ago, you had no information. There’s no reason why you would leave the
freeway, even if there is traffic. Because first, you don’t know the neighborhood. Second, maybe it’s dangerous. Third, you’ve never done it before. So most of the people will just stay on the
freeway, which means essentially you’d have a ton of travel time on that freeway. And then nobody would go to these side streets. And then as more and more and more people
have that information then more and more people will figure out, oh, I have an alternative. And people will clog that street. And when that street gets clogged, then people
will see the map and say, oh, this one is actually better and then start to clog that
street. And so if you try to compute the travel time
as more and more people use the app, that’s what it looks like. So you think of this as 10 years ago when
nobody had the information, travel time on the freeway with terrible. Travel time in Pasadena was awesome. And so everybody stayed on the freeway. And that’s the way things were 10 years ago. And then today, maybe 20% use– 20% of the
population use the app. So in the meantime, as more and more and more
people use the app, then they clog the first street. Then they clog the second street. And finally, travel time became the same. It’s equally bad, whether you’re on the freeway
or whether you’re on the arterial roads. That’s called a Nash equilibrium. If you’ve taken any economics class, that’s
probably something you’ve seen before. If you’ve seen the movie A Beautiful Mind,
that’s the story of John Nash. And the reason why he got the Nobel Prize
for this is because if everybody behaved selfishly in this way, that means there’s no incentive
to change your behavior. Then, one can prove that, at least in the
case of transportation, that’s not optimal. That means that if all of us use what we think
is the shortest path to go to work, it actually leads to a worst outcome than if we were all
collaboratively working together. And that’s why all these headlines that you
see of mayor of this city teams up with that company to improve traffic by giving you best
information for traffic is just complete wrong economically. But it also is demonstrated in practice. And so the sad situation of sharing that traffic
information is that today, by giving people better traffic information, we are actually
making the system worse because of that Nash equilibrium problem we see here. And there is something called the price of
anarchy, which is essentially how much worse is that Nash equilibrium than if everybody
was collaborating. And that price of anarchy is a measure of
how bad the system is. And so part of the work that we are going
to do in the future is understanding what are the proper structure and mechanisms that
one should use to fix that problem. The problem– so I’m going to skip a lot of
the mathematics here. But I think to try to understand the problem,
it’s really a two layer problem. The first layer problem is that we all want
what’s best for ourselves. We all want the shortest path, or the fastest
path, to go to work. So the point is we’re not going to collaborate
by sending people on a longer route for the greater good of society. And then, at the higher level, there’s absolutely
no incentive for companies either to do the same. If I’m Google and if I’m good at giving you
a travel time, my incentive, as Google, is to give you the best for you. Because if I give you something better for
society that puts you on a longer path, the first thing you’re going to do is delete the
app on your phone and try another app. And that’s the reality. So these apps are not going to collaborate
with each other either for the same reason. And so what you end up with is a situation
in which not only you have a feedback loop, but you have a set of different feedback loops
that are not clear to understand either. Like some of you use Google. Some of you use Waze. That’s nearly the same nowadays. Some use Apple. Some use INRX. All of them have a slightly different algorithm
that tries to give you what’s best for you. And so I think the game over the next 10 years
is to try to understand what kind of control mechanisms can be happening at this level,
which is the level of regulation, to fix that problem. And there’s probably another 10 years of work. And that probably is even harder, because
that’s happening one day. But the truth is if you think about Google
today, if you think about Google six months ago, if you think about Google two years ago,
or whatever app you were using, this also changes. You probably noticed Google or any of these
apps push you different routes, puts you different alternatives. That’s because the algorithm itself is learning
over time. So in addition to that problem, which up to
now has been described in a pretty static manner, it turns out that this is also a learning
dynamics. The system over time itself learns. And if you don’t understand that second dynamics,
that makes it even harder to control the system. And so part of the work we do consists in
understanding how do these algorithms learn over time? How did– because if we have at least a good
understanding of the way they learn, then we could say something about what they convert
to. And so in game theoretical terms, that means
that if I am a routing app, every day essentially I route a bunch of users. I look at how well I did, because I can tell
how fast they took. I can compare them. That’s what’s happening probably under the
hood at all these apps. And then the next day the algorithm changes
based on the data. So part of the work we’re doing now is trying
to understand if we can make assumptions on the way these apps learn. Can we say anything about what is this going
to converge to? And so that field is called repeated games. And depending on the assumptions you make
on the games that these apps are playing, what you can prove is that in some cases,
at best, that convert to Nash equilibrium. And in some cases, they don’t even convert. That means if not used properly at super large
scale, you can have really bad situations where on Monday, apps brought everybody on
one freeway. On Tuesday, next freeway, because the freeway
was terrible. And things could get really unstable. And these are things that you can start to
see already happening. And so part of the theoretical work we do
consists in trying to prove that, at least for certain learning mechanisms. So you can think about it as a machine learning
algorithm that runs all the statistics on the collected data. And depending on the assumption you make on
the learning which happens inside these apps, you could at least make some guarantees that
if some conditions are satisfied, this is not going to converge to something bad. And if possible, this could converge to something
better. And so part of understanding that behavior
essentially amounts to correct rising Nash equilibrium and showing that at least in some
cases, the system converges to a Nash equilibrium. So if you make almost no assumption on the
system and assume that the algorithm used by these apps are no regret algorithms, which
are very general custom algorithms, then you can’t even prove convergence. You can only prove convergence on average. If you assume that these apps work a little
bit more closer to the hedge algorithm, which is a very commonly known algorithm in finance,
then at least you can show that the apps converge almost surely to a Nash equilibrium. That’s not great news. But that’s still better. And then if you make the assumption that these
algorithms learn using a mirror design step, which is now much closer to a concept commonly
used in convex optimization, then you can prove not only convergence to a Nash equilibrium,
but you can also have a rate of convergence. And I’m going to keep the details of the proofs
here. But I’m happy to keep that for the end. So the point is that if you view the system
as some environment which evolves, some agents which are routed by this algorithm, depending
on the way the algorithms are trained, you can prove that the system will be steered
to different types of equilibrium. And that’s really the beginning of something
which is very important in the future, because we’re going to see a lot more regulation of
traffic. It’s just unavoidable. If you look at the way Silicon Valley is asphyxiated,
when I was a student at that university in the south of the Bay, the commute time between
San Francisco and Cupertino was about 45 minutes. Nowadays, it’s about an hour 20. So there is no choice. At some point, regulation will kick in. But to understand that regulation, we really
have to think hierarchically about this matryoshka problem. At the core is you, who have the need for
the fastest route. So that means you’re regulating a system which
is part of a much broader control scheme, which itself will keep changing every day
as the algorithms that are routing you learn. And that’s a reality of urban planning. And that goes maybe back to the spirit of
the Wenk lecture, that technology should really inform policy. These are problems which are reality and will
become more and more crucial. The smartphone explosion, that’s the story
of the last 10 years. The fact that 10 years ago, probably nobody
had a smartphone, now everybody has a smartphone. But then if you think about demographic growth,
which is very true for Seattle and certainly crucial in California, that’s going to only
make things worse. And then in addition to this, if you think
about perturbations of the system which are local, there was a startup called Google when
I was a student at Stanford, which had 10 people. Now Google’s campus in Mountain View probably
has 30,000 or 40,000 people. A few years ago, there was a company named
Tesla that made electric cars. Today, the Tesla factory in the south of the
Bay has a couple thousand people. In three years, Treasure Island will have
18,000 additional apartments that will be open for the public. You can imagine what they will do to the Bay,
to the bridge. And so understanding what these perturbations
of the system do as the algorithm learn is going to be essential to regulate traffic
better. And so the reason why these problems are difficult
is because at the core of these problems, you have dynamics which are inherently complex. The way people route themselves, the way people
drive, the way traffic behaves– it’s fairly– underlying nonlinear dynamics are quite hard
to understand. It’s also distributed in decentralized. The way control is done is not that easy. You have non-cooperative players. There’s no reason why you would sacrifice
your travel time to help society. People want to go to work as fast as they
can. And you have humans in the loop, which also
induce additional difficulties. And so understanding all these complex loops
in this game theoretical framework is going to be one of the first steps towards at least
understanding the problem. Fixing the problem is even harder. Because now even if you knew that you should
send 20% of your commuters a different route to improve traffic, how do you do it? Do you pay people? Do you give people their free cappuccino to
take a 20 minute detour? That probably is not likely to work. You could prevent people from entering the
freeway by making sure that you cannot let a car with less than five people onto that
lane. And that’s a potential alternative. And there are several other ways to do this. But even when we understand this, which is
not happening tomorrow, even when we understand this, once we understand it, how to actually
do something about it is something which is even harder to conceive. And that’s something where a lot of work is
going to be needed. And that work will be needed essentially in
an area which is kind of in the middle of technology and policy. And that’s one of the areas that we’re very
excited to work on at Berkeley. So to finish in the last two or three minutes,
I’d also like to talk about a few other mobile sensor problems that we’ve been working on,
which are paramount to civil engineering. In fact, one of the first apps we worked on
when I joined the faculty 10 years ago was iShake. IShake was one of the first traffic apps to
monitor earthquakes using smartphones. Another beauty of the smartphones is that
they are accelerometers. And so one of the things we did initially
is try to see if you put– this is a shake table. That’s to replicate earthquake when you built
models, if you’re not a structural engineer. So we tried a bunch of smartphones on shake
tables and tried to see how good a measurement you would get. And would that enable us to measure earthquakes? And it turned out that it worked quite well. So the seismo lab took this, and today, T-Mobile
has an app which can be used to monitor earthquakes using smartphones. Another thing which we were also very excited
about is this is– you just saw a phone that shakes. We just spoke about phones that drive. That’s a phone that swims. One of the things we did initially when we
started that mobile sensing program is trying to understand how we could use mobile phones
in floaters to monitor water quality and river motion. California has a lot of water problems, as
you know. And so we build these 100 floating sensors
that have essentially propellers and can measure things and then deployed them in the Sacramento
delta. So watching traffic and watching water is
also equally boring. So this is why the movie is accelerated. So they don’t swim as fast, as you can see
in the movie. But the point here is that they can readjust
themselves and float along the river, transmit their data, so that we can reconstruct water
river flow in real time. And then maybe one of the last mobile sensing
problems we have started to work on recently– you can see that the theme is really you always
try to use mobile measurements to reconstruct something. There is a problem that is really dear to
a lot of our members in our group, because a lot of us have had families affected by
Alzheimer’s disease. And so we try to instrument patients with
Alzheimer’s disease to see if we could monitor falls. And we thought smartwatches was going to be
the next thing. And it turned out to really not work well. But one thing we’ve managed to do today with
video cameras is that using deep learning. So we deploy– this is an Alzheimer’s patient
in a memory care facility we’re operating in in California. Using deep learning and video camera networks,
we’re actually able not only to detect when people fall– so the level of color here,
red, indicates the algorithm has detected this patient had fallen down. But also, we are able to start to collect
enough data to predict when people will fall. And it’s interesting because in these type
of situations, you can see that a fall is about to happen by the way the person walks
or by the way the person interacts. And so what we really hope with this is to
be able to reduce the number of falls in memory care facilities by a lot of different techniques
that we’re working on with occupational therapists. So we’ve come a long way from mobile phones
to be used for traffic, which is still kind of the core of what our group does, to a lot
of different applications. And health care is one that is of particular
interest. So with this, I’d like to stop. I, again, would like to thank the University
of Washington for inviting me for this lecture, the Wenk family for their generous endowment,
Jeff for being my host. And if we have time, I’d love to answer any
questions you have. Thank you.

One thought on “Road Work: Using Mobile Data to Alleviate Traffic Congestion

  1. I think that Prof. Bayen is perhaps too cynical about the willingness of routing providers to collaborate in order to reduce overall trip times (especially Google). As routing providers get more penetration, they inevitably control enough of the system to affect their own route selections; I'm sure Google knows enough about users to predict, or maybe they could just ask, that they may not want the fastest route for a particular trip, necessarily. I'm sure there is some room for collaboration between the top two, or maybe the biggest and third biggest.

Leave a Reply

Your email address will not be published. Required fields are marked *