There are now several dozen companies trying to make the technology for autonomous cars, across OEMs, their traditional suppliers, existing major tech companies and startups. Clearly, not all of these will succeed, but enough of them have a chance that one wonders what and where the winner-take-all effects could be, and what kinds of leverage there might be. Are there network effects that would allow the top one or two companies to squeeze the rest out, as happened in smartphone or PC operating systems? Or might there be room for five or ten companies to compete indefinitely? And for what layers in the stack does victory give power in other layers?
These kinds of question matter because they point to the balance of power in the car industry of the future. A world in which car manufacturers can buy commodity ‘autonomy in a box’ from any of half a dozen companies (or make it themselves), much as they buy ABS today, is very different from one in which Waymo and perhaps Uber are the only real options, and can set the business model of their choice, as Google did with Android. Microsoft and Intel found choke points in the PC world, and Google did in smartphones - what might those points be in autonomy?
To begin with, it seems pretty clear that the hardware and sensors for autonomy - and, probably, for electric - will be commodities. There is plenty of science and engineering in these (and a lot more work to do), just as there is in, say, LCD screens, but there is no reason why you have to use one rather than another just because everyone else is. There are strong manufacturing scale effects, but no network effect. So, LIDAR, for example, will go from a ‘spinning KFC bucket’ that costs $50k to a small solid-state widget at a few hundred dollars or less, and there will be winners within that segment, but there’s no network effect, while winning LIDAR doesn’t give leverage at other layers of the stack (unless you get a monopoly), anymore than than making the best image sensors (and selling them to Apple) helps Sony’s smartphone business. In the same way, it’s likely that batteries (and motors and battery/motor control) will be as much of a commodity as RAM is today - again, scale, lots of science and perhaps some winners within each category, but no broader leverage.
On the other hand, there probably won’t be direct parallels to the third party software developer ecosystems that we see in PCs or smartphones. Windows squashed the Mac and then iOS and Android squashed Windows Phone because of the virtuous circle of developer adoption above anything else, but you won’t buy a car (if you own a car at all, of course) based on how many apps you can run on it. They’ll all run Uber and Lyft and Didi, and have Netflix embedded in the screens, but any other apps will happen on your phone (or watch, or glasses).
Rather, the place to look is not within the cars directly but still further up the stack - in the autonomous software that enables a car to move down a road without hitting anything, in the city-wide optimisation and routing that mean we might automate all cars as a system, not just each individual car, and in the on-demand fleets of 'robo-taxis' that will ride on all of this. The network effects in on-demand are self-evident, but will will get much more complex with autonomy (which will cut the cost of an on-demand ride by three quarters or more). On-demand robo-taxi fleets will dynamically pre-position their cars, and both these and quite possibly all other cars will co-ordinate their routes in real time for maximum efficiency, perhaps across fleets, to avoid, for example, all cars picking the same route at the same time. This in turn could be combined not just with surge pricing but with all sorts of differential road pricing - you might pay more to get to your destination faster in busy times, or pick an arrival time by price.
From a technological point of view, these three layers (driving, routing & optimisation, and on-demand) are largely independent - you could install the Lyft app in a GM autonomous car and let the pre-installed Waymo autonomy module drive people around, hypothetically. Clearly, some people hope there will be leverage across layers, or perhaps bundling - Tesla says that it plans to forbid people from using its autonomous cars with any on-demand service other than its own. This doesn't work the other way - Uber won't insist you use only its own autonomous systems. But though Microsoft cross-leveraged Office and Windows, both of these won in their own markets with their own network effects: a small OEM insisting you use its small robo-taxi service would be like Apple insisting you buy AppleWorks instead of Microsoft Office in 1995. I suspect that a more neutral approach might prevail. This would especially be the case if we have cross-city co-ordination of all vehicles, or even vehicle-to-vehicle communication at junctions - you would need some sort of common layer (though my bias is always towards decentralised systems).
All this is pretty speculative, though, like trying to predict what traffic jams would look like from 1900. The one area where we can talk about what the key network effects might look like is in autonomy itself. This is about hardware, and sensors, and software, but mostly it's about data, and there are two sorts of data that matter for autonomy - maps and driving data. First, ‘maps.’
Our brains are continuously processing sensor data and building a 3D model of the world around us, in real time and quite unconsciously, such that when we run through a forest we don’t trip over a root or bang our head on a branch (mostly). In autonomy this is referred to as SLAM (Simultaneous Localisation And Mapping) - we map our surroundings and localise ourselves within them. This is obviously a basic requirement for autonomy - AVs need to work out where they are on the road and what features might be around (lanes, turnings, curbs, traffic lights etc), and they also need to work out what other vehicles are on the road and how fast they’re moving.
Doing this in real time on a real road remains very hard. Humans drive using vision (and sound), but extracting a sufficiently accurate 3D model of your surroundings from imaging alone (especially 2D imaging) remains an unsolved problem: machine learning makes it conceivable but no-one can do it yet with the accuracy necessary for driving. So, we take shortcuts. This is why almost all autonomy projects are combining imaging with 360 degree LIDAR: each of these sensors have their limitations, but by combining them (‘sensor fusion’) you can get a complete picture. Building a model of the world around you with imaging alone will certainly be possible at some point in the future, but using more sensors gets you there a lot quicker, even given that you have to wait for the cost and form factor of those sensors to become practical. That is, LIDAR is a shortcut to get to a model of the world around you. Once you've got that, you often use machine learning to understand what's in it - that shape is a car, or a cyclist, but for this, there don't seem to be a network effect (or a strong one): you can get enough images of cyclists yourself without needing a fleet of cars.
If LIDAR is one shortcut to SLAM, the other and more interesting one is to use prebuilt maps, which actually means ‘high-definition 3D models’. You survey the road in advance, process all the data at leisure, build a model of the street and then put it onto any car that’s going to drive down the road. The autonomous car doesn’t now have to process all that data and spot the turning or traffic light against all the other clutter in real-time at 65 miles an hour - instead it knows where to look for the traffic light, and it can take sightings of key landmarks against the model to localise itself on the road at any given time. So, your car uses cameras and LIDAR to work out where it is on the road and where the traffic signals etc are by comparing what it can see with a pre-built map instead of having to do it from scratch, and also uses those inputs to spot other vehicles around it in real time.
Maps have network effects. When any autonomous car drives down a pre-mapped road, it is both comparing the road to the map and updating the map: every AV can also be a survey car. If you have sold 500,000 AVs and someone else has only sold 10,000, your maps will be updated more often and be more accurate, and so your cars will have less chance of encountering something totally new and unexpected and getting confused. The more cars you sell the better all of your cars are - the definition of a network effect.
The risk here is that in the long term it is possible that just as cars could do SLAM without LIDAR, they could also do it without pre-built maps - after all, again, humans do. When and whether that would happen is unclear, but at the moment it appears that it would be long enough after autonomous cars go on sale that all the rest of the landscape might look quite different as well (that is, 🤷🏻♂️).
So, maps are the first network effect in data - the second comes in what the car does once it understands its surroundings. Driving on an empty road, or indeed on a road full of other AVs, is one problem, once you can see it, but working out what the other humans on the road are going to do, and what to do about it, is another problem entirely.
These kinds of question matter because they point to the balance of power in the car industry of the future. A world in which car manufacturers can buy commodity ‘autonomy in a box’ from any of half a dozen companies (or make it themselves), much as they buy ABS today, is very different from one in which Waymo and perhaps Uber are the only real options, and can set the business model of their choice, as Google did with Android. Microsoft and Intel found choke points in the PC world, and Google did in smartphones - what might those points be in autonomy?
To begin with, it seems pretty clear that the hardware and sensors for autonomy - and, probably, for electric - will be commodities. There is plenty of science and engineering in these (and a lot more work to do), just as there is in, say, LCD screens, but there is no reason why you have to use one rather than another just because everyone else is. There are strong manufacturing scale effects, but no network effect. So, LIDAR, for example, will go from a ‘spinning KFC bucket’ that costs $50k to a small solid-state widget at a few hundred dollars or less, and there will be winners within that segment, but there’s no network effect, while winning LIDAR doesn’t give leverage at other layers of the stack (unless you get a monopoly), anymore than than making the best image sensors (and selling them to Apple) helps Sony’s smartphone business. In the same way, it’s likely that batteries (and motors and battery/motor control) will be as much of a commodity as RAM is today - again, scale, lots of science and perhaps some winners within each category, but no broader leverage.
On the other hand, there probably won’t be direct parallels to the third party software developer ecosystems that we see in PCs or smartphones. Windows squashed the Mac and then iOS and Android squashed Windows Phone because of the virtuous circle of developer adoption above anything else, but you won’t buy a car (if you own a car at all, of course) based on how many apps you can run on it. They’ll all run Uber and Lyft and Didi, and have Netflix embedded in the screens, but any other apps will happen on your phone (or watch, or glasses).
Rather, the place to look is not within the cars directly but still further up the stack - in the autonomous software that enables a car to move down a road without hitting anything, in the city-wide optimisation and routing that mean we might automate all cars as a system, not just each individual car, and in the on-demand fleets of 'robo-taxis' that will ride on all of this. The network effects in on-demand are self-evident, but will will get much more complex with autonomy (which will cut the cost of an on-demand ride by three quarters or more). On-demand robo-taxi fleets will dynamically pre-position their cars, and both these and quite possibly all other cars will co-ordinate their routes in real time for maximum efficiency, perhaps across fleets, to avoid, for example, all cars picking the same route at the same time. This in turn could be combined not just with surge pricing but with all sorts of differential road pricing - you might pay more to get to your destination faster in busy times, or pick an arrival time by price.
From a technological point of view, these three layers (driving, routing & optimisation, and on-demand) are largely independent - you could install the Lyft app in a GM autonomous car and let the pre-installed Waymo autonomy module drive people around, hypothetically. Clearly, some people hope there will be leverage across layers, or perhaps bundling - Tesla says that it plans to forbid people from using its autonomous cars with any on-demand service other than its own. This doesn't work the other way - Uber won't insist you use only its own autonomous systems. But though Microsoft cross-leveraged Office and Windows, both of these won in their own markets with their own network effects: a small OEM insisting you use its small robo-taxi service would be like Apple insisting you buy AppleWorks instead of Microsoft Office in 1995. I suspect that a more neutral approach might prevail. This would especially be the case if we have cross-city co-ordination of all vehicles, or even vehicle-to-vehicle communication at junctions - you would need some sort of common layer (though my bias is always towards decentralised systems).
All this is pretty speculative, though, like trying to predict what traffic jams would look like from 1900. The one area where we can talk about what the key network effects might look like is in autonomy itself. This is about hardware, and sensors, and software, but mostly it's about data, and there are two sorts of data that matter for autonomy - maps and driving data. First, ‘maps.’
Our brains are continuously processing sensor data and building a 3D model of the world around us, in real time and quite unconsciously, such that when we run through a forest we don’t trip over a root or bang our head on a branch (mostly). In autonomy this is referred to as SLAM (Simultaneous Localisation And Mapping) - we map our surroundings and localise ourselves within them. This is obviously a basic requirement for autonomy - AVs need to work out where they are on the road and what features might be around (lanes, turnings, curbs, traffic lights etc), and they also need to work out what other vehicles are on the road and how fast they’re moving.
Doing this in real time on a real road remains very hard. Humans drive using vision (and sound), but extracting a sufficiently accurate 3D model of your surroundings from imaging alone (especially 2D imaging) remains an unsolved problem: machine learning makes it conceivable but no-one can do it yet with the accuracy necessary for driving. So, we take shortcuts. This is why almost all autonomy projects are combining imaging with 360 degree LIDAR: each of these sensors have their limitations, but by combining them (‘sensor fusion’) you can get a complete picture. Building a model of the world around you with imaging alone will certainly be possible at some point in the future, but using more sensors gets you there a lot quicker, even given that you have to wait for the cost and form factor of those sensors to become practical. That is, LIDAR is a shortcut to get to a model of the world around you. Once you've got that, you often use machine learning to understand what's in it - that shape is a car, or a cyclist, but for this, there don't seem to be a network effect (or a strong one): you can get enough images of cyclists yourself without needing a fleet of cars.
If LIDAR is one shortcut to SLAM, the other and more interesting one is to use prebuilt maps, which actually means ‘high-definition 3D models’. You survey the road in advance, process all the data at leisure, build a model of the street and then put it onto any car that’s going to drive down the road. The autonomous car doesn’t now have to process all that data and spot the turning or traffic light against all the other clutter in real-time at 65 miles an hour - instead it knows where to look for the traffic light, and it can take sightings of key landmarks against the model to localise itself on the road at any given time. So, your car uses cameras and LIDAR to work out where it is on the road and where the traffic signals etc are by comparing what it can see with a pre-built map instead of having to do it from scratch, and also uses those inputs to spot other vehicles around it in real time.
Maps have network effects. When any autonomous car drives down a pre-mapped road, it is both comparing the road to the map and updating the map: every AV can also be a survey car. If you have sold 500,000 AVs and someone else has only sold 10,000, your maps will be updated more often and be more accurate, and so your cars will have less chance of encountering something totally new and unexpected and getting confused. The more cars you sell the better all of your cars are - the definition of a network effect.
The risk here is that in the long term it is possible that just as cars could do SLAM without LIDAR, they could also do it without pre-built maps - after all, again, humans do. When and whether that would happen is unclear, but at the moment it appears that it would be long enough after autonomous cars go on sale that all the rest of the landscape might look quite different as well (that is, 🤷🏻♂️).
So, maps are the first network effect in data - the second comes in what the car does once it understands its surroundings. Driving on an empty road, or indeed on a road full of other AVs, is one problem, once you can see it, but working out what the other humans on the road are going to do, and what to do about it, is another problem entirely.