This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/ipv6-wg@ripe.net/

[ipv6-wg] Re: [address-policy-wg] Re: 200 customer requirements for IPv6

Previous message (by thread): [ipv6-wg] Re: [address-policy-wg] Re: 200 customer requirements for IPv6
Next message (by thread): [ipv6-wg] IPv6 database service

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Iljitsch van Beijnum iljitsch at muada.com
Tue Dec 6 21:42:57 CET 2005

On 6-dec-2005, at 20:10, Oliver Bartels wrote:

>> I can't imagine what such a layer would look like...

> Clustering all PI-prefixes originating at the same AS to form
> a single "super-prefix" makes policy processing a lot easier,
> because it need to be done just once for the whole block.

I'm not sure I understand the "superprefix" but obviously a lot of  
work that now happens per-prefix in BGP should happen per-AS. But  
that's mostly moot in IPv6 as we should never reach the numbers of  
prefixes per AS that we see in IPv4.

> With as few as 256MByte of DDRAM plus a 64K TCAM chip it is
> possible to handle max. 8 Million Forwarding entries at full 128 bit
> resolution

I guess that means you throw everything in the TCAM first to get from  
8M to about 125 entries and then look those up in a tree or hash table?

Obviously it's possible to build architectures that allow fast  
forwarding with big tables. However, this doesn't come free: it takes  
more iron to do this, and also more power. TCAMs suck down the juice  
like a depressed alcoholic. This is bad for your design (both the box  
itself and the datacenter), your wallet and the environment.

> I personally just received a patent on such router hardware concept.

So big routing tables are good business for you, then?

>> Sure. But trying to aggregate on network topology is never going to
>> work for two very simple reasons:

>> 1. It changes all the time

> The same is true for geographical aggregation.

I guess I you live in California or another place that is plagued by  
frequent earthquakes...

> Geographical aggregation would require free transit, otherwise
> it is not compatible with the ISP's business models.

The point is to keep the aggregation inside the ISP network. Tier-1  
ISPs would still have a full routing table, but rather than have a  
copy in each router, it's distributed over the network. So there is  
no free transit requirement.

> country boundaries.

> Such boundaries are artifical, the EU tries to avoid them.

The idea behind aggregation is that you can move up or down. If  
country borders get in the way, drill down a bit and start looking at  
provinces or cities. In our design there are potentially 64k distinct  
areas (mostly cities) so if you want, you can have a route for each  
of those in your routing table and never run into country borders.

>> 2. You can't express a topology with loops in it in an addressing
>> hierarchy

> Avoiding loops is the job of the routing protocol, not the
> topology.

??? Are we talking about spanning tree now? Loops in the topology are  
good. You can't remove them. Routing is also dynamic, BTW.

>> Distance is actually very important. It's very hard to do decent high
>> speed file transfer on out of the box OSes and applications with high
>> delay. Also, it often makes sense to backhaul traffic over SOME
>> distance, but that doesn't mean it also makes sense to backhaul it
>> over even larger distances. I.e., even if a link to New York is
>> cheap, you don't want to go over Palo Alto.

> If I would be located in Seattle, Palo Alto might be an alternative
> way point as well as Chicago or even Dallas.

Of course. But we're in Europe. If you're in Seattle you'll see a lot  
of your traffic to other people in Seattle flow through Palo Alto.  
That's normal, because it's not economical to peer with everyone  
everywhere. It's not so cool when intra-Seattle traffic starts to  
flow through Miami.

> What you are trying is:

> Map a two-and-a-half-dimensional world on a one-dimensional
> address range. This won't work by Math.
> Dimensions can only be replaced by dimensions.

Ah, but we're not mathematicians but engineers. In software, you have  
one dimensional memory. Still, you can have multidimensional arrays.

> Asked a database programmer how difficult it is to implement
> a geographical 10km around some place search on a database
> and ask them about the algorithms in use.

Easy: select everything that's in a 20x20 km grid around the center  
point and then do the real distance calculation on everything in that  
grid. Obviously you'll select stuff that's at x+8 y+8 = ~12km from  
the center but that's only true for a relatively small part of the  
intermediate results.

> What they try is interleaving the West-East (X) and North-South (Y)
> coordinates bitwise in the search key and handle overruns by  
> exceptions,

That sounds like Tony Hain's geographical addressing.

The variant Michel Py and myself came up with is based on  
administrative borders such as countries so you already have on  
dimension: the alphabet. (Ok, not entirely how it works, but still.)

> However this requires a _significant_ exception handling effort,
> nothing someone would like to implement in a fast forwarding
> engine for packet routing.

Geography is long gone by the time we're forwarding packets.

>> Today, IPv4 routing works but it has come close to the edge of the
>> cliff twice (early 1990s just before CIDR routing tables were too
>> large and late 1990s flapping cost too much CPU) and it's still
>> pushing towards that edge, which we can't see clearly but know is out
>> there somewhere.

> It works. Period.

Hm, if you only descern "works" / "doesn't work" it's hard to say  
anything about the routing system... Some quantitative and  
qualitative analysis can be helpful.

> And it will continue to work, because of the
> economic pressure. Engineers have found a solution, thus:
> Don't worry.

Guess what. I'm an engineer. I'm working on this stuff. And I'm  
saying: when de facto unlimited PI is allowed, it may not mean the  
end of the internet, but it's certainly reason to worry. Of course  
things will continue to work. However, they'll be less reliable and  
more expensive.

>> So because you can't prove that you're right I should just believe
>> you without proof?

> Yes, because the theory of computer sience gives you the
> prrof that there are theorems in this world which can't be proven.

There are also many theorems that turn out to be false. Proof is a  
pretty good method to avoid those. If we can't have proof we'll need  
to have less reliable methods to avoid them. Just accept anything is  
not the solution.

>> The scenario that de facto unlimited PI in IPv6 will make routing
>> tables so large that it becomes problematic in some way or another is
>> entirely reasonable, on the other hand.

> The current experience let us make a reasonable and responsible
> assumption that a IPv6 routing table would take not much more
> growth than the current IPv4 table, whereas current technology
> permits tables of 10 to 100 times that size.

Today, people sometimes deaggregate a /16. That's bad: 255  
unnecessary routes. What if they do the same thing with a /32 in  
IPv6? 65535 unnecessary routes. That will probably kill most existing  
IPv6 routers today.

10 times is 1.75M routes, 100 = 17.5. The former is probably doable  
for IPv4 on some extremely high end boxes but I'm not sure how those  
would handle real issues such as flapping, lots of full feeds etc. I  
don't believe the latter exists or will exist in the forseeable  
future, at least not in a way that anyone can afford to actually use.  
Even those 1.75M boxes will be very expensive and only affordable by  
the largest networks. Don't forget you and I all pay for their  
hardware, directly or indirectly.

Iljitsch

-- 
I've written another book! http://www.runningipv6.net/

Previous message (by thread): [ipv6-wg] Re: [address-policy-wg] Re: 200 customer requirements for IPv6
Next message (by thread): [ipv6-wg] IPv6 database service

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[ ipv6-wg Archives ]