Re: shim6 @ NANOG (forwarded note from John Payne)

From: Joe Abley (no email)
Date: Wed Mar 01 2006 - 10:07:39 EST

  • Next message: John Payne: "Re: shim6 @ NANOG (forwarded note from John Payne)"

    On 1-Mar-2006, at 02:56, Kevin Day wrote:

    > On Mar 1, 2006, at 12:47 AM, Joe Abley wrote:
    >>
    >>> o a small to medium multi-homed tier-n isp
    >>
    >> A small-to-medium, multi-homed, tier-n ISP can get PI space from
    >> their RIR, and don't need to worry about shim6 at all. Ditto
    >> larger ISPs, up to and including the largest.
    >
    > If you include "Web hosting company" in your definition of ISP,
    > that's not true.

    Right. I wasn't; I listed them separately.

    It's important to note that even if you are a hosting company who
    *does* qualify for PI v6 space, you still need shim6-capable servers,
    if you want to make them optimally available to multi-homed, shim6-
    capable hosts. The difference PI makes is in the distribution of
    addresses to servers (the servers only need a single set).

    > You don't get PI space, and Shim6 is looking like your only
    > alternative for multihoming.

    Right. For a hosting company with multiple PA netblocks, shim6 is the
    option on the table.

    > Many content providers set up multiple non-interconnected POPs in
    > different geographical locations. The only way this can be
    > accomplished is by making separate announcements in each POP for
    > each space. This means either being able to deaggregate, or to get
    > a block for each POP. I don't know of *ANY* that are deploying 5000
    > + servers per POP.

    Right. With shim6, getting a block per POP is trivial, since they are
    all PA assignments from transit providers.

    > I'm just one guy, one ASN, and one content/hosting network. But I
    > can tell you that to switch to using shim6 instead of BGP speaking
    > would be a complete overhaul of how we do things.

    You are not alone in fearing change.

    > Putting routing decisions in the control of servers we don't
    > operate scares me. I wouldn't rely on 90% of our customers to get
    > this right unless it was completely idiot proof. Even if it was, I
    > don't see how we can trust that users aren't messing with things to
    > "game the system" somehow.

    This is the kind of feedback that the shim6 architects need. There is
    talk at present of whether the protocol needs to be able to
    accommodate a site-policy middlebox function to enforce site policy
    in the event that host behaviour needs to be controlled. The scope of
    that policy mediation function depends strongly on people like you
    saying "at a high level, this is the kind of decision I am not happy
    with the hosts making".

    > We deal with long lived TCP sessions (hours/days). I don't see how
    > routing updates can happen that won't result in a disconnect/
    > reconnect, which isn't acceptable.

    One of the primary objectives of shim6 is to provide session
    survivability over re-homing events. Since routing protocols are not
    used to manage re-homing, the speed at which a session can recover
    from a topological event depends on the operation of the shim6
    protocol between client and server.

    It seems reasonable to say that in some cases shim6 re-homing
    transitions will be faster than the equivalent routing transition in
    v4; in other cases it will be shorter. Depends on the network, and
    how enthusiastically you flap, perhaps.

    The experience of people who provide services involving long-held TCP
    sessions is exactly the kind of thing that the shim6 architects need
    to hear about.

    > We have peering arrangements with about 120 ASNs. How do we mix BGP
    > IPv6 peering and Shim6 for transit?

    You advertise all your PA netblocks to all your peers.

    > So far it looks like Shim6 is going to rely on DNS. The DNS caching
    > issue is a real problem. We need changes to happen faster than DNS
    > caching will allow.

    Well, not quite.

    If you change a transit provider, then you need to remove a set of
    AAAA records from the servers you operate, and substitute a new set.
    The time taken for this change to propagate in the DNS is non-zero,
    assuming you use reasonable TTLs. This is your point above, I think.

    With shim6-capable clients and servers, the dark period during which
    the changes propagate is handled by an address selection/retry
    algorithm in the client (for new sessions) and by the shim6 protocol
    doing failure detection and selecting a new locator (for established
    sessions).

    Once the DNS change has propagated, the address selection and shim6
    band-aids are no longer required, and clients have an accurate set of
    information.

    Renumbering for hosting providers can be a monstrous pain in the
    neck, especially for hosting providers who rely on third parties (or,
    horrors, their customers) to maintain the zone files within which
    services are named.

    Some hosting providers of my acquaintance insist on customer zones
    being redelegated to the hosting providers' nameservers, so that any
    renumbering that needs to happen can be coordinated by the hosting
    provider directly. Hosting providers who don't do this, and who use
    PA addresses with shim6 to multi-home, are definitely going to face
    some challenges.

    > Our network is complicated. We have a /21 that's split into 4 /23s.
    > One for each non-interconnected POP. We only advertise the /23 for
    > each POP out to transit, but we give peers access to our entire
    > network wherever they peer with us and we pay to haul/tunnel it
    > around. How do we even do this without PI space, let alone through
    > shim6?

    You avoid it completely, and use PA space in every POP. You can still
    announce PA space from other POPs to peers, if you want to retain
    your tunnels.

    > For quite the foreseeable future, we'd be running IPv4 and IPv6 at
    > the same time, over the same transit connections. We'd have to TE
    > our IPv6 bits completely differently than our IPv4 bits, even
    > though we'd be billed for the aggregate usage of both. Automated
    > tools for tweaking total usage per transit port is hard enough in
    > BGP. Having to tweak both BGP and some external shim6 method of TE
    > when the goal is a common aggregate number is going to be a very
    > difficult issue.

    Yep. Difficult and expensive.

    > Some of our applications are extremely sensitive to jitter/latency.
    > We've spent ages tweaking route-maps manually (and through
    > automated continual tweaking) to make sure we avoid any congested
    > links. [...]

    The site-policy middleware that I alluded to earlier seems like the
    analogous place to specify this policy. Such a facility might
    actually give you more control than you have now -- tweaking BGP
    attributes to accomplish this kind of thing is often like a game of
    whack-a-mole; if you were able to control the route taken by traffic
    in both directions by influencing the locator selection for each and
    every session, you'd have far greater, and more fine-grained, control
    over your external traffic than BGP/swamp-abuse gives you currently.

    Your specific requirements in this regard (the high-level objectives
    that you currently meet using BGP) would no doubt be gratefully
    received on the shim6 list.

    > We'd still be relying on PA space. No matter how great dhcp6 is,
    > there will be significant renumbering pain when providers are
    > changed. Static ACLs, firewall rules, etc. If you're including
    > customer machines in the renumbering, many simply won't do it.

    Agreed, renumbering is a pain. Dhcp6 sounds like a scary thing to use
    with servers. Customers suck. Change in operational practices will be
    required.

    Lest I sound too much like a foam-at-the-mouth shim6 advocate, I
    think it would be perfectly fine if, in the final analysis, the
    conclusion was that shim6 and PA/renumbering was not an option for
    hosting providers. A reasoned technical argument which came to that
    conclusion would provide a solid basis for the RIRs to modify their
    allocation policies such that hosting providers could use PI space
    instead. As perhaps the recent attempt to change the v6 PI policy
    indicates, the chances of making changes without such a reasoned
    argument are slim.

    However, I think it's possible that shim6, incorporating some
    facility for a site to manage the locator selection of the hosts,
    could actually make some things easier for hosting providers. There
    might even be reasons to like it :-)

    Joe


  • Next message: John Payne: "Re: shim6 @ NANOG (forwarded note from John Payne)"





    Hosted Email Solutions

    Invaluement Anti-Spam DNSBLs



    Powered By FreeBSD   Powered By FreeBSD