[NANOG] Microsoft.com PMTUD black hole?

From: Nathan Anderson/FSR (no email)
Date: Tue May 06 2008 - 15:07:05 EDT

  • Next message: Brandon Butterworth: "Re: [NANOG] Microsoft.com PMTUD black hole?"

    Hello,

    Has anyone else here seen problems with microsoft/msn/hotmail/live.com
    sites not performing PMTUD correctly? We have, for a while now, had
    people on our network complain of poor microsoft.com reachability, and
    discovered we can work around the issue by changing MSS on all TCP SYN
    as they go out of our network.

    I recently watched the whole conversation between msn.com and a host on
    our network (with the MSS rewrite disabled), and if I'm reading it
    right, we are following PMTUD protocol correctly by sending back ICMP
    type 3 code 4, but all Microsoft hosts seem to ignore this and continue
    to send packets back to our host with an MSS that is too large.

    I hope I'm wrong and that it is we who are doing something stupid, but
    after cruising Google for a while, I found a multitude of other
    complaints from people connected to other ISPs specifically about not
    being able to reach Microsoft web sites. It seems crazy that MS could
    have PMTUD broken for so long with nobody ever raising a complaint to
    them directly, though, which makes me wonder if there is another answer
    here that I'm missing.

    I sent the following message to a couple of addresses that I gleaned
    from ARIN WHOIS for the IP block in question and threw hostmaster in
    there just in case it went somewhere, but appears to
    be defunct. I have yet to receive acknowledgment of receipt from the
    other address.

    Are there any microsoft.com admins that hang out here that can comment
    on this or get in touch with me, or is there perhaps someone on here
    with connections to the Microsoft NOC?

    (BTW, I stripped the referenced libpcap attachment off of this message
    to the list just so that I wouldn't accidentally incur the wrath of
    NANOG...if y'all want to see it, I'm happy to post it.)

    Thanks,

    -- 
    Nathan Anderson
    First Step Internet, LLC
    -------- Original Message --------
    Subject: Microsoft/MSN/Live!/Hotmail behind blackhole router?
    Date: Thu, 01 May 2008 19:00:46 -0700
    From: Nathan Anderson/FSR <>
    To: , , 
    To microsoft.com NOC admins:
    I work for a regional ISP in the inland pacific northwest.  May of our
    customers' connections have MTUs of less than 1500, and we get routine
    complaints from them that they have trouble reaching web sites that are
    under your administration.
    Usually we can fix the problem by "mangling" the TCP SYNs originating
    from our customers and headed to the world to reflect a lower value;
    however, we would rather not have to do that.  The fact that we are
    REQUIRED to do this in order for your sites to be reachable by our
    customers strongly suggests that either the servers that respond to HTTP
    requests sent to www.microsoft/msn/hotmail/live.com are behind routers
    that are blocking ALL ICMP traffic sent their way -- even ICMP type 3
    code 4 (packet too large, DF set), which is necessary in order for Path
    MTU Discovery to work -- or the servers themselves are not listening to
    the ICMP messages that we are sending their way when our routers are
    forced to drop a packet sent by you which is too large to be forwarded
    to a customer of ours.
    I set up a test connection "on the bench" so to speak, and had our
    router capture a copy of the conversation between our test client and
    www.msnbc.msn.com and forward that conversation encapsulated in TZSP to
    the same test client over a different interface.  The capture clearly
    shows our test client establishing the TCP connection with MSNBC
    (SYN/SYN+ACK/ACK), and then goes on to show MSNBC send ethernet
    MTU-sized packets our way that an intermediate router of ours drops and
    responds with "packet too big, DF set."  Despite this, MSNBC continues
    to retrasmit the original packet with the same payload and the same size
    back to us.  We continue to respond "packet too big, DF set," but the
    MSNBC server never seems to get the message (literally).
    We see the same behavior with all sites across the board contained
    within the 207.46.0.0/16 space, regardless of actual hostname/FQDN.
    We also find this ironic considering that Microsoft published a Technet
    article a few years back on black hole routers and the problems they
    pose, found at http://technet.microsoft.com/en-us/library/bb878081.aspx
    (which we can't read/access unless we are mangling the MSS).
    We would appreciate it if Microsoft NOC admins would please look into
    the matter and take the appropriate corrective action: allowing ICMP
    type 3 code 4 messages through your routers/firewalls, and making sure
    that your servers respond to them appropriately as defined in RFC 1191.
    I have attached the capture we made of the conversation to this e-mail
    message in libpcap format for your analysis.  The test client itself had
    a 1500 MTU to a desktop router, which in turn had an MTU of 1492 on its
    uplink to us.
    I am available to answer any additional clarifying questions you may have.
    Thank you for your time and attention to this matter.
    Regards,
    -- 
    Nathan Anderson
    First Step Internet, LLC
    _______________________________________________
    NANOG mailing list
    http://mailman.nanog.org/mailman/listinfo/nanog
    

  • Next message: Brandon Butterworth: "Re: [NANOG] Microsoft.com PMTUD black hole?"





    Hosted Email Solutions

    Invaluement Anti-Spam DNSBLs



    Powered By FreeBSD   Powered By FreeBSD