From owner-tech+M42049=david=gwynne.id.au@openbsd.org Tue May 12 20:13:02 2015 Delivered-To: david@gwynne.id.au Received: by 10.194.71.146 with SMTP id v18csp1864545wju; Tue, 12 May 2015 03:13:03 -0700 (PDT) X-Received: by 10.52.131.9 with SMTP id oi9mr11106898vdb.76.1431425582909; Tue, 12 May 2015 03:13:02 -0700 (PDT) Return-Path: Received: from shear.ucar.edu (lists.openbsd.org. [192.43.244.163]) by mx.google.com with ESMTPS id fa3si14021124vdb.22.2015.05.12.03.13.02 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 May 2015 03:13:02 -0700 (PDT) Received-SPF: pass (google.com: domain of owner-tech+M42049=david=gwynne.id.au@openbsd.org designates 192.43.244.163 as permitted sender) client-ip=192.43.244.163; Authentication-Results: mx.google.com; spf=pass (google.com: domain of owner-tech+M42049=david=gwynne.id.au@openbsd.org designates 192.43.244.163 as permitted sender) smtp.mail=owner-tech+M42049=david=gwynne.id.au@openbsd.org Received: from openbsd.org (localhost [127.0.0.1]) by shear.ucar.edu (8.14.7/8.14.7) with ESMTP id t4CACsfx031488 for ; Tue, 12 May 2015 04:13:01 -0600 (MDT) Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [217.70.183.195]) by shear.ucar.edu (8.14.7/8.14.7) with ESMTP id t4CACger017485 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO) for ; Tue, 12 May 2015 04:12:43 -0600 (MDT) Received: from mfilter34-d.gandi.net (mfilter34-d.gandi.net [217.70.178.165]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id 08CCCA80ED for ; Tue, 12 May 2015 12:12:41 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter34-d.gandi.net Received: from relay3-d.mail.gandi.net ([217.70.183.195]) by mfilter34-d.gandi.net (mfilter34-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id bIaAoFGbWF41 for ; Tue, 12 May 2015 12:12:39 +0200 (CEST) X-Originating-IP: 195.132.27.120 Received: from figo.nolizard.org (195-132-27-120.rev.numericable.fr [195.132.27.120]) (Authenticated sender: mpieuchot@nolizard.org) by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 259ECA80D9 for ; Tue, 12 May 2015 12:12:37 +0200 (CEST) Received: from localhost (figo.nolizard.org [local]); by figo.nolizard.org (OpenSMTPD) with ESMTPA id ca425a82; for ; Tue, 12 May 2015 12:12:37 +0200 (CEST) Date: Tue, 12 May 2015 12:12:37 +0200 From: Martin Pieuchot To: tech@openbsd.org Subject: Take vlan(4) out of ether_input() Message-ID: <20150512101237.GA6921@figo.nolizard.org> Mail-Followup-To: tech@openbsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Loop: tech@openbsd.org Precedence: list Sender: owner-tech@openbsd.org Status: RO Content-Length: 10604 Lines: 353 Diff below convert vlan(4) to the new if_input() framework which means that vlan_input() will now be executed before ether_input(). Compared to trunk(4) multiple vlan(4)s can be attached to the same parent interface. When such thing happens only one input handler is added to keep the "if_inputs" handler list as small as possible. With this diff pseudo-drivers using the if_input() framework can now be stacked. That's why if_input_process() has been modified to deal with multiple ifp/lists of handlers. Reviewers might notice that this diff introduces a behavior change when a trunk is configured on top of a vlan. vlan_input() now runs before trunk_input() which means that the trunk interface now see packets with encapsulation removed. I know that various configurations involving vlans are currently broken. This diff won't fix them but it should not introduce newer problem, so please test and report back. Comments and oks are also welcome. Index: net/if.c =================================================================== RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.330 diff -u -p -r1.330 if.c --- net/if.c 23 Apr 2015 09:45:24 -0000 1.330 +++ net/if.c 11 May 2015 13:15:01 -0000 @@ -497,10 +497,19 @@ if_input_process(void *xmq) if ((++mit & 0x1f) == 0) yield(); +again: + /* + * Pass this mbuf to all input handlers of its + * interface until it is consumed. + */ ifp = m->m_pkthdr.rcvif; SLIST_FOREACH(ifih, &ifp->if_inputs, ifih_next) { if ((*ifih->ifih_input)(m, NULL)) break; + + /* Pseudo-drivers might be stacked. */ + if (ifp != m->m_pkthdr.rcvif) + goto again; } } splx(s); Index: net/if_ethersubr.c =================================================================== RCS file: /cvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.196 diff -u -p -r1.196 if_ethersubr.c --- net/if_ethersubr.c 11 May 2015 08:41:43 -0000 1.196 +++ net/if_ethersubr.c 11 May 2015 08:58:48 -0000 @@ -456,7 +456,7 @@ bad: int ether_input(struct mbuf *m, void *hdr) { - struct ifnet *ifp0, *ifp; + struct ifnet *ifp; struct ether_header *eh = hdr; struct niqueue *inq; u_int16_t etype; @@ -469,7 +469,7 @@ ether_input(struct mbuf *m, void *hdr) /* mark incoming routing table */ - ifp = ifp0 = m->m_pkthdr.rcvif; + ifp = m->m_pkthdr.rcvif; m->m_pkthdr.ph_rtableid = ifp->if_rdomain; if (eh == NULL) { @@ -511,12 +511,6 @@ ether_input(struct mbuf *m, void *hdr) atomic_setbits_int(&netisr, (1 << NETISR_RND_DONE)); } -#if NVLAN > 0 - if (((m->m_flags & M_VLANTAG) || etype == ETHERTYPE_VLAN || - etype == ETHERTYPE_QINQ) && (vlan_input(eh, m) == 0)) - return (1); -#endif - #if NBRIDGE > 0 /* * Tap the packet off here for a bridge, if configured and @@ -565,7 +559,7 @@ ether_input(struct mbuf *m, void *hdr) * is for us. Drop otherwise. */ if ((m->m_flags & (M_BCAST|M_MCAST)) == 0 && - ((ifp->if_flags & IFF_PROMISC) || (ifp0->if_flags & IFF_PROMISC))) { + (ifp->if_flags & IFF_PROMISC)) { if (memcmp(ac->ac_enaddr, eh->ether_dhost, ETHER_ADDR_LEN)) { m_freem(m); return (1); Index: net/if_var.h =================================================================== RCS file: /cvs/src/sys/net/if_var.h,v retrieving revision 1.25 diff -u -p -r1.25 if_var.h --- net/if_var.h 23 Apr 2015 09:45:24 -0000 1.25 +++ net/if_var.h 11 May 2015 09:10:48 -0000 @@ -115,6 +115,7 @@ struct ifqueue { struct ifih { SLIST_ENTRY(ifih) ifih_next; int (*ifih_input)(struct mbuf *, void *); + int ifih_refcnt; }; /* Index: net/if_vlan.c =================================================================== RCS file: /cvs/src/sys/net/if_vlan.c,v retrieving revision 1.118 diff -u -p -r1.118 if_vlan.c --- net/if_vlan.c 22 Apr 2015 06:42:11 -0000 1.118 +++ net/if_vlan.c 11 May 2015 13:57:02 -0000 @@ -79,6 +79,8 @@ u_long vlan_tagmask, svlan_tagmask; #define TAG_HASH(tag) (tag & vlan_tagmask) LIST_HEAD(vlan_taghash, ifvlan) *vlan_tagh, *svlan_tagh; + +int vlan_input(struct mbuf *, void *); int vlan_output(struct ifnet *, struct mbuf *, struct sockaddr *, struct rtentry *); void vlan_start(struct ifnet *ifp); @@ -268,32 +270,44 @@ vlan_start(struct ifnet *ifp) } /* - * vlan_input() returns 0 if it has consumed the packet, 1 otherwise. + * vlan_input() returns 1 if it has consumed the packet, 0 otherwise. */ int -vlan_input(struct ether_header *eh, struct mbuf *m) +vlan_input(struct mbuf *m, void *hdr) { - struct ifvlan *ifv; - struct ifnet *ifp = m->m_pkthdr.rcvif; - struct vlan_taghash *tagh; - u_int tag; - u_int16_t etype; - struct ether_header *eh1; + struct ifvlan *ifv; + struct ifnet *ifp; + struct ether_vlan_header *evl; + struct ether_header *eh; + struct vlan_taghash *tagh; + u_int tag; + u_int16_t etype; + + ifp = m->m_pkthdr.rcvif; + eh = mtod(m, struct ether_header *); + + etype = ntohs(eh->ether_type); if (m->m_flags & M_VLANTAG) { etype = ETHERTYPE_VLAN; tagh = vlan_tagh; - } else { + } else if ((etype == ETHERTYPE_VLAN) || (etype == ETHERTYPE_QINQ)) { if (m->m_len < EVL_ENCAPLEN && (m = m_pullup(m, EVL_ENCAPLEN)) == NULL) { ifp->if_ierrors++; - return (0); + return (1); } - etype = ntohs(eh->ether_type); + evl = mtod(m, struct ether_vlan_header *); + m->m_pkthdr.ether_vtag = ntohs(evl->evl_tag); tagh = etype == ETHERTYPE_QINQ ? svlan_tagh : vlan_tagh; - m->m_pkthdr.ether_vtag = ntohs(*mtod(m, u_int16_t *)); + } else { + /* Skip non-VLAN packets. */ + return (0); } + + ifp->if_ibytes += m->m_pkthdr.len; + /* From now on ether_vtag is fine */ tag = EVL_VLANOFTAG(m->m_pkthdr.ether_vtag); m->m_pkthdr.pf.prio = EVL_PRIOFTAG(m->m_pkthdr.ether_vtag); @@ -303,7 +317,7 @@ vlan_input(struct ether_header *eh, stru m->m_pkthdr.pf.prio = !m->m_pkthdr.pf.prio; LIST_FOREACH(ifv, &tagh[TAG_HASH(tag)], ifv_list) { - if (m->m_pkthdr.rcvif == ifv->ifv_p && tag == ifv->ifv_tag && + if (ifp == ifv->ifv_p && tag == ifv->ifv_tag && etype == ifv->ifv_type) break; } @@ -315,40 +329,35 @@ vlan_input(struct ether_header *eh, stru * it a chance. */ if (ifp->if_bridgeport && (m->m_flags & M_PROTO1) == 0) - return (1); + return (0); #endif ifp->if_noproto++; m_freem(m); - return (0); + return (1); } if ((ifv->ifv_if.if_flags & (IFF_UP|IFF_RUNNING)) != (IFF_UP|IFF_RUNNING)) { m_freem(m); - return (0); + return (1); } /* * Having found a valid vlan interface corresponding to * the given source interface and vlan tag, remove the - * encapsulation, and run the real packet through - * ether_input() a second time (it had better be - * reentrant!). + * encapsulation. */ - m->m_pkthdr.rcvif = &ifv->ifv_if; if (m->m_flags & M_VLANTAG) { m->m_flags &= ~M_VLANTAG; } else { - eh->ether_type = mtod(m, u_int16_t *)[1]; - m->m_len -= EVL_ENCAPLEN; - m->m_data += EVL_ENCAPLEN; - m->m_pkthdr.len -= EVL_ENCAPLEN; + eh->ether_type = evl->evl_proto; + memmove((char *)eh + EVL_ENCAPLEN, eh, sizeof(*eh)); + m_adj(m, EVL_ENCAPLEN); } #if NBPFILTER > 0 if (ifv->ifv_if.if_bpf) - bpf_mtap_hdr(ifv->ifv_if.if_bpf, (char *)eh, ETHER_HDR_LEN, m, - BPF_DIRECTION_IN, NULL); + bpf_mtap_ether(ifv->ifv_if.if_bpf, m, BPF_DIRECTION_IN); #endif /* @@ -361,25 +370,19 @@ vlan_input(struct ether_header *eh, stru if (bcmp(&ifv->ifv_ac.ac_enaddr, eh->ether_dhost, ETHER_ADDR_LEN)) { m_freem(m); - return (0); + return (1); } } - M_PREPEND(m, sizeof(*eh1), M_DONTWAIT); - if (m == NULL) - return (-1); - eh1 = mtod(m, struct ether_header *); - memmove(eh1, eh, sizeof(*eh1)); - ifv->ifv_if.if_ipackets++; - ether_input_mbuf(&ifv->ifv_if, m); - + m->m_pkthdr.rcvif = &ifv->ifv_if; return (0); } int vlan_config(struct ifvlan *ifv, struct ifnet *p, u_int16_t tag) { + struct ifih *vlan_ifih; struct sockaddr_dl *sdl1, *sdl2; struct vlan_taghash *tagh; u_int flags; @@ -390,6 +393,16 @@ vlan_config(struct ifvlan *ifv, struct i if (ifv->ifv_p == p && ifv->ifv_tag == tag) /* noop */ return (0); + /* Share an ifih between multiple vlan(4) instances. */ + vlan_ifih = SLIST_FIRST(&p->if_inputs); + if (vlan_ifih->ifih_input != vlan_input) { + vlan_ifih = malloc(sizeof(*vlan_ifih), M_DEVBUF, M_NOWAIT); + if (vlan_ifih == NULL) + return (ENOMEM); + vlan_ifih->ifih_input = vlan_input; + vlan_ifih->ifih_refcnt = 0; + } + /* Remember existing interface flags and reset the interface */ flags = ifv->ifv_flags; vlan_unconfig(&ifv->ifv_if, p); @@ -441,9 +454,10 @@ vlan_config(struct ifvlan *ifv, struct i bcopy(LLADDR(sdl2), ifv->ifv_ac.ac_enaddr, ETHER_ADDR_LEN); ifv->ifv_tag = tag; - s = splnet(); - tagh = ifv->ifv_type == ETHERTYPE_QINQ ? svlan_tagh : vlan_tagh; - LIST_INSERT_HEAD(&tagh[TAG_HASH(tag)], ifv, ifv_list); + + /* Change input handler of the physical interface. */ + if (++vlan_ifih->ifih_refcnt == 1) + SLIST_INSERT_HEAD(&p->if_inputs, vlan_ifih, ifih_next); /* Register callback for physical link state changes */ ifv->lh_cookie = hook_establish(p->if_linkstatehooks, 1, @@ -454,6 +468,10 @@ vlan_config(struct ifvlan *ifv, struct i vlan_ifdetach, ifv); vlan_vlandev_state(ifv); + + tagh = ifv->ifv_type == ETHERTYPE_QINQ ? svlan_tagh : vlan_tagh; + s = splnet(); + LIST_INSERT_HEAD(&tagh[TAG_HASH(tag)], ifv, ifv_list); splx(s); return (0); @@ -462,6 +480,7 @@ vlan_config(struct ifvlan *ifv, struct i int vlan_unconfig(struct ifnet *ifp, struct ifnet *newp) { + struct ifih *vlan_ifih; struct sockaddr_dl *sdl; struct ifvlan *ifv; struct ifnet *p; @@ -487,6 +506,14 @@ vlan_unconfig(struct ifnet *ifp, struct if (newp != NULL) { ifp->if_link_state = LINK_STATE_INVALID; if_link_state_change(ifp); + } + + /* Restore previous input handler. */ + vlan_ifih = SLIST_FIRST(&p->if_inputs); + KASSERT(vlan_ifih->ifih_input == vlan_input); + if (--vlan_ifih->ifih_refcnt == 0) { + SLIST_REMOVE_HEAD(&p->if_inputs, ifih_next); + free(vlan_ifih, M_DEVBUF, sizeof(*vlan_ifih)); } /* Index: net/if_vlan_var.h =================================================================== RCS file: /cvs/src/sys/net/if_vlan_var.h,v retrieving revision 1.24 diff -u -p -r1.24 if_vlan_var.h --- net/if_vlan_var.h 24 Oct 2013 11:14:33 -0000 1.24 +++ net/if_vlan_var.h 11 May 2015 08:54:38 -0000 @@ -94,8 +94,6 @@ struct ifvlan { #define ifv_prio ifv_mib.ifvm_prio #define ifv_type ifv_mib.ifvm_type #define IFVF_PROMISC 0x01 - -extern int vlan_input(struct ether_header *eh, struct mbuf *m); #endif /* _KERNEL */ #endif /* _NET_IF_VLAN_VAR_H_ */