Index: share/man/man4/Makefile =================================================================== RCS file: /cvs/src/share/man/man4/Makefile,v retrieving revision 1.697 diff -u -p -r1.697 Makefile --- share/man/man4/Makefile 23 Nov 2018 12:38:44 -0000 1.697 +++ share/man/man4/Makefile 19 Dec 2018 04:16:07 -0000 @@ -15,7 +15,7 @@ MAN= aac.4 ac97.4 acphy.4 acrtc.4 \ auacer.4 audio.4 aue.4 auglx.4 auich.4 auixp.4 autri.4 auvia.4 \ axe.4 axen.4 axppmic.4 azalia.4 \ bce.4 bcmaux.4 bcmdog.4 bcmrng.4 bcmtemp.4 berkwdt.4 bge.4 \ - bgw.4 bio.4 bktr.4 bmtphy.4 bnx.4 bnxt.4 \ + bgw.4 bio.4 bpe.4 bktr.4 bmtphy.4 bnx.4 bnxt.4 \ boca.4 bpf.4 brgphy.4 bridge.4 brswphy.4 bwfm.4 bwi.4 bytgpio.4 \ cac.4 cas.4 cardbus.4 carp.4 ccp.4 ccpmic.4 cd.4 cdce.4 cfxga.4 \ ch.4 chvgpio.4 ciphy.4 ciss.4 clcs.4 clct.4 cmpci.4 \ Index: share/man/man4/bpe.4 =================================================================== RCS file: share/man/man4/bpe.4 diff -N share/man/man4/bpe.4 --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ share/man/man4/bpe.4 19 Dec 2018 04:16:07 -0000 @@ -0,0 +1,159 @@ +.\" $OpenBSD$ +.\" +.\" Copyright (c) 2018 David Gwynne +.\" +.\" Permission to use, copy, modify, and distribute this software for any +.\" purpose with or without fee is hereby granted, provided that the above +.\" copyright notice and this permission notice appear in all copies. +.\" +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. +.\" +.Dd $Mdocdate: November 16 2018 $ +.Dt BPE 4 +.Os +.Sh NAME +.Nm bpe +.Nd Backbone Provider Edge pseudo-device +.Sh SYNOPSIS +.Cd "pseudo-device bpe" +.Sh DESCRIPTION +The +.Nm bpe +driver allows construction of IEEE 802.1Q Provider Backbone Bridge +(PBB) networks by acting as a Backbone Edge Bridge (BEB). +PBB, also known as mac-in-mac, was originally specified in +IEEE 802.1ah-2008 and became part of IEEE 802.1Q-2011. +.Pp +A Provider Backbone Bridge Network (PBBN) consists of BEBs +interconnected by Backbone Core Bridges (BCBs) to form an Ethernet +network for the transport of encapsulated Ethernet packets. +Where VLAN and SVLAN protocols add a shim to differentiate Ethernet +packets for different networks but retain the Ethernet addresses +of encapsulated traffic, PBB completely encapsulates Ethernet packets +for transmission between BEBs on a PBBN. +This removes the need for intermediate BCB devices on the backbone +network to learn the Ethernet addresses of devices on the encapsulated +network, but requires each BEB to maintain a mapping of addresses +on the encapsulated network to peer BEBs. +.Pp +A PBB packet consists of another Ethernet frame containing Ethernet +addresses for BEBs and the PBB Ethernet protocol type (0x88e7), a +32-bit Backbone Service Instance Tag (I-TAG), followed by the +encapsulated Ethernet frame. +The I-TAG contains a 24-bit Backbone Service Instance Identifiier +(I-SID) to differentiate different PBBNs on the same backbone network +.Pp +IEEE 802.1Q describes Customer VLANs being encapsulated by PBB, +which in turn uses an S-VLAN service. +This can be implemented with +.Xr vlan 4 +using a +.Nm bpe +interface as the parent, +and with the +.Nm bpe +interface using +.Xr svlan 4 +as the parent. +.Nm bpe +itself does not require this topology, therefore allowing flexible +deployment and network topologies. +.Pp +The +Nm. bpe +driver implements a learning bridge on each interface. +The driver will learn the mapping of BEPs to encapsulated Ethernet +address based on traffic received from other devices on the backbone +network. +Traffic sent to broadcast, multicast, or unknown unicast Etherent +addresses will be flooded to a multicast address on the backbone network. +The multicast address used for each PBB Service Instance +will begin with 01:1e:83 as the first three octets, with the I-SID +as the last three octets, e.g., a +.Nm bpe +interface with a vnetid of 1024 (0x400 in hex) will have a multicast +group address of 01:1e:83:00:04:00. +The address learning in +.Nm bpe +only uses the Ethernet addresses of encapsulated traffic for it's +forwarding decisions, it does not use VLAN or S-VLAN tags to +differentiate services. +.Pp +.Nm bpe +interfaces can be created at runtime using the +.Ic ifconfig bpe Ns Ar N Ic create +command or by setting up a +.Xr hostname.if 5 +configuration file for +.Xr netstart 8 . +The interface itself can be configured with +.Xr ifconfig 8 ; +see its manual page for more information. +.Pp +.Nm bpe +interfaces must be configured with a parent Ethernet interface to +operate, and a virtual network identifier for use as the I-SID. +.Pp +The I-TAG includes a priority field. +By default, the value of the priority field in a transmitted packet +is based on the priority of packets sent over the interface, which +may be altered via +.Xr pf.conf 5 ; +see the +.Cm prio +option for more information. +Alternatively, +.Cm txprio +can set a specific priority for transmitted packets. +.Pp +.Nm bpe +interfaces support the following +.Xr ioctl 2 Ns s : +.Bl -tag -width indent -offset 3n +.It Dv SIOCSIFPARENT Fa "struct if_parent *" +Set the parent interface. +The parent may only be configured while the virtual interface is +administratively down. +.It Dv SIOCGIFPARENT Fa "struct if_parent *" +Get the currently configured parent interface. +.It Dv SIOCDIFPARENT Fa "struct ifreq *" +Delete the parent interface configuration. +The parent may only be removed while the virtual interface is +administratively down. +.It Dv SIOCSVNETID Fa "struct ifreq *" +Set the virtual network identifier. +Valid identifiers are 24-bit values in the range 0 to 16777215. +.It Dv SIOCGVNETID Fa "struct if_parent *" +Get the currently configured virtual network identifier. +.El +.Sh SEE ALSO +.Xr bridge 4 , +.Xr inet 4 , +.Xr ip 4 , +.Xr netintro 4 , +.Xr vlan 4 , +.Xr hostname.if 5 , +.Xr pf.conf 5 , +.Xr ifconfig 8 , +.Xr netstart 8 +.Rs +.%T IEEE 802.1Q standard +.%U http://standards.ieee.org/getieee802/802.1.html +.Re +.Rs +.%Q Provider Backbone Bridges +.%T IEEE 802.1ah standard +.Re +.Sh HISTORY +The +.Nm +driver first appeared in +.Ox 6.6 . +.Sh AUTHORS +.An David Gwynne Aq Mt dlg@openbsd.org . Index: sys/net/if_bpe.c =================================================================== RCS file: sys/net/if_bpe.c diff -N sys/net/if_bpe.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ sys/net/if_bpe.c 19 Dec 2018 04:16:07 -0000 @@ -0,0 +1,1016 @@ +/* $OpenBSD$ */ +/* + * Copyright (c) 2018 David Gwynne + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include "bpfilter.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include +#include + +/* for bridge stuff */ +#include + + +#if NBPFILTER > 0 +#include +#endif + +#include + +#define PBB_ITAG_ISID 0x00ffffff +#define PBB_ITAG_ISID_MIN 0x00000000 +#define PBB_ITAG_ISID_MAX 0x00ffffff +#define PBB_ITAG_RES2 0x03000000 /* must be zero on input */ +#define PBB_ITAG_RES1 0x04000000 /* ignore on input */ +#define PBB_ITAG_UCA 0x08000000 +#define PBB_ITAG_DEI 0x10000000 +#define PBB_ITAG_PCP_SHIFT 29 +#define PBB_ITAG_PCP_MASK (0x7U << PBB_ITAG_PCP_SHIFT) + +#define BPE_BRIDGE_AGE_TMO 100 /* seconds */ + +struct bpe_key { + int k_if; + uint32_t k_isid; + + RBT_ENTRY(bpe_tunnel) k_entry; +}; + +RBT_HEAD(bpe_tree, bpe_key); + +static inline int bpe_cmp(const struct bpe_key *, const struct bpe_key *); + +RBT_PROTOTYPE(bpe_tree, bpe_key, k_entry, bpe_cmp); +RBT_GENERATE(bpe_tree, bpe_key, k_entry, bpe_cmp); + +struct bpe_entry { + struct ether_addr be_c_da; /* customer address - must be first */ + struct ether_addr be_b_da; /* bridge address */ + unsigned int be_type; +#define BPE_ENTRY_DYNAMIC 0 +#define BPE_ENTRY_STATIC 1 + struct refcnt be_refs; + time_t be_age; + + RBT_ENTRY(bpe_entry) be_entry; +}; + +RBT_HEAD(bpe_map, bpe_entry); + +static inline int bpe_entry_cmp(const struct bpe_entry *, + const struct bpe_entry *); + +RBT_PROTOTYPE(bpe_map, bpe_entry, be_entry, bpe_entry_cmp); +RBT_GENERATE(bpe_map, bpe_entry, be_entry, bpe_entry_cmp); + +struct bpe_softc { + struct bpe_key sc_key; /* must be first */ + struct arpcom sc_ac; + struct ifmedia sc_media; + int sc_txhprio; + uint8_t sc_group[ETHER_ADDR_LEN]; + + void * sc_lh_cookie; + void * sc_dh_cookie; + + struct bpe_map sc_bridge_map; + struct rwlock sc_bridge_lock; + unsigned int sc_bridge_num; + unsigned int sc_bridge_max; + int sc_bridge_tmo; /* seconds */ + struct timeout sc_bridge_age; +}; + +void bpeattach(int); + +static int bpe_clone_create(struct if_clone *, int); +static int bpe_clone_destroy(struct ifnet *); + +static void bpe_start(struct ifnet *); +static int bpe_ioctl(struct ifnet *, u_long, caddr_t); +static int bpe_media_get(struct bpe_softc *, struct ifreq *); +static int bpe_up(struct bpe_softc *); +static int bpe_down(struct bpe_softc *); +static int bpe_multi(struct bpe_softc *, struct ifnet *, u_long); +static int bpe_set_vnetid(struct bpe_softc *, const struct ifreq *); +static void bpe_set_group(struct bpe_softc *, uint32_t); +static int bpe_set_parent(struct bpe_softc *, const struct if_parent *); +static int bpe_get_parent(struct bpe_softc *, struct if_parent *); +static int bpe_del_parent(struct bpe_softc *); +static void bpe_link_hook(void *); +static void bpe_link_state(struct bpe_softc *, u_char, uint64_t); +static void bpe_detach_hook(void *); + +static void bpe_input_map(struct bpe_softc *, + const uint8_t *, const uint8_t *); +static void bpe_bridge_age(void *); + +static struct if_clone bpe_cloner = + IF_CLONE_INITIALIZER("bpe", bpe_clone_create, bpe_clone_destroy); + +static struct bpe_tree bpe_interfaces = RBT_INITIALIZER(); +static struct rwlock bpe_lock = RWLOCK_INITIALIZER("bpeifs"); +static struct pool bpe_entry_pool; + +#define ether_cmp(_a, _b) memcmp((_a), (_b), ETHER_ADDR_LEN) +#define ether_is_eq(_a, _b) (ether_cmp((_a), (_b)) == 0) +#define ether_is_bcast(_a) ether_is_eq((_a), etherbroadcastaddr) + +void +bpeattach(int count) +{ + if_clone_attach(&bpe_cloner); +} + +static int +bpe_clone_create(struct if_clone *ifc, int unit) +{ + struct bpe_softc *sc; + struct ifnet *ifp; + + if (bpe_entry_pool.pr_size == 0) { + pool_init(&bpe_entry_pool, sizeof(struct bpe_entry), 0, + IPL_NONE, 0, "bpepl", NULL); + } + + sc = malloc(sizeof(*sc), M_DEVBUF, M_WAITOK|M_ZERO); + ifp = &sc->sc_ac.ac_if; + + snprintf(ifp->if_xname, sizeof(ifp->if_xname), "%s%d", + ifc->ifc_name, unit); + + sc->sc_key.k_if = 0; + sc->sc_key.k_isid = 0; + bpe_set_group(sc, 0); + + sc->sc_txhprio = IF_HDRPRIO_PACKET; + + rw_init(&sc->sc_bridge_lock, "bpebr"); + RBT_INIT(bpe_map, &sc->sc_bridge_map); + sc->sc_bridge_num = 0; + sc->sc_bridge_max = 100; /* XXX */ + sc->sc_bridge_tmo = 240; + timeout_set_proc(&sc->sc_bridge_age, bpe_bridge_age, sc); + + ifp->if_softc = sc; + ifp->if_hardmtu = ETHER_MAX_HARDMTU_LEN; + ifp->if_ioctl = bpe_ioctl; + ifp->if_start = bpe_start; + ifp->if_flags = IFF_BROADCAST | IFF_MULTICAST; + ifp->if_xflags = IFXF_CLONED; + IFQ_SET_MAXLEN(&ifp->if_snd, IFQ_MAXLEN); + ether_fakeaddr(ifp); + + if_attach(ifp); + ether_ifattach(ifp); + + return (0); +} + +static int +bpe_clone_destroy(struct ifnet *ifp) +{ + struct bpe_softc *sc = ifp->if_softc; + + NET_LOCK(); + if (ISSET(ifp->if_flags, IFF_RUNNING)) + bpe_down(sc); + NET_UNLOCK(); + + ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); + ether_ifdetach(ifp); + if_detach(ifp); + + free(sc, M_DEVBUF, sizeof(*sc)); + + return (0); +} + +static inline int +bpe_entry_valid(struct bpe_softc *sc, const struct bpe_entry *be) +{ + time_t diff; + + if (be == NULL) + return (0); + + if (be->be_type == BPE_ENTRY_STATIC) + return (1); + + diff = time_uptime - be->be_age; + if (diff < sc->sc_bridge_tmo) + return (1); + + return (0); +} + +static void +bpe_start(struct ifnet *ifp) +{ + struct bpe_softc *sc = ifp->if_softc; + struct ifnet *ifp0; + struct mbuf *m0, *m; + struct ether_header *ceh; + struct ether_header *beh; + uint32_t itag, *itagp; + int hlen = sizeof(*beh) + sizeof(*itagp); +#if NBPFILTER > 0 + caddr_t if_bpf; +#endif + int txprio; + uint8_t prio; + + ifp0 = if_get(sc->sc_key.k_if); + if (ifp0 == NULL || !ISSET(ifp0->if_flags, IFF_RUNNING)) { + ifq_purge(&ifp->if_snd); + goto done; + } + + txprio = sc->sc_txhprio; + + while ((m0 = ifq_dequeue(&ifp->if_snd)) != NULL) { +#if NBPFILTER > 0 + if_bpf = ifp->if_bpf; + if (if_bpf) + bpf_mtap_ether(if_bpf, m0, BPF_DIRECTION_OUT); +#endif + + ceh = mtod(m0, struct ether_header *); + + /* force prepend of a whole mbuf because of alignment */ + m = m_get(M_DONTWAIT, m0->m_type); + if (m == NULL) { + m_freem(m0); + continue; + } + + M_MOVE_PKTHDR(m, m0); + m->m_next = m0; + + m_align(m, 0); + m->m_len = 0; + + m = m_prepend(m, hlen, M_DONTWAIT); + if (m == NULL) + continue; + + beh = mtod(m, struct ether_header *); + + if (ether_is_bcast(ceh->ether_dhost)) { + memcpy(beh->ether_dhost, sc->sc_group, + sizeof(beh->ether_dhost)); + } else { + struct bpe_entry *be; + + rw_enter_read(&sc->sc_bridge_lock); + be = RBT_FIND(bpe_map, &sc->sc_bridge_map, + (struct bpe_entry *)ceh->ether_dhost); + if (bpe_entry_valid(sc, be)) { + memcpy(beh->ether_dhost, &be->be_b_da, + sizeof(beh->ether_dhost)); + } else { + /* "flood" to unknown hosts */ + memcpy(beh->ether_dhost, sc->sc_group, + sizeof(beh->ether_dhost)); + } + rw_exit_read(&sc->sc_bridge_lock); + } + + memcpy(beh->ether_shost, ((struct arpcom *)ifp0)->ac_enaddr, + sizeof(beh->ether_shost)); + beh->ether_type = htons(ETHERTYPE_PBB); + + prio = (txprio == IF_HDRPRIO_PACKET) ? + m->m_pkthdr.pf.prio : txprio; + + itag = sc->sc_key.k_isid; + itag |= prio << PBB_ITAG_PCP_SHIFT; + itagp = (uint32_t *)(beh + 1); + + htobem32(itagp, itag); + + if_enqueue(ifp0, m); + } + +done: + if_put(ifp0); +} + +static void +bpe_bridge_age(void *arg) +{ + struct bpe_softc *sc = arg; + struct bpe_entry *be, *nbe; + time_t diff; + + timeout_add_sec(&sc->sc_bridge_age, BPE_BRIDGE_AGE_TMO); + + rw_enter_write(&sc->sc_bridge_lock); + RBT_FOREACH_SAFE(be, bpe_map, &sc->sc_bridge_map, nbe) { + if (be->be_type != BPE_ENTRY_DYNAMIC) + continue; + + diff = time_uptime - be->be_age; + if (diff < sc->sc_bridge_tmo) + continue; + + sc->sc_bridge_num--; + RBT_REMOVE(bpe_map, &sc->sc_bridge_map, be); + if (refcnt_rele(&be->be_refs)) + pool_put(&bpe_entry_pool, be); + } + rw_exit_write(&sc->sc_bridge_lock); +} + +static int +bpe_rtfind(struct bpe_softc *sc, struct ifbaconf *baconf) +{ + struct ifnet *ifp = &sc->sc_ac.ac_if; + struct bpe_entry *be; + struct ifbareq bareq; + caddr_t uaddr, end; + int error; + time_t age; + struct sockaddr_dl *sdl; + + if (baconf->ifbac_len == 0) { + /* single read is atomic */ + baconf->ifbac_len = sc->sc_bridge_num * sizeof(bareq); + return (0); + } + + uaddr = baconf->ifbac_buf; + end = uaddr + baconf->ifbac_len; + + rw_enter_read(&sc->sc_bridge_lock); + RBT_FOREACH(be, bpe_map, &sc->sc_bridge_map) { + if (uaddr >= end) + break; + + memcpy(bareq.ifba_name, ifp->if_xname, + sizeof(bareq.ifba_name)); + memcpy(bareq.ifba_ifsname, ifp->if_xname, + sizeof(bareq.ifba_ifsname)); + memcpy(&bareq.ifba_dst, &be->be_c_da, + sizeof(bareq.ifba_dst)); + + memset(&bareq.ifba_dstsa, 0, sizeof(bareq.ifba_dstsa)); + + bzero(&bareq.ifba_dstsa, sizeof(bareq.ifba_dstsa)); + sdl = (struct sockaddr_dl *)&bareq.ifba_dstsa; + sdl->sdl_len = sizeof(sdl); + sdl->sdl_family = AF_LINK; + sdl->sdl_index = 0; + sdl->sdl_type = IFT_ETHER; + sdl->sdl_nlen = 0; + sdl->sdl_alen = sizeof(be->be_b_da); + CTASSERT(sizeof(sdl->sdl_data) >= sizeof(be->be_b_da)); + memcpy(sdl->sdl_data, &be->be_b_da, sizeof(be->be_b_da)); + + switch (be->be_type) { + case BPE_ENTRY_DYNAMIC: + age = time_uptime - be->be_age; + bareq.ifba_age = MIN(age, 0xff); + bareq.ifba_flags = IFBAF_DYNAMIC; + break; + case BPE_ENTRY_STATIC: + bareq.ifba_age = 0; + bareq.ifba_flags = IFBAF_STATIC; + break; + } + + error = copyout(&bareq, uaddr, sizeof(bareq)); + if (error != 0) { + rw_exit_read(&sc->sc_bridge_lock); + return (error); + } + + uaddr += sizeof(bareq); + } + baconf->ifbac_len = sc->sc_bridge_num * sizeof(bareq); + rw_exit_read(&sc->sc_bridge_lock); + + return (0); +} + +static void +bpe_flush_map(struct bpe_softc *sc, uint32_t flags) +{ + struct bpe_entry *be, *nbe; + + rw_enter_write(&sc->sc_bridge_lock); + RBT_FOREACH_SAFE(be, bpe_map, &sc->sc_bridge_map, nbe) { + if (flags == IFBF_FLUSHDYN && + be->be_type != BPE_ENTRY_DYNAMIC) + continue; + + RBT_REMOVE(bpe_map, &sc->sc_bridge_map, be); + if (refcnt_rele(&be->be_refs)) + pool_put(&bpe_entry_pool, be); + } + rw_exit_write(&sc->sc_bridge_lock); +} + +static int +bpe_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data) +{ + struct bpe_softc *sc = ifp->if_softc; + struct ifreq *ifr = (struct ifreq *)data; + struct ifbrparam *bparam = (struct ifbrparam *)data; + int error = 0; + + switch (cmd) { + case SIOCSIFFLAGS: + if (ISSET(ifp->if_flags, IFF_UP)) { + if (!ISSET(ifp->if_flags, IFF_RUNNING)) + error = bpe_up(sc); + else + error = 0; + } else { + if (ISSET(ifp->if_flags, IFF_RUNNING)) + error = bpe_down(sc); + } + break; + + case SIOCSVNETID: + error = bpe_set_vnetid(sc, ifr); + case SIOCGVNETID: + ifr->ifr_vnetid = sc->sc_key.k_isid; + break; + + case SIOCSIFPARENT: + error = bpe_set_parent(sc, (struct if_parent *)data); + break; + case SIOCGIFPARENT: + error = bpe_get_parent(sc, (struct if_parent *)data); + break; + case SIOCDIFPARENT: + error = bpe_del_parent(sc); + break; + + case SIOCSTXHPRIO: + if (ifr->ifr_hdrprio == IF_HDRPRIO_PACKET) /* use mbuf prio */ + ; + else if (ifr->ifr_hdrprio < IF_HDRPRIO_MIN || + ifr->ifr_hdrprio > IF_HDRPRIO_MAX) { + error = EINVAL; + break; + } + + sc->sc_txhprio = ifr->ifr_hdrprio; + break; + case SIOCGTXHPRIO: + ifr->ifr_hdrprio = sc->sc_txhprio; + break; + + case SIOCGIFMEDIA: + error = bpe_media_get(sc, ifr); + break; + + case SIOCBRDGSCACHE: + error = suser(curproc); + if (error != 0) + break; + + if (bparam->ifbrp_csize < 1) { + error = EINVAL; + break; + } + + /* commit */ + sc->sc_bridge_max = bparam->ifbrp_csize; + break; + case SIOCBRDGGCACHE: + bparam->ifbrp_csize = sc->sc_bridge_max; + break; + + case SIOCBRDGSTO: + error = suser(curproc); + if (error != 0) + break; + + if (bparam->ifbrp_ctime < 8 || + bparam->ifbrp_ctime > 3600) { + error = EINVAL; + break; + } + sc->sc_bridge_tmo = bparam->ifbrp_ctime; + break; + case SIOCBRDGGTO: + bparam->ifbrp_ctime = sc->sc_bridge_tmo; + break; + + case SIOCBRDGRTS: + error = bpe_rtfind(sc, (struct ifbaconf *)data); + break; + case SIOCBRDGFLUSH: + error = suser(curproc); + if (error != 0) + break; + + bpe_flush_map(sc, + ((struct ifbreq *)data)->ifbr_ifsflags); + break; + + default: + error = ether_ioctl(ifp, &sc->sc_ac, cmd, data); + break; + } + + return (error); +} + +static int +bpe_media_get(struct bpe_softc *sc, struct ifreq *ifr) +{ + struct ifnet *ifp0; + int error; + + ifp0 = if_get(sc->sc_key.k_if); + if (ifp0 != NULL) + error = (*ifp0->if_ioctl)(ifp0, SIOCGIFMEDIA, (caddr_t)ifr); + else + error = ENOTTY; + if_put(ifp0); + + return (error); +} + +static int +bpe_up(struct bpe_softc *sc) +{ + struct ifnet *ifp = &sc->sc_ac.ac_if; + struct ifnet *ifp0; + struct bpe_softc *osc; + int error = 0; + u_int hardmtu; + + KASSERT(!ISSET(ifp->if_flags, IFF_RUNNING)); + NET_ASSERT_LOCKED(); + + ifp0 = if_get(sc->sc_key.k_if); + if (ifp0 == NULL) + return (ENXIO); + + /* check again if bpe will work on top of the parent */ + if (ifp0->if_type != IFT_ETHER) { + error = EPROTONOSUPPORT; + goto put; + } + + hardmtu = ifp0->if_hardmtu; + /* check min mtu? */ + if (ifp->if_mtu > hardmtu) { + error = ENOBUFS; + goto put; + } + + /* parent is fine, let's prepare the bpe to handle packets */ + ifp->if_hardmtu = hardmtu; + SET(ifp->if_flags, ifp0->if_flags & IFF_SIMPLEX); + + /* commit the interface */ + error = rw_enter(&bpe_lock, RW_WRITE | RW_INTR); + if (error != 0) + goto scrub; + + osc = (struct bpe_softc *)RBT_INSERT(bpe_tree, &bpe_interfaces, + (struct bpe_key *)sc); + rw_exit(&bpe_lock); + + if (osc != NULL) { + error = EADDRINUSE; + goto scrub; + } + + if (bpe_multi(sc, ifp0, SIOCADDMULTI) != 0) { + error = ENOTCONN; + goto remove; + } + + /* Register callback for physical link state changes */ + sc->sc_lh_cookie = hook_establish(ifp0->if_linkstatehooks, 1, + bpe_link_hook, sc); + + /* Register callback if parent wants to unregister */ + sc->sc_dh_cookie = hook_establish(ifp0->if_detachhooks, 0, + bpe_detach_hook, sc); + + /* we're running now */ + SET(ifp->if_flags, IFF_RUNNING); + bpe_link_state(sc, ifp0->if_link_state, ifp0->if_baudrate); + + if_put(ifp0); + + timeout_add_sec(&sc->sc_bridge_age, BPE_BRIDGE_AGE_TMO); + + return (0); + +remove: + rw_enter(&bpe_lock, RW_WRITE); + RBT_REMOVE(bpe_tree, &bpe_interfaces, (struct bpe_key *)sc); + rw_exit(&bpe_lock); +scrub: + CLR(ifp->if_flags, IFF_SIMPLEX); + ifp->if_hardmtu = 0xffff; +put: + if_put(ifp0); + + return (error); +} + +static int +bpe_down(struct bpe_softc *sc) +{ + struct ifnet *ifp = &sc->sc_ac.ac_if; + struct ifnet *ifp0; + + NET_ASSERT_LOCKED(); + + CLR(ifp->if_flags, IFF_RUNNING); + + ifp0 = if_get(sc->sc_key.k_if); + if (ifp0 != NULL) { + hook_disestablish(ifp0->if_detachhooks, sc->sc_dh_cookie); + hook_disestablish(ifp0->if_linkstatehooks, sc->sc_lh_cookie); + bpe_multi(sc, ifp0, SIOCDELMULTI); + } + if_put(ifp0); + + rw_enter(&bpe_lock, RW_WRITE); + RBT_REMOVE(bpe_tree, &bpe_interfaces, (struct bpe_key *)sc); + rw_exit(&bpe_lock); + + CLR(ifp->if_flags, IFF_SIMPLEX); + ifp->if_hardmtu = 0xffff; + + return (0); +} + +static int +bpe_multi(struct bpe_softc *sc, struct ifnet *ifp0, u_long cmd) +{ + struct ifreq ifr; + struct sockaddr *sa; + + /* make it convincing */ + CTASSERT(sizeof(ifr.ifr_name) == sizeof(ifp0->if_xname)); + memcpy(ifr.ifr_name, ifp0->if_xname, sizeof(ifr.ifr_name)); + + sa = &ifr.ifr_addr; + CTASSERT(sizeof(sa->sa_data) >= sizeof(sc->sc_group)); + + sa->sa_family = AF_UNSPEC; + memcpy(sa->sa_data, sc->sc_group, sizeof(sc->sc_group)); + + return ((*ifp0->if_ioctl)(ifp0, cmd, (caddr_t)&ifr)); +} + +static void +bpe_set_group(struct bpe_softc *sc, uint32_t isid) +{ + uint8_t *group = sc->sc_group; + + group[0] = 0x01; + group[1] = 0x1e; + group[2] = 0x83; + group[3] = isid >> 16; + group[4] = isid >> 8; + group[5] = isid >> 0; +} + +static int +bpe_set_vnetid(struct bpe_softc *sc, const struct ifreq *ifr) +{ + struct ifnet *ifp = &sc->sc_ac.ac_if; + uint32_t isid; + + if (ifr->ifr_vnetid < PBB_ITAG_ISID_MIN || + ifr->ifr_vnetid > PBB_ITAG_ISID_MAX) + return (EINVAL); + + isid = ifr->ifr_vnetid; + if (isid == sc->sc_key.k_isid) + return (0); + + if (ISSET(ifp->if_flags, IFF_RUNNING)) + return (EBUSY); + + /* commit */ + sc->sc_key.k_isid = isid; + bpe_set_group(sc, isid); + bpe_flush_map(sc, IFBF_FLUSHALL); + + return (0); +} + +static int +bpe_set_parent(struct bpe_softc *sc, const struct if_parent *p) +{ + struct ifnet *ifp = &sc->sc_ac.ac_if; + struct ifnet *ifp0; + + ifp0 = ifunit(p->ifp_parent); /* doesn't need an if_put */ + if (ifp0 == NULL) + return (ENXIO); + + if (ifp0->if_type != IFT_ETHER) + return (ENXIO); + + if (ifp0->if_index == sc->sc_key.k_if) + return (0); + + if (ISSET(ifp->if_flags, IFF_RUNNING)) + return (EBUSY); + + /* commit */ + sc->sc_key.k_if = ifp0->if_index; + bpe_flush_map(sc, IFBF_FLUSHALL); + + return (0); +} + +static int +bpe_get_parent(struct bpe_softc *sc, struct if_parent *p) +{ + struct ifnet *ifp0; + int error = 0; + + ifp0 = if_get(sc->sc_key.k_if); + if (ifp0 == NULL) + error = EADDRNOTAVAIL; + else + memcpy(p->ifp_parent, ifp0->if_xname, sizeof(p->ifp_parent)); + if_put(ifp0); + + return (error); +} + +static int +bpe_del_parent(struct bpe_softc *sc) +{ + struct ifnet *ifp = &sc->sc_ac.ac_if; + + if (ISSET(ifp->if_flags, IFF_RUNNING)) + return (EBUSY); + + /* commit */ + sc->sc_key.k_if = 0; + bpe_flush_map(sc, IFBF_FLUSHALL); + + return (0); +} + +static inline struct bpe_softc * +bpe_find(struct ifnet *ifp0, uint32_t isid) +{ + struct bpe_key k = { .k_if = ifp0->if_index, .k_isid = isid }; + struct bpe_softc *sc; + + rw_enter_read(&bpe_lock); + sc = (struct bpe_softc *)RBT_FIND(bpe_tree, &bpe_interfaces, &k); + rw_exit_read(&bpe_lock); + + return (sc); +} + +static void +bpe_input_map(struct bpe_softc *sc, const uint8_t *ba, const uint8_t *ca) +{ + struct bpe_entry *be; + int new = 0; + + if (ETHER_IS_MULTICAST(ca)) + return; + + /* remember where it came from */ + rw_enter_read(&sc->sc_bridge_lock); + be = RBT_FIND(bpe_map, &sc->sc_bridge_map, (struct bpe_entry *)ca); + if (be == NULL) + new = 1; + else { + be->be_age = time_uptime; /* only a little bit racy */ + + if (be->be_type != BPE_ENTRY_DYNAMIC || + ether_is_eq(ba, &be->be_b_da)) + be = NULL; + else + refcnt_take(&be->be_refs); + } + rw_exit_read(&sc->sc_bridge_lock); + + if (new) { + struct bpe_entry *obe; + unsigned int num; + + be = pool_get(&bpe_entry_pool, PR_NOWAIT); + if (be == NULL) { + /* oh well */ + return; + } + + memcpy(&be->be_c_da, ca, sizeof(be->be_c_da)); + memcpy(&be->be_b_da, ba, sizeof(be->be_b_da)); + be->be_type = BPE_ENTRY_DYNAMIC; + refcnt_init(&be->be_refs); + be->be_age = time_uptime; + + rw_enter_write(&sc->sc_bridge_lock); + num = sc->sc_bridge_num; + if (++num > sc->sc_bridge_max) + obe = be; + else { + /* try and give the ref to the map */ + obe = RBT_INSERT(bpe_map, &sc->sc_bridge_map, be); + if (obe == NULL) { + /* count the insert */ + sc->sc_bridge_num = num; + } + } + rw_exit_write(&sc->sc_bridge_lock); + + if (obe != NULL) + pool_put(&bpe_entry_pool, obe); + } else if (be != NULL) { + rw_enter_write(&sc->sc_bridge_lock); + memcpy(&be->be_b_da, ba, sizeof(be->be_b_da)); + rw_exit_write(&sc->sc_bridge_lock); + + if (refcnt_rele(&be->be_refs)) { + /* ioctl may have deleted the entry */ + pool_put(&bpe_entry_pool, be); + } + } +} + +void +bpe_input(struct ifnet *ifp0, struct mbuf *m) +{ + struct mbuf_list ml = MBUF_LIST_INITIALIZER(); + struct bpe_softc *sc; + struct ifnet *ifp; + struct ether_header *beh, *ceh; + uint32_t *itagp, itag; + unsigned int hlen = sizeof(*beh) + sizeof(*itagp) + sizeof(*ceh); + struct mbuf *n; + int off; + + if (m->m_len < hlen) { + m = m_pullup(m, hlen); + if (m == NULL) { + /* pbb short ++ */ + return; + } + } + + beh = mtod(m, struct ether_header *); + itagp = (uint32_t *)(beh + 1); + itag = bemtoh32(itagp); + + if (itag & PBB_ITAG_RES2) { + /* dropped by res2 ++ */ + goto drop; + } + + sc = bpe_find(ifp0, itag & PBB_ITAG_ISID); + if (sc == NULL) { + /* no interface found */ + goto drop; + } + + ceh = (struct ether_header *)(itagp + 1); + + bpe_input_map(sc, beh->ether_shost, ceh->ether_shost); + + m_adj(m, sizeof(*beh) + sizeof(*itagp)); + + n = m_getptr(m, sizeof(*ceh), &off); + if (n == NULL) { + /* no data ++ */ + goto drop; + } + + if (!ALIGNED_POINTER(mtod(n, caddr_t) + off, uint32_t)) { + /* unaligned ++ */ + n = m_dup_pkt(m, ETHER_ALIGN, M_NOWAIT); + m_freem(m); + if (n == NULL) + return; + + m = n; + } + + ifp = &sc->sc_ac.ac_if; + + m->m_flags &= ~(M_BCAST|M_MCAST); + m->m_pkthdr.ph_ifidx = ifp->if_index; + m->m_pkthdr.ph_rtableid = ifp->if_rdomain; + +#if NPF > 0 + pf_pkt_addr_changed(m); +#endif + + ml_enqueue(&ml, m); + if_input(ifp, &ml); + return; + +drop: + m_freem(m); +} + +void +bpe_detach_hook(void *arg) +{ + struct bpe_softc *sc = arg; + struct ifnet *ifp = &sc->sc_ac.ac_if; + + if (ISSET(ifp->if_flags, IFF_RUNNING)) { + bpe_down(sc); + CLR(ifp->if_flags, IFF_UP); + } + + sc->sc_key.k_if = 0; +} + +static void +bpe_link_hook(void *arg) +{ + struct bpe_softc *sc = arg; + struct ifnet *ifp0; + u_char link = LINK_STATE_DOWN; + uint64_t baud = 0; + + ifp0 = if_get(sc->sc_key.k_if); + if (ifp0 != NULL) { + link = ifp0->if_link_state; + baud = ifp0->if_baudrate; + } + if_put(ifp0); + + bpe_link_state(sc, link, baud); +} + +void +bpe_link_state(struct bpe_softc *sc, u_char link, uint64_t baud) +{ + struct ifnet *ifp = &sc->sc_ac.ac_if; + + if (ifp->if_link_state == link) + return; + + ifp->if_link_state = link; + ifp->if_baudrate = baud; + + if_link_state_change(ifp); +} + +static inline int +bpe_cmp(const struct bpe_key *a, const struct bpe_key *b) +{ + if (a->k_if > b->k_if) + return (1); + if (a->k_if < b->k_if) + return (-1); + if (a->k_isid > b->k_isid) + return (1); + if (a->k_isid < b->k_isid) + return (-1); + + return (0); +} + +static inline int +bpe_entry_cmp(const struct bpe_entry *a, const struct bpe_entry *b) +{ + return memcmp(&a->be_c_da, &b->be_c_da, sizeof(a->be_c_da)); +} Index: sys/net/if_bpe.h =================================================================== RCS file: sys/net/if_bpe.h diff -N sys/net/if_bpe.h --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ sys/net/if_bpe.h 19 Dec 2018 04:16:07 -0000 @@ -0,0 +1,26 @@ +/* $OpenBSD$ */ + +/* + * Copyright (c) 2018 David Gwynne + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#ifndef _NET_IF_BPE_H +#define _NET_IF_BPE_H + +#ifdef _KERNEL +void bpe_input(struct ifnet *, struct mbuf *); +#endif + +#endif /* _NET_IF_GRE_H_ */ Index: sys/net/if_ethersubr.c =================================================================== RCS file: /cvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.255 diff -u -p -r1.255 if_ethersubr.c --- sys/net/if_ethersubr.c 12 Dec 2018 05:38:26 -0000 1.255 +++ sys/net/if_ethersubr.c 19 Dec 2018 04:16:07 -0000 @@ -108,6 +108,11 @@ didn't get a copy, you may request one f #include #endif +#include "bpe.h" +#if NBPE > 0 +#include +#endif + #ifdef INET6 #include #include @@ -440,6 +445,11 @@ ether_input(struct ifnet *ifp, struct mb case ETHERTYPE_MPLS_MCAST: input = mpls_input; break; +#endif +#if NBPE > 0 + case ETHERTYPE_PBB: + bpe_input(ifp, m); + return (1); #endif default: goto dropanyway; Index: sys/conf/GENERIC =================================================================== RCS file: /cvs/src/sys/conf/GENERIC,v retrieving revision 1.256 diff -u -p -r1.256 GENERIC --- sys/conf/GENERIC 18 Oct 2018 02:10:54 -0000 1.256 +++ sys/conf/GENERIC 19 Dec 2018 04:16:07 -0000 @@ -94,6 +94,7 @@ pseudo-device mobileip # MobileIP encaps pseudo-device loop # network loopback pseudo-device mpe # MPLS PE interface pseudo-device mpw # MPLS pseudowire support +pseudo-device bpe # Provider Backbone Bridge edge interface pseudo-device pair # Virtual Ethernet interface pair pseudo-device ppp # PPP pseudo-device pppoe # PPP over Ethernet (RFC 2516) Index: sys/conf/files =================================================================== RCS file: /cvs/src/sys/conf/files,v retrieving revision 1.665 diff -u -p -r1.665 files --- sys/conf/files 21 Aug 2018 18:06:12 -0000 1.665 +++ sys/conf/files 19 Dec 2018 04:16:07 -0000 @@ -563,6 +563,7 @@ pseudo-device crypto: ifnet pseudo-device trunk: ifnet, ether, ifmedia pseudo-device mpe: ifnet, ether pseudo-device mpw: ifnet, ether +pseudo-device bpe: ifnet, ether, ifmedia pseudo-device vether: ifnet, ether pseudo-device pppx: ifnet pseudo-device vxlan: ifnet, ether, ifmedia @@ -813,6 +814,7 @@ file net/if_trunk.c trunk needs-coun file net/trunklacp.c trunk file net/if_mpe.c mpe needs-count file net/if_mpw.c mpw & bridge needs-count +file net/if_bpe.c bpe needs-count file net/if_vether.c vether needs-count file net/if_pair.c pair needs-count file net/if_pppx.c pppx needs-count