----- Forwarded message from David Gwynne ----- From: David Gwynne Subject: kstat - expose kernel statistics to userland when playing with some multiq/interrupt/rss/toeplitz stuff recently, i couldnt tell what effect these kinds of changes had because most info is rolled up to the interface level when we make it available for userland. currently the only info we get to see is vmstat -i, and systat mbuf info. vmstat -i isn't that great cos we're handling both tx and rx from the same interrupt. systat mbuf isn't that great cos it only shows if an rx ring grows, it doesnt say how much work it is doing, and doesn't show tx info at all. i can technically munge through the kernel with bsd.gdb, but that's very tiring to do repeatedly. i could also add some high level stats and use pstat -d to get at them, but that doesn't scale very well. i could expose some stuff with sensors, but i only feel regret every time i have to work with that api now. i started on some ioctls to expose ifq and ifiq info to userland like a cross between the rxr info and ifidata stuff, but it's tedious work that feels like im going to get it wrong. it's also frustrating cos i often want to get info out of the kernel and feel like i reinvent the wheel every time, and 90% of the time i throw it away because it's specific to the thing im working on, which makes it hard to justify polluting userland with. other systems have generic mechanisms for stuff. procfs and sysfs don't appeal to me though. the least worst ive come across is kstat in solaris, so i had a go at implementing an opinionated version of that based on their doco. there are three parts: 1. a pseudo-device which sits between userland and the kernel, and provides ioctls for userland to access the info with. 2. a kernel api that is provided if the pseudo-device is configured. 3. a basic userland tool to query and print the info that the kernel provides. a kstat is an arbitrary chunk of information (ie, bytes) that the kernel can provide to userland. this could just be an opaque blob of bytes, but the most useful is a series of key/value pairs. solaris calls this the "named" data type, but it makes more sense to me as kv data. the kstat_kv struct is richer than the equivalent named data in solaris. where solaris lets you specify values as integers of various sizes, strings, and raw bytes, kstat_kv differentiates between integer and counter types, adds nulls and bool types, and lets you specify a unit for integer types, like what the sensors stuff provides. the userland tool is currently pretty basic, but has been good enough for me to validate that some of the changes to network drivers have or haven't been working. the diff below includes the kstat code, but also changes to ifq.c so i can see what that part of the network stack is doing, and tweaks to vmx and ix so i could see what they're doing. the ix and vmx stats are based on what info i can get off the hardware. kstat on a box with ix looks like this: dlg@ix ~$ kstat aggr0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets aggr0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 256 packets oactive: false dispatch: 0 defers: 0 enc0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets enc0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 256 packets oactive: false dispatch: 0 defers: 0 ix0:0:ix:0 Rx CRC Errs: 0 packets Link XON Tx: 0 packets Link XON Rx: 0 packets Link XOFF Tx: 0 packets Link XOFF Rx: 0 packets 64B Rx: 14 packets 65-127B Rx: 1871 packets 128-255B Rx: 83 packets 256-511B Rx: 6 packets 512-1023B Rx: 11 packets 1024-MaxB Rx: 54 packets Good Tx: 640 packets Good Rx: 382021 bytes Good Tx: 96236 bytes Rx Undersize: 0 packets Rx Fragment: 0 packets Rx Oversize: 0 packets Rx Jabber: 0 packets Total Rx: 492802 bytes Total Rx: 3541 packets Total Tx: 640 packets Good Rx: 2039 packets Broadcast Rx: 22 packets Multicast Rx: 58 packets 64B Tx: 39 packets 65-127B Tx: 409 packets 128-255B Tx: 121 packets 256-511B Tx: 58 packets 512-1023B Tx: 4 packets 1024-MaxB Tx: 9 packets Multicast Tx: 50 packets Broadcast Tx: 5 packets ix0:0:ix-rxq:0 packets: 2039 packets bytes: 373865 bytes qdrops: 0 packets ix0:0:ix-txq:0 packets: 640 packets bytes: 93392 bytes ix0:0:rxq:0 packets: 2039 packets bytes: 373865 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets ix0:0:txq:0 packets: 640 packets bytes: 93392 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 255 packets oactive: false dispatch: 1 defers: 606 ix1:0:ix:0 Rx CRC Errs: 0 packets Link XON Tx: 0 packets Link XON Rx: 0 packets Link XOFF Tx: 0 packets Link XOFF Rx: 0 packets 64B Rx: 539 packets 65-127B Rx: 10177262 packets 128-255B Rx: 56 packets 256-511B Rx: 61 packets 512-1023B Rx: 5 packets 1024-MaxB Rx: 34 packets Good Tx: 17560683 packets Good Rx: 712893366 bytes Good Tx: 26655742873 bytes Rx Undersize: 0 packets Rx Fragment: 0 packets Rx Oversize: 0 packets Rx Jabber: 0 packets Total Rx: 713049937 bytes Total Rx: 10179658 packets Total Tx: 17560683 packets Good Rx: 10177957 packets Broadcast Rx: 519 packets Multicast Rx: 90 packets 64B Tx: 2 packets 65-127B Tx: 589 packets 128-255B Tx: 336 packets 256-511B Tx: 54 packets 512-1023B Tx: 22 packets 1024-MaxB Tx: 17559680 packets Multicast Tx: 55 packets Broadcast Tx: 0 packets ix1:0:ix-rxq:0 packets: 10177957 packets bytes: 672181538 bytes qdrops: 0 packets ix1:0:ix-txq:0 packets: 17560683 packets bytes: 26585500134 bytes ix1:0:rxq:0 packets: 10177957 packets bytes: 672181538 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets ix1:0:txq:0 packets: 17560683 packets bytes: 26585500134 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 255 packets oactive: false dispatch: 82466 defers: 7137703 lo0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets lo0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 256 packets oactive: false dispatch: 0 defers: 0 pflog0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets pflog0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 256 packets oactive: false dispatch: 0 defers: 0 vio0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets vio0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 255 packets oactive: false dispatch: 0 defers: 0 on a box with vmx: dlg@kbuild ~$ kstat enc0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets enc0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 256 packets oactive: false lo0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets lo0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 256 packets oactive: false pflog0:0:rxq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets pflog0:0:txq:0 packets: 0 packets bytes: 0 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 256 packets oactive: false vmx0:0:rxq:0 packets: 2869 packets bytes: 1021612 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets vmx0:0:rxq:1 packets: 474 packets bytes: 53537 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets vmx0:0:rxq:2 packets: 481 packets bytes: 59467 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets vmx0:0:rxq:3 packets: 17554893 packets bytes: 26577199846 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets vmx0:0:txq:0 packets: 343 packets bytes: 64226 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 512 packets oactive: false vmx0:0:txq:1 packets: 1927 packets bytes: 837752 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 512 packets oactive: false vmx0:0:txq:2 packets: 535 packets bytes: 74533 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 512 packets oactive: false vmx0:0:txq:3 packets: 10177527 packets bytes: 671887185 bytes qdrops: 0 packets errors: 0 packets qlen: 0 packets maxqlen: 512 packets oactive: false vmx0:0:vmx-rxstats:0 LRO packets: 0 packets LRO bytes: 0 bytes ucast packets: 1823 packets ucast bytes: 164300 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 1046 packets bcast bytes: 853984 bytes no buffers: 0 packets errors: 0 packets vmx0:0:vmx-rxstats:1 LRO packets: 0 packets LRO bytes: 0 bytes ucast packets: 474 packets ucast bytes: 53505 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes no buffers: 0 packets errors: 0 packets vmx0:0:vmx-rxstats:2 LRO packets: 0 packets LRO bytes: 0 bytes ucast packets: 481 packets ucast bytes: 59447 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes no buffers: 0 packets errors: 0 packets vmx0:0:vmx-rxstats:3 LRO packets: 0 packets LRO bytes: 0 bytes ucast packets: 17554893 packets ucast bytes: 26577199810 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes no buffers: 5431 packets errors: 0 packets vmx0:0:vmx-txstats:0 TSO packets: 0 packets TSO bytes: 0 bytes ucast packets: 339 packets ucast bytes: 64058 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 4 packets bcast bytes: 168 bytes errors: 0 packets discards: 0 packets vmx0:0:vmx-txstats:1 TSO packets: 0 packets TSO bytes: 0 bytes ucast packets: 1927 packets ucast bytes: 837752 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes errors: 0 packets discards: 0 packets vmx0:0:vmx-txstats:2 TSO packets: 0 packets TSO bytes: 0 bytes ucast packets: 535 packets ucast bytes: 74533 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes errors: 0 packets discards: 0 packets vmx0:0:vmx-txstats:3 TSO packets: 0 packets TSO bytes: 0 bytes ucast packets: 10177527 packets ucast bytes: 671887185 bytes mcast packets: 0 packets mcast bytes: 0 bytes bcast packets: 0 packets bcast bytes: 0 bytes errors: 0 packets discards: 0 packets if you want a better feel for how it works, i suggest reading: - https://illumos.org/man/9F/kstat_create#description - https://illumos.org/man/9s/kstat - https://github.com/illumos/illumos-gate/blob/4e0c5eff9af325c80994e9527b7cb8b3a1ffd1d4/usr/src/uts/common/sys/kstat.h#L131-L141 - https://github.com/illumos/illumos-gate/blob/4e0c5eff9af325c80994e9527b7cb8b3a1ffd1d4/usr/src/uts/common/sys/kstat.h#L177-L183 - https://github.com/illumos/illumos-gate/blob/4e0c5eff9af325c80994e9527b7cb8b3a1ffd1d4/usr/src/uts/common/sys/kstat.h#L360-L375 anyway, i'll put my asbestos undies on now. thoughts? Index: sys/arch/amd64/amd64/conf.c =================================================================== RCS file: /cvs/src/sys/arch/amd64/amd64/conf.c,v retrieving revision 1.70 diff -u -p -r1.70 conf.c --- sys/arch/amd64/amd64/conf.c 25 May 2020 06:37:52 -0000 1.70 +++ sys/arch/amd64/amd64/conf.c 24 Jun 2020 06:04:29 -0000 @@ -142,6 +142,7 @@ cdev_decl(cy); #include "pctr.h" #include "bktr.h" #include "ksyms.h" +#include "kstat.h" #include "usb.h" #include "uhid.h" #include "fido.h" @@ -238,7 +239,7 @@ struct cdevsw cdevsw[] = cdev_notdef(), /* 48 */ cdev_bktr_init(NBKTR,bktr), /* 49: Bt848 video capture device */ cdev_ksyms_init(NKSYMS,ksyms), /* 50: Kernel symbols device */ - cdev_notdef(), /* 51 */ + cdev_kstat_init(NKSTAT,kstat), /* 51: Kernel statistics */ cdev_midi_init(NMIDI,midi), /* 52: MIDI I/O */ cdev_notdef(), /* 53 was: sequencer I/O */ cdev_notdef(), /* 54 was: RAIDframe disk driver */ Index: sys/conf/GENERIC =================================================================== RCS file: /cvs/src/sys/conf/GENERIC,v retrieving revision 1.271 diff -u -p -r1.271 GENERIC --- sys/conf/GENERIC 23 Jun 2020 23:35:39 -0000 1.271 +++ sys/conf/GENERIC 24 Jun 2020 06:04:29 -0000 @@ -82,6 +82,7 @@ pseudo-device msts 1 # MSTS line discipl pseudo-device endrun 1 # EndRun line discipline pseudo-device vnd 4 # vnode disk devices pseudo-device ksyms 1 # kernel symbols device +pseudo-device kstat 1 # kernel statistics #pseudo-device dt # Dynamic Tracer # clonable devices Index: sys/conf/files =================================================================== RCS file: /cvs/src/sys/conf/files,v retrieving revision 1.689 diff -u -p -r1.689 files --- sys/conf/files 21 Jun 2020 12:14:48 -0000 1.689 +++ sys/conf/files 24 Jun 2020 06:04:29 -0000 @@ -573,6 +573,9 @@ pseudo-device wg: ifnet pseudo-device ksyms file dev/ksyms.c ksyms needs-flag +pseudo-device kstat +file dev/kstat.c kstat needs-flag + pseudo-device fuse file miscfs/fuse/fuse_device.c fuse needs-flag file miscfs/fuse/fuse_file.c fuse Index: sys/dev/kstat.c =================================================================== RCS file: sys/dev/kstat.c diff -N sys/dev/kstat.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ sys/dev/kstat.c 24 Jun 2020 06:04:29 -0000 @@ -0,0 +1,691 @@ +/* $OpenBSD$ */ + +/* + * Copyright (c) 2020 David Gwynne + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include + +/* for kstat_set_cpu */ +#include +#include + +#include + +RBT_HEAD(kstat_id_tree, kstat); + +static inline int +kstat_id_cmp(const struct kstat *a, const struct kstat *b) +{ + if (a->ks_id > b->ks_id) + return (1); + if (a->ks_id < b->ks_id) + return (-1); + + return (0); +} + +RBT_PROTOTYPE(kstat_id_tree, kstat, ks_id_entry, kstat_id_cmp); + +RBT_HEAD(kstat_pv_tree, kstat); + +static inline int +kstat_pv_cmp(const struct kstat *a, const struct kstat *b) +{ + int rv; + + rv = strcmp(a->ks_provider, b->ks_provider); + if (rv != 0) + return (rv); + + if (a->ks_instance > b->ks_instance) + return (1); + if (a->ks_instance < b->ks_instance) + return (-1); + + rv = strcmp(a->ks_name, b->ks_name); + if (rv != 0) + return (rv); + + if (a->ks_unit > b->ks_unit) + return (1); + if (a->ks_unit < b->ks_unit) + return (-1); + + return (0); +} + +RBT_PROTOTYPE(kstat_pv_tree, kstat, ks_pv_entry, kstat_pv_cmp); + +RBT_HEAD(kstat_nm_tree, kstat); + +static inline int +kstat_nm_cmp(const struct kstat *a, const struct kstat *b) +{ + int rv; + + rv = strcmp(a->ks_name, b->ks_name); + if (rv != 0) + return (rv); + + if (a->ks_unit > b->ks_unit) + return (1); + if (a->ks_unit < b->ks_unit) + return (-1); + + rv = strcmp(a->ks_provider, b->ks_provider); + if (rv != 0) + return (rv); + + if (a->ks_instance > b->ks_instance) + return (1); + if (a->ks_instance < b->ks_instance) + return (-1); + + return (0); +} + +RBT_PROTOTYPE(kstat_nm_tree, kstat, ks_nm_entry, kstat_nm_cmp); + +struct kstat_lock_ops { + void (*enter)(void *); + void (*leave)(void *); +}; + +#define kstat_enter(_ks) (_ks)->ks_lock_ops->enter((_ks)->ks_lock) +#define kstat_leave(_ks) (_ks)->ks_lock_ops->leave((_ks)->ks_lock) + +static const struct kstat_lock_ops kstat_rlock_ops = { + (void (*)(void *))rw_enter_read, + (void (*)(void *))rw_exit_read, +}; + +static const struct kstat_lock_ops kstat_wlock_ops = { + (void (*)(void *))rw_enter_write, + (void (*)(void *))rw_exit_write, +}; + +static const struct kstat_lock_ops kstat_mutex_ops = { + (void (*)(void *))mtx_enter, + (void (*)(void *))mtx_leave, +}; + +static void kstat_cpu_enter(void *); +static void kstat_cpu_leave(void *); + +static const struct kstat_lock_ops kstat_cpu_ops = { + kstat_cpu_enter, + kstat_cpu_leave, +}; + +static struct rwlock kstat_lock = RWLOCK_INITIALIZER("kstat"); + +/* + * The global state is versioned so changes to the set of kstats + * can be detected. This is an int so it can be read atomically on + * any arch, which is ridiculous optimisation, really. + */ +static unsigned int kstat_version = 0; + +/* + * kstat structures have a unique identifier so they can be found + * quickly. Identifiers are 64bit in the hope that it won't wrap + * during the runtime of a system. The identifiers start at 1 so that + * 0 can be used as the first value for userland to iterate with. + */ +static uint64_t kstat_next_id = 1; + +static struct kstat_id_tree kstat_id_tree = RBT_INITIALIZER(); +static struct kstat_pv_tree kstat_pv_tree = RBT_INITIALIZER(); +static struct kstat_nm_tree kstat_nm_tree = RBT_INITIALIZER(); +static struct pool kstat_pool; + +static struct rwlock kstat_default_lock = + RWLOCK_INITIALIZER("kstatlk"); + +static int kstat_read(struct kstat *); +static int kstat_copy(struct kstat *, void *); + +int +kstatattach(int num) +{ + /* XXX install system stats here */ + return (0); +} + +int +kstatopen(dev_t dev, int flag, int mode, struct proc *p) +{ + return (0); +} + +int +kstatclose(dev_t dev, int flag, int mode, struct proc *p) +{ + return (0); +} + +static int +kstatioc_enter(struct kstat_req *ksreq) +{ + int error; + + error = rw_enter(&kstat_lock, RW_READ | RW_INTR); + if (error != 0) + return (error); + + if (!ISSET(ksreq->ks_rflags, KSTATIOC_F_IGNVER) && + ksreq->ks_version != kstat_version) { + error = EBUSY; + goto error; + } + + return (0); + +error: + rw_exit(&kstat_lock); + return (error); +} + +#define sstrlcpy(_dst, _src) \ + (strlcpy((_dst), (_src), sizeof((_dst))) >= sizeof((_dst))) + +static int +kstatioc_leave(struct kstat_req *ksreq, struct kstat *ks) +{ + void *buf = NULL; + size_t klen = 0, ulen = 0; + struct timespec updated; + int error = 0; + + if (ks == NULL) { + error = ENOENT; + goto error; + } + + switch (ks->ks_state) { + case KSTAT_S_CREATED: + ksreq->ks_updated = ks->ks_created; + ksreq->ks_interval.tv_sec = 0; + ksreq->ks_interval.tv_nsec = 0; + ksreq->ks_datalen = 0; + ksreq->ks_dataver = 0; + break; + + case KSTAT_S_INSTALLED: + ksreq->ks_dataver = ks->ks_dataver; + ksreq->ks_interval = ks->ks_interval; + + if (ksreq->ks_data == NULL) { + /* userland doesn't want actual data, so shortcut */ + kstat_enter(ks); + ksreq->ks_datalen = ks->ks_datalen; + ksreq->ks_updated = ks->ks_updated; + kstat_leave(ks); + break; + } + + klen = ks->ks_datalen; /* KSTAT_F_REALLOC */ + buf = malloc(klen, M_TEMP, M_WAITOK|M_CANFAIL); + if (buf == NULL) { + error = ENOMEM; + goto error; + } + + kstat_enter(ks); + error = (*ks->ks_read)(ks); + if (error == 0) { + updated = ks->ks_updated; + + /* KSTAT_F_REALLOC */ + KASSERTMSG(ks->ks_datalen == klen, + "kstat doesnt support resized data yet"); + + error = (*ks->ks_copy)(ks, buf); + } + kstat_leave(ks); + + if (error != 0) + goto error; + + ulen = ksreq->ks_datalen; + ksreq->ks_datalen = klen; /* KSTAT_F_REALLOC */ + ksreq->ks_updated = updated; + break; + default: + panic("ks %p unexpected state %u", ks, ks->ks_state); + } + + ksreq->ks_version = kstat_version; + ksreq->ks_id = ks->ks_id; + + if (sstrlcpy(ksreq->ks_provider, ks->ks_provider) != 0) + panic("kstat provider string has grown"); + ksreq->ks_instance = ks->ks_instance; + if (sstrlcpy(ksreq->ks_name, ks->ks_name) != 0) + panic("kstat name string has grown"); + ksreq->ks_unit = ks->ks_unit; + + ksreq->ks_created = ks->ks_created; + ksreq->ks_type = ks->ks_type; + ksreq->ks_state = ks->ks_state; + +error: + rw_exit(&kstat_lock); + + if (buf != NULL) { + if (error == 0) + error = copyout(buf, ksreq->ks_data, min(klen, ulen)); + + free(buf, M_TEMP, klen); + } + + return (error); +} + +static int +kstatioc_find_id(struct kstat_req *ksreq) +{ + struct kstat *ks, key; + int error; + + error = kstatioc_enter(ksreq); + if (error != 0) + return (error); + + key.ks_id = ksreq->ks_id; + + ks = RBT_FIND(kstat_id_tree, &kstat_id_tree, &key); + + return (kstatioc_leave(ksreq, ks)); +} + +static int +kstatioc_nfind_id(struct kstat_req *ksreq) +{ + struct kstat *ks, key; + int error; + + error = kstatioc_enter(ksreq); + if (error != 0) + return (error); + + key.ks_id = ksreq->ks_id; + + ks = RBT_NFIND(kstat_id_tree, &kstat_id_tree, &key); + + return (kstatioc_leave(ksreq, ks)); +} + +static int +kstatioc_find_pv(struct kstat_req *ksreq) +{ + struct kstat *ks, key; + int error; + + error = kstatioc_enter(ksreq); + if (error != 0) + return (error); + + key.ks_provider = ksreq->ks_provider; + key.ks_instance = ksreq->ks_instance; + key.ks_name = ksreq->ks_name; + key.ks_unit = ksreq->ks_unit; + + ks = RBT_FIND(kstat_pv_tree, &kstat_pv_tree, &key); + + return (kstatioc_leave(ksreq, ks)); +} + +static int +kstatioc_nfind_pv(struct kstat_req *ksreq) +{ + struct kstat *ks, key; + int error; + + error = kstatioc_enter(ksreq); + if (error != 0) + return (error); + + key.ks_provider = ksreq->ks_provider; + key.ks_instance = ksreq->ks_instance; + key.ks_name = ksreq->ks_name; + key.ks_unit = ksreq->ks_unit; + + ks = RBT_NFIND(kstat_pv_tree, &kstat_pv_tree, &key); + + return (kstatioc_leave(ksreq, ks)); +} + +static int +kstatioc_find_nm(struct kstat_req *ksreq) +{ + struct kstat *ks, key; + int error; + + error = kstatioc_enter(ksreq); + if (error != 0) + return (error); + + key.ks_name = ksreq->ks_name; + key.ks_unit = ksreq->ks_unit; + key.ks_provider = ksreq->ks_provider; + key.ks_instance = ksreq->ks_instance; + + ks = RBT_FIND(kstat_nm_tree, &kstat_nm_tree, &key); + + return (kstatioc_leave(ksreq, ks)); +} + +static int +kstatioc_nfind_nm(struct kstat_req *ksreq) +{ + struct kstat *ks, key; + int error; + + error = kstatioc_enter(ksreq); + if (error != 0) + return (error); + + key.ks_name = ksreq->ks_name; + key.ks_unit = ksreq->ks_unit; + key.ks_provider = ksreq->ks_provider; + key.ks_instance = ksreq->ks_instance; + + ks = RBT_NFIND(kstat_nm_tree, &kstat_nm_tree, &key); + + return (kstatioc_leave(ksreq, ks)); +} + +int +kstatioctl(dev_t dev, u_long cmd, caddr_t data, int flag, struct proc *p) +{ + struct kstat_req *ksreq = (struct kstat_req *)data; + int error = 0; + + KERNEL_UNLOCK(); + + switch (cmd) { + case KSTATIOC_VERSION: + *(unsigned int *)data = kstat_version; + break; + + case KSTATIOC_FIND_ID: + error = kstatioc_find_id(ksreq); + break; + case KSTATIOC_NFIND_ID: + error = kstatioc_nfind_id(ksreq); + break; + case KSTATIOC_FIND_PROVIDER: + error = kstatioc_find_pv(ksreq); + break; + case KSTATIOC_NFIND_PROVIDER: + error = kstatioc_nfind_pv(ksreq); + break; + case KSTATIOC_FIND_NAME: + error = kstatioc_find_nm(ksreq); + break; + case KSTATIOC_NFIND_NAME: + error = kstatioc_nfind_nm(ksreq); + break; + + default: + error = ENOTTY; + break; + } + + KERNEL_LOCK(); + + return (error); +} + +static void +kstat_init(void) +{ + static int initialized = 0; + + if (initialized) + return; + + pool_init(&kstat_pool, sizeof(struct kstat), 0, IPL_NONE, + PR_WAITOK | PR_RWLOCK, "kstatmem", NULL); + + initialized = 1; +} + +static int +kstat_strcheck(const char *str) +{ + size_t i, l; + + l = strlen(str); + if (l == 0 || l >= KSTAT_STRLEN) + return (-1); + for (i = 0; i < l; i++) { + int ch = str[i]; + if (ch >= 'a' && ch <= 'z') + continue; + if (ch >= 'A' && ch <= 'Z') + continue; + if (ch >= '0' && ch <= '9') + continue; + switch (ch) { + case '-': + case '_': + case '.': + break; + default: + return (-1); + } + } + + return (0); +} + +struct kstat * +kstat_create(const char *provider, unsigned int instance, + const char *name, unsigned int unit, + unsigned int type, unsigned int flags) +{ + struct kstat *ks, *oks; + + if (kstat_strcheck(provider) == -1) + panic("invalid provider string"); + if (kstat_strcheck(name) == -1) + panic("invalid name string"); + + kstat_init(); + + ks = pool_get(&kstat_pool, PR_WAITOK|PR_ZERO); + + ks->ks_provider = provider; + ks->ks_instance = instance; + ks->ks_name = name; + ks->ks_unit = unit; + ks->ks_flags = flags; + ks->ks_type = type; + ks->ks_state = KSTAT_S_CREATED; + + getnanouptime(&ks->ks_created); + ks->ks_updated = ks->ks_created; + + ks->ks_lock = &kstat_default_lock; + ks->ks_lock_ops = &kstat_wlock_ops; + ks->ks_read = kstat_read; + ks->ks_copy = kstat_copy; + + rw_enter_write(&kstat_lock); + ks->ks_id = kstat_next_id; + + oks = RBT_INSERT(kstat_pv_tree, &kstat_pv_tree, ks); + if (oks == NULL) { + /* commit */ + kstat_next_id++; + kstat_version++; + + oks = RBT_INSERT(kstat_nm_tree, &kstat_nm_tree, ks); + if (oks != NULL) + panic("kstat name collision! (%llu)", ks->ks_id); + + oks = RBT_INSERT(kstat_id_tree, &kstat_id_tree, ks); + if (oks != NULL) + panic("kstat id collision! (%llu)", ks->ks_id); + } + rw_exit_write(&kstat_lock); + + if (oks != NULL) { + pool_put(&kstat_pool, ks); + return (NULL); + } + + return (ks); +} + +void +kstat_set_rlock(struct kstat *ks, struct rwlock *rwl) +{ + KASSERT(ks->ks_state == KSTAT_S_CREATED); + + ks->ks_lock = rwl; + ks->ks_lock_ops = &kstat_rlock_ops; +} + +void +kstat_set_wlock(struct kstat *ks, struct rwlock *rwl) +{ + KASSERT(ks->ks_state == KSTAT_S_CREATED); + + ks->ks_lock = rwl; + ks->ks_lock_ops = &kstat_wlock_ops; +} + +void +kstat_set_mutex(struct kstat *ks, struct mutex *mtx) +{ + KASSERT(ks->ks_state == KSTAT_S_CREATED); + + ks->ks_lock = mtx; + ks->ks_lock_ops = &kstat_mutex_ops; +} + +static void +kstat_cpu_enter(void *p) +{ + struct cpu_info *ci = p; + sched_peg_curproc(ci); +} + +static void +kstat_cpu_leave(void *p) +{ + atomic_clearbits_int(&curproc->p_flag, P_CPUPEG); +} + +void +kstat_set_cpu(struct kstat *ks, struct cpu_info *ci) +{ + KASSERT(ks->ks_state == KSTAT_S_CREATED); + + ks->ks_lock = ci; + ks->ks_lock_ops = &kstat_cpu_ops; +} + +int +kstat_read_nop(struct kstat *ks) +{ + return (0); +} + +void +kstat_install(struct kstat *ks) +{ + if (!ISSET(ks->ks_flags, KSTAT_F_REALLOC)) { + KASSERTMSG(ks->ks_copy != NULL || ks->ks_data != NULL, + "kstat %s:%u:%s:%u must provide ks_copy or ks_data", + ks->ks_provider, ks->ks_instance, ks->ks_name, ks->ks_unit); + KASSERT(ks->ks_datalen > 0); + } + + rw_enter_write(&kstat_lock); + ks->ks_state = KSTAT_S_INSTALLED; + rw_exit_write(&kstat_lock); +} + +void +kstat_destroy(struct kstat *ks) +{ + rw_enter_write(&kstat_lock); + RBT_REMOVE(kstat_id_tree, &kstat_id_tree, ks); + RBT_REMOVE(kstat_pv_tree, &kstat_pv_tree, ks); + RBT_REMOVE(kstat_nm_tree, &kstat_nm_tree, ks); + kstat_version++; + rw_exit_write(&kstat_lock); + + pool_put(&kstat_pool, ks); +} + +static int +kstat_read(struct kstat *ks) +{ + getnanouptime(&ks->ks_updated); + return (0); +} + +static int +kstat_copy(struct kstat *ks, void *buf) +{ + memcpy(buf, ks->ks_data, ks->ks_datalen); + return (0); +} + +RBT_GENERATE(kstat_id_tree, kstat, ks_id_entry, kstat_id_cmp); +RBT_GENERATE(kstat_pv_tree, kstat, ks_pv_entry, kstat_pv_cmp); +RBT_GENERATE(kstat_nm_tree, kstat, ks_nm_entry, kstat_nm_cmp); + +void +kstat_kv_init(struct kstat_kv *kv, const char *name, enum kstat_kv_type type) +{ + memset(kv, 0, sizeof(*kv)); + strlcpy(kv->kv_key, name, sizeof(kv->kv_key)); /* XXX truncated? */ + kv->kv_type = type; + kv->kv_unit = KSTAT_KV_U_NONE; +} + +void +kstat_kv_unit_init(struct kstat_kv *kv, const char *name, + enum kstat_kv_type type, enum kstat_kv_unit unit) +{ + switch (type) { + case KSTAT_KV_T_COUNTER64: + case KSTAT_KV_T_COUNTER32: + case KSTAT_KV_T_UINT64: + case KSTAT_KV_T_INT64: + case KSTAT_KV_T_UINT32: + case KSTAT_KV_T_INT32: + break; + default: + panic("kv unit init %s: unit for non-integer type", name); + } + + memset(kv, 0, sizeof(*kv)); + strlcpy(kv->kv_key, name, sizeof(kv->kv_key)); /* XXX truncated? */ + kv->kv_type = type; + kv->kv_unit = unit; +} Index: sys/dev/pci/if_vmx.c =================================================================== RCS file: /cvs/src/sys/dev/pci/if_vmx.c,v retrieving revision 1.59 diff -u -p -r1.59 if_vmx.c --- sys/dev/pci/if_vmx.c 17 Jun 2020 07:08:39 -0000 1.59 +++ sys/dev/pci/if_vmx.c 24 Jun 2020 06:04:29 -0000 @@ -17,6 +17,7 @@ */ #include "bpfilter.h" +#include "kstat.h" #include #include @@ -26,6 +27,7 @@ #include #include #include +#include #include #include @@ -92,20 +94,48 @@ struct vmxnet3_comp_ring { u_int32_t gen; }; +struct vmx_txstats_kv { + struct kstat_kv tso_packets; + struct kstat_kv tso_bytes; + struct kstat_kv ucast_packets; + struct kstat_kv ucast_bytes; + struct kstat_kv mcast_packets; + struct kstat_kv mcast_bytes; + struct kstat_kv bcast_packets; + struct kstat_kv bcast_bytes; + struct kstat_kv errors; + struct kstat_kv discards; +}; + struct vmxnet3_txqueue { struct vmxnet3_softc *sc; /* sigh */ struct vmxnet3_txring cmd_ring; struct vmxnet3_comp_ring comp_ring; struct vmxnet3_txq_shared *ts; struct ifqueue *ifq; + struct kstat *txkstat; } __aligned(64); +struct vmx_rxstats_kv { + struct kstat_kv lro_packets; + struct kstat_kv lro_bytes; + struct kstat_kv ucast_packets; + struct kstat_kv ucast_bytes; + struct kstat_kv mcast_packets; + struct kstat_kv mcast_bytes; + struct kstat_kv bcast_packets; + struct kstat_kv bcast_bytes; + struct kstat_kv nobuffers; + struct kstat_kv errors; + }; + struct vmxnet3_rxqueue { struct vmxnet3_softc *sc; /* sigh */ struct vmxnet3_rxring cmd_ring[2]; struct vmxnet3_comp_ring comp_ring; struct vmxnet3_rxq_shared *rs; struct ifiqueue *ifiq; + struct kstat *rxkstat; } __aligned(64); struct vmxnet3_queue { @@ -117,6 +147,14 @@ struct vmxnet3_queue { int intr; }; +struct vmx_kstats { + struct rwlock lock; + struct timeval updated; + + struct vmx_txstats_kv txstats; + struct vmx_rxstats_kv rxstats; +}; + struct vmxnet3_softc { struct device sc_dev; struct arpcom sc_arpcom; @@ -136,24 +174,11 @@ struct vmxnet3_softc { struct vmxnet3_driver_shared *sc_ds; u_int8_t *sc_mcast; struct vmxnet3_upt1_rss_conf *sc_rss; -}; - -#define VMXNET3_STAT -#ifdef VMXNET3_STAT -struct { - u_int ntxdesc; - u_int nrxdesc; - u_int txhead; - u_int txdone; - u_int maxtxlen; - u_int rxdone; - u_int rxfill; - u_int intr; -} vmxstat = { - NTXDESC, NRXDESC -}; +#if NKSTAT > 0 + struct vmx_kstats sc_kstats; #endif +}; #define JUMBO_LEN (1024 * 9) #define DMAADDR(map) ((map)->dm_segs[0].ds_addr) @@ -202,6 +227,14 @@ void vmxnet3_media_status(struct ifnet * int vmxnet3_media_change(struct ifnet *); void *vmxnet3_dma_allocmem(struct vmxnet3_softc *, u_int, u_int, bus_addr_t *); +#if NKSTAT > 0 +static void vmx_kstat_init(struct vmxnet3_softc *); +static void vmx_kstat_txstats(struct vmxnet3_softc *, + struct vmxnet3_txqueue *, int); +static void vmx_kstat_rxstats(struct vmxnet3_softc *, + struct vmxnet3_rxqueue *, int); +#endif /* NKSTAT > 0 */ + const struct pci_matchid vmx_devices[] = { { PCI_VENDOR_VMWARE, PCI_PRODUCT_VMWARE_NET_3 } }; @@ -323,9 +356,9 @@ vmxnet3_attach(struct device *parent, st snprintf(q->intrname, sizeof(q->intrname), "%s:%d", self->dv_xname, i); /* this should be pci_intr_establish_cpu */ - q->ih = pci_intr_establish(pa->pa_pc, ih, + q->ih = pci_intr_establish_cpu(pa->pa_pc, ih, IPL_NET | IPL_MPSAFE, - /* intrmap_cpu(sc->sc_intrmap, i), */ + intrmap_cpu(sc->sc_intrmap, i), vmxnet3_intr_queue, q, q->intrname); q->intr = vec; @@ -389,10 +422,20 @@ vmxnet3_attach(struct device *parent, st if_attach_queues(ifp, sc->sc_nqueues); if_attach_iqueues(ifp, sc->sc_nqueues); + +#if NKSTAT > 0 + vmx_kstat_init(sc); +#endif + for (i = 0; i < sc->sc_nqueues; i++) { ifp->if_ifqs[i]->ifq_softc = &sc->sc_q[i].tx; sc->sc_q[i].tx.ifq = ifp->if_ifqs[i]; sc->sc_q[i].rx.ifiq = ifp->if_iqs[i]; + +#if NKSTAT > 0 + vmx_kstat_txstats(sc, &sc->sc_q[i].tx, i); + vmx_kstat_rxstats(sc, &sc->sc_q[i].rx, i); +#endif } } @@ -1024,9 +1067,6 @@ vmxnet3_rxintr(struct vmxnet3_softc *sc, ml_enqueue(&ml, m); skip_buffer: -#ifdef VMXNET3_STAT - vmxstat.rxdone = idx; -#endif if (rq->rs->update_rxhead) { u_int qid = letoh32((rxcd->rxc_word0 >> VMXNET3_RXC_QID_S) & VMXNET3_RXC_QID_M); @@ -1424,3 +1464,154 @@ vmxnet3_dma_allocmem(struct vmxnet3_soft bus_dmamap_destroy(t, map); return va; } + +#if NKSTAT > 0 +static const struct timeval vmx_kstat_rate = { 1, 0 }; + +static void +vmx_kstat_init(struct vmxnet3_softc *sc) +{ + struct vmx_txstats_kv *txkvs = &sc->sc_kstats.txstats; + struct vmx_rxstats_kv *rxkvs = &sc->sc_kstats.rxstats; + + rw_init(&sc->sc_kstats.lock, "vmxstats"); + + kstat_kv_unit_init(&txkvs->tso_packets, "TSO packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&txkvs->tso_bytes, "TSO bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&txkvs->ucast_packets, "ucast packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&txkvs->ucast_bytes, "ucast bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&txkvs->mcast_packets, "mcast packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&txkvs->mcast_bytes, "mcast bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&txkvs->bcast_packets, "bcast packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&txkvs->bcast_bytes, "bcast bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&txkvs->errors, "errors", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&txkvs->discards, "discards", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + + kstat_kv_unit_init(&rxkvs->lro_packets, "LRO packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&rxkvs->lro_bytes, "LRO bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&rxkvs->ucast_packets, "ucast packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&rxkvs->ucast_bytes, "ucast bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&rxkvs->mcast_packets, "mcast packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&rxkvs->mcast_bytes, "mcast bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&rxkvs->bcast_packets, "bcast packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&rxkvs->bcast_bytes, "bcast bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES); + kstat_kv_unit_init(&rxkvs->nobuffers, "no buffers", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); + kstat_kv_unit_init(&rxkvs->errors, "errors", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS); +} + +static int +vmx_txstats_read(struct kstat *ks) +{ + struct vmxnet3_txqueue *tq = ks->ks_softc; + struct vmxnet3_softc *sc = tq->sc; + struct vmx_txstats_kv *txkvs = ks->ks_data; + struct UPT1_TxStats *txstats = &tq->ts->stats; + + if (ratecheck(&sc->sc_kstats.updated, &vmx_kstat_rate)) + WRITE_CMD(sc, VMXNET3_CMD_GET_STATS); + + txkvs->tso_packets.kv_v.v_u64 = txstats->TSO_packets; + txkvs->tso_bytes.kv_v.v_u64 = txstats->TSO_bytes; + txkvs->ucast_packets.kv_v.v_u64 = txstats->ucast_packets; + txkvs->ucast_bytes.kv_v.v_u64 = txstats->ucast_bytes; + txkvs->mcast_packets.kv_v.v_u64 = txstats->mcast_packets; + txkvs->mcast_bytes.kv_v.v_u64 = txstats->mcast_bytes; + txkvs->bcast_packets.kv_v.v_u64 = txstats->bcast_packets; + txkvs->bcast_bytes.kv_v.v_u64 = txstats->bcast_bytes; + txkvs->errors.kv_v.v_u64 = txstats->error; + txkvs->discards.kv_v.v_u64 = txstats->discard; + + TIMEVAL_TO_TIMESPEC(&sc->sc_kstats.updated, &ks->ks_updated); + + return (0); +} + +static void +vmx_kstat_txstats(struct vmxnet3_softc *sc, struct vmxnet3_txqueue *tq, int i) +{ + tq->sc = sc; + + tq->txkstat = kstat_create(sc->sc_dev.dv_xname, 0, "vmx-txstats", i, + KSTAT_T_KV, 0); + if (tq->txkstat == NULL) + return; + + kstat_set_wlock(tq->txkstat, &sc->sc_kstats.lock); + + tq->txkstat->ks_softc = tq; + tq->txkstat->ks_data = &sc->sc_kstats.txstats; + tq->txkstat->ks_datalen = sizeof(sc->sc_kstats.txstats); + tq->txkstat->ks_read = vmx_txstats_read; + TIMEVAL_TO_TIMESPEC(&vmx_kstat_rate, &tq->txkstat->ks_interval); + + kstat_install(tq->txkstat); +} + +static int +vmx_rxstats_read(struct kstat *ks) +{ + struct vmxnet3_rxqueue *rq = ks->ks_softc; + struct vmxnet3_softc *sc = rq->sc; + struct vmx_rxstats_kv *rxkvs = ks->ks_data; + struct UPT1_RxStats *rxstats = &rq->rs->stats; + + if (ratecheck(&sc->sc_kstats.updated, &vmx_kstat_rate)) + WRITE_CMD(sc, VMXNET3_CMD_GET_STATS); + + rxkvs->lro_packets.kv_v.v_u64 = rxstats->LRO_packets; + rxkvs->lro_bytes.kv_v.v_u64 = rxstats->LRO_bytes; + rxkvs->ucast_packets.kv_v.v_u64 = rxstats->ucast_packets; + rxkvs->ucast_bytes.kv_v.v_u64 = rxstats->ucast_bytes; + rxkvs->mcast_packets.kv_v.v_u64 = rxstats->mcast_packets; + rxkvs->mcast_bytes.kv_v.v_u64 = rxstats->mcast_bytes; + rxkvs->bcast_packets.kv_v.v_u64 = rxstats->bcast_packets; + rxkvs->bcast_bytes.kv_v.v_u64 = rxstats->bcast_bytes; + rxkvs->nobuffers.kv_v.v_u64 = rxstats->nobuffer; + rxkvs->errors.kv_v.v_u64 = rxstats->error; + + TIMEVAL_TO_TIMESPEC(&sc->sc_kstats.updated, &ks->ks_updated); + + return (0); +} + +static void +vmx_kstat_rxstats(struct vmxnet3_softc *sc, struct vmxnet3_rxqueue *rq, int i) +{ + rq->sc = sc; + + rq->rxkstat = kstat_create(sc->sc_dev.dv_xname, 0, "vmx-rxstats", i, + KSTAT_T_KV, 0); + if (rq->rxkstat == NULL) + return; + + kstat_set_wlock(rq->rxkstat, &rq->sc->sc_kstats.lock); + + rq->rxkstat->ks_softc = rq; + rq->rxkstat->ks_data = &sc->sc_kstats.rxstats; + rq->rxkstat->ks_datalen = sizeof(sc->sc_kstats.rxstats); + rq->rxkstat->ks_read = vmx_rxstats_read; + TIMEVAL_TO_TIMESPEC(&vmx_kstat_rate, &rq->rxkstat->ks_interval); + + kstat_install(rq->rxkstat); +} +#endif /* NKSTAT > 0 */ Index: sys/net/ifq.c =================================================================== RCS file: /cvs/src/sys/net/ifq.c,v retrieving revision 1.40 diff -u -p -r1.40 ifq.c --- sys/net/ifq.c 17 Jun 2020 06:45:22 -0000 1.40 +++ sys/net/ifq.c 24 Jun 2020 06:04:29 -0000 @@ -17,6 +17,7 @@ */ #include "bpfilter.h" +#include "kstat.h" #include #include @@ -32,6 +33,10 @@ #include #endif +#if NKSTAT > 0 +#include +#endif + /* * priq glue */ @@ -122,7 +127,10 @@ ifq_is_serialized(struct ifqueue *ifq) void ifq_start(struct ifqueue *ifq) { - if (ifq_len(ifq) >= min(ifq->ifq_if->if_txmit, ifq->ifq_maxlen)) { + struct ifnet *ifp = ifq->ifq_if; + + if (ISSET(ifp->if_xflags, IFXF_MPSAFE) && + ifq_len(ifq) >= min(ifp->if_txmit, ifq->ifq_maxlen)) { task_del(ifq->ifq_softnet, &ifq->ifq_bundle); ifq_run_start(ifq); } else @@ -188,11 +196,42 @@ ifq_barrier_task(void *p) * ifqueue mbuf queue API */ +#if NKSTAT > 0 +struct ifq_kstat_data { + struct kstat_kv kd_packets; + struct kstat_kv kd_bytes; + struct kstat_kv kd_qdrops; + struct kstat_kv kd_errors; + struct kstat_kv kd_qlen; + struct kstat_kv kd_maxqlen; + struct kstat_kv kd_oactive; +}; + +static const struct ifq_kstat_data ifq_kstat_tpl = { + KSTAT_KV_UNIT_INITIALIZER("packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS), + KSTAT_KV_UNIT_INITIALIZER("bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES), + KSTAT_KV_UNIT_INITIALIZER("qdrops", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS), + KSTAT_KV_UNIT_INITIALIZER("errors", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS), + KSTAT_KV_UNIT_INITIALIZER("qlen", + KSTAT_KV_T_UINT32, KSTAT_KV_U_PACKETS), + KSTAT_KV_UNIT_INITIALIZER("maxqlen", + KSTAT_KV_T_UINT32, KSTAT_KV_U_PACKETS), + KSTAT_KV_INITIALIZER("oactive", KSTAT_KV_T_BOOL), +}; + +static int ifq_kstat_copy(struct kstat *, void *); +#endif + void ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx) { ifq->ifq_if = ifp; - ifq->ifq_softnet = net_tq(ifp->if_index); /* + idx */ + ifq->ifq_softnet = ISSET(ifp->if_xflags, IFXF_MPSAFE) ? + net_tq(ifp->if_index /* + idx */) : systq; ifq->ifq_softc = NULL; mtx_init(&ifq->ifq_mtx, IPL_NET); @@ -222,6 +261,18 @@ ifq_init(struct ifqueue *ifq, struct ifn ifq_set_maxlen(ifq, IFQ_MAXLEN); ifq->ifq_idx = idx; + +#if NKSTAT > 0 + /* XXX xname vs driver name and unit */ + ifq->ifq_kstat = kstat_create(ifp->if_xname, 0, + "txq", ifq->ifq_idx, KSTAT_T_KV, 0); + KASSERT(ifq->ifq_kstat != NULL); + kstat_set_mutex(ifq->ifq_kstat, &ifq->ifq_mtx); + ifq->ifq_kstat->ks_softc = ifq; + ifq->ifq_kstat->ks_datalen = sizeof(ifq_kstat_tpl); + ifq->ifq_kstat->ks_copy = ifq_kstat_copy; + kstat_install(ifq->ifq_kstat); +#endif } void @@ -265,6 +316,10 @@ ifq_destroy(struct ifqueue *ifq) { struct mbuf_list ml = MBUF_LIST_INITIALIZER(); +#if NKSTAT > 0 + kstat_destroy(ifq->ifq_kstat); +#endif + NET_ASSERT_UNLOCKED(); if (!task_del(ifq->ifq_softnet, &ifq->ifq_bundle)) taskq_barrier(ifq->ifq_softnet); @@ -289,6 +344,26 @@ ifq_add_data(struct ifqueue *ifq, struct mtx_leave(&ifq->ifq_mtx); } +#if NKSTAT > 0 +static int +ifq_kstat_copy(struct kstat *ks, void *dst) +{ + struct ifqueue *ifq = ks->ks_softc; + struct ifq_kstat_data *kd = dst; + + *kd = ifq_kstat_tpl; + kd->kd_packets.kv_v.v_u64 = ifq->ifq_packets; + kd->kd_bytes.kv_v.v_u64 = ifq->ifq_bytes; + kd->kd_qdrops.kv_v.v_u64 = ifq->ifq_qdrops; + kd->kd_errors.kv_v.v_u64 = ifq->ifq_errors; + kd->kd_qlen.kv_v.v_u32 = ifq->ifq_len; + kd->kd_maxqlen.kv_v.v_u32 = ifq->ifq_maxlen; + kd->kd_oactive.kv_v.v_bool = ifq->ifq_oactive; + + return (0); +} +#endif + int ifq_enqueue(struct ifqueue *ifq, struct mbuf *m) { @@ -505,6 +580,31 @@ ifq_mfreeml(struct ifqueue *ifq, struct * ifiq */ +#if NKSTAT > 0 +struct ifiq_kstat_data { + struct kstat_kv kd_packets; + struct kstat_kv kd_bytes; + struct kstat_kv kd_qdrops; + struct kstat_kv kd_errors; + struct kstat_kv kd_qlen; +}; + +static const struct ifiq_kstat_data ifiq_kstat_tpl = { + KSTAT_KV_UNIT_INITIALIZER("packets", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS), + KSTAT_KV_UNIT_INITIALIZER("bytes", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_BYTES), + KSTAT_KV_UNIT_INITIALIZER("qdrops", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS), + KSTAT_KV_UNIT_INITIALIZER("errors", + KSTAT_KV_T_COUNTER64, KSTAT_KV_U_PACKETS), + KSTAT_KV_UNIT_INITIALIZER("qlen", + KSTAT_KV_T_UINT32, KSTAT_KV_U_PACKETS), +}; + +static int ifiq_kstat_copy(struct kstat *, void *); +#endif + static void ifiq_process(void *); void @@ -525,11 +625,27 @@ ifiq_init(struct ifiqueue *ifiq, struct ifiq->ifiq_errors = 0; ifiq->ifiq_idx = idx; + +#if NKSTAT > 0 + /* XXX xname vs driver name and unit */ + ifiq->ifiq_kstat = kstat_create(ifp->if_xname, 0, + "rxq", ifiq->ifiq_idx, KSTAT_T_KV, 0); + KASSERT(ifiq->ifiq_kstat != NULL); + kstat_set_mutex(ifiq->ifiq_kstat, &ifiq->ifiq_mtx); + ifiq->ifiq_kstat->ks_softc = ifiq; + ifiq->ifiq_kstat->ks_datalen = sizeof(ifiq_kstat_tpl); + ifiq->ifiq_kstat->ks_copy = ifiq_kstat_copy; + kstat_install(ifiq->ifiq_kstat); +#endif } void ifiq_destroy(struct ifiqueue *ifiq) { +#if NKSTAT > 0 + kstat_destroy(ifiq->ifiq_kstat); +#endif + NET_ASSERT_UNLOCKED(); if (!task_del(ifiq->ifiq_softnet, &ifiq->ifiq_task)) taskq_barrier(ifiq->ifiq_softnet); @@ -616,6 +732,24 @@ ifiq_add_data(struct ifiqueue *ifiq, str data->ifi_iqdrops += ifiq->ifiq_qdrops; mtx_leave(&ifiq->ifiq_mtx); } + +#if NKSTAT > 0 +static int +ifiq_kstat_copy(struct kstat *ks, void *dst) +{ + struct ifiqueue *ifiq = ks->ks_softc; + struct ifiq_kstat_data *kd = dst; + + *kd = ifiq_kstat_tpl; + kd->kd_packets.kv_v.v_u64 = ifiq->ifiq_packets; + kd->kd_bytes.kv_v.v_u64 = ifiq->ifiq_bytes; + kd->kd_qdrops.kv_v.v_u64 = ifiq->ifiq_qdrops; + kd->kd_errors.kv_v.v_u64 = ifiq->ifiq_errors; + kd->kd_qlen.kv_v.v_u32 = ml_len(&ifiq->ifiq_ml); + + return (0); +} +#endif int ifiq_enqueue(struct ifiqueue *ifiq, struct mbuf *m) Index: sys/net/ifq.h =================================================================== RCS file: /cvs/src/sys/net/ifq.h,v retrieving revision 1.31 diff -u -p -r1.31 ifq.h --- sys/net/ifq.h 22 May 2020 07:02:24 -0000 1.31 +++ sys/net/ifq.h 24 Jun 2020 06:04:29 -0000 @@ -20,6 +20,7 @@ #define _NET_IFQ_H_ struct ifnet; +struct kstat; struct ifq_ops; @@ -54,6 +55,8 @@ struct ifqueue { uint64_t ifq_errors; uint64_t ifq_mcasts; + struct kstat *ifq_kstat; + /* work serialisation */ struct mutex ifq_task_mtx; struct task_list ifq_task_list; @@ -91,6 +94,8 @@ struct ifiqueue { uint64_t ifiq_errors; uint64_t ifiq_mcasts; uint64_t ifiq_noproto; + + struct kstat *ifiq_kstat; /* properties */ unsigned int ifiq_idx; Index: sys/sys/conf.h =================================================================== RCS file: /cvs/src/sys/sys/conf.h,v retrieving revision 1.152 diff -u -p -r1.152 conf.h --- sys/sys/conf.h 26 May 2020 07:53:00 -0000 1.152 +++ sys/sys/conf.h 24 Jun 2020 06:04:30 -0000 @@ -328,6 +328,13 @@ extern struct cdevsw cdevsw[]; (dev_type_stop((*))) enodev, 0, seltrue, \ (dev_type_mmap((*))) enodev, 0, 0, seltrue_kqfilter } +/* open, close, ioctl */ +#define cdev_kstat_init(c,n) { \ + dev_init(c,n,open), dev_init(c,n,close), (dev_type_read((*))) enodev, \ + (dev_type_write((*))) enodev, dev_init(c,n,ioctl), \ + (dev_type_stop((*))) enodev, 0, selfalse, \ + (dev_type_mmap((*))) enodev } + /* open, close, read, write, ioctl, stop, tty, poll, mmap, kqfilter */ #define cdev_wsdisplay_init(c,n) { \ dev_init(c,n,open), dev_init(c,n,close), dev_init(c,n,read), \ @@ -605,6 +612,7 @@ cdev_decl(wsmouse); cdev_decl(wsmux); cdev_decl(ksyms); +cdev_decl(kstat); cdev_decl(bio); cdev_decl(vscsi); Index: sys/sys/kstat.h =================================================================== RCS file: sys/sys/kstat.h diff -N sys/sys/kstat.h --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ sys/sys/kstat.h 24 Jun 2020 06:04:30 -0000 @@ -0,0 +1,197 @@ +/* $OpenBSD$ */ + +/* + * Copyright (c) 2020 David Gwynne + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#ifndef _SYS_KSTAT_H_ +#define _SYS_KSTAT_H_ + +#include + +#define KSTAT_STRLEN 32 + +#define KSTAT_T_RAW 0 +#define KSTAT_T_KV 1 +#define KSTAT_T_COUNTERS 2 + +struct kstat_req { + unsigned int ks_rflags; +#define KSTATIOC_F_IGNVER (1 << 0) + /* the current version of the kstat subsystem */ + unsigned int ks_version; + + uint64_t ks_id; + + char ks_provider[KSTAT_STRLEN]; + unsigned int ks_instance; + char ks_name[KSTAT_STRLEN]; + unsigned int ks_unit; + + struct timespec ks_created; + struct timespec ks_updated; + struct timespec ks_interval; + unsigned int ks_type; + unsigned int ks_state; + + void *ks_data; + size_t ks_datalen; + unsigned int ks_dataver; +}; + +/* ioctls */ + +#define KSTATIOC_VERSION _IOR('k', 1, unsigned int) +#define KSTATIOC_FIND_ID _IOWR('k', 2, struct kstat_req) +#define KSTATIOC_NFIND_ID _IOWR('k', 3, struct kstat_req) +#define KSTATIOC_FIND_PROVIDER _IOWR('k', 4, struct kstat_req) +#define KSTATIOC_NFIND_PROVIDER _IOWR('k', 5, struct kstat_req) +#define KSTATIOC_FIND_NAME _IOWR('k', 6, struct kstat_req) +#define KSTATIOC_NFIND_NAME _IOWR('k', 7, struct kstat_req) + +/* named data */ + +#define KSTAT_KV_NAMELEN 16 +#define KSTAT_KV_ALIGN sizeof(uint64_t) + +enum kstat_kv_type { + KSTAT_KV_T_NULL, + KSTAT_KV_T_BOOL, + KSTAT_KV_T_COUNTER64, + KSTAT_KV_T_COUNTER32, + KSTAT_KV_T_UINT64, + KSTAT_KV_T_INT64, + KSTAT_KV_T_UINT32, + KSTAT_KV_T_INT32, + KSTAT_KV_T_ISTR, /* inline string */ + KSTAT_KV_T_STR, /* trailing string */ + KSTAT_KV_T_BYTES, /* trailing bytes */ +}; + +/* units only apply to integer types */ +enum kstat_kv_unit { + KSTAT_KV_U_NONE = 0, + KSTAT_KV_U_PACKETS, /* packets */ + KSTAT_KV_U_BYTES, /* bytes */ + KSTAT_KV_U_CYCLES, /* cycles */ + KSTAT_KV_U_TEMP, /* temperature (uK) */ + KSTAT_KV_U_FANRPM, /* fan revolution speed */ + KSTAT_KV_U_VOLTS_DC, /* voltage (uV DC) */ + KSTAT_KV_U_VOLTS_AC, /* voltage (uV AC) */ + KSTAT_KV_U_OHMS, /* resistance */ + KSTAT_KV_U_WATTS, /* power (uW) */ + KSTAT_KV_U_AMPS, /* current (uA) */ + KSTAT_KV_U_WATTHOUR, /* power capacity (uWh) */ + KSTAT_KV_U_AMPHOUR, /* power capacity (uAh) */ + KSTAT_KV_U_PERCENT, /* percent (m%) */ + KSTAT_KV_U_LUX, /* illuminance (ulx) */ + KSTAT_KV_U_TIMEDELTA, /* system time error (nSec) */ + KSTAT_KV_U_HUMIDITY, /* humidity (m%RH) */ + KSTAT_KV_U_FREQ, /* frequency (uHz) */ + KSTAT_KV_U_ANGLE, /* angle (uDegrees) */ + KSTAT_KV_U_DISTANCE, /* distance (uMeter) */ + KSTAT_KV_U_PRESSURE, /* pressure (mPa) */ + KSTAT_KV_U_ACCEL, /* acceleration (u m/s^2) */ + KSTAT_KV_U_VELOCITY, /* velocity (u m/s) */ +}; + +struct kstat_kv { + char kv_key[KSTAT_KV_NAMELEN]; + union { + char v_istr[16]; + unsigned int v_bool; + uint64_t v_u64; + int64_t v_s64; + uint32_t v_u32; + int32_t v_s32; + size_t v_len; + } kv_v; + enum kstat_kv_type kv_type; + enum kstat_kv_unit kv_unit; +} __aligned(KSTAT_KV_ALIGN); + +#define KSTAT_KV_UNIT_INITIALIZER(_key, _type, _unit) { \ + .kv_key = (_key), \ + .kv_type = (_type), \ + .kv_unit = (_unit), \ +} + +#define KSTAT_KV_INITIALIZER(_key, _type) \ + KSTAT_KV_UNIT_INITIALIZER((_key), (_type), KSTAT_KV_U_NONE) + +void kstat_kv_init(struct kstat_kv *, const char *, enum kstat_kv_type); +void kstat_kv_unit_init(struct kstat_kv *, const char *, + enum kstat_kv_type, enum kstat_kv_unit); + +#ifdef _KERNEL + +#include + +struct kstat_lock_ops; + +struct kstat { + uint64_t ks_id; + + const char *ks_provider; + unsigned int ks_instance; + const char *ks_name; + unsigned int ks_unit; + + unsigned int ks_type; + unsigned int ks_flags; +#define KSTAT_F_REALLOC (1 << 0) + unsigned int ks_state; +#define KSTAT_S_CREATED 0 +#define KSTAT_S_INSTALLED 1 + + struct timespec ks_created; + RBT_ENTRY(kstat) ks_id_entry; + RBT_ENTRY(kstat) ks_pv_entry; + RBT_ENTRY(kstat) ks_nm_entry; + + /* the driver can update these between kstat creation and install */ + unsigned int ks_dataver; + void *ks_softc; + int (*ks_read)(struct kstat *); + int (*ks_copy)(struct kstat *, void *); + + const struct kstat_lock_ops * + ks_lock_ops; + void *ks_lock; + + /* the data that is updated by ks_read */ + void *ks_data; + size_t ks_datalen; + struct timespec ks_updated; + struct timespec ks_interval; +}; + +struct kstat *kstat_create(const char *, unsigned int, + const char *, unsigned int, + unsigned int, unsigned int); + +void kstat_set_rlock(struct kstat *, struct rwlock *); +void kstat_set_wlock(struct kstat *, struct rwlock *); +void kstat_set_mutex(struct kstat *, struct mutex *); +void kstat_set_cpu(struct kstat *, struct cpu_info *); + +int kstat_read_nop(struct kstat *); + +void kstat_install(struct kstat *); +void kstat_destroy(struct kstat *); + +#endif /* _KERNEL */ + +#endif /* _SYS_KSTAT_H_ */ Index: usr.bin/kstat/Makefile =================================================================== RCS file: usr.bin/kstat/Makefile diff -N usr.bin/kstat/Makefile --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ usr.bin/kstat/Makefile 24 Jun 2020 06:04:30 -0000 @@ -0,0 +1,9 @@ +# $OpenBSD$ + +PROG= kstat +SRCS= kstat.c +MAN= +WARNINGS=Yes +DEBUG=-g + +.include Index: usr.bin/kstat/kstat.c =================================================================== RCS file: usr.bin/kstat/kstat.c diff -N usr.bin/kstat/kstat.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ usr.bin/kstat/kstat.c 24 Jun 2020 06:04:30 -0000 @@ -0,0 +1,322 @@ +/* $OpenBSD$ */ + +/* + * Copyright (c) 2020 David Gwynne + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "/usr/src/sys/sys/kstat.h" + +#ifndef roundup +#define roundup(x, y) ((((x)+((y)-1))/(y))*(y)) +#endif + +#define DEV_KSTAT "/dev/kstat" + +static void kstat_list(int, unsigned int); + +#if 0 +__dead static void +usage(void) +{ + extern char *__progname; + fprintf(stderr, "usage: %s\n", __progname); + exit(1); +} +#endif + +int +main(int argc, char *argv[]) +{ + unsigned int version; + int fd; + + fd = open(DEV_KSTAT, O_RDONLY); + if (fd == -1) + err(1, "%s", DEV_KSTAT); + + if (ioctl(fd, KSTATIOC_VERSION, &version) == -1) + err(1, "kstat version"); + + kstat_list(fd, version); + + return (0); +} + +struct kstat_entry { + struct kstat_req kstat; + RBT_ENTRY(kstat_entry) entry; +}; + +RBT_HEAD(kstat_tree, kstat_entry); + +static inline int +kstat_cmp(const struct kstat_entry *ea, const struct kstat_entry *eb) +{ + const struct kstat_req *a = &ea->kstat; + const struct kstat_req *b = &eb->kstat; + int rv; + + rv = strncmp(a->ks_provider, b->ks_provider, sizeof(a->ks_provider)); + if (rv != 0) + return (rv); + if (a->ks_instance > b->ks_instance) + return (1); + if (a->ks_instance < b->ks_instance) + return (-1); + + rv = strncmp(a->ks_name, b->ks_name, sizeof(a->ks_name)); + if (rv != 0) + return (rv); + if (a->ks_unit > b->ks_unit) + return (1); + if (a->ks_unit < b->ks_unit) + return (-1); + + return (0); +} + +RBT_PROTOTYPE(kstat_tree, kstat_entry, entry, kstat_cmp); +RBT_GENERATE(kstat_tree, kstat_entry, entry, kstat_cmp); + +static int +printable(int ch) +{ + if (ch == '\0') + return ('_'); + if (!isprint(ch)) + return ('~'); + return (ch); +} + +static void +hexdump(const void *d, size_t datalen) +{ + const uint8_t *data = d; + size_t i, j = 0; + + for (i = 0; i < datalen; i += j) { + printf("%4zu: ", i); + + for (j = 0; j < 16 && i+j < datalen; j++) + printf("%02x ", data[i + j]); + while (j++ < 16) + printf(" "); + printf("|"); + + for (j = 0; j < 16 && i+j < datalen; j++) + putchar(printable(data[i + j])); + printf("|\n"); + } +} + +static void +strdump(const void *s, size_t len) +{ + const char *str = s; + char dst[8]; + size_t i; + + for (i = 0; i < len; i++) { + char ch = str[i]; + if (ch == '\0') + break; + + vis(dst, ch, VIS_TAB | VIS_NL, 0); + printf("%s", dst); + } + printf("\n"); +} + +static void +kstat_kv(const void *d, ssize_t len) +{ + const uint8_t *buf; + const struct kstat_kv *kv; + ssize_t blen; + void (*trailer)(const void *, size_t); + + if (len < (ssize_t)sizeof(*kv)) { + warn("short kv (len %zu < size %zu)", len, sizeof(*kv)); + return; + } + + buf = d; + do { + kv = (const struct kstat_kv *)buf; + + buf += sizeof(*kv); + len -= sizeof(*kv); + + blen = 0; + trailer = hexdump; + + printf("%16.16s: ", kv->kv_key); + + switch (kv->kv_type) { + case KSTAT_KV_T_NULL: + printf("null"); + break; + case KSTAT_KV_T_BOOL: + printf("%s", kv->kv_v.v_bool ? "true" : "false"); + break; + case KSTAT_KV_T_COUNTER64: + case KSTAT_KV_T_UINT64: + printf("%" PRIu64, kv->kv_v.v_u64); + break; + case KSTAT_KV_T_INT64: + printf("%" PRId64, kv->kv_v.v_s64); + break; + case KSTAT_KV_T_COUNTER32: + case KSTAT_KV_T_UINT32: + printf("%" PRIu32, kv->kv_v.v_u32); + break; + case KSTAT_KV_T_INT32: + printf("%" PRId32, kv->kv_v.v_s32); + break; + case KSTAT_KV_T_STR: + blen = kv->kv_v.v_len; + trailer = strdump; + break; + case KSTAT_KV_T_BYTES: + blen = kv->kv_v.v_len; + trailer = hexdump; + + printf("\n"); + break; + + case KSTAT_KV_T_ISTR: + printf("%.*s\n", (int)sizeof(kv->kv_v.v_istr), + kv->kv_v.v_istr); + break; + default: + printf("unknown type %u, stopping\n", kv->kv_type); + return; + } + + switch (kv->kv_unit) { + case KSTAT_KV_U_NONE: + break; + case KSTAT_KV_U_PACKETS: + printf(" packets"); + break; + case KSTAT_KV_U_BYTES: + printf(" bytes"); + break; + case KSTAT_KV_U_CYCLES: + printf(" cycles"); + break; + + default: + printf(" unit-type-%u", kv->kv_unit); + break; + } + + if (blen > 0) { + if (blen > len) { + blen = len; + } + + (*trailer)(buf, blen); + } else + printf("\n"); + + blen = roundup(blen, KSTAT_KV_ALIGN); + buf += blen; + len -= blen; + } while (len >= (ssize_t)sizeof(*kv)); +} + +static void +kstat_list(int fd, unsigned int version) +{ + struct kstat_entry *kse; + struct kstat_req *ksreq; + size_t len; + uint64_t id = 0; + struct kstat_tree kstat_tree = RBT_INITIALIZER(); + + for (;;) { + kse = malloc(sizeof(*kse)); + if (kse == NULL) + err(1, NULL); + + memset(kse, 0, sizeof(*kse)); + ksreq = &kse->kstat; + ksreq->ks_version = version; + ksreq->ks_id = ++id; + + ksreq->ks_datalen = len = 64; /* magic */ + ksreq->ks_data = malloc(len); + if (ksreq->ks_data == NULL) + err(1, "data alloc"); + + if (ioctl(fd, KSTATIOC_NFIND_ID, ksreq) == -1) { + if (errno == ENOENT) { + free(ksreq->ks_data); + free(kse); + break; + } + + err(1, "nfind id %llu", id); + } + + while (ksreq->ks_datalen > len) { + len = ksreq->ks_datalen; + ksreq->ks_data = realloc(ksreq->ks_data, len); + if (ksreq->ks_data == NULL) + err(1, "data resize (%zu)", len); + + if (ioctl(fd, KSTATIOC_FIND_ID, ksreq) == -1) + err(1, "find id %llu", id); + } + + if (RBT_INSERT(kstat_tree, &kstat_tree, kse) != NULL) + errx(1, "duplicate kstat entry"); + + id = ksreq->ks_id; + } + + RBT_FOREACH(kse, kstat_tree, &kstat_tree) { + ksreq = &kse->kstat; + printf("%s:%u:%s:%u\n", + ksreq->ks_provider, ksreq->ks_instance, + ksreq->ks_name, ksreq->ks_unit); + switch (ksreq->ks_type) { + case KSTAT_T_RAW: + hexdump(ksreq->ks_data, ksreq->ks_datalen); + break; + case KSTAT_T_KV: + kstat_kv(ksreq->ks_data, ksreq->ks_datalen); + break; + default: + hexdump(ksreq->ks_data, ksreq->ks_datalen); + break; + } + } +} +