sFlow vs netFlow/IPFIX

Pavel Odintsov pavel.odintsov at gmail.com
Mon Feb 29 08:12:32 UTC 2016


What you mean as lack of ifindex in sflow?

I could offer example sflow v5 sample structure description (it's from
my C++ based sflow parser but actually it's pretty simple to
understand):

        uint32_t sample_sequence_number; // sample sequence number
        uint32_t source_id_type;         // source id type
        uint32_t source_id_index;        // source id index
        uint32_t sampling_rate;          // sampling ratio
        uint32_t sample_pool;            // number of sampled packets
        uint32_t drops_count;            // number of drops due to
hardware overload
        uint32_t input_port_type;        // input port type
        uint32_t input_port_index;       // input port index
        uint32_t output_port_type;       // output port type
        uint32_t output_port_index;      // outpurt port index
        uint32_t number_of_flow_records;
        ssize_t original_payload_length;

As you can see we have source id, sampling rate and definitely we have
full information about source and destination ifindexes.

In addition to sample structure (which consist of first X bytes of
each packet) we have counter structures which working as old good
"snmp counters" and offer detailed information about load on each
port.


Looks like you haven't so much field experience with sflow. I could
help and offer some real field experience below.

---

Few words about netflow.

When you are speaking about "netflow" you should mentions explicit
vendors. Because netflow is very-very-very vendor specific.

I have my own netflow collector implementation for netflow v5, netflow
v9 and IPFIX (just check my repository
https://github.com/pavel-odintsov/fastnetmon/blob/master/src/netflow_plugin/netflow_collector.cpp).

I spent so much nights on debugging this protocols.

So you know about Mirkotik implementation of netflow (they have
minimum possible active and inactive timeout - 60 seconds) ?

Or what about old Cisco routers which support only 180 seconds as
active timeouts?

Could they offer affordable time for telemetry delivery?

The only one way to have accurate bandwidth data in netflow to use
some sort of average or moving average for certain time (30 seconds
for example).

But if you have really huge network you should use netflow sampling.

And here I should say multiple nice questions! Cisco and Juniper are
using really incompatible way to encode sampling rate. That's really
funny but that is. I do not know about other vendors because network
sampling is very specific feature.

But with sampling (if your collector could decode yet another netflow
incompatible implementation) things going really weird :) And you
could get accurate bandwidth data only if you have really HUGE network
with only 10-100GE customers because traffic speed to small (100 -
1000mbps) customers will be really weird :)

What's your ideas about this all? Please mention vendor names whet you
vote for netflow next time. Because not all netflow implementations
are OK. And definitely some netflow implementations are broken.



On Mon, Feb 29, 2016 at 10:53 AM, Roland Dobbins <rdobbins at arbor.net> wrote:
> On 29 Feb 2016, at 14:41, Pavel Odintsov wrote:
>
>> Could you describe they in details?
>
>
> Inconsistent stats, lack of ifindex information.
>
>> But netflow __could__ delay telemetry up to 30 seconds (in case of huge
>> syn/syn-ack flood for example) and you network will experience downtime.
>
>
> This is incorrect, and reflects an inaccurate understanding of how
> NetFlow/IPFIX actually works, in practice.  It's often repeated by those
> with little or no operational experience with NetFlow/IPFIX.
>
> -----------------------------------
> Roland Dobbins <rdobbins at arbor.net>



-- 
Sincerely yours, Pavel Odintsov



More information about the NANOG mailing list