DPDK and energy efficiency

Eric Kuhnke eric.kuhnke at gmail.com
Fri Mar 5 00:26:45 UTC 2021


A great deal of this discussion could be resolved by the use of a $20
in-line 120VAC watt meter [1] plugged into something as simple as a $500 1U
server with some of the DPDK-enabled network cards connected to its PCI-E
bus, running DANOS.

Characterizing the idle load, average usage load, and absolute maximum
wattage load of an x86-64 platform is excessively difficult or complicated.

[1]
https://www.homedepot.com/p/Kill-A-Watt-Electricity-Monitor-P4400/202196386


On Thu, Mar 4, 2021 at 11:28 AM Etienne-Victor Depasquale <edepa at ieee.org>
wrote:

> *TL;DR - DPDK applications embody the phrase caveat emptor.*
>
> As Robert Bays put it:  "Please ask your open source dev and/or vendor of
> choice to verify."
> On the other hand, I do not recommend taking the following (citing Robert
> Bays again) for granted:
> "But the reality is [open source projects and commercial products] have
> all been designed from day one not to unnecessarily consume power."
>
> This note is presented in two sections.
> Section 1 presents the preamble necessary to avoid misinformation.
> Section 2 presents the survey.
>
> If so inclined, please read on.
>
> *SECTION 1*
> There are three issues at stake:
>
> 1.  the ground truth about the power/energy efficiency of (current)
> deployments that use DPDK,
> 2.  my choice of words for the first question, as this constitutes the
> claimed source of misinformation, and
> 3.  apportionment of responsibility for the attained level of
> power/energy efficiency of a deployment that uses DPDK,
>
> *Issue #1: ground truth on current deployments*
> I base on (a) research papers and (b) Pawel Malachowski's data. Numbered
> references are listed at the end of this e-mail.
>
> [1] investigates software data planes, including OvS-DPDK. Citing directly:
> "DPDK-OVS always works with high power consumption even when [there is] no
> traffic to handle.
> Considering the inefficiency [][in] power, DPDK provides power management
> APIs to compromise
> between power consumption and performance."
> "For DPDK-OVS, due to the feature of DPDK’s Polling Mode Driver (PMD),
> once the first DPDK port is added to vswitchd process,
> it creates a polling thread and polls DPDK device in continuous loop.
> Therefore CPU utilization for that thread is always 100%,
> and the power consumption r[]ises to about 138 Watt"
>
> [2] investigates multimedia content delivery and benchmarks *DPDK-OvS* in
> the process. Citing directly:
> "Even when no traffic was in transit,
> OvS-DPDK consumed approximately three
> times more energy than the other two data
> planes, adding 250 percent energy overhead
> (15.57 W) on top of the host OS."
>
> [3] proposes the use of ACPI P-states and the halt instruction to control
> power consumption,
> in the context of *a bespoke application*. Citing directly:
> "For example, a Xeon(R) E5-2620 v3 dual socket CPU consumes
> about 22W of power when it is idle; but if a DPDK-based software
> router runs on it, the CPU power soars to 83W even
> when no packets arrive. That is a power gap of more than
> 60W."
>
> [4] investigates the energy-efficient use of *Pktgen-DPDK*. Citing
> directly:
> "We find that high performance comes at the cost of high energy
> consumption."
>
> Pawel Malachowski  shows a list of cores (13 out of 16) in use by a DPDK
> application
> ("DPDK-based 100G DDoS scrubber currently lifting some low traffic using
> cores 1-13
> on 16 core host. It uses naive DPDK::rte_pause() throttling to enter C1").
> The list shows the cores spending most of their time in C1.
> This means that cores are in a low-power-idle state and therefore not in
> an active (C0) state.
> This shows a power-aware DPDK application.
>
> *Issue #2: my choice of words, as a source of misinformation*
> Issue has been taken with the text of question 1.
> I addressed this to the NANOG community,
> who are busy and knowledgeable.
> I chose, *with hindsight wrongly*, to paraphrase,
> with the expectation that a reader would interpret correctly.
> A better expression, that would still have been terse, would be:
> "Are you aware that *naïve* use of DPDK on a processor core keeps
> utilization at 100% regardless of packet activity?"
>
>
> *Issue #3: apportionment of responsibility for the attained level of
> power/energy efficiency of a deployment that uses DPDK*
> Pawel Malachowski states that "It consumes 100% only if you busy poll
> (which is the default approach)."
>
> Since it is the application that exploits the DPDK API,
> and since the DPDK API promotes run-to-completion (
> https://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html),
> then *it is the application that determines power consumption*
> but it is DPDK's poll-mode driver *that poses a real threat to power
> efficiency, if used in "the default approach".*
>
> Robert Bays states:
> "The vast majority of applications that this audience would actually
> install in their networks do not do tight polling all the time and
> therefore don’t consume 100% of the CPU all the time."
> *Would this audience (an audience of network operators) **truly not be
> interested in using OvS-DPDK ?*
> *Caveat emptor.*
>
> *SECTION 2: Survey results*
> *Q1*
> [image: image.png]
> *Q2*
> [image: image.png]
>
>
>
> [1] Z. Xu, F. Liu, T. Wang, and H. Xu, “Demystifying the energy efficiency
> of Network Function Virtualization,”
> in 2016 IEEE/ACM 24th International Symposium on Quality of Service
> (IWQoS), Jun. 2016, pp. 1–10.
> DOI: 10.1109/IWQoS.2016.7590429.
>
> [2] S. Fu, J. Liu, and W. Zhu, “Multimedia Content Delivery with Network
> Function Virtualization: The Energy Perspective,”
>  IEEE MultiMedia, vol. 24, no. 3, pp. 38–47, 2017, ISSN: 1941-0166.
> DOI: 10.1109/MMUL.2017.3051514.
>
> [3] X. Li, W. Cheng, T. Zhang, F. Ren, and B. Yang, “Towards Power
> Efficient High Performance Packet I/O,”
> IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 4, pp.
> 981–996, April 2020,
> ISSN:1558-2183. DOI: 10.1109/TPDS.2019.2957746.
>
> [4] G. Li, D. Zhang, Y. Li, and K. Li, “Toward energy efficiency
> optimization of pktgen-DPDK for green network testbeds,”
> China Communications, vol. 15, no. 11, pp. 199–207, November 2018,
> ISSN: 1673-5447. DOI: 10.1109/CC.2018.8543100.
>
>
> On Sat, Feb 27, 2021 at 5:11 PM Etienne-Victor Depasquale <edepa at ieee.org>
> wrote:
>
>> Just a quick note to say that I've closed the survey.
>>
>> I haven't published the results yet as I said that I would write notes
>> necessary as a preamble to correctly inform potential readers,
>> and these notes are taking longer to write than I have time available.
>>
>> Cheers,
>>
>> Etienne
>>
>> On Wed, Feb 24, 2021 at 7:07 PM Etienne-Victor Depasquale <edepa at ieee.org>
>> wrote:
>>
>>> I think I need to calm this thread down.
>>>
>>> I'm a researcher, and my interest is in the truth, not in my opinion.
>>>
>>> I've read some facts in this thread that are necessary
>>> as a prerequisite to the publication of the results on Friday.
>>>
>>> I do want to ensure that no future reader is misinformed and will do my
>>> best,
>>> with the help of contribution from my peers in this good community,
>>> to summarize all objections to this survey's questions,
>>> in the same message as that which publishes the result.
>>>
>>> All peace and good wishes,
>>>
>>> Etienne
>>>
>>> On Wed, Feb 24, 2021 at 4:35 PM Robert Bays <robert at gdk.org> wrote:
>>>
>>>> To the nanog community, I’m sorry to have dragged this conversation out
>>>> further.  I'm only responding to this because there are a significant
>>>> number of open source projects and commercial products that use DPDK, or
>>>> similar userspace network environment in their implementations.  The
>>>> statements in this thread incorrectly cast them, because they use DPDK, as
>>>> inefficient.  But the reality is they have all been designed from day one
>>>> not to unnecessarily consume power.  Please ask your open source dev and/or
>>>> vendor of choice to verify.  But please don’t rely on the information in
>>>> this thread to make decisions about what you deploy in your network.
>>>>
>>>> On Feb 23, 2021, at 11:44 PM, Etienne-Victor Depasquale <edepa at ieee.org>
>>>> wrote:
>>>>
>>>> Hello Robert,
>>>>
>>>> Your statement that DPDK “keeps utilization at 100% regardless of
>>>>> packet activity” is just not correct.  You further pre-suppose "widespread
>>>>> DPDK's core operating inefficiency” without any data to backup the
>>>>> operating inefficacy assertion.
>>>>>
>>>>
>>>> This statement is incorrect.
>>>> I have provided references (please see earlier e-mails) that
>>>> investigate the operation of DPDK.
>>>> These references are items of peer-reviewed research that investigate a
>>>> perceived problem with deployment of DPDK.
>>>> If the power consumption incurred while running DPDK were a corner
>>>> case,
>>>> then there would be little to no research value in investigating such
>>>> behavior.
>>>>
>>>>
>>>> Your references don’t take into account the code that this community
>>>> would actually deploy; open source implementations like DANOS, FD.io,
>>>> or OVS.  They don’t audit any commercial products that implement userspace
>>>> stacks.  None of your references say that DPDK is inherently inefficient.
>>>> The closest they come is to say that tight polling is inefficient.  But
>>>> tight polling, even in the earliest days of DPDK, was never meant to be a
>>>> design pattern that was actually deployed into production.  I was there for
>>>> those early conversations.
>>>>
>>>> Please don’t mislead the community into believing that DPDK == power bad
>>>>>
>>>> I have to object to this statement. It does seem to imply malice, or,
>>>> at best, amateurish behaviour, whether you intended it or not.
>>>>
>>>>
>>>> Object all you want.  You are misleading people with your comments.
>>>> And in the process you are denigrating a large swath of OSS projects and
>>>> commercial products that use DPDK.  Your survey questions are leading and
>>>> provide a false dichotomy.  And when you post the results here, they will
>>>> be archived forever to continue to spread misinformation, unfortunately.
>>>>
>>>> Everything following is informational.  Stop here if so inclined.
>>>>>
>>>>  Please stop delving into the detail of DPDK's facilities without
>>>> regard for your logical omission:
>>>> that whether the facilities are available or not, DPDK's deployment
>>>> profile (meaning: how it's being used in general), as indicated by the
>>>> references I've provided,
>>>> are leading to high power inefficiency on cores partitioned to the data
>>>> plane.
>>>>
>>>>
>>>> I’ve been writing network appliance code for over 20 years.  I designed
>>>> network architectures for years before that.  I have 10s of thousands of
>>>> DPDK based appliances in production at this moment across multiple
>>>> different use cases. I work with companies that have 100s of thousands of
>>>> units in production that leverage userspace runtimes.  I do think I
>>>> understand DPDK’s deployment profile better than you.  That’s what I have
>>>> been trying to tell you.  People don’t write inefficient DPDK code to put
>>>> into production.  We’re not dumb.  We’ve been thinking about power
>>>> consumption from day one.  DPDK was never supposed to be just a tight loop
>>>> poll.  You were always supposed to put in the very minimal extra work to
>>>> modulate power consumption.
>>>>
>>>> The takeaway is that DPDK (and similar) doesn’t guarantee runaway power
>>>>> bills.
>>>>>
>>>> Of course it doesn't.
>>>> Even the second question of that bare-bones survey tried to communicate
>>>> this much.
>>>>
>>>> If you have questions, I’d be happy to discuss off line
>>>>>
>>>> I would be happy to answer your objections in detail off line too.
>>>> Just let me know.
>>>>
>>>>
>>>> Unfortunately, you don’t seem to be receptive to the numerous people
>>>> contradicting your assertions.  So I’m out.  I’ll let my comments stand
>>>> here.
>>>>
>>>> Cheers,
>>>>
>>>> Etienne
>>>>
>>>>
>>>> On Wed, Feb 24, 2021 at 12:12 AM Robert Bays <robert at gdk.org> wrote:
>>>>
>>>>> Hi Etienne,
>>>>>
>>>>> Your statement that DPDK “keeps utilization at 100% regardless of
>>>>> packet activity” is just not correct.  You further pre-suppose "widespread
>>>>> DPDK's core operating inefficiency” without any data to backup the
>>>>> operating inefficacy assertion.  Your statements, taken at face value, lead
>>>>> people to believe that if a project uses DPDK it’s going to increase their
>>>>> power costs.  And that’s just not the case.  Please don’t mislead the
>>>>> community into believing that DPDK == power bad.
>>>>>
>>>>> Everything following is informational.  Stop here if so inclined.
>>>>>
>>>>> DPDK does not dictate CPU utilization or power consumption, the
>>>>> application leveraging DPDK does.  It’s the application that decides how to
>>>>> poll packets.  If an application implements DPDK using only a tight polling
>>>>> loop, then it will keep CPU cores that are running DPDK threads at 100%.
>>>>> But only the most simple and/or bespoke (think trading) applications are
>>>>> implemented this way.  You don’t need tight polling all the time to get the
>>>>> performance gains provided by DPDK or similar environments.  The vast
>>>>> majority of applications that this audience would actually install in their
>>>>> networks do not do tight polling all the time and therefore don’t consume
>>>>> 100% of the CPU all the time.   An interesting, helpful research effort you
>>>>> could lead would be to survey the ecosystem to catalog those applications
>>>>> that do fall into the power hungry category and help them to change their
>>>>> code.
>>>>>
>>>>> Intel DPDK application development guidelines don’t pre-suppose tight
>>>>> polling all the time and offer at least two methods for optimizing power
>>>>> against throughput.  The older method is to use adaptive polling;
>>>>> increasing the polling frequency as traffic load increases.  This keeps cpu
>>>>> utilization low when packet load is light and increases it as traffic
>>>>> levels warrant.  The second method is to use P-states and/or C-states to
>>>>> put the processor into lower power modes when traffic loads are lighter.
>>>>> We have found that adaptive polling works better across a larger pool of
>>>>> hardware types, and therefore that is what DANOS uses, amongst other
>>>>> things.
>>>>>
>>>>> Further, performance and power consumption are dictated by a
>>>>> multivariate set of application decisions including: design patterns such
>>>>> as single thread run to completion models vs. passing mbufs between
>>>>> multiple threads, buffer sizes and cache management algorithms, combining
>>>>> and/or separating tx/rx threads, binding threads to specific lcores,
>>>>> reserved cores for DPDK threads, hyperthreading, kernel schedulers,
>>>>> hypervisor schedulers, interface drivers, etc.  All of these are
>>>>> application specific, not DPDK generic.  Well written applications that
>>>>> leverage DPDK provide knobs for the user to tune these settings for their
>>>>> specific environment and use case.  None of this unique to DPDK.  Solution
>>>>> designs were cribbed from previous technologies.
>>>>>
>>>>> The takeaway is that DPDK (and similar) doesn’t guarantee runaway
>>>>> power bills.  Power consumption is dictated by the application.  Look for
>>>>> well behaved applications and everything will be alright.
>>>>>
>>>>> If you have questions, I’d be happy to discuss off line.
>>>>>
>>>>> Thanks,
>>>>> Robert.
>>>>>
>>>>>
>>>>> > On Feb 22, 2021, at 11:27 PM, Etienne-Victor Depasquale <
>>>>> edepa at ieee.org> wrote:
>>>>> >
>>>>> > Sorry, last line should have been:
>>>>> > "intended to get an impression of how widespread ***knowledge of***
>>>>> DPDK's core operating inefficiency is",
>>>>> > not:
>>>>> > "intended to get an impression of how widespread DPDK's core
>>>>> operating inefficiency is"
>>>>> >
>>>>> > On Tue, Feb 23, 2021 at 8:22 AM Etienne-Victor Depasquale <
>>>>> edepa at ieee.org> wrote:
>>>>> > Beyond RX/TX CPU affinity, in DANOS you can further tune power
>>>>> consumption by changing the adaptive polling rate.  It doesn’t, per the
>>>>> survey, "keep utilization at 100% regardless of packet activity.”
>>>>> > Robert, you seem to be conflating DPDK
>>>>> > with DANOS' power control algorithms that modulate DPDK's default
>>>>> behaviour.
>>>>> >
>>>>> > Let me know what you think; otherwise, I'm pretty confident that
>>>>> DPDK does:
>>>>> > "keep utilization at 100% regardless of packet activity.”
>>>>> >
>>>>> > Keep in mind that this is a bare-bones survey intended for busy,
>>>>> knowledgeable people (the ones you'd find on NANOG) -
>>>>> > not a detailed breakdown of modes of operation of DPDK or DANOS.
>>>>> > DPDK has been designed for fast I/O that's unencumbered by the
>>>>> trappings of general-purpose OSes,
>>>>> > and that's the impression that needs to be forefront.
>>>>> > Power control, as well as any other dimensions of modulation,
>>>>> > are detailed modes of operation that are well beyond the scope of a
>>>>> bare-bones 2-question survey
>>>>> > intended to get an impression of how widespread DPDK's core
>>>>> operating inefficiency is.
>>>>> >
>>>>> > Cheers,
>>>>> >
>>>>> > Etienne
>>>>> >
>>>>> > On Mon, Feb 22, 2021 at 10:20 PM Robert Bays <robert at gdk.org> wrote:
>>>>> > Beyond RX/TX CPU affinity, in DANOS you can further tune power
>>>>> consumption by changing the adaptive polling rate.  It doesn’t, per the
>>>>> survey, "keep utilization at 100% regardless of packet activity.”  Adaptive
>>>>> polling changes in DPDK optimize for tradeoffs between power consumption,
>>>>> latency/jitter and drops during throughput ramp up periods.  Ideally your
>>>>> DPDK implementation has an algorithm that tries to automatically optimize
>>>>> based on current traffic patterns.
>>>>> >
>>>>> > In DANOS refer to the “system default dataplane power-profile”
>>>>> config command tree for adaptive polling settings.  Interface RX/TX
>>>>> affinity is configured on a per interface basis under the “interfaces
>>>>> dataplane” config command tree.
>>>>> >
>>>>> > -robert
>>>>> >
>>>>> >
>>>>> > > On Feb 22, 2021, at 11:46 AM, Jared Geiger <jared at compuwizz.net>
>>>>> wrote:
>>>>> > >
>>>>> > > DANOS lets you specify how many dataplane cores you use versus
>>>>> control plane cores. So if you put a 16 core host in to handle 2GB of
>>>>> traffic, you can adjust the dataplane worker cores as needed. Control plane
>>>>> cores don't stay at 100% utilization.
>>>>> > >
>>>>> > > I use that technique plus DANOS runs on VMware (not
>>>>> oversubscribed) which allows me to use the hardware for other VMs. NICS are
>>>>> attached to the VM via PCI Passthrough which helps eliminate the overhead
>>>>> to the VMware hypervisor itself.
>>>>> > >
>>>>> > > I have an 8 core VM with 4 cores set to dataplane and 4 to control
>>>>> plane. The 4 control plane cores are typically idle only processing BGP
>>>>> route updates, SNMP, logs, etc.
>>>>> > >
>>>>> > > ~Jared
>>>>> > >
>>>>> > > On Sun, Feb 21, 2021 at 11:30 PM Etienne-Victor Depasquale <
>>>>> edepa at ieee.org> wrote:
>>>>> > > Hello folks,
>>>>> > >
>>>>> > > I've just followed a thread regarding use of CGNAT and noted a
>>>>> suggestion (regarding DANOS) that includes use of DPDK.
>>>>> > >
>>>>> > > As I'm interested in the breadth of adoption of DPDK, and as I'm a
>>>>> researcher into energy and power efficiency, I'd love to hear your feedback
>>>>> on your use of power consumption control by DPDK.
>>>>> > >
>>>>> > > I've drawn up a bare-bones, 2-question survey at this link:
>>>>> > >
>>>>> > > https://www.surveymonkey.com/r/J886DPY.
>>>>> > >
>>>>> > > Responses have been set to anonymous.
>>>>> > >
>>>>> > > Cheers,
>>>>> > >
>>>>> > > Etienne
>>>>> > >
>>>>> > > --
>>>>> > > Ing. Etienne-Victor Depasquale
>>>>> > > Assistant Lecturer
>>>>> > > Department of Communications & Computer Engineering
>>>>> > > Faculty of Information & Communication Technology
>>>>> > > University of Malta
>>>>> > > Web. https://www.um.edu.mt/profile/etiennedepasquale
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Ing. Etienne-Victor Depasquale
>>>>> > Assistant Lecturer
>>>>> > Department of Communications & Computer Engineering
>>>>> > Faculty of Information & Communication Technology
>>>>> > University of Malta
>>>>> > Web. https://www.um.edu.mt/profile/etiennedepasquale
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Ing. Etienne-Victor Depasquale
>>>>> > Assistant Lecturer
>>>>> > Department of Communications & Computer Engineering
>>>>> > Faculty of Information & Communication Technology
>>>>> > University of Malta
>>>>> > Web. https://www.um.edu.mt/profile/etiennedepasquale
>>>>>
>>>>>
>>>>
>>>> --
>>>> Ing. Etienne-Victor Depasquale
>>>> Assistant Lecturer
>>>> Department of Communications & Computer Engineering
>>>> Faculty of Information & Communication Technology
>>>> University of Malta
>>>> Web. https://www.um.edu.mt/profile/etiennedepasquale
>>>>
>>>>
>>>>
>>>
>>> --
>>> Ing. Etienne-Victor Depasquale
>>> Assistant Lecturer
>>> Department of Communications & Computer Engineering
>>> Faculty of Information & Communication Technology
>>> University of Malta
>>> Web. https://www.um.edu.mt/profile/etiennedepasquale
>>>
>>
>>
>> --
>> Ing. Etienne-Victor Depasquale
>> Assistant Lecturer
>> Department of Communications & Computer Engineering
>> Faculty of Information & Communication Technology
>> University of Malta
>> Web. https://www.um.edu.mt/profile/etiennedepasquale
>>
>
>
> --
> Ing. Etienne-Victor Depasquale
> Assistant Lecturer
> Department of Communications & Computer Engineering
> Faculty of Information & Communication Technology
> University of Malta
> Web. https://www.um.edu.mt/profile/etiennedepasquale
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20210304/2209a311/attachment.html>


More information about the NANOG mailing list