Editing
Core Offloads
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Receive Segment Coalescing === The device MAY support Receive Segment Coalescing (RSC). If the device supports this feature, it MUST follow the below rules on packet combining. Receive Segment Coalescing reduces packet rate from device to host by building a single large packet from multiple consecutive packet payloads in the same stream. The concept applies well to TCP, which defines payload as a contiguous byte stream. The feature is also referred to as Large Receive Offload (LRO) and Hardware Generic Receive Offload (HW-GRO). The three mechanisms can differ in the exact rules on when and how to coalesce. RSC and LRO are originally defined only for TCP/IP. This section defines a broader set of rules. It takes the software Generic Receive Offload (GRO) in Linux v6.3 as ground truth. If the two disagree, that source code takes precedence. Receive Segment Coalescing is the common term for this behavior. To avoid confusion we do not introduce yet another different acronym. But the RSC rules defined here differ from those previously defined by Microsoft [ref_id:msft_rsc]. At a minimum, in the following ways: * This spec generalizes to other protocols besides IP and TCP * This spec requires all TCP options to be supported <span id="segment-size"></span> ===== Segment size ===== The device MUST pass to the host along with the large (SO) packet, a segment size field that encodes the payload length of the original packets. This field implies that packets are only coalesced if they have the same size on the wire. Coalescing stops if a packet arrives of different size. If it is larger than the previous packets, it cannot be appended. If it is smaller, it can be. If segment size is not a divisor of the SO packet payload, then the remainder encodes the payload length of this last packet. ''Reversibility'' The segment size field is mandatory. It must be possible to reconstruct the original packet stream. This reversibility capability is a hard requirement, to be able to use RSC plus TSO/USP/PISO for forwarding without creating externally observable changes to the packet stream compared to when both offloads are disabled. The ground rule is that receive offload must be the exact inverse of segmentation offload. That is, if TSO/USO/PISO splits a large packet into a chain of small ones, RSC will rebuild the exact same packet. The inverse also holds. An RSC packet forwarded to a device for transmission with TSO/USO/PISO will result in the same packets on the wire as arrived before RSC coalescing. Reconstructing the original packet stream imposes constraints on header coalescing beyond segment size. Each operation has to be reversible at segmentation offload. When fields are identical, coalescing is a trivially reversible operation. All other cases are explicitly listed below, by protocol. In exceptional cases, only where explicitly stated, do we allow information loss by coalescing packets with fields that differ. <span id="stateful"></span> ===== Stateful ===== Receive Segment Coalescing is not stateless. This specification does not prescribe concrete implementation. But in an abstract design, RSC maintains a table of RSC contexts. This specification does not state a minimum required number of contexts. Each RSC context can hold one SO packet. Each flow maps onto at most one context. When a packet arrives, it is compared to all contexts. See RSS for flow matching. If a context matches a flow, the next phase enters. A packet matches a context if it matches the flow, is consecutive to the SO packet and all header fields match. Fields match if they are the same, with some protocol-specific exceptions to this rule, all listed below. <span id="context-closure"></span> ==== Context Closure ==== An SO context closes if a packet matches the flow, but not the other conditions. Then the data is flushed to the host and the context released. In the common case, the SO packet and incoming packet are then passed to the host as two packets. In a few specific exception cases, the incoming packet is appended to the SO packet and the single larger SO packet is passed to the host. This special case MUST happen if all fields match, except for payload size, and payload size of the incoming packet is less than the previous segments. It then forms the valid remainder of the SO packet. The same SHOULD happen also if all fields match except for PSH or FIN and either or both of these is set on the incoming packet. <span id="general-match-exceptions"></span> ===== General Match Exceptions ===== * Length: if shorter than previous, may be appended, then closes context. * Checksums: must all have been verified before RO. Are ignored for packet matching. <span id="tcp-header-field-exceptions"></span> ===== TCP Header Field Exceptions ===== * Sequence number: must be previous plus segment size. * Flags: FIN and PSH bit only allow appending the packet, then close context. * Flags: all other flag differences close context without append. To state explicitly: Ack sequence number and TCP options must match. ''IP Header Field Exceptions'' * Fragmentation: fragmented packets are not coalesced. Detection of a first fragment closes the context for a flow, if open. * The IP ID must either increment for each segment or be the same for all segments. ** The first is common. The second may be the result of segmentation offload. To state explicitly: TTL, hop limit and flowlabel fields must match. Contexts can also be closed if a maximum number of segments is reached. This maximum may be host configurable. <span id="asynchronous-close"></span> ====== Asynchronous Close ====== Flows can also be closed asynchronously, due to one of two events. If the device applies RSC to a flow, it must set an expiry timer when the first packet opens an RSC context. The device must send the packet to the host no later than the timeout. The flow timeout value MUST be host readable and SHOULD be host configurable. The host may also notify the device that it wants RSC to be disabled. Any outstanding context must then be closed asynchronously, in the same manner as if their timers expired. <span id="so-packet-construction"></span> ==== SO packet construction ==== The device must adjust all protocol header length fields to match the length of the combined payload. <span id="tcp-header-field-adjustments-1"></span> ===== TCP Header Field Adjustments ===== * Sequence number: Sequence number of the first segment. * Checksum: undefined. * Flags: FIN and PSH are set if present in the last segment. * Flags: CWR is set if present in the first segment. ''IP Header Field Adjustments'' TSO requires protocol specific changes to the preceding IPv4 or IPv6 header of the last segment, if this is shorter than full mss: * IPv4 total length is updated to match the SO packet. ** Or set to zero and the below jumbo rules apply. * IPv6 payload length is updated to match the SO packet. ** Or set to zero and the below jumbo rules apply. * IPv4 IP ID is the ID of the first segment. * IPv4 checksum is valid. <span id="jumbogram-receive-segmentation-offload"></span> ===== Jumbogram Receive Segmentation Offload ===== Devices SHOULD support coalescing of packet streams that exceed the maximum IPv4 or IPv6 packet size. Jumbogram RSC is the inverse of Jumbogram Segmentation Offload. It solves the length field limitation in the same way: the length field MUST be set to zero and the length communicated out-of-band, likely as a descriptor field. Jumbogram RSC MUST only be applied if total length exceeds the IPv4 total length or IPv6 payload length field. <span id="timestamping"></span>
Summary:
Please note that all contributions to OpenCompute may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
OpenCompute:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information