The draft is complete except for the fact that more packet types will be added for the end-programmer's benefit, but this will not effect the actual protocol, since only Packet and ExitPacket are absolutely required.
Please review this draft and send any comments and suggestions to gillius AT mail DOT rit DOT edu.
Table of Contents
This document describes the method by which GNE communicates between a client and a server. More information about the GNE library can be found at http://www.gillius.org/. Note that when this document refers to client and server it applies these concepts only at the low-level. A server is simply the computer that is listening for new connections and a client is the computer that is initiating them. These terms do not refer to any role the two peers play in communicating data for the networking application.
This document should be adequate enough by itself to allow another programmer to write a compatible interface that can interact transparently with a peer using the GNE library, potentially to write a compatible C version of the library or to write utilities to interact with game servers using GNE (like Gamespy, etc.).
This document will also make references to TCP and UDP. Since GNE uses HawkNL, GNE can work over any network type, so when TCP is mentioned it can mean TCP or any other similarly reliable protocol, and for UDP it means any unreliable datagram protocol. More precisely, TCP means a HawkNL socket of type NL_RELIABLE_PACKETS and UDP of type NL_UNRELIABLE.
Also, this document will make references to the term "user." User meaning the programmer that will be using GNE to develop their game. This term is also used interchangably with the term "end-programmer."
GNE always performs its transfers using a packet-based method, and never a stream method -- however TCP is used to transfer reliable packets now. The method of transfering packets over a stream socket is the NL_RELIABLE_PACKETS socket type in HawkNL, because HawkNL is the network library that the C++ implementation of GNE uses to encapsulate the OS's socket functions.
Breaking a stream socket into packets was chosen not only for its convience, but to allow for additional communications methods to the GNE protocol (in version 2), an example might be where the two sockets might be combined into a single UDP or IPX socket sending both unreliable and reliable packets together at the same time.
The format in the TCP stream consists of the two bytes 'N', and 'L', and a 16 bit packet content length in big-endian format followed by the content. The content length does not include the 4 byte packet header. So where TCP is mentioned in this document, NL_RELIABLE_PACKETS is what is meant.
Because GNE will be running on multiple platforms and architectures, it is necessary to define several common variable types. All of the data types will be in little-endian format -- This is different from most portable networking libraries which usually send data in big-endian format. The reason for this is because I expect the library to be used almost exclusively on the Intel x86-based architecture, rather than the other main supported architectures, PPC (Macintosh) and UltraSparc (Sun), which are both big-endian processors.
As a refresher, in little-endian format the least-significant byte comes first, and is in memory with an opposite byte order than how we write it on paper. In big-endian the most-significant byte is first. So the number 45,682 is 0xB272 in hex. If we assign this number to a 32-bit integer, in memory for a little endian processor we will see the bytes 72, B2, 00, 00 and in the big-endian processor we will see the bytes as 00, 00, B2, 72. x86 CPUs are little-endian but Macs and UltraSparc machines are big-endian.
GNE sends all of its data in little-endian format, but of course this does not force the GNE user to send inner-packet data as little-endian, but it is suggested that they do so. The RawPacket class's stream operators in the C++ implementation automatically convert from the host format into little-endian.
gbyte, gint8, gint16,
gint32, guint8, guint16, guint32
These are integer types. The number specifies their width in bits, and the u denotes unsigned types. gbyte is a synonym for guint8. gint64 and guint64 might be added in a future revision of the GNE protocol.
These are single-precision (32 bit -- float in most compilers) and double-precision (64 bit -- double in most compilers) floating point numbers, as implemented by the IEEE 754-1985 standard (official site). From what I can tell x86, Mac, and UltraSparc platforms all use and understand these numbers natively, so luckily it seems no conversion other than endian is required. If in the future it is found that conversion is needed, the RawPacket operators will handle this conversion, always converting to and from the network format.
std::string, const char*
GNE can read and write strings. A string used by GNE cannot be larger than 255 characters long. If a longer string is required, it is suggested that a new packet type is created which holds only the string in the format the user chooses (a packet that does this MIGHT be added in the future). The reason for this is packets shouldn't be much larger than 250 bytes and with the other data, a larger string should not be needed.
The string format is a guint8 byte specifying a string size x from 0 to 255. The next x bytes of the packet are the string. This method was chosen over C-style strings for two reasons:
At the moment, 8-bit strings are assumed. But in the future 16-bit or higher strings will likely be supported through std::wstring. In this case, the format will be the same. A size byte will specify the number of bytes the string consists of, and then that number of bytes will be copied into memory, so it will be hopefully format-transparent. This change might be made before protocol 1 is finalized, or it might be added into protocol version 2, depending both on the demand and the difficulty of the support.
A time format. This value is 64-bits long, and consists of 2 gint32 parts. The first 4 bytes is a gint32 variable of the number of seconds. This number can be negative, which makes sense only if the time is a relative time. The last 4 bytes is another gint32 with the number of microseconds. This value must be in the range [0, 999,999]. The sign of the milliseconds part is always positve, and the entire time value should be interpreted as seconds + milliseconds. This is of particular note if the seconds part is negative -- if the seconds value is -5, and the milliseconds value is 500,000, then the time is to be interpreted as -4.5 seconds and not -5.5 seconds.
For representing small relative time values, or small absolute time values (such as current frame count), the user will likely want to use a guint32 to conserve bandwidth.
In the C++ inplementation of GNE, the gtime data type is represented by the GNE::Time class.
The start of the connection process works as usual for the reliable protocol that is in use (such as TCP). Once the low-level connection has been established, the client (connecting machine) sends a packet with the following information called the connection request packet (CRP) with an exact size of 48 bytes. The CRP starts with a header so that the client can more easily tell if the returned packet came from GNE rather or some random client.
Game string notes: The 32-byte game string is used because it is possible for one GNE program to connect to another program and not be the same program without this string, because it is expected that many users will use a user protocol version number of 0, 1, 2 ... and such. Therefore if one was running a GNE game "A" and a GNE game "B" they are likely to have the same GNE protocol versions, and somewhat likely to have the same user versions, so the way chosen to differentiate between the two games is to send the game name or some other unique identifier string. Format: of this string is a NULL-terminated ASCII string of length no greater 31 characters. Any unused space in the buffer shall consist of bytes with the value of 0.
The build number is used only as a convenience for development and can be used during the development process because the implementation may only be partially completed. The version number of the connecting GNE client should match exactly to what the server expects, and therefore the number should only change when changes make communication incompatible. This version protocol number is not related to the version number of the GNE code itself.
The protocol version number is read as version.sub.build. So the version number for a final production GNE implementation of this protocol should be 1.0.0. Non-released development versions (as you will see if you get GNE from CVS), will have the previous GNE version number with a non-zero build, so the first attempt at implementing 1.0.0 will be versioned 0.0.1, 0.0.2, ... and so on. Only CVS versions should have a non-zero build number.
The user version is supplied by the user code and its contents is a number which should match exactly between both ends of the connection. It is unsigned so that nothing is implied by the bits in the memory. One can note that the user version number uses the same amount of space (32 bits) as the GNE version number and therefore a user could implement a similar version numbering scheme to GNE's.
The first two version numbers are picked as guint8 values rather than gint32 or guint32 because they are endian and signed-number-format independent and thus are forwards compatible. Even if the default endian format, or the signed number format of GNE changes, these values will remain compatable and comparable.
Once the server receives the CRP, it can choose to accept or refuse the connection. It must refuse the connection if any of the versions do not match. If can also refuse the connection for any other reason it wishes (i.e. a server may choose to allow only IP adresses in its subnet or maybe it is too busy). Both the refusal and the acceptance starts with a header so that the client can more easily tell if the returned packet came from GNE rather or some random server.
If the server decides to refuse, it sends a refusal packet with the server's version information with a size of exactly 44 bytes:
The information in the refusal packet gives some more information to the client to know what the version mismatch was. If the versions do not mismatch then the client cannot tell why they were refused.
If the server accepts, it sends a connection accepted packet (CAP) of exactly 8 or more bytes, or exactly 12 when using the TCP/UDP or IPX/SPX protocols:
The last part of the CAP is defined depending on the low-level protocol in use to actually connect the peers. For internet and other port-based protocols, a port number is sent in the form of a gint32. The address should be the same and therefore is not sent. The client requests an unreliable port by sending a value of true in appropriate field of the CRP.
If the client did not request an unreliable connection, or the server refused it, then the client considers the connection complete as soon as it receives the CAP. The server considers the connection complete as soon as it sends the CAP.
If a unreliable connection was successfully negotiated, then once the client receives the CAP, it will send a packet over its reliable connection with only the data needed for the server to be able to respond to the client, in the nearly same format that the server sent.
For UDP and IPX this shall be a gint32 from 0 to 65535, inclusive.
The client may optionally send a packet over the unreliable connection at this time to the server if that will help in opening up a firewall or gateway. This optional packet is a GNE-level packet consisting of zero user-level packets, meaning it should send a single packet of a single guint8 with a value of 255.
Once the client has sent the one or two packets it can consider the connection process completed and start to send data. The server will consider the connection complete once it receives the packet sent over the reliable connection.
Very likely the user will want to perform further transfers of information to start their connection. Any extra connection communication is dependant on the user and is the reason for the user version number. This process will be handled through the onNewConn and onConnect events in the GNE library.
Reliable or unreliable, streaming or non-streaming there are two layers of packets for GNE. User-level and GNE-level. A GNE-level packet is a low-level packet that cannot be fragmented (and is small). If there can be fragmentation, the networking layer will take care of this (like HawkNL's NL_RELIABLE_PACKETS for packet based transmission over TCP).
The user-level packets are the only thing the user can see. From the user's perspective, the packets are comming and going in a stream through the PacketStream class. The end-programmer chooses to send a packet reliably or not, sending it reliably guarantees order, reception, and uniqueness (no duplicated packets). Without sending it reliably, packets might be lost, might be out of order, excessively late, or duplicated. In the default C++ implementation, all packets come in from the same place -- the client cannot tell how it got there so the reliable and unreliable connections both feed into the same packet stream. If the client did not request an unreliable connection, it is suggested that a GNE implementation sends packets marked for the unreliable connection over the reliable one, or at least generate an error. It is suggested that implementations of the GNE protocol treat reliable packets as having a higher priority than unreliable packets and send reliable packets first, or implement an interface that allows the end-programmer to specify priority levels (the default C++ implementation does not have one at this time, but treats reliable packets as having the highest priority).
Note: Only the user-level packets follow these guarantees, not GNE-level packets. Therefore, a future revision of the GNE protocol might specify a way to internally reorder packets, combine the reliable and unreliable data into a single UDP or TCP connection, or various other methods to optimize network usage, as long as by the time the packets reach the user the guaranteed conditions are met.
The user is meant to create their packets to be as small as possible, so even a packet containing 5 bytes is perfectly acceptable. This is because GNE is expected to optimize the connection and combine these packets.
When connecting, the client and server sent each other their max incoming data rates. This means that the peer should not transmit more than approximately that many bytes per second. This can be used to better cooperate and adjust for modem users, or to reserve some bandwith on a broadband connection for other uses. On a server it can be used to control the amount of incoming data for various reasons. Thus if the client requests an incoming maximum of 3200 bytes per second, the server in any particular second should not send out more than approximately 3200 bytes to that client. It is stated as approximately because the protocol does not have to (nor can, given the unreliable connection), send EXACTLY 3200 or less but the bandwith cap should be honored as much as is reasonable.
Note that the value of zero sent in the connection process means that no rate limiting is requested, so that the other side can feel free to send as much data as it wants. The value can still be changed through RateAdjustPackets to or from this "unlimited" value.
Note that if no data is sent for a time a "deficit" does not build up. This is consistant to how modems or low bandwidth conncetions work. The average sent over a second should still match the requested rate or less.
In the default C++ implementation of GNE, the end-programmer also gets to specify outgoing data rates as well, and when connecting the implementation picks the minimum of the two rates, but this is implementation dependent as the peer is free to select an actual data transfer rate equal to or less than the requested.
The peer is also allowed to change this rate up and down at any time by sending a RateAdjustPacket over the reliable connection. The other end of the connection should start to honor this new rate "soon," where soon is defined as within a second after receiving the RateAdjustPacket, but as before, you can't "unsend" sent packets and a burt of unreliable packets might arrive late so the exact results are still fuzzy. The rate change ability can be useful to servers, espically if they are low bandwith, to allow clients to send them more when few people are playing but to have them reduce their sending when the server becomes loaded with extra players. This can also be used by advanced programmers in their attempt to tweak the connection and throttle it up and down depending on changing conditions.
If the programmer is sending more packets to the implementation than can be sent out, the implementation may do one or more of the following:
The implementation is NOT allowed to drop reliable packets. If the bottleneck becomes large enough that reliable packets cannot be sent any longer, the connection should be considered broken and/or timed out and should be terminated. At what point this occurs is decided by the implementation and possibly the end-programmer.
Each GNE-level packet consists of zero or more user-level packets. The format is as follows:
A GNE-level packet will contain zero user-level packets in the case of the client choosing to send its optional UDP packet to open up a firewall or gateway.
So in other terms, each packet's first byte is an id identifing the type of packet that it is, followed by a set of packet-defined data which is parsed by code in the class the resembles the packet, so ANYTHING can exist in there, and a packet does not have to be of set size for its ID (a common example is CustomPacket and packets with strings). The id 255 represents the end of the GNE-level packet and parsing stops at that point.
If an error occurs during parsing, the implementation defines what happens. It could throw out the current packet and all other packets left unparsed in this GNE-level packet. It could also attempt to recover data in the current packet and any others left unparsed. In either case the implementation should warn the client code that an error occured and data was lost.
Note that this means even reliable data can be lost to earlier, corrupted, reliable packets! Since the network layer guarantees the the data inside of our packets will always arrive correctly, errors during parsing can only represent programmer errors or a hacked client sending improper packets (which is really a subset of programming errors).
By user-level, this refers to packets that are placed into the packet stream, and those packets that are contained inside the GNE-level packets that are sent between the peers. The end-programmer will see almost all of these packets enter their packet stream. The exceptions are:
It is highly suggested that the underlying GNE implementation handle these packets. In the C++ implementation, it completely hides the existance of RateAdjustPackets, and the interaction with ExitPacket is known only through an onExit event.
GNE defines many user-level packets, espically the 'Packet' packet which all packets must be derived from.
Packets are considered to "derive" from each other. Derive used in this document means just what it does in C++: a packet that is derived from another is adding on to its attributes. Users of GNE are meant to create their own derivations of the packets that GNE provides. A packet that is derived from another is called "the child" and the packet it was derived from is called "the parent." The notation that GNE uses is a colon, starting with the base packet and going to the final child, in the form grandparent:parent:child.
The order of data in the serialized form of the user-level packet is strictly defined. All of the parent's data comes before the child's. Since Packet is always the first in the derivation, its data goes first, and its data consists of a packet ID which the child should define to be its own. Therefore the packet ID always comes first, which forms valid GNE-level packets.
In the default C++ implementation, each packet class parses its own data. So a parsing function for a packet which resides in a child first calls the parent's parsing function, then parses its own data. If it has a child, then its parsing function was called.
Making sure that the client's clocks are synchronized to the server is often useful for games. Because GNE will be used in games, the program will not want to actually change the client's clock, but it will want to know the time offset between itself and the server. Because finding round trip time and clock offset are so related, and because the amount of extra packet data needed to find clock offset is small (16 bytes), the PingPacket serves both purposes.
A choice had to be made between synchronizing actual real clock times, or synchronizing on an arbitrary server clock. Due to the fact that absolute time clocks have less resolution in some platforms (ie Windows), and because of possible difficulties surrounding differing timezones and GMT-UTC differences, the clock synchronization will be based on an arbitrary clock on the server. This allows the server to use the clock with the highest resolution, and perhaps more importantly, the server can use the same clock for synchronizing and for timing the game logic. The important part to know from this is that the clock offsets may be extreme (as the C++ implementation uses GMT in Linux, but seconds since CPU powerup in Windows), and thus may not fit into a 32-bit integer with microsecond accuracy.
The method used to determine time offset and round trip time is the same that is used for NTP (RFC 1305) and SNTP (RFC 2030). There are 6 variables in this scheme:
The actual times that T1-T4 are found can vary, ideally these times would be found at the time the network interface actually sent/received the packets. It is up to the implementation to decide when to take these times, but the closer that the time is taken to the physical send/receive time, the better the measurement of round-trip time and clock offset.
The round-trip time is found through the simple formula: R = (T4 - T1) - (T3 - T2).
The clock offset time is found from the following formula: O = ( (T2 - T1) + (T3 - T4) ) / 2.
The accuracy of finding the offset is dependant on how close each of the single trips take. Thus the algorithm performs perfectly if the time the packet takes to the remote is the same it takes as it returns. The amount of error is exact, and is the difference in the latencies divided by 2: True Offset = O + (LatencyTo - LatencyFrom) / 2. I've spent a large amount of time researching this and I could find no way to discover the latencies, or to even to discover if the offset O is not the true offset.
Because of this, the true offset's range is always known exactly, as the maximum error E = R / 2, so: O - E <= True Offset <= O + E.
The packet listings here use the same names as their class names in the standard C++ implementation of GNE, but at the network level their names have no significance, only their guint8 id. The last packet (the new child) of the derivation defines the packet ID of the whole packet. The data listed for each packet is only for that packet -- all of the parent's data is assumed and is not stated again.
A GNE implementation is required to be able to understand and parse all of these packets so that end-programmers can derive their own packet types from them. Packet, CustomPacket, RateAdjustPacket, and ExitPacket provide the base functionality, the rest are more suited to GNE being a protocol for gaming engines.
|Data Type||Symbolic Name||Description|
|guint8||Packet ID||An ID which is defined by child packets that is written when serialized into a GNE-Level packet.|
|This packet serves as a base packet for all of the other packets and contains the ID.|
|Data Type||Symbolic Name||Description|
|guint8||length||The length of the encapsulated raw packet.|
|undefined||raw packet||The raw packet of the previously given length.|
|This packet allows the client to send some information without having to define a packet to encapsulate it, which is useful when sending once-only data. It is mostly meant for end-programmers to use with a SyncConnection to do the connection handshaking for their own game or to send things like maps or other large binary data during connection time.|
|This packet contains no actual data. It is used by GNE to signify a graceful disconnect.|
|Data Type||Symbolic Name||Description|
|guint32||rate||The maximum incoming data rate in bytes per second. Zero for "unlimited" rates allowed (turn off throttling).|
|RateAdjustPacket's functionality is discussed in the bandwidth control section of this document. Sending this packet will tell the other side the maximum amount of data to send to this side. The programmer does not send this packet explicitly -- it is sent as a result of requesting a rate change in the implementation.|
|Data Type||Symbolic Name||Description|
A session-unique request ID.
|gtime||T2||Remote packet reception time, relative to any point in time the server chooses.|
|gtime||T3||Remote packet sent time, relative to any point in time the server chooses.|
This packet can be sent by the programmer or by the GNE implementation. The ID in the packet is used to match a request to its reply and should be unique for a given connection whether sent by GNE or the user. A GNE implementation is encouraged to reply to the ping.
In the request packet, T2 and T3 are to have time values of 0s, 0ms. In the reply packet, T2 and T3 should be set to the time the remote end received and sent the packet. T2 must be <= T3. Equality is allowed if the time between receiving and replying is very small. The values T2 and T3 are to find the clock offset, used for synchronizing clocks. The epoch (starting point) for times T2 and T3 must remain the same for the whole connection, and for the synchronizing to mean anything, all times reported by the server should use the same epoch.
How these values are interpreted to determine ping time and offset is discussed in the clock synchronization section of this document.
More packets will be created in a separate document, with the high-level game engine features of GNE, which will be an additional specifiation on top of this protocol. For this purpose, the packet IDs from 5 to 15 inclusive are reserved.
When a client disconnects, it should send a valid GNE-level packet consisting of a ExitPacket over the reliable connection as its last data if possible then close its sockets. A GNE implementation has the option of ignoring any packets that arrive after an ExitPacket (because late unreliable packets might still come).
More types of connections might be added in the future. Right now there is reliable TCP data and unreliable UDP data. TCP guarantees order and single reception and UDP guarantees neither. It might be possible to provide an implementation over UDP to allow guaranteed reception but not guarantee order or multiple reception which would be more efficient that using TCP and UDP. Since GNE abstracts the low-level and only asks what it needs to guarantee, optimizations such as this are transparent to the user and therefore could be done later.
As mentioned earlier, the capability to send wide character formats will likely be added into GNE.