kylebrandt/Full Long Dataplane.md

## Full Long Dataplane.md

      
    Raw
  

              Full Long Dataplane.md
            
          
    Full Long Formats

Two new data types have been introduced with SQL expressions. timeseries-full-long and numeric-full-long. This is a row-oriented tabular format similar to long (labels are not used), but structured in a way that allows more lossless conversion of information between the labeled (wide and multi) formats.
So the use case for this format is when you need to be able to convert between labeled and tabular formats with more preservation of the information. Whereas, when converting back and forth with Long can create artifacts.
Properties shared by Full Long Formats

They are single frame formats.
There are three reserved fields (columns):

__metric_name__: The name of the metric
__value__: A nullable *float64 numeric field, where this is the value that that represents combination of the metric name and dimensions for the row.
__display_name__: (Optional) When converting from other formats, if the DisplayNameFromDS property is set, it is flatted into this field
Note: Additional reserved names may be added in the same naming style. This would be to flatten other column metadata such as a data links or units.

Dimension Columns

Like long any dimensions (that would be labels in in wide or multi) become their own field. The dimension fields name corresponds to key of the dimensions, and the values values are in the rows of that field.
In the case of full-long they are nullable string fields. When the value on a row is null, the label is considered absent. This helps with more correct conversion to and from the wide and multi formats because of all of the items do not have the same set of label keys.
Numeric Full Long

The numeric kind is for when each item in the response as a value. The numeric kind in format full long format is identified by the data frame type numeric-full-long.
Numeric Full Long Examples

Single Metric Example


    __metric_name__
    __value__
    host
    region
  
  
    cpu_load
    0.82
    a
    us-east-1
  
  
    cpu_load
    0.61
    b
    us-east-1
  

Multi-Metric Example


    __metric_name__
    __value__
    host
    region
  
  
    cpu_load
    0.82
    a
    us-east-1
  
  
    cpu_load
    0.61
    b
    null
  
  
    disk_free_bytes
    1.23e+12
    a
    us-east-1
  
  
    disk_free_bytes
    9.80e+11
    b
    us-east-1
  

Timeseries Full Long

The timeseries kind shares the same properties as the numeric kind for the full long format in terms of reserved columns and the behavior of dimensions. The timeseries kind in format full long format is identified by the data frame type timeseries_full-long.
The main difference is that there must be a time column the is is sorted by time in ascending order (old to new). When this type is produced by converting from the multi or wide formats, the name of the field (column) is kept.
Timeseries Full Long Examples

Example 1


    time
    __value__
    __metric_name__
    host
    iface
  
  
    2025-10-02 15:55:41.000
    1
    cpu
    a
    x
  
  
    2025-10-02 15:55:41.000
    6
    cpu
    b
    y
  
  
    2025-10-02 15:55:42.000
    4
    cpu
    a
    x
  
  
    2025-10-02 15:55:42.000
    8
    cpu
    b
    y
  
  
    2025-10-02 15:55:43.000
    2
    cpu
    a
    x
  

How Full Long differs from the Long format

The long long format is partially flatted, where as the full-long format is fully flatted.
In the full-long formats, the metric name is treated as another dimension. This creates two reserved fields for this purpose in full long: __value__ and __metric_name__.
In the long and each unique metric name gets a field (column). The field's Name property is the metric name. Like long the dimensions that would be labels in the wide and multi formats get flatted. But in long the metric name is not treated as a dimension and does not get flatted, whereas it does in full-long
This format also uses nullable strings for dimensions. This means that if the value is null for the row, where converting to labels, that label should be considered not present. This allows full-long to avoid creating label sets that didn't exist during a round trip conversion from wide or multi.
In the case of the timeseries kind, with roundtrip conversion (multi or wide to long, and then back) long can create false datapoints. Since long does not flatten the metric name portion, there was no way to represent timestamped data points that existed for one set of series sharing a name but not the other set of series. Since full-long is fully flattened into __value__ and __metric_name__, those missing points don't get falsely created like to with full long with doing roundtrip conversion.
In long, the numeric type for the value is persevered. For example a metric that is a unit64 will stay that way. Because full-long flattens to a a single column (__value__), all numeric values become nullable float64 values. In most cases, upstream we effectively convert to something like a float64 (or JS float), so this likely doesn't matter much in current practice.
Conversion between wide and multi formats

From wide/multi to full-long


Fundamentally we are flattening a column-oriented structure with labels into row-oriented structure without labels.
Each label key present becomes a field (column) with the name of the label key
All the numeric value columns are combined and flatted into one __value__ field. Whatever the numeric type is, it will be converted to a nullable *float64 field.
The field.Name from wide/multi gets flatted into the __metric_name__ field.
If the timeseries kind, the first field of type time gets flatted and keeps the name

Use in SQL expressions

When SQL expressions receives the data in the kinds of timeseries or numeric in the wide or multi formats, it converts it to the corresponding full long format. This is because there is no notion of labels in SQL, so the data must be flatted into tabular format.
__metric_name__	__value__	host	region
cpu_load	0.82	a	us-east-1
cpu_load	0.61	b	us-east-1
time	__value__	__metric_name__	host	iface
2025-10-02 15:55:41.000	1	cpu	a	x
2025-10-02 15:55:41.000	6	cpu	b	y
2025-10-02 15:55:42.000	4	cpu	a	x
2025-10-02 15:55:42.000	8	cpu	b	y
2025-10-02 15:55:43.000	2	cpu	a	x